The Startup Genome

Hidden within the magisterial work, The Gene by Siddhartha Mukherjee was this incredible quasi-equation about the interplay between the code of humanity, the genotype and the physical representation of the code, the phenotype. Basically it is not a one to one mapping, code != reality. This is true since the genotype is influenced by the environment the person is in and by random chance, hence:

phenotype = genotype + environment + triggers + chance

It seems plausible to port this into the realm of early stage companies and can support the discussion around using quantitative tools for private investment. Here we can think of the phenotype representing the eventual outcome of the venture, the genotype the attributes of the company, the environment the competitive and general endogenous financial landscape and the triggers and chance representing their usual meanings.

Building machine learning tools for venture investment selection really only uses the genotype of the deal (and perhaps the environment). This allows us to see the inherent limitations of using quantitative tools in private investment (as Chamath Palihapitiya recently quipped on twitter, to paraphrase: it's hard to use excel to predict the future) since the phenotype is made up of triggers and chance

But this hasn't stopped medicine from turbo-charging genomics to do what we can to eradicate disease so shouldn't we do what we can (build databases, ML models, graph theory etc.) to at least understand the startup genome as best we can so to remove as much uncertainty as possible? Obviously I think the answer is yes, which is why we've built and deployed ML models to support our investment decisions.