Step 1: Capture your likeness and create photorealistic digital avatar
Step 2: Train LLM on your entire message history
Step 3: Put your glasses on and talk to yourself.
Step 1: Capture your likeness and create photorealistic digital avatar
Step 2: Train LLM on your entire message history
Step 3: Put your glasses on and talk to yourself.
New technologies allow us to do more, in new ways, in new places. Renowned computer historian Paul Ceruzzi notes that the emergence of a new computing platform “is as much the result of social and political negotiation among a variety of groups as it is the natural emergence of the most efficient or technically best design”.
The introduction of new technologies come with the same series of questions: “What’s it for?” “What’s the first application?” “It looks like a toy?” “I can basically already do this with my current technology?” Answering these questions, and successfully deploying a new general purpose computing platform requires a collection of solutions. We need people to recognize the new technology, via metaphor. We need the platform to allow any creative developer to build their application. And we need the components of the platform to offer applications that could never have been built with the previous technology.
Platform shifts are rate, but when they happen they re-architect the entire industry. Understanding how they happen helps us create, deploy, and thrive in the a new, more capable, future.
Metaphors
Metaphors give us a way of recognizing the future. Even back in 1830, Charles Babbage saw the value of metaphor in new technology introduction, by calling his invention the “analytical engine”. More recently, Steven Levy points out in Incredibly Great, the founding story of the Mac, metaphor was the “key to making computers comprehensible”. Steve Jobs himself used metaphor in introducing the value of the personal computer being a “bicycle for the mind". The metaphors that made computer truly comprehensible was the “desktop” metaphor created by Alan Kay and the team at Xerox PARC. (PARC was given the directive to examine the implications of a “paperless office”, something important to Xerox so it makes some sense that they came up with the desktop metaphor and “printing digital bits” using WYSIWYG - what you see is what you get.) People were familiar with the concept of a desktop and could envision how it would work, and importantly, what it could be done for.
Design teams looking to build the next platform beyond the PC had a confusing mix of metaphors. In part this was due to the anchoring of the personal computer, the overwhelming success of the iPod, and the creative destruction of the digital camera. These influences on the eventual smartphone industry were strong: Andy Rubin ha said that Android was originally conceived as a better OS for digital cameras.
This led to a number of confused metaphors (and Frankenstein-like product mutants): the N91 “music phone”, the nGage “game-deck phone”, and the N95 “multimedia computer phone”. Complexity theorist and technology historian Brian Arthur calss this “combinatorial evolution”..
Since the late eighties, what seemed to dominate, was the idea of the “extended desktop”. The metaphor came in many forms: “handheld computer”, “pocket PC”, “multimedia computer”, “personal digital assistants”. The PC in your pocket metaphor is still even in use today, with the common “supercomputer in your pocket” idea. I would argue that the word computer is too ambiguous to be useful and contributed to these devices (prior to the smartphone) not generating a new platform shift.
This metaphor class was thrown out abruptly in January 2007. On stage, Steve Jobs said explicitly, Apple was “reinventing the phone.” In a NYT interview Jobs was again explicit: “these are more like iPods than they are like computers.” There were many commercial drivers for this too; the iPod was a blowout success, in an anchoring the consumers to the iPod Jobs had introduced the iPhone with a success halo.
Jobs also made a very deliberate choice in introducing the product, famously announcing three products in one: “widescreen iPod with touch controls, a revolutionary mobile phone, and a breakthrough internet communications device” (nothing about computing.) In this way, repeated so often it became a joke for the audience, Jobs was pounding the table with the idea that this one device was new, complementary, general purpose platform.
Metaphors allow us to recognize the future. They are a necessary simplification of something new that allows for faster, and broader adoption. It simultaneously anchors the developer to some understanding of purpose, and at the same time hints at a new opportunity. Platform shift step 1 done.
Sami Kaji from First Republic, who often tweets the early stage LP perspective, expects 2,000+ seed funds by the end of 2020. Now (good) founders can be more selective of angel investors. So what makes an angel investor valuable (other than respect, decision speed, and attractive deal terms)?
1 . Talent networks - hire better people
2. Capital networks - remove financial risk
3. “Advice” - work smarter
Talent Networks
Hiring is among the most critical tasks for a founder, especially at the very early stages. A good angel investor has a broad and active talent network with which to provide personal referrals to the company, particularly at the senior level. The reason this is listed first is because of the non-linearity of the benefit: an exceptional hire brings her own talent network, and is a very strong signal to the undecided candidate pool. One exceptional hire becomes three, becomes ten, becomes fifty.
Capital Networks
You need money to hire (generally). Fundraising is consistently one of the most draining things a founder does. Mostly because they feel like they’re not working on core company development, which can be particularly frustrating. Personal referrals to Partners at target funds (who trust and value the referral) short circuits this process.
“Advice”
Before you read the following please be aware I’ve never owned a Patagonia vest. Angel investor’s, and more pointedly VC’s “advice” is sometimes a simple truism or at worst an overfit generalization from a single sample. This is partly a result of the statistic mentioned at the top combined with the universal megaphone of Twitter. Harvard Law School professor Yochai Benkler has coined a phrase for this: the Babel objection, having “too much information” devolves into a “cacophony of noise”.
So, what do I think is helpful [1]: getting clarity on how to align your limited resources towards a future goal. Why is this valuable: 1) you have to clearly define your future goal (harder than you think), and 2) you create a strategy for achieving your goal. (Also you’ll notice that the first two categories of value: talent networks and capital networks are helpful in increasing these “limited resources".) I did not come up with the above phrasing myself, it’s taken from Pulitzer prize winning author John Lewis Gaddis’ book On Grand Strategy.
Notes
[1] Bonus points for identifying the irony of the author referencing the “cacophony of noise” from too many people offering advice, and then proceeding to offer his advice.
“Gayford is in effect recounting the fall of Paris as the adjudicatory centre, the supreme court, of modern art. From the Impressionists, to Cézanne, to Matisse and Picasso, Paris ruled.”
The FT on Martin Gayford’s ‘Modernists and Mavericks’
In 1969 Francis Bacon, while in London, created “Three Studies of Lucian Freud”. In 2013 Christie’s would sell the work for $142m. The sale was a commercial representation of the social and economic network that was created in the modern art world in London since 1940. At that time however, Paris, not London, was considered the “supreme court” of modern art. Read from the perspective of a participant in the developing New York tech ecosystem, “Modernists and Mavericks” which tells the story of how London created the required social and economic network to pull the center of gravity away from Paris, provides an interesting parallel to what’s happening today between SV and New York.
Shifting the center of gravity of something so deeply geographically centered doesn’t happen often. It doesn’t happen often because the geographic center is a physical instantiation of the underlying social and economic networks present within that geography. To examine this, we require the tools of modern network theory. For example, the below diagram represents a selection of the most influential technology companies and associated investors of the last 6 decades. It is clear that a network representation is essential to our understanding of how this (and these) ecosystems develop.
Visually, the overwhelming degree centrality (number of links incident on a node) of both Fairchild Semiconductor and KPCB is clearly evident (indeed the K in KPCB was an employee of Fairchild). It is also clear how this node can influence the ecosystem for decades: Fairchild is first degree connected to Apple, KPCB, Intel, Sequoia, and second degree connected to Google, Amazon, and Netscape.
The influence of the network is particularly exacerbated in the venture industry given the presence of the positive feedback loop: Sequoia invests in a company, better talent is attracted, better guidance is provided, less competition, the investment outperforms, other founders want to be associated with Sequoia, they pitch Sequoia, Sequoia sees better deals. In network theory, this resembles the idea of positive assortativity: relatively high-degree nodes have a higher tendency to be connected to other high-degree nodes. This is exactly why the Power Law is the topic of choice for VC dinner conversations.
These networks are ‘easy’ to generate in a historical, static capacity What’s more interesting (and difficult) is to generate these in a dynamic, contemporary way, that may enable us to both forecast the development of an ecosystem in real-time and understand the requirements for the new ecosystem to flourish. For SV, once Fairchild gained the incredible concentration of talent, and then that talent dissipated, the ecosytem exploded. Such is the present hope for New York tech.
“People feel that it is very important for artists to have an aim. Actually, what’s vital is to have a beginning. You find your aim in the process of working. You discover it.” - Bridget Riley, “Modernists and Mavericks”
Many people have pointed out the irony that venture funds invest in the most cutting edge technology yet still operate with excel, quickbooks, and "gut feel". Given the current awareness of the value of data and the rise of machine learning, is there an opportunity for technology to radically alter the venture landscape like it did for the public markets in the 80s?
At my previous fund, Hone Capital based in Palo Alto, we thought so. We built machine learning models to enhance the GP decision in investing in over 350 early stage technology companies. You can read more about our approach in McKinsey Quarterly.
In this post I want to first look at quantitative approaches in the public markets, how the market structure influenced these strategies and how these don't directly translate to the private markets. Then look at how the private market architecture is changing and how that might present new opportunities for quant strategies in VC.
The goal of traditional investing, put simply, is to find undervalued companies. Traditional investors have always used data (well maybe I shouldn't say always; Benjamin Graham introduced the idea of an "investment policy" rather than speculation and hoped to implant in the reader of his canonical "The Intelligent Investor" a "tendency to measure or quantify.") The traditional investors' data consists of revenue, margins, growth rates, etc.: metrics I call 'primary' to the company.
Perhaps inevitably, the data used by traditional investors is growing. Now the term 'quantamental' is used for those using "alternative data" like satellite images, app download numbers etc. This is still however, using data (albeit new forms of data) to achieve the same goal: identify undervalued businesses.
It's important to note when translating public quant strategies to VC, quant hedge funds don't use data to enhance the traditional goal, but rather they have created an entirely new goal and have profited handsomely to say the least. The quant funds used data to create a completely different goal: find repeatable, short term statistical patterns. In effect, the quant strategies grew out of the "data exhaust" of traditional investing.
The architecture of public and private markets is very different. Below is an examination of the elements of the public markets that led to the development of the quant strategies that have been so successful. To make the comparison clear with the private markets (since our goal here is to explore opportunities for quant VC) I also list the challenges of translating the public market strategies to the private market.
Public market: Short. The ability to trade within minutes (seconds, microseconds etc.) allows the quant hedge funds to isolate a statistical driver of profit. This is originally how Renaissance Technologies started: they "predicted" the very short term response to a macroeconomic announcement (non-farm payrolls, consumer price index, GDP growth etc) by analyzing a huge (analog) database of how securities responded to those announcements in the past. By trading in and out within minutes, no other exogenous factors influence the security response in that time (theoretically.)
Private market: Very Long. Venture investments have a long time horizon and the investment is extremely illiquid (this is changing now to some extent with companies like EquityZen, SharesPost and Equidate - but even still these are mostly for later stage secondaries and still only offer "trade" windows of months at least.) The long time horizon between opportunities to exit an investment mean that many exogenous, unplanned and unpredictable factors undermine the potential for statistical patterns to provide any alpha. These exogenous factors lead to an exponential decay in the accuracy of any forecast over time.
Public market: Difficult. Quant hedge funds compete on identifying a "signal" with which to trade. RenTech famously released one signal they had found - fine weather in the morning correlated to upward trend in the stock market in that city. However, the trend wasn't big enough to overcome transaction costs - so they released it publicly. The point here is that the quant hedge funds have an almost unlimited amount of data to mine: intra-second, machine readable, tic-by-tic price movements on thousands of securities and derivatives all over the world.
Private market: Difficult. By definition, private companies keep information private. There are no tic-by-tic data libraries of valuation movements (indeed valuations only move in discrete steps.) There is also a limited historical set of information on startups (changing rapidly thanks to PitchBook, Crunchbase and CBInsights.) This means that if you want to build quant models for the private market you need to get creative (beyond PitchBook, Crunchbase etc.) For example, although it was still for their public quant hedge fund, Winton released a blog post systematically examining a companies proclivity to register their domain name in different countries and whether that could be a signal for the competence of company's technical leadership. Systematic feature extraction seems to be the original direction of Google Ventures when they discussed strategy more publicly at launch back in 2013:
Public market: Rich. Long historical record of continuous, machine readable and easily accessible data.
Private market: Sparse. Various sources offering incomplete data (missing data; missing founding rounds, conflicting reports etc.) The data is often not easily machine readable: names of funds could be First Round, First Round Capital, FRC, Josh Kopelman, FirstRound.) Extensive data cleaning is required (not to say this doesn't happen at the quant hedge funds, but less is required given the maturity of the data market for quant funds.)
Public market: Easy. The public markets are continuous and liquid. One only needs to identify a signal and expression of that signal is trivial (assuming liquidity is not an issue which it most certainly is in some high frequency trading scenarios.)
Private market: Hard. Even after the data is acquired, cleaned, made machine readable and a signal is found, the venture investor has to identify a new opportunity that matches that signal and also win access to that deal. Whereas the public markets are liquid and freely accessible, the private markets, again almost by definition, require 'permission based access.'
Public market: Low. The CIO of a $30b+ quant hedge fund once told me that if the signal is >50.1% accurate it is in play. The only way this works if if there are thousands of possible trades using this signal. Invoking the Law of Large Numbers, a 50.1% signal becomes profitable. So the extremely high number of possible trades (given the highly liquid, global, permission-less public market architecture) makes it a lot easier to identify a signal that can be used.
Private market: High. In contrast, a venture fund has to have some concentration in portfolio companies to achieve traditional 3.0x+ ROIC that LPs expect. This translates to a low number of portfolio companies (to build/maintain high ownership.) A low number of "trades" therefore requires a highly accurate signal (much greater than 50% (a false negative in venture is very bad.) One could push back on this however: in trying to develop a benchmark for our ML models, I once asked a Partner at Sequoia what he considers a "high" seed - series A conversion rate. His answer: 50%.) I've explored the mathematical dynamics and efficacy of a larger, index-style VC fund here.
So how do we address these limitations of the private markets? Well the early stage private market architecture itself has been changing.
In an almost Soros-like reflexive loop, the number of deals done and the amount of data on these deals has dramatically increased over time.
The graph below (from PwC Moneytree) shows the number of deals done (line) and amount of capital deployed to 'seed stage' deals from 1Q02 to 1Q18. It shows a 30x increase in the number of seed deals and a 40x increase in capital deployed at seed.
Over that time companies were created to operationalize the data generated from these deals: PitchBook (2007), Crunchbase (2007), CBInsights (2008), AngelList (2010). Over this exact same timeframe, LinkedIn profiles grew from 10 in 2003 at launch to over 500m today (same trend can probably be seen in AngelList, Twitter, ProductHunt profiles.) This increase in the number of deals and data available (on those deals, founders, and other features) means more training data for machine learning models. The amount ad quality of data will only increase.
Liquidity has substantially increased in the early stage private market for three reasons: incredible proliferation of 'micro-VCs', mega-funds (SoftBank et. al.) and new secondary market options. Senior Managing Director at First Republic (and great VC blogger) Samir Kaji recently mentioned in a post they are tracking close to 650 micro-VCs and characteristically offers some insight into how this growth might influence the market architecture going forward.
Given this dispersion of ownership interests in companies among many distinct funds, like Samir I find it very likely that consolidation will occur - and potentially the development of a new market - something I'm calling Synthetic Liquidity. This would be when a fund sells their ownership position prematurely (like at Series A or B) and the buying fund pays the selling fund carry. Obviously the selling fund is forgoing potentially lucrative upside but they are buying quick liquidity. I see AngelList being very well positioned to be the intermediary here.
Shortening the time to realization may make some quant strategies viable (like using ML to forecast 1 - 2 years out rather than 10.) The idea here is that the early stage venture market is in a period of realignment which may introduce opportunity for new quant strategies.
Not much needs to be said here about the other secondary options available, just that they have contributed to the changing landscape of liquidity in the private market: SharesPost (2009), EquityZen (2013) and Equidate (2014) and also that the innovation and growth here unfortunately seems to have plateaued.
Public quant funds used the data exhaust of the traditional investors as their fuel to build incredibly successful funds. The data exhaust here is the time-series price fluctuation in security markets. Quant funds ran statistical models on these time-series data sets and identified repeatable statistical patterns that became the foundation of their fund.
So what is the data exhaust in venture? "Secondary" information about funding rounds: who the investors in the round were? Who was lead investor? Were they new investors or follow-on investors? What industry is the company in? How much VC funding has gone into this industry this year? Growth of VC funding in this industry this year? Number of VC deals in this industry? Location of startup? Schools of founders? Are they first time founders? etc. etc.
The above data can be ripped from Crunchbase, AngelList, PitchBook, CBInsights, LinkedIn, SEC EDGAR, PwC MoneyTree and many other creative sources.
Part of the benefit of examining the efficacy of quant approaches to VC (and indeed in building ML models to support VC investment decisions) is that it forces an examination of the way we currently understand the venture business. Here we've systematically analyzed the the architecture of the public and private markets which I think greatly helps to understand, in Turing's words "how we think ourselves."
Today, there is a question of how much data could/should be used in venture. In High Output Management, Andy Grove said "Anything that can be done, will be done, if not by you then by someone else." I believe leveraging data in the venture investment process can be done. Building a fully standalone quant fund may still be some years off. The reason for this I believe lies in the developing architecture of the private market. The 'electronification' of the public markets in the 80's greatly enhanced the ease with which quant strategies could be built and deployed in the public markets and we are seeing an equivalent "datafication" of the venture business today.
I believe the near future is quantamental VC funds. This is already starting to be realized; Sequoia has a data science group, Social Capital (had) a Head of Data Science and many other funds are not public with their data effort in the hopes of maintaining competitive advantage. The combination of the relentless growth of data available, the changing architecture of the early stage market, and the extreme need for differentiation given the explosion of funds, I believe will lead to inevitable innovation in the venture industry of the coming years.
This industry is incredibly interesting not only because databases are the core of every data strategy (and data is becoming the core of every strategy) but also because the business model is in itself still in question. With Benchmark's recent Series A lead of Timescale, a PostgreSQL packaged time series database and MongoDB's 2017 IPO, I thought it might be a good time to dig in a little to the industry, companies, business models and prospects for startups.
Primer: Ajay Kulkarni, Co-Founder and CEO of Timescale has a great post on the history of SQL versus NoSQL and where the industry is heading today.
In looking to get context on growth rates, margins and revenue targets for new open source database startups, I looked at the IPO filings of 4 recent open source enterprise companies (Hortonworks, MongoDB, Pivotal and Cloudera) to understand what "success" looks like - at least what Wall St. is expecting should the companies make it to IPO. Importantly this influences the internal growth targets and financial forecasts of new startups.
Open source database companies typically generate revenue through subscription agreements (for enterprises to use their software for commercial purposes) and through professional services. Red Hat was able to build a multi-billion dollar business through services revenue alone (but this is most likely due to the fact that they were part of the generational shift from UNIX to LINUX.) Although the code is open source, companies implement an "open-core" model where the core of the code is open source but enterprise grade features require a license.
The below tables show the Subscription revenue and gross margin at IPO (Yr 0) and growth rates and margins for the 2 years preceding the IPO. We can use the table to generate a Subscription revenue target of $150m at 85% margin (and $30m Services revenue at 15%) at IPO for our fictional new open source database company.
The above graphs show why open source database companies are moving away from relying on services revenue as a key driver. Both the growth rates and margins are unreliable: the average services revenue growth rate at IPO was 24% at a gross margin of only 13%.
The amount of capital required to go public (or just build something of value) depends very heavily on the industry of the company. This is why software businesses have allowed for another golden age of VC (there are obvious exceptions however, according to Crunchbase Snap raisde $4.6b in VC funding.)
Below are graphs showing the amount of capital raised by each company before and after monetization. This distinction is important. The burn for engineers to build the database is small and as such the amount raised (and therefore founder dilution) is low. However, during the monetization phase significant amounts of capital are required. This money goes into the sales and marketing process for the new database to compete against Oracle, Amazon and the other open source solutions available.
The practical implication of raising this much (MongoDB's Series F was $150m, Cloudera's Series F was $740m - Crunchbase) is that for this amount of capital to be invested the valuation must reach commensurately high levels. For example if in these funding rounds the founders are willing to give away 20% of the company, the pre-money valuation would have to have been $600m. It would be interesting to see how this was justified, especially given monetization had only begun.
The shift back to SQL from the NoSQL sidetrack is going to offer massive potential for value generation. The question is who will benefit? AWS CEO Andy Jassy recently said on CNBC that Amazon Aurora (the scalable SQL database) is their fastest growing product - ever. And Google's leadership in the database community basically ushered in the move back to SQL (after they were the ones to usher in the first move from SQL to NoSQL.) Their Spanner product, given their reputation is dominant.
But the startups are taking advantage of companies growing despondency with a perceived lock-in with larger companies. Oracle has been the most damaging in this regard and this is directly translated to Amazon and Google. Companies want strategic flexibility, especially given the mission critical nature of the database. They can only really get this by deploying a truly independent offering like that from Cockroch Labs, Timescale or the many other independent solutions. It is hard to bet against Amazon and Google. It will be very interesting to see how the future unfolds.
This question seems to be in the mind of many investors today. From SoftBank's giant $100b to the proliferation of hundreds of "micro-VCs", understanding the importance of fund size has become critical.
In this post I want to provide some ways to formalize the fund sizing question for the emerging manager. Simply put, the returns of a fund are the summation of exited portfolio company valuations multiplied the ownership of those companies at exit. Although the mathematical formulation of this is trivial (see below) I believe it helps in understanding the dynamics of fund size and ownership:
This equation shows us why ownership is so important for VC funds: if ownership is decreased, to maintain the same returns the exit value of the company must increase. However, given the power law of VC outcomes, this is exponentially more difficult.
This is shown below, with three hypothetical funds, $100m, $150m and $200m. Assuming average exit ownership of 3.0%, the graph shows the required exited market cap to generate various fund multiples. Again, moving up the y-axis is exponentially more difficult given power law VC.
In a recent post, Partner at Founder Collective, Micah Rosenbloom stated "it's easier to make money on carry if you make money on fees." To drive this home for the emerging manager, next we look at how carry dollars are influenced by fund size. Table 2.0 provides the graph above in table form (for example: with a $150m fund and 3.0% exit ownership returning 3.0x requires a combined market cap on exit of $15b.) Table 1.0 links to Table 2.0 and shows the carry dollars to the Partner group in each fund size and outcome scenario.
Here again, moving along the x-axis (higher exits) is exponentially difficult. But what is most interesting in the Tables above is comparing the likelihood of scenarios. What is more likely, generating $4.2b of exits on a $50m fund or $7.5b on a $150m fund? Both generate $15m in carry for the Partner group. Sounding like a broken record now (power law of VC) I would say the smaller the exit value required is better (but as is the game in VC both scenarios are very unlikely.)
There are many factors that can alter the above calculus (initial ownership, ability to execute pro-rata, exposure to quality companies etc.) but hopefully this formalization can contribute to building the most optimal fund given your specific circumstances and goals.
Note: Most material in the above post is taken from my introductory Fund Sizing Deck I use when consulting with emerging managers. For a copy of the full deck and/or the excel model please email building58ti@gmail.com.
A wise man once told me that a business is just a group of people. We have a tendency to think of some very successful companies as larger than life for many reasons (like the "taste the feeling" ephemeral joy of Coca Cola or the minimalist sophistication of Apple.) But history is littered with once vaunted companies that suffered leadership complacency or willful blindness in the face of change.
What I want to explore here however, is the network element of the "group of people" description of a business. We will see how valuable the network was to leadership roles in new startups and new investment opportunities for early VC funds. The 1950 - 1985 period of U.S. and Silicon Valley history is immensely rich network effects permeating the birth of the "minicomputer" to the development of synthetic insulin.
There are incredible resources that explore in more detail the narratives behind these networks (see notes at end.) Below is a concise filtering of the key network elements in rough chronological order and a diagram for further effect.
Now, a couple points from the above network. There are a number of instances where new people were introduced to the network with limited connections or personal history (Atari founders to Valentine, Robert Swanson cold calling scientists etc.) So even though the power of networks here is self evident, there are exceptions for exceptional people [See the John Doerr - Amazon bonus note below.]
Second, working at Fairchild (or in any other part of this network) was not a necessary, nor sufficient condition for building a successful startup in this period. It took an intense amount of hard work and (probably) equal parts luck. But investors in startups often talk about reducing risk as much as possible. Perhaps being in this network was a tangible risk reducer.
Think about Valentine's investment in Apple. He didn't invest directly, he brought in Mike Markkula (who he must have trusted given his professional relationship) because Valentine though Jobs and Wozniak didn't have "any sense of the size of the market, they weren't thinking anywhere near big enough". Even after bringing in Markkula, Valentine only invested in the next round [1].
A few months ago I came across Jido Maps, a startup enabling persistent AR. While I'm not exactly sure yet what could act as an adoption catalyst for the developer community to build persistent AR apps (perhaps another Pkemon Go or equivalent?) I do believe that AR will inevitably transform the city environment.
But whereas traditional, physical advertising (and indeed every other physical object in the city) is limited in that it can only ever be one instance of a billboard, neon sign, or display in a shop window, with AR these advertisements can be individually target in physical space just as they are online today. This idea is not new, back in 2007 (!) Microsoft patented "Personal augmented reality advertising" - see below for an image from that patent.
Taking this to the extreme, we could be walking together through the city (with some sort of AR type contact lenses - hopefully soon!) and our experiences could be very different. I'm interested in exploring what will happen in this "hyper-unique", fragmented cultural environment. We've already seen how dangerous this is online, will it be equally troubling in a augmented offline world?
The beginnings of this dangerous world were implicitly foreshadowed in "Cellular Convergence and the Death of Privacy" by Stephen B. Wicker, published by the Cambridge University Press. The book describes the (forgotten narrative) of how the explosive adoption of the smartphone (by users and developers) created a single failure point for privacy (that is to say, one device had data on every aspect of your life.) What will happen when we have, as Steve Gu at AiFi phrased it so elegantly: pervasive, perceptual computing?
To me the technology underlying this "pervasive, perceptual computing has already been discovered and the applications that could run on top of this platform could save lives, save money and entertain. It seems inevitable that it will be deployed and adopted (persistent AR, fragmented and hyper-personalized city environments and pervasive, perceptual computing.) In this world therefore, the most key questions are those that are being explored around data ownership and monetization.
From a technology perspective, the decentralization and self-sovereignty movements being enabled by blockchain technology provides an enticing potential solution to many of these problems. From a legal and social responsibility perspective it's hard to look past the incredible content coming out of the Yale Law School Information Society Project. Despite running for a long time, Jack Balkin's ISP seems so perfectly timely for the questions we are currently raising today.
I don't know what the right answers are, but I am definitely interested in finding them.
Running the startup miner uncovers great startups almost every day (like Thread Genius, Trove and Lively.) Side note: you can see the updated list of mined, high potential startups here. I started to get curious whether some value could also be extracted a 'meta-data' style analysis.
The output of the miner is the list of startups that were listed on AngelList the previous day per geographic region, I run it for NYC and SF Bay (note: you could also cut the newly listed startups by market, so you could scrape new blockchain, healthcare or AI startups from the previous day.) So now we can explore the raw number of startups listed on AngelList per week and the market composition of these startups. This could help to create a real time awareness of what founders are excited about and potential differences between regions.
To explore this idea, I ran these numbers for the week of 03/05/18 to 03/12/18.
So we see approximately 1.7x number of startups listed in SF Bay than NYC. This seems to be much smaller than common wisdom suggests (given the prominent position SV holds in the tech community.) Indeed this data is mirrored in data from PwC Moneytree. Taking the median of the number of startups and the amount of capital deployed to NYC and SF Bay seed stage startups (since this most closely approximates AngelList startups) over the last 2 years we see a 2.1x number of startups in SV over NYC and a 2.0x in capital deployed.
The market breakdown for the NYC startups is shown below:
It might be hard to make any meaningful observations from this data in isolation [1]. But there are some we can see clearly. Consumer dominates and Blockchain and AI /ML/Data startups are low (based on my prior expectation.) But comparing with SF Bay will be most helpful. The market breakdown for SF Bay area startups for the previous week is shown below.
Here we can see a (nice, somewhat predictable) balance between Enterprise Software and Consumer startups (this may be representative of the "maturity" of SF Bay as a startup ecosystem.) Healthcare and Blockchain seem low and Education surprisingly high.
As common VC wisdom suggests, I think the (ongoing) market examination of these startup ecosystems will be helpful in a contradictory way: the best startups are often tackling markets that are not hot (home sharing, transportation, social etc.) and many are resistant to, and in fact break rigid data structures by definition being highly innovative (which is precisely the point.)
I'm looking forward to continuing this series (with more than just one weeks worth of data!)
Notes:
[1] AngelList's UI allows users to write free form text for their market categorization when creating a new startup profile. If it matches a previous tag it autofills but if not a new market tag can be created. This makes it a little difficult to run analytics (some of the best market categories for scraped startups last week include: swimming, USA and livestock options.) So i created my own 'umbrella' market tags to consolidate free form text tags. Disclaimer obviously this could introduce distortion, but it is assumed to be negligible.
The timing of Uber’s launch may be another (less talked about) important element to its eventual success. According to Crunchbase, it was founded in 2009, superficially not the best year to start building a business, right after the biggest financial collapse since the Great Depression.
But for Uber’s business model the crisis may have in fact been beneficial. Uber relied on the assumption they could make it an attractive option for the ‘non-taxi driver’ public to start driving and earn a bit or money on the side (Uber even had a whole marketing campaign later centered around promoting the “side hustle.”) At a time when the unemployment rate was 8.5%, having an ‘easy’ way to earn more money probably attracted many people to join Uber as a driver, and if their model is sticky which I think it is, they were able to create a stable, large network or ‘contractor’ drivers for the Uber network. Indeed there are similarities with Airbnb, founded in 2008, enabling people to list their spare space and earn a bit of “side hustle” income at a time when it was needed by most.
This then begs the question, how important are exogenous factors (stock market value, unemployment, social unrest, political climate, pop culture) in successful early stage technology company building? This isn't an altogether new question: in the wonderfully prophetic philosophical exploration of the potential of the Internet, published in 1997, Internet Dreams (only $7.00!), Mark Steifik provides a passage on the '"Gutenburg Myth" form Scott D. N. Cook:
"At the very least, this account of the printing revolution suggests that the traditional, one dimensional model of new technologies (or a single new material gadget) causing broad social change must be regarded with deep suspicion."
Cook introduces the beautiful phrase: 'political and moral myopia' in revisionist history that weights too heavily the technological innovation in a whirlpool of social, political and economic upheaval (the revolutionary period in the France and United States created a social climate of equality leading to increased education, literacy and therefore demand for printing - which the printing press could accommodate.)
So what are the exogenous political, moral, economic factors most important today?
The benefit of this exploration is in the brainstorm and not the answer (even if there is one.) Many ways to explore the above.
This website could be helpful in exogenous factor exploration: https://ourworldindata.org/
This Jeff Bezos quote has done the rounds for a while and for good reason (Bill Gurley and Collab Fund). It's equal parts obvious and contrarian. It has been in the back of my mind for a while and I think I finally understood why.
“I very frequently get the question: ‘What’s going to change in the next 10 years?’ And that is a very interesting question; it’s a very common one. I almost never get the question: ‘What’s not going to change in the next 10 years?’ And I submit to you that that second question is actually the more important of the two — because you can build a business strategy around the things that are stable in time. … [I]n our retail business, we know that customers want low prices, and I know that’s going to be true 10 years from now. They want fast delivery; they want vast selection. It’s impossible to imagine a future 10 years from now where a customer comes up and says, ‘Jeff I love Amazon; I just wish the prices were a little higher,’ [or] ‘I love Amazon; I just wish you’d deliver a little more slowly.’ Impossible.” - Jeff Bezos
For me understanding what doesn't change is almost half the equation. The other half is understanding where these things that don't change happen: the technological event horizon. OK yes that is super jargon but hear me out. It seems as though there are values that don't change and outcomes that don't change. We can look at this using personal transportation as an example:
Personal entertainment offers another example:
In both cases the value doesn't change, the outcome doesn't change but where these things happen, the technological event horizon does change. Even Jeff Bezos knows this. Amazon was built on people wanting wide selection at low prices enabled by (the thing that has changed:) the internet.
Disclaimer: Even before I write this post I know it's going to appear very buzzwordy. I'm going to attempt to make it as practical and actionable as possible.
How do you know your idea for a company has potential? I'm not talking about superficial halo effect success metrics like VC funding, co-founder excitement or press. I mean how do you structure a rigorous examination of the potential of your idea?
I looked online briefly and didn't really find any compelling information. Google offered "10 Ways to Know if You Have a Good Business Idea" which seemed to be the online equivalent of a tax "expert" at a strip mall. First Round's new First Search tool offered Chris Dixon's "Why You Shouldn't Keep Your Startup Idea Secret," still not exactly what I was looking for. Sequoia has a brief but interesting post on "Writing a Business Plan" that is definitely worth checking out.
So it seems there may be room for some examination of the structured diligence that could be applied to the brainstorming session of a co-founding team. Below is that examination. It is still a work in progress but, like most helpful frameworks I think, it breaks the problem down with 3 main components; the thematic layer, the strategic layer and the tactical layer.
The math of venture requires funds invest in companies that have, as Sequoia puts it, "legendary" potential. These legendary companies generally take advantage of tectonic shifts in history. These shifts can be difficult to identify by incumbents since they have a financial incentive to believe in the stability of their value proposition. As 'The Sovereign Individual' points out "if you know nothing else about the future, you can rest assured that dramatic changes will be neither welcomed nor advertised by conventional thinkers...the tendency will be to downplay the inevitability of these changes." Famous themes that led to the development of legendary companies include:
Examples of decades-long themes that could form the core of a company today could be:
Resources that help with the thematic layer diligence of a company idea:
Assuming you are able to get funding, ship product, get customers... what is your company's competitive differentiation that allows you to keep those customers and what gives your company escape velocity? Is it internally building a proprietary database? First mover and product leader? Brand differentiation? Design differentiation? Interestingly in First Round's 2017 State of Startups only 5% of founders said they think they could fail because a "competitor outdid them."
Resources that help with strategic layer diligence:
Getting from 0 to 1: The hardest, loneliest stage. Where do you start, what are your priorities? How much runway do you have? Which verticals do you start with? Who do you know in VC? Who could you convince to join in the earliest stages? Who else is out there trying to do what you are doing (be honest)? Why are you a better team to build this company instead of them?
Resources that help with the tactical layer of diligence:
Legendary Sequoia Partner Mike Moritz recently penned an article in the Financial Times, as he is wont to do, titled "Silicon Valley Would be Wise to Follow China's Lead." It describes the culture of working at a startup in China; long hours, family sacrifices and a "furious" pace of work. This would be fine in my opinion, if it were a singular exploration of a very unique culture. By not only contrasting it with Silicon Valley, but also being prescriptive in saying we should "follow" the "lead" of China, Moritz misses a chance to focus on output as the main measure of success rather than the outward impression of how busy you are. The two cultures are so drastically different, a suggestion that one should simply follow the other seems sub-optimal.
Having said all that, the comparison of China and the U.S. technologically, culturally, intellectually, has been a long one. Now, in exploring output or value creation as the main comparison factor we see there are many elements of this story that are worth investigating:
In my view, a major story that will play out over the next decade is the internal conflict of technological regulation. It is both a necessity to maintain the core value of privacy and a potential hindrance on technology advancement. Will China's more lax regulations allow it to adopt and deploy autonomous vehicles quicker? virtual reality as entertainment quicker? A.I for medical diagnosis? For everything?
People have seen this coming. a16z have Partners that worked in the White House under Bush and Obama and Tusk Ventures was recently created specifically to help startups "thrive in heavily regulated markets." It's going to be a fascinating decade.
Buy it, Cambridge University Press, Robert C. Allen
When you stop to think about it, the idea of using the past to help predict the future of technological development seems kind of self-contradictory. But there is obvious value in understanding the path of development. Below are some interesting insights from renowned historian Robert C. Allen on the British Industrial revolution that I think can be translated to the present, with the goal of shedding light on the highest potential that exists today for company builders.
Questions that arose after reading and reviewing:
Looking back on it now, I was unequivocally unprepared for the process of becoming a founder. I say this with the benefit of hindsight obviously, but also because in my current job after meeting with many founders, I’ve devoted more time to thinking about the “profile” of a “good” founder (even exploring whether we could build predictive algos around this profile). I put these terms in quotations very deliberately because this profile does not exist. But that doesn’t mean there is nothing we can do to prepare ourselves to become or identify creative, high potential founders.
I want to outline how hard it is being a founder by exploring some of the things that we try and look for in the best ones. What we’re looking for is an idea of dynamic balance. I don’t mean work life balance (although that is important.) What I mean is, how do they manage two perhaps contradictory, and potentially both “good” responses to key questions, strategy directions, decisions etc.
Ben Horowitz hints at this idea in “The Hard Thing About Hard Things” when referring to CEO stress, saying CEOs make one of the following mistakes “1. They take things too personally, 2. They do not take things personally enough. The “right” response is highly context dependent; sometimes taking this personally is the “right” call sometimes not. What is important here is that the founder has the ‘dynamic balance’ to make this correct judgement.
For me the best way to think about it is using a fulcrum to visualize the competing response. The key founder criteria, with competing responses are shown below:
Let's investigate these a little further.
In 'Make Your Bed' a deceptively simply titled book, former U.S. Navy Admiral William McRaven highlights the benefits of being "unshackled by fear." Founders need this too. But he also outlined how detailed and calculated each move they made was. The dynamic balance here is understanding the fact (more than just a cursory acceptance) that your company will most probably die and still being unafraid (and excited!)
This one is really interesting. Jim Simons at Renaissance Technologies, arguably the best quant hedge fund in the history of the world, famously does not hire anyone out of Wall Street. He wants physics PhDs straight out of school because he doesn't want them to be 'corrupted' by the 'wrong' ways of making investment decisions.
MIT Professor Andrew Lo has a similar take on this; in his new book Adaptive Markets , he describes the scene of a shark thrashing about on the shore of a beach. Lo does this to illuminate how a creature of such (terrifying) hunting perfection can be reduced to ridicule simply by changing context. Now, this may be obvious, but the point here is to say the shark is so hyper-adapted to hunting in the ocean that it is useless on land. I say this to explore the idea of why Hilton didn't make Airbnb. Hilton (or any other global hotel chain) was so hyper-adapted to hunting in their context they missed (and arguably could never have seen or executed) on this multi-billion dollar opportunity.
So, depending on the market with which the founder is operating in we either tolerate (or indeed seek out) creative, contrarian outsiders or the market will necessitate domain knowledge (like building ML-specific ASICs for example.)
Is decisive, takes action, makes decisions, fails fast, etc etc. versus has been deliberate about building the "communication architecture" as Ben Horowitz calls it around seeking feedback from team members, investors, advisers on important questions. Again, the idea here is that every situation will be different, what we want to diligence is the founders awareness of when to tip towards one or the other.
Andy Grove pivoted his huge, public company away from what they've been doing successfully for many, many years to save the entire company. This one for me is actually one and the same thing. You need to think of adaptions to support having a thriving, continuing vision; the shark needs to think if he'll ever needs legs.
There is no "right" answer to the dynamic balance on these questions and they change depending on may factors. But we believe this high level, translatable framework is helpful in learning more about how the founder will run her business in the future.
(also let’s talk! You can reach me at s.mcanearney@gmail.com and stephen@honecap.com)
“Financial markets don’t follow economic laws. Financial markets are a product of human evolution and follow biological laws instead”
- Andrew Lo, Adaptive Markets
In attempting to understand (and exploit) the operation of modern financial markets, academics and investors alike have long found comfort in the reductionism of all-encompassing equations. The problem is these equations can sometimes be wrong, and when they are wrong, they are destructively wrong. Richard Feynman once quipped “imagine how difficult physics would be if electrons had feelings,” pointing out with characteristic wit, the inappropriateness of translating physics to human financial markets. In trying to solve this problem there has been a recent shift towards recognizing the role that humans play in these markets (and the irrationality and unpredictability they create) and by extension the role that biological laws play in finance. From Herbert Simon’s bounded rationality, to Daniel Kahneman’s heuristics and biases, this focus on human biology in financial markets has been a long time coming.
Our focus here will be on how we incorporate (and exploit) these new techniques and tools as an investor in the venture capital market. In this context., we can see that attempting to incorporate the ‘physics’ of machine learning alone will be suboptimal. We need to leverage biological laws in optimizing our investments. We do this by using interpreting the venture capital market as a complex adaptive system, and draw on insights from machine learning, theoretical computer science, graph theory, and evolutionary game theory.
Building quantitative tools to support investment decisions is valuable in itself. Alan Turing once said “I believe that the attempt to make a thinking machine will help us greatly in finding out how we think ourselves.” I believe all venture investors, for every decision, invoke an internal ‘model’ that they’ve ‘learned’ over their career through the all companies reviewed, decisions made, successes and failures. In building a model we can learn more about how we make decisions, and how we can improve them. The is based on the problem that humans forget, are biased, and generally make sub-optimal, heuristic based short cut decisions. Also, how many sufficiently detailed deals could a human have possibly seen? 2,000, 10,000? And of these deals how many ‘features’ do they remember about each of the deals? Psychologist George Miller of Princeton famously found that humans can only hold 7 objects (plus or minus 2) in their working memory.
But a machine learning model does not forget, is not biased (as long as the training data is appropriate) and can evaluate all 30,000+ deals in making a decision. But what happens when the first Blockchain deal is reviewed by the model? What ‘market’ feature do we assign to this new market? Here we see the breakdown of using only machine learning to make decisions; it violates the invariance assumption (from Theoretical Computer Science Professor Leslie Valiant), the invariance assumption states that the context in which the generalization (prediction) is to be applied cannot be fundamentally different from that in which it was made.
But in almost every successful case, the entrepreneur is deliberately trying to violate this assumption. ‘We are doing something completely unique’, every entrepreneur is deliberately trying to break the current context and introduce something new, and if it sufficiently new that it is unrecognizable to the model that has learned over the past 10 years (like blockchain technology) the model is broken.
“I remember the first time I met Edsger Dijkstra. He was noted not only for his pioneering contributions to computer science, but also for having strong opinions and a stinging wit. He asked me what I was working on. Perhaps just to provoke a memorable exchange I said, “AI.” To that he immediately responded, ‘Why don’t you just on work on I.’ ”
- Harvard Professor Leslie Valiant, Probably Approximately Correct
So it is that highly complex, non-linear systems must be treated (at the time of writing) with more than just artificial intelligence. We are trying to get to a more complete understanding, and with that goal in mind we introduce elements of biology into the mental model.
The venture capital market lends itself naturally to biology; it is completely driven by human interactions, networks and relationships, it’s constantly evolving, and involves concepts of competition and survival analogous to evolutional biology. Indeed, with some (all?) companies it is inherently human; Sequoia Capital reportedly analyzes which of the ‘7 Deadly Sins’ the company under question exploits.
Applying X to venture capital:
To be continued.
Note: Kendrick Kho contributed heavily to the modeling in this post
There are a few examples of funds building very large portfolios of early stage companies, mainly being YC, 500 Startups, SV Angel and to some extent A16Z. The reasons for building a larger portfolio vary between funds (since they have different incentives) but common among all of them is the benefit of a higher likelihood of having a hyper-successful company in the portfolio. The trade-off however (assuming that the fund dollar amount if constant) the ownership per portfolio company will be lower (assuming that the post-money valuations are steady also.) So is the benefit of higher likelihood of hyper-successful portfolio company worth the lower ownership per portfolio company?
Given the Power Law sensitivity in VC (the Nassim Nicholas Taleb famously called the Black Swan effect) to the point where one Airbnb is (currently) worth roughly 30 'regular' unicorns [1]). But what we can do is build a higher level abstraction of what happens with a larger portfolio, shown below.
Effectively what we are doing is moving from the light blue line (high likelihood of low return, but very small likelihood of extremely high return - far right) to that of the dark blue line (high likelihood of 'decent' return but lower likelihood of extremely high return - far right.)
This is because the small portfolio (say 20 - 30 companies) is concentrated and there will be some simulations where Airbnb (or equivalent) is in this concentrated portfolio which will lead to extremely high returns. In the larger portfolio, most of the simulations will exhibit a hyper-success company but we have lower ownership and therefore this hyper-success company doesn't contribute 10x the fund but maybe 1.5x. But this happens a lot more than in the smaller portfolio since now we are very diversified.
The above is merely meant to represent the idea of portfolio diversity and concentration. We now have simulated these two scenarios using the Power Law Hybrid outlined in a previous post. Results of this (1,000 trial Monte Carlo simulation) are shown below.
Here we see (in a bit more detail) the effect described above. Almost 50% of the time, our smaller portfolio (light blue) loses money (negative IRR). But, 10% of the time we return >30% IRR. Compare this with the larger portfolio (dark blue). Here we see (almost) 0% chance of returning <0% IRR (graph rounds very small numbers to 0%) but also (almost) 0% chance of returning >30%. Herein lies the tradeoff.
The larger portfolio is more reliable in returning an IRR of 15%+ but we miss the chance of returning 30%+. For LPs in many cases they can deploy funds that reliably generate 15% in different asset classes. They choose early stage venture specifically for the extreme and out-sized returns (despite this occurring only 10% of the time - in this simulation.)
So now we can look at estimating how many portfolio companies are required to build a 'diversified' early stage portfolio. The graph below holds the fund $ size constant and adjusts the ownership per portfolio company along the x-axis representing the number of companies in our portfolio. The dark blue line is the median expected ROIC of the portfolio (again over a 1,000 trial Monte Carlo simulation) and the shaded blue represents the 3rd and 1st quartile. From the graph we can see that at around 250 - 300 companies the benefits of diversification are realized (disclaimer: this simulation is highly stylized and for discussion purposes only.)
[1] http://graphics.wsj.com/billion-dollar-club/
Before building models to sensitize the returns of different portfolio constructions we need a representation of an individual company's return. Empirically we know roughly 50% - 75% of seed stage companies die and that the hyper success case is, well, hyper-rare, maybe 1 in 200. Of course any mathematical representation of the returns likelihood of a seed stage company is steeped in uncertainty. The best we can do is use a distribution that doesn't seem too implausible.
By many people's estimation the Power Law is the most 'usable' and 'accurate' distribution to model seed stage returns. The Power Law is also the basis of our approximation. But we create a mutant hybrid with the Log Normal because in generating and experimenting with various Power Law configurations it seems a little too harsh on the death rate (>75% die in many cases.) The Power Law also seems a little too willing to produce hyper-success cases (1 - 2 in 100.)
In sensitizing and understanding the returns of our portfolio constructions, these two 'errors' (the lower than realistic death rate and higher than realistic unicorn case) can be accepted. Indeed in any fund there must be some 'unfair advantages' that the fund manager has (superior network, proprietary deal flow, superior selection rate etc.) for he fund to exist in the first place. For Hone Capital machine learning models support our lower seed stage death rate and the AngelList network supports the higher than normal unicorn rate.
With return (measured in multiple of original post-money valuation) the Log Normal is run with the mean at 0.3x and the standard deviation at 1.0x. The Power Law is modeled using a Power Law coefficient of 1.159. Both shown below (vertical axis cut for clarity.)
With these coefficients we can see that the 'effective' death rate is approximately 40%, another 40% return 1.0x - 3.0x heavily weighted towards the former and around 1.5% go to exit at 'unicorn' status. It is pretty clear these are just approximations which help us get a sense of the sensitivity of a portfolio construction. They are at best over-engineered and at worst wrong.
Investors in early stage private technology companies produce a very rich graph to investigate. The graph is a secondary result of the investment activity (no one focuses on the graph while they invest). In this regard it is more like 'data exhaust'.
But the information is very valuable since it allows you to quantify the 'connectedness' of an individual, or their centrality. There are many different forms of centrality (eigenvector centrality, betweenness centrality etc.) that are appropriate depending on the context of the problem.
In investigating emerging early stage investors, one could look at the 'ego network' of the investor in question at two points in time (after 2 or 3 deals and then again a year later, assuming they've executed more deals in that time). Using open source software from Stanford [1], you can quantify the difference in that investor's centrality. In an environment where access via relationships is important (as in early stage tech investing) this quantification of an investors increase in centrality could be very valuable as a proxy for their future success.
[1] https://sites.google.com/site/ucinetsoftware/home