The main point of this note is to suggest that an investor should have a higher tolerance for false positive classifications when selecting shares with right-skew and/or fat-tailed returns (potential “moonshots”).
Statistical discussions of hypothesis testing commonly refer to "false positive" (Type I) and "false negative" (Type II) errors. The term “error” is value-neutral in statistical testing, but may be unhelpful in a portfolio selection context, because it might be misread as insinuating some analytical mistake on the part of the investor. To avoid this blameworthy connotation, I will instead use the (slightly) more neutral term “disappointment.”
In portfolio selection I can then seek to avoid ex-ante errors of analysis, whilst recognising that there is an optimal rate of ex-post disappointments. (I can't think of a crisp and widely recognised word pair for the distinction I am stressing here: on the one hand blameworthy ex-ante errors, and on the other hand blameless ex-post disappointments. This seems an interesting linguistic lacuna.)
Portfolio selection as a classification problem
Portfolio selection can be viewed as a classification problem, with “positives” being shares which are added to (or retained in) your portolio, and “negatives” being shares which are rejected (or sold). Normally in classification problems you seek to minimise the error rate, defined as a weighted sum of false positives and false negatives. The two weights – one for false positives, and one for false negatives – are set according to the cost (the payoff) of each type of disappointment.
To give an example from medicine: if a disease is potentially fatal but has a reliable and safe treatment, then false negatives are costly, and false positives are benign. Hence we should tolerate a higher rate of false positives (eg pap smear testing for cervical cancer in young women). On the other hand, if a condition is benign and treatment tends to be worse than the disease, converse payoffs would apply and we should tolerate a higher rate of false negatives (eg PSA antigen testing for prostate cancer in elderly men).
In portfolio selection, false positives and false negatives cannot sensibly be identified with just two payoffs. Instead each type of disappointment generates a return distribution, reflecting the underlying share returns.
The underlying shares can be thought of as two classes, with different return distributions – potential moonshots, and mundanes. The graphs below show the return distribution for an individual stock drawn at random from each class.
Potential moonshots are shares with the potential for very high growth: shares which might “go to the moon”. Say Apple in 1997, or ASOS in 2003, or QXL in 2005 (of course examples are easy to identify ex-post!). Potential moonshots are few in number. Most potential moonshots never take off (call these “duds”). Actual moonshots are very rare (call these “hits”).
Mundanes are (unleveraged versions of) housebuilders, manufacturers, engineers – reliable businesses, but not plausible moonshots. Mundanes are plentiful, and easy to recognise. The nature of their operations makes long-term high growth unlikely. Compared to potential moonshots, mundanes produce outcomes which are less extreme, and more evenly distributed between “hits” and “duds”.
Generally, any adjustment to our tolerance of disappointments involves a trade-off between false positives and false negatives: it is not possible to minimise simultaneously both type of disappointment. The optimal trade-off depends on the (dis-)utility of each type of disappointment.
Is portfolio selection more like pap smear testing (false negatives are costly), or PSA antigen testing (false positives are costly)? The answer depends on the type of share: false negatives have much higher (opportunity) costs for potential moonshots than mundanes. For potential moonshots, a false negative means we miss one of the very few Apple-like shares which could make a big difference to our portfolio return. For mundanes, false negatives are not much of a problem, because a single missed mundane won’t make much difference to our portfolio return, and there are always plenty more largely interchangeable mundanes we could include.
This implies we should apply a heavier penalty to false negatives (and therefore necessarily also accept more false positives) when classifying potential moonshots than when classifying mundanes.
The differing optimal strategies for portfolio selection from potential moonshots and mundanes can be be illustrated with toy examples, as follows.
Toy example: portfolio selection from potential moonshots
Suppose that the entire class of potential moonshots comprises 100 shares, with these payoffs: 90 out of 100 go bust, and the other ten can be sold for 3x their cost.
Assume we select 10 shares for our portfolio, and make equal investments in each of them (these assumptions are simplifying, but not necessary).
If we select at random, the expected portfolio return is a loss of 70% (0.1 x3 x1 + 0.1 x0 x9).
But if we can manage to select four true moonshots, the expected portfolio return becomes a positive 20% (0.1 x 3 x 4 + 0.1 x 0 x 6). With five true moonshots, it’s 50%. With 6,7,8,9, 10, it’s then 80%, 110%, 140%, 170%, 200%.
So in this (extreme) example, we have can have a false positive rate as high as 60% when selecting mooonshots, and yet still generate a positive portfolio return.
Isn’t it better to tighten our criteria, and so reduce the false positive rate below 60%? For example, if our criteria for potential moonshots include forecast sales growth of 30%pa, we could tighten this to 40%pa. Yes, that would probably reduce false positives – but it would also probably increase false negatives – that is, we exclude more true moonshots. Because moonshots are so rare, the combination of reducing false positives and increasing false negatives may produce a portfolio with a lower fraction of moonshots, and hence a lower expected return.
(Technical note The argument as stated here is implicitly in cross-sectional form, adding contemporaneous raw returns in one period to get portfolio return for that period. But it also applies in longitudinal form: for compound returns, you just add the log returns over time.)
Toy example: portfolio selection from mundanes
Suppose the entire class of mundanes comprises 500 shares, with these payoffs: 250 give a 20% loss, and 250 give a 30% gain. (Note that realistically, there are 5 times as many mundanes as potential moonshots.)
As before, we select 10 shares for our portfolio, and make equal investments in each of them (these assumptions are simplifying, but not necessary).
If we select at random, the expected portfolio return is 5%.
If we select mundanes with a 60% false positive rate – the same disappointment-tolerant strategy which produced a positive 20% return from the potential moonshots class above – then our expected return is nil (0.1 x 0.8 x 6 + 0.1 x 1.3 x 4). In this case, a 60% false positive rate produces a lower result than chance; the disappointment-tolerant selection strategy which worked for moonshots doesn’t work for mundanes.
To achieve a positive return from mundanes, we need to penalise false positives more heavily. Say we tighten our selection criteria to reduce the false positive rate to 40%. Then our expected return is 0.1 x 0.8 x 4 + 0.1 x 1.3 x 6 = 10%. With the false positive rates of 30%, 20%, 10%, and 0%, the expected returns are 15%, 20%, 25%, and 30%.
By tightening our selection criteria, we also probably increase the false negative rate that is we reject some good mundanes. But we don’t care much, because no mundane makes a big difference to the portfolio, and there are hundreds more good mundanes to look at.
In advocating raised tolerance for false positives when selecting potential moonshots, I am not saying that we should set out to make careless judgments. We should strive to avoid ex-ante errors of analysis; but we also need to accept that even diligent judgement may lead to a high rate of ex-post disappointments, and we need to be comfortable with this pattern of outcomes.
A problem with advocating higher tolerance for false positives for selecting potential moonshots and for selecting mundanes is that shares are not labelled as belonging to one or other of these categories. The categorisation is itself a matter of judgment. I have no solution to this.
How do we increase the false positive rate for potential moonshots, and reduce it for mundanes? The most obvious way is just to be (a little) more credulous when assessing potential moonshots, and conversely for mundanes. To formalise this, one can use looser requirements for current financial metrics when assessing moonshots.
Another way might be to apply an inclusive checklist for potential moonshots, and a disqualifying checklist for mundanes.
By inclusive checklist I mean that the presence of certain positive features (say a management team with exceptional previous start-up success) guarantees inclusion in the portfolio largely irrespective of other any concerns. By disqualifying checklist I mean that the presence of certain negative features (say Debt > 3 x EBITDA, or large share sales by insiders) guarantees exclusion from the portfolio, irrespective of any other merits of the company.
One extant manifestation of my suggested strategy “fatter-tailed and/or right-skew returns => be more tolerant of false positives” is the tech start-up sector. For angel investors in tech start-ups, most investments are duds, but they hope to more than make up for this with a few runaway hits. Peter Thiel (of Paypal / Facebook fame) suggests that to a first approximation, an angel investor will achieve a positive return only if his single best investment ends up being worth more than all the others combined.
Update (21 October 2012): Paul Graham makes much the same point: angel investors in tech start-ups are Black Swan farming.