What does (and doesn't) it tell us about differences between charities?
On our comparing charities page, we discuss one of our core claims, echoed by many others in the effective giving ecosystem: many of us can easily 100x our impact by donating to highly-impactful charities. We cover why differences in impact exist and which factors contribute; we also demonstrate the vastness of these differences through a couple real life examples/case studies. But we wouldn’t be doing justice to our effective giving values without discussing some of the data that is often used to support claims about the differences between charities — so, while we chose to leave that discussion off the main page to avoid overwhelming our readers, let's dive into it here.
One of the most often discussed data sets that comes up when discussing how much different programs (or “interventions”) vary in their good per dollar or “cost-effectiveness,” is the Disease Control Priorities in Developing Countries 2006 report (DCP2), which was a project of The World Bank, the National Institutes of Health, the World Health Organization, and the Population Reference Bureau. This data (shown above) compared the cost-effectiveness of different interventions in global health. As discussed by Toby Ord in his foundational essay The Moral Imperative Towards Cost-Effectiveness, it found huge variance between the best and worst global health interventions and a still quite sizable difference (Ord suggests around 60x) between the median and best. It also suggested that the best interventions were so much better that, as Ord writes in his essay, “if we funded all of these interventions equally, 80% of the benefits would be produced by the top 20% of the interventions.”
While we think the patterns in the DCP2 data demonstrate a key effective giving principle — that there are substantial impact differences between the programs a charity may operate — we don’t rely on this data to demonstrate differences between charities. There are two reasons for this:
On the first point:
Average vs. marginal
We think that when you make a charitable donation, the relevant question is: how much good will my donation do? (cost-effectiveness on the margin) rather than “How much good does the average donation do?” (average cost-effectiveness). For example, suppose a charity is doing cutting edge research on vaccine development, but they are already fully staffed and resourced and therefore without need for additional funding. They may be highly cost-effective on average, but not on the margin.
Interventions vs. charities
We think it’s important not to conflate data on the cost-effectiveness of an intervention with data on the cost-effectiveness of a charity. While differences in the cost-effectiveness of interventions indicate that the cost-effectiveness of charities will vary based on which programs (interventions) they facilitate, there are other factors at play as well. For example, supporting a charity that facilitates a particularly cost-effective intervention will likely still be more expensive than is captured by the DCP2 data, since that data won’t necessarily include all the associated costs a charity might incur.
On the second point:
Reliability
In 2011, the charity evaluator GiveWell found some concerning errors when examining the DCP2’s figures for deworming, and GiveWell is now hesitant to put any weight on DCP2 cost-effectiveness estimates unless its research team can fully verify the calculations. Importantly, GiveWell only examined one intervention of many, so it isn’t clear whether there are additional errors in the data set; however, given the extent of the errors, and the fact that they weren’t caught prior to publication, there is good reason to question the reliability of this data source. As such, there is some doubt over how heavily donors should weigh the DCP2 conclusions; on the one hand, this was a project of over 350 experts and very credible agencies; on the other, GiveWell estimated the cost-effectiveness estimate on deworming to be off by a factor of 100.
Giving What We Can still displays the DCP2 data in some places on our site, because we think the “heavy tailed” pattern (which demonstrates the very large extent to which interventions vary in efficacy) is still accurate, even assuming some errors in the exact cost-effectiveness calculations. In other words, we think it’s fair to say that even if there were more errors in this data set, it’s highly likely it would still be heavy-tailed (given the very large spread we are starting with). That said, we don’t use the DCP2 to make claims about the exact cost-effectiveness figures of different interventions (or the exact amount interventions differ in impact) as we think GiveWell’s deep dive into the data casts doubt on those numbers. We remain confident, however, that the overall pattern shown by the data — that interventions vary greatly in their cost-effectiveness — holds up.
One of the reasons we are confident about this is that the 80/20 pattern (whereby the top 20% of interventions are so much better that implementing just those would do about 80% as much good as implementing all of them) is repeated in more recent data sets from a variety of sources. Indeed, when 80,000 Hours’ Benjamin Todd “checked” Ord’s foundational paper on the DCP2 against a collection of “all the studies [they] could find” he concluded that “the 80/20 pattern basically holds up” and found the top 2.5% of interventions in the data sets he examined to be around 20-200 times more cost-effective than the median (an intervention somewhere in the middle of the range) and 8-20 times more cost-effective than the mean (a randomly-chosen intervention).
So the data sets we have do support vast differences in the cost-effectiveness of different interventions, which means depending on which programs a charity implements, there will be vast differences in its overall impact.
In the examination referenced above, Todd also argues that the true impact difference might be lower than what we see in the data, because of the possibility of data errors like the type GiveWell encountered in the DCP2, regression to the mean, actual availability of the interventions examined, and positive secondary effects of some interventions; however, even taking into account these considerations, he still believes that the most effective interventions in an area are at least 3-10 times more cost-effective than the mean (“where the mean is the expected effectiveness you’d get from picking randomly”).
(By the way, In case there is any confusion about why Todd's numbers — 3-10 — are less than our estimate of “easily doing 100x more”, we want to clarify that we are making slightly different claims. Todd is saying that if a donor previously picked an intervention at random, they’d be able to 3-10x their impact by switching to supporting an intervention in the top 2.5% of interventions in that area. We’re saying that, depending on one’s starting point, we can easily 100x our impact by following some key principles when choosing where to donate. We also suspect that the starting point of most donors is more likely to be closer to the median of interventions (a “typical” intervention) than to the mean (choosing randomly). Perhaps more importantly, Todd's estimates are confined to the difference between interventions working in the same area, while ours are not. Todd estimates the difference in effectiveness between interventions in different cause areas to be considerably higher, stating that if one were to "both find a pressing problem and an effective solution...the combined spread in effectiveness could be 1,000 or even 10,000 fold."
That said, as GiveWell points out, there are limitations to cost-effectiveness calculations. When GiveWell analyses their top charities, its researchers take into account not only cost-effectiveness but also factors like confidence in the organisation (such as whether they have a proven track record) and how much evidence there is that a specific intervention works well.
We hope you enjoyed this deep dive into the data! If you have thoughts or feedback on this discussion, please let us know.