The Wisdom of the Crowd and Wine Critic Ratings

. I must admit, when I first saw the announcement of their existence, I assumed it was just some economists having a good time and looking for an excuse to drink some more wine.

But then the papers started to be published, and now the AAWE has made it clear that they’re quite serious in trying to apply the dark arts of statistics and economics to the world of wine — a world that can increasingly be quantified and examined thanks to rafts of data available online.

The associations publications can often be met with controversy or criticism, and that’s putting it a little mildly. I’ve heard one wine writer refer to their work as utterly fraudulent, and read many a lambasting blog post aiming to criticize the mathematics behind their work.

I didn’t pay nearly enough attention in my college statistics class to be able to offer a judgement about the quality of the work done by the various authors of the 91 papers available through the association, but I can say that I find quite fascinating what this group of academic wine lovers is doing.

Their latest paper is a case in point. Entitled “The Buyer’s Dilemma – Whose Rating Should a Wine Drinker Pay Attention to?” (232k PDF), this paper looked at the relationships between the scores from major wine critics and the scores found on CellarTracker for around 100 Bordeaux wines.

Like most of their papers, I have a bit of a hard time decoding their numerical voodoo. Things like “We run a two sample t-­test (with unequal variances) to check if the 1.6 points difference between community and RP [Robert Parker scores] is statistically significant. The t-­statistics is 4.58 and the critical t-­value is 1.97; therefore we reject the null hypothesis that there is no difference between community and RP average scores” make my head spin a bit.

But here’s essentially what these folks are claiming that the data support:

1. There’s “significant” variance between the scores of major critics (Parker, Tanzer, and the Spectator) on the same wine
2. There’s “significant” disagreement between community scores on CellarTracker and Robert Parker in particular (but also Wine Spectator)
3. There’s considerably more “agreement” between Stephen Tanzer’s ratings and those of the CellarTracker community than the other critics
4. There is greater correlation between the community score and the price of the wine than the critics’ scores and the price of the wine.

Of the findings I see the third as most interesting just as a simple fact.

I do have some questions about the findings that perhaps my readers with better math skills than I might be able to answer.

In particular I’m not sure whether what the world of statistics considers to be a “significant” difference actually translates into the wine world. There may be a significant difference (in pure mathematical terms) between a rating of 92.5 and 96 for a wine by two different critics, but really, those two critics entirely agree on the quality of the wine in every sense that really matters for the wine world.

I was glad to see the paper’s authors offer a hypothesis that the last point above may be skewed by psychology: namely that consumers have a tendency to rate more expensive wines higher simply because they (subconsciously) think that more expensive wines ought to be better. This sounds quite plausible to me.

The authors also offer another interesting hypothesis which I find less likely — that the relative divergence between CellarTracker scores and Robert Parker’s scores is somewhat deliberate — a backlash against Parker in which consumers “resent Robert Parker’s influence -­ or shall we say hegemony over the wine community -­ and systematically challenge his ratings by either giving higher scores to the wines with low RP ratings and lower scores to the wines with high RP ratings.”

I just find it hard to believe that consumers are aware enough of the specific Parker rating of the wine they are drinking, and when they are rating it, to facilitate this sort of behavior in a broad sense.

The one thing that the authors of the paper don’t happen to devote much attention to, which I think is at least as interesting as all their other findings, is that in general, the greater wine drinking population doesn’t think wines are as good as the major critics do. If I am understanding the data correctly, in all cases the community rating was below all of the critics ratings.

This is sort of surprising, given claims by some commentators in the wine industry that most people can’t tell the difference between a $18 bottle of wine and a $90 bottle of wine. As a result you would think that the scores that the broader population assigned to wines might skew towards either end of the spectrum in a kind of “yum/yuck” or “love it / hate it” volatility.

Of course, CellarTracker is not necessarily representative of the broader population, a point which the paper’s authors seem to acknowledge, but it isn’t clear just how much they want to address this bias. Of course, it is really the only broad and deep set of consumer wine evaluation data that is publicly available, so one can hardly fault them for using it.

In any case, the paper is worth a read, and it’s pretty easy to skim the technical parts. Take a gander and then tell me what you think.