Piling onto so-called expert wine evaluators has become all the rage lately. Remember when the California State Fair commercial wine competition judges got steamrolled (again) by data showing that blind tasting medals are awarded in a random distribution?
So expert wine evaluation is all just donkey-bong bunk, right?
Not so fast, Jerky.
According to data collected over the last several months by VineSleuth, it turns out that when we live by the wine evaluation data sword, we also die by the wine data evaluation sword. VineSleuth’s data show that expert wine evaluators “are able to repeat their observations on individual wine samples about 90% of the time” when tasting wines blind.
Now, where I come from, 90% is a sh*t-ton better performance than can be explained by random chance. It suggests that the blind wine evaluation game isn’t so clearly flawed as some might make it out to be.
And before you start manically flailing away at your keyboards typing me flaming e-mails about how the experts chosen for VineSleuth’s analysis must not actually be experts, or that their (patent-pending and proprietary) methodology is somehow flawed, you should know that they ran it with the help of sensory scientists and numerical algorithms researchers/experts, and that they stocked their tasting panels with folks who make their livings tasting wine: winemakers, oenologists, sommeliers, writers… and little ol’ me.
And pretty soon, you’ll be able to test out my work for yourself…
I served as a “core evaluator” for VineSleuth’s upcoming wine app, wine4.me, in part of a process that is based on scoring wines by intensity. The results went through some pretty rigorous and proprietary methods of analysis to ensure that the data tested “clean” and repeatable from a scientific standpoint (statistical analysis that rejected any inconsistent, inaccurate, or imprecise data) for their soon-to-be-released wine app.
In other words, VineSleuth’s team knows exactly how consistent (or inconsistent) I am at blind wine sensory evaluation, and I’m somewhere in the realm of 90%, which I’ll gladly take [insert golf claps here]. I see that much less as an ego thing and much more as a “if you do something for 10,000 hours, you’ll probably get good at it” thing.
This kind of thing wouldn’t normally float my gloat boat, but it took on extra significance when I was told that I was (understandably, considering the tasters with whom I shared the experience) the dark horse candidate in all of this work. According to my friend and VineSleuth CEO/co-founder Amy Gross, the decision to include me in the process went down something like this (paraphrased version of events):
Super-smart scientists: “Amy, here is the list of evaluators we think that you should use for your project. These are experts who we strongly suspect will be excellent tasters.”
Amy: “Okay, great! I want to add this guy, too – Joe Roberts, from 1WineDude.com”
Super-smart scientists (looking quite concerned): [Awkward pause] “Uhm… okaaaaay… but when he totally blows it and we have to reject **all** of his data, that’s on you.”
As for the VineSleuth data analysis process, there’s not much I’m allowed to tell you about it (due to a NDA), but I think I’m permitted to mention that the scientific rigor with which the tasting sessions were executed, and the strong focus on cleaning the resulting data, sets this sort of wine evaluation quite far apart (as in, say, North-Pole-to-Antarctica-distance apart) from any wine competition in which I’ve yet judged. Not that the comps. are somehow incompetent in their execution (far from it, in fact – see my take on the 2013 Critics Challenge as an example of how it’s done right), it’s just that they have an entirely different focus than VineSleuth’s work. That different focus (pinpoint sensory evaluation versus quick quality assessment) might account for why the VineSleuth evaluator results are so strikingly different from those being proffered by wine competition detractors lately.
The main point is that, while the evaluations include somewhat different focuses, non-rigorous (in a scientific sense) wine competition data show expert wine tasters to be inconsistent, while quite rigorous and clean scientific data show that wine tasters can be consistent within 90%. So which data set are you gonna go with (I know which one I’m picking)?
VineSleuth is about to release Wine4.me, a smartphone and (eventually a) web application. You can sign up to be a beta tester now at http://wine4.me and put all of my tasting evaluation work to, well, work.
Cheers!
Thanks, Joe, for rocking it as one of our evaluators and not, well, totally blowing it. :)
We're glad to have you on our team and can't wait for the moment when we get to share all of this amazing data with the rest of the world… it won't be long!
Thanks, Amy. I miss the sessions! :-)
Color me skeptical.
The 90% figure flies in the face of prior studies, and so far we only have VineSleuth's assertion of that result (and your support). We lack sufficient info concerning VineSleuth's methodology to ascertain whether it was reliable, valid or trustworthy. So at this point, I think people need to be very wary of such an assertion until much more info is released.
Joe, how many wines did you taste for them?
I also question VineSleuth's repeated assertions on their website that their wine characterizations are "objective." That is a heavy standard, and flies in the face of the neuroscience of tasting. As VineSleuth does not provide an example of a wine profile on their site though, it is difficult to determine what is objective about their profiles.
As they say, when something sounds too good to be true…
Richard – you're always skeptical :-) . But in this case, you should be for exactly the reasons that you specify: a lack of details. I can't reveal them, due to a NDA. But I'm (obviously) pottery confident that, when those details can be revealed, they'll reinforce my endorsement of the methodology. I'll say this, FWIW: there are several wine people who aren't evaluators but who have been walked through those details (scientifically, and not “marketingly” :-) under NDAs (some of them are mutual acquaintances of ours), and none of them walk away totally skeptical at the end of it. I guess time will tell. As for the objective part: the eval isn't a quality assessment; sorry I can't say more than that yet, but I am convinced that difference is why the data are consistent (vs. more subjective measurements such as “how good is this wine”).
“flies in the face of prior studies” – not necessarily, this sounds like apples and oranges to me. The prior studies that have received press (some reasonably designed/controlled, some not) have involved things like:
–switching the label or color of the same wine, and finding that people rate it differently, which just tells you that visual stimuli and prior knowledge can override taste receptors.
–wine judges vary considerably in their quality ratings (single point or ordinal score) of the same wine on differing occasions. BTW, these are statistical tests of existing data, rather than sensory analysis.
In contrast (Joe – correct me if I’m wrong here), from what I’ve read, the VineSleuth sensory research focused on consistent identification or quantification of specific flavors or aromas in a wine, rather than rating or identifying the wine. This use of human “sensory meters” is well-established and documented in sensory research (see for example Sensory Evaluation Practices; Stone, Bleibaum & Thomas).
CMMwine – Not sure I am allowed to confirm that last part, but let's just say that you are on the right track :).
What I meant by the "flies in the face" thing is that at a high level the VS data suggests something opposite to what has been suggested in the coverage of those other studies (that expert wine tasters are not consistent). You are probably *technically* correct that in some of those examples we are not comparing apples to apples, but I am pretty sure that the general wine buying public is NOT reading the coverage of those studies that way. To them, the view is probably one big pile-on when it comes to the abilities of expert tasters in any context (and the VS data, to me, contradicts that view).
Yep, that skepticism is the lawyer in me. :)
I understand about NDAs so I'll have to wait until more details are released.
And I'll ask again, how many wines did you taste for them?
Richard – :) Sorry, missed the question on the volume tasted. I'm not sure I'm allowed to share that number yet, but it's… not insignificant. Let's just say it compares to the volume I've tasted at many of the wine competitions in which I've judged the last few years, so the volume of wines tasted at a session was basically like a day of judging a competition.
1WineDoody,
Isn't scoring wines by intensity sort of like judging music by volume? There's far more to wine than intensity, as you well know. And it seems to me that if you use an app that only reinforces your previous tastes, that's pretty limiting, and not at all the way to learn about wine, or even pairing food and wine.
VineSleuth reads like it's basically Pandora for wine. A pre-programmed crap shoot that might work for the user once in a while, but it's hard to push a "Skip" button after you've purchased and opened a bottle. Another way to trust a wine review created by someone you don't know, no different than any other wine review. It also assumes that new wine lovers are consistent in their tastes, and, worse, that they should be. As with so much else, it treats wine as simply the sum of its parts. Only a simpleton thinks that.
And I've never been bothered by those who criticize wine competition judges, of which I'm one far too often. It's fine, even appropriate, to be skeptical of any results of any wine judging. I don't really think VineSleuth's data validate wine judges anyway, to be honest. Are we consistent when we judge? I hope not. Part of the point of judging with others is to sew a little inconsistency into our old patterns of tasting, open our minds to other people's opinions, and maybe veer from consistency. Consistency is for sauce, not much else.
Hey Ron – I don't disagree about the judging (as you know, and as I've written about that way too much length on these virtual pages already). It is fun, though, to offer a potential counter point on the data side to the unfair pile-on that has resulted on structured wine tasting of any kind lately. As for VineSleuth's app, it's actually a bit different than what you're interpreting (wish I could reveal more in support of that statement, but need to err on the side of NDA caution for now), and I imagine it will be of most use to those who simply don't care to learn about wine that deeply and want a decent glass to enjoy from time to time (and there are a crap-ton of those people fueling the wine market; they don't care about wine appreciation in the same way that that I don't care to learn about needlepoint appreciation – nothing wrong with deep dives into that subject, I just don't care about it with any passion whatsoever). The VS recommendations are not quality assessments or reviews, and to the best of my knowledge my name won't appear on any particular wine that pops up as a result of someone getting a reco. from their app. So it's not really something for those of us who want that deep dive and revel in the joy of it (and there's no way I'd be smarmy enough to suggest that is the only way to enjoy wine – I'm smarmy, but I'm not *that* smarmy…).
Frankly, I just loved the "donkey-bong bunk" exclaimer…still trying to figure that one out. Joe, your poetic waxing and wailing of English prose is reason enough to glance thru…
As for the tasting evals, I have little doubt that certain factors are repeatable and reliable. How to prove it, well, there's the rub, aye? Worthwhile dialogue in any case.
Z
Zoeldar – thanks, that one is a staple here on 1WD. Regarding the data… hopefully VS will open some more of that up to the public in the future (I'm getting the sence that I posted this stuff a bit prematurely). As for you name… it sounds like you once tried to take over the galaxy, is that true??? ;-)
What do you mean "tried "?
Z – my bad, overlord :-)
I'm glad that you elaborated on my article "VineSleuth metrics validate expert tasters," http://www.examiner.com/article/vinesleuth-metrics-valid... Joe. I intentionally drew in wine competitions to fuel the dialog around "the unfair pile-on that has resulted on structured wine tasting of any kind lately." With headlines like the Observer's "Wine Tasting: it's junk science" generating hundreds of comments most of which are uniformed, it's time someone pointed out that there are studies (CMMwine) and metrics like VineSleuth's that prove otherwise. It's my opinion that they validate our professional activities. Apples and oranges, sure, and absolutely guarenteed to draw fire. I'll let you duke it out with the blogarazzi
Deb – thanks! I was happy to see you take up the topic. I'm okay with calling out the BS portions of the wine biz (and there are enough to fuel the media frenzy, for sure) but I (obviously) don't see structured tasting as one of those areas. Cheers!
Great post Joe. At this point in my blog-reading life, I've learned that people believe what they want to believe. That said, I always appreciate some actual research and scientific data. I've done some objective tasting myself, so I know it to be something that is real and repeatable. I'm sure this blog won't stop people from making combative and declarative statements about wine tasting, but I appreciate the voice of reason
Thanks, Gabe. What do you mean this post won't change the future of wine media?!?? ;-)
LoL, I wish it would
Gabe – yeah, I’m still working on that one…
If anyone would like to see a possible explanation on how wine4me works, check out this article I wrote. It also highlights some of the advantages and disadvantages of wine recommendation, compared to music recommendation.
http://www.cellarswineclub.com/wine/index.php/how…
@TheWineByron – Thanks for the link. Another difference: a bottle of wine costs A LOT more to take a chance on than a $0.99 song download…
That's a good point, although I think that, for casual drinkers, this app might be great for some restaurants that sell by the glass. As the problem of open wine bottles gets solved, more and more of these places will be popping up.
I just wish I could remember the name of that wine bottle opener that leaves the cork in…
Here it is! http://mashable.com/2013/07/31/coravin-wine/
@ TheWineByron – I thought about the same thing while working with them…
I'm pleased to see that VineSleuth's app and methods are getting coverage and prompting good discussion. Like Joe and Deborah Parker Wong, I've also served on the tasting panels and been selected as a core evaluator.
Yes, I felt validated at having been selected to the core group, but only because I've spoken at length with Amy Gross, CEO, and Michael Tompkins, CSO, who developed the methodology to test the tasters' consistency and repeatability. I also know quite a bit about their approach because – full disclosure – I've been working with VineSleuth to develop marketing copy for their website, http://vinesleuth.com, and other channels.
Core evaluators are able to repeat their observations 90%+ of the time. Each wine is tested by a panel of evaluators, and the results are run through statistical analysis to reject inconsistencies. The result is a comprehensive picture of a wine that's more objective than if it were evaluated by any single person. The fact that VineSleuth has *also* tested their tasters' consistency adds robustness to the method.
I'm a deeply skeptical person by nature, and like Joe am under NDA, but what I know about the method has convinced me that it's solid. In fact, if I didn't believe in it, I wouldn't be part of the project, and certainly wouldn't be talking about it here.
I'm pleased to see that VineSleuth's app and methods are getting coverage and prompting good discussion. Like Joe and Deborah Parker Wong, I've also served on the tasting panels and been selected as a core evaluator.
Yes, I felt validated at having been selected to the core group, but only because I've spoken at length with Amy Gross, CEO, and Michael Tompkins, CSO, who developed the methodology to test the tasters' consistency and repeatability. I also know quite a bit about their approach because – full disclosure – I've been working with VineSleuth to develop marketing copy for their website, http://vinesleuth.com, and other channels.
Core evaluators are able to repeat their observations 90%+ of the time. Each wine is tested by a panel of evaluators, and the results are run through statistical analysis to reject inconsistencies. The result is a comprehensive picture of a wine that's more objective than if it were evaluated by any single person. The fact that VineSleuth has *also* tested their tasters' consistency adds robustness to the method.
I'm a deeply skeptical person by nature, and like Joe am under NDA, but what I know about the method has convinced me that it's solid. In fact, if I didn't believe in it, I wouldn't be part of the project, and certainly wouldn't be talking about it here.
Thanks. Meg. Will be interesting to see if some of the reactions change if/when more of the methodology is made public. Cheers!