Big Data and more statisticulation

by Irving Geis in How to Lie with StatisticsToday’s New York Times made my hair stand on end. Kenneth Chang’s piece,  Parsing of Data Led to Mixed Messages on Organic Food’s Value,  admirably points out how easy it is to confuse the public. The article raises good questions about research methods, standing by institutional reputation not merit.  I wonder  Is this  just another sign of academic practice needing a bit of a makeover?

The Case in point

Stanford, whose sterling reputation lifted higher with news celebrating its latest Nobel prize winning economist, added new fodder to the debate of organic vs. traditional agriculture methods and practices.  The respect enjoyed by Stanford university and its published research shuts down public questioning and personally, reminded me of Darrell Huff‘s 1954 slim volume How to Lie with StatisticsA sentiment also captured whimsically by  Irving Geis in his  illustration(shown on the left)  in the same volume.

Huff coined the phrase “statisculation,” to refer to those who use statistical material to misinform people.  Chang’s article doesn’t judge. He merely helps  innocent readers understand something about meta-analysis, a perfectly acceptable method that produces very useful and cost-effective insights by re-examining other researchers’ data. Also customary in scientific circles, the Stanford published results and conclusions, raise questions and challenges pressed by other researchers.

The problem is that this dispute and airing works for closed loop academic circles where everyone understands and accepts the evaluation process, and accepts that knowledge and acceptance never comes with only one paper.  Chang’s article points out that Kirsten Brandt, the scientist who led the England Newcastle University published 2011 study  though also a meta analysis, produced very different results, did not enjoy the wider public audience and hit the social media airwaves.  Consumers  making daily buying decisions instead are made vulnerable by sound bytes shared by those who stand to benefit the most from the Stanford University researchers’ findings–that organic doesn’t offer material differences.

On average Stanford’s findings may be true, but since when do we actually care about the average?  The tomato I buy isn’t average, and how, where and when its produced changes its quality.  For example, this year the drought didn’t produce the same volume or quality of tomatoes locally as last year. Chances are the nutritional quality differs too. So why would averages across geography and year prove meaningful to compare?

Big Data and its ongoing hype does offer  greater opportunity. The capability to look at and analyze more attributes, makes it possible to discover more relationships. produce new insights, increase our understanding of subtle differences into the relationship between things or ideas.  The cost/benefits of this explosion of hype around the promise of big data and analytics however in the short-term may prove disproportionately beneficial.  As the Chang article illustrates, more data doesn’t improve the analysis quality. Limited capability to test and engage in the healthy scientific defense process, may lead to greater manipulations or statisticulation.

Few people capably and competently understand differences in error terms or appreciate that some reported differences just aren’t as significant as others because the method relied on averages and not the full range of the distribution. If I’ve already lost you, then perhaps you should take a look at Darrell Huff’s book.

Required reading for any published author and every student is a basic primer in statistics.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s