Insight, the counter factual to what we know


dot dot dot

A belated response to a package I unexpectedly left for a friend who apologized for not acknowledging its arrival sooner. ended with

” more importantly , I am at a loss for words.  Truly I don’t know what to say . ”

My return reply was this:

Its ok,

“it’s possible to talk about something and have the words themselves not be very telling”

associations ..the dog that didn’t bark?
I stole the quote, but it’s applicable
We live in strange times, and life is strange.
Obviously i was feeling a bit spunky…just looking at the word, makes me laugh.

to exchanges robbed of words but positive sentiments
you are very welcome.

His response:

“Riddle me this.”

I share this as example of human to human conversations.
There’s no chance that a cheeky chatbot would have written such a response. It’s why people are less predictable, making their every move less certain and the exciting part–capable of learning–both to be good or to be bad.

This was my reply.

riddle suggests you want an answer,
don’t have a personal one,
here’s what I’ve been reading and thinking
One. Roger Schank’s latest blasts courtesy reminder from another cognitive scientist acquaintance I made recently:
(Roger writes some great pieces on this…if you find it interesting as I do, here’s another series
Two, the dog that doesn’t bark? Shorthand for a famous turn of the phrase by Conan Doyle ascribed to Sherlock holmes seeking to solve a crime in which it was the absence of information that he cleverly used to solve the crime.  (see the short story  The Adventure of Silver Blaze). He collected the data, framed it in context to get information and then  use the counterfactual to obtain the insight.  What happened, what did you see or hear? When do dogs not bark?  when they recognize someone they know, so obviously it was the trainer….
[note this reference appears in The Big Short  too…great movie!]
Three, Comment I noticed by Leda Glyptis who is now at Sapient…what a great bio!
Extracting value from a wealth of structured and unstructured data, however, is not as much a technical problem, as a business problem. Technical heavy lifting will undeniably be needed to get you from having a ‘data lake’ to being a data-driven organisation but fundamentally: saying ‘there are 10,000 species of snake in the world’ is data; ‘there’s one under your seat’ is information; ‘it is asleep right now, so you can get up and walk away’ is actionable insight.

The whole interview is here:   http://www.femtechleaders.com/europe/leda-glyptis/

My point –make the time to be human, take the time to notice and connect to more of what you know.  The payoff? Surprising insights will be yours

 

my big Data Donut


Two days in a row I managed to catch very different talks about big data, but came away with one big duh and several new insights.  In short, my prior training and experience using analytics to drive strategic decision-making placed me comfortably up the curve.  In return for my limited investment of time and attention, I gained a few new ideas, collected some cogent descriptors to share with clients and reawakened  elements in my strategic thinking process.

Big DATA , just a conjunction 

We all know Big because we know small. Everything classifies as one, when we decide it’s not the other. Big is also a euphemism for many.  Statistically, the bigger the sample, the greater it’s  significance. Bigness insures enough cases to draw general conclusions about a population.  Most of the time we don’t care about the population but we do care that a sample represents the population we care about.  An “Everyman” should be average and appear at the top of the bell curve, or normal distribution, right? Will being average, change the odds of being big or small? hold that thought.

We recognize data when we see it too. In excel, Big spreadsheets contain many rows and or many columns of stuff that we call data.

Changes in technology bring more data, we record and keep records of events that previously were not possible to record. More data gets created when instruments simplify its recording over ever smaller intervals.  For example, satellite data records and transmits continuously atmospheric particle movements,  Nike’s Fuel metrics measured by its band can provide streaming location data of people’s changing heart rate.

Put the Big together with Data along with the ease of access and you find yourself understanding Big Data coincident with the cultural shift  Big Data’s wider access produces.

If you build it they will come

In Big Data’s case, technology shifts made lots of data more accessible which increased people’s application in their decision-making.  At this hour, I can hear the helicopters hovering over the major highway junctions nearby to monitor traffic and issue the reports broadcast over radio and TV.  Everyone wants to avoid sitting in traffic, and their consumption of this information and decisions of when and which route they drive naturally impacts the pattern.  The widespread availability of GPS and map services rely on alternative information sources to generate traffic congestion maps , and influence consumer travel decisions as well.  Don’t you rely on one or more of these information sources? Why? few of us know the details behind the projection.  Instead,  we feel better with more information available, after all,  traffic information helps us avoid the inevitable–the likelihood of being stuck and delayed in rush hour.

Bottom line, consumption makes Big Data valuable. Its availability  raises questions, but we often skip the critical ones.  We ponder its use, before questioning its reliability as in what do I do with it? How can and should it impact my decisions?  

Why?

Humans’ daily actions rely on the process of cause and effect.  I turn on the faucet to make water come out.  I say “please,” you say “thank you.”  How many miles must I run to burn off the Fat calories I consumed eating a donut for breakfast?   Hmm, can I measure my fat burn rate? If I work for the donut producer, I may focus on the sales effects that result from posting this information.

These sets of  reactionary questions miss the opportunity set that Subway anticipated and took to the bank.  I don’t know the story behind Subway’s marketing strategy , haven’t looked into the chain’s profitability, but they clearly seized advantage of a trend fueling both  awareness and their revenue. They twisted the cause effect to create a successful Cause marketing campaign.

Worry about Bad not Big Data

In the second talk, Casey Winters, the head of digital marketing for a growing web-based start-up called Grub Hub spoke about the poor decisions being made using vanity metrics.  Traffic isn’t a new metric for retailers or commuters.  In business, Cost per Acquisition, Lifetime Value and Conversion rates represent a few key performance metrics that when properly calculated, effectively drive strategic investment decisions.

The challenge today isn’t their availability as much as their reliability.  More sources  of information reflect the ease with which some data can be measured.  For example, Google Analytics offers the basic traffic stats freely to any website who embeds their code.  Advertising agencies spent a decade redefining themselves to be digitally capable, and help their clients use these new tools to distribute their marketing dollars to physical and virtual locations.  The result, more data and Data Scientists emerging as guides through the complexity associated with Big Data.

STOP making Data into donuts

More data spread around doesn’t make anyone smarter, especially when not all available measurements of existing data prove trustworthy. Standards help a lot, but they may not  sufficiently help separate the noise from the signal. Don’t just use the data that’s available but be sure you understand its creation.  Take the case of the glazed donut comparisons shown above between Krispy Kreme’s Famous calculated calories to Dunkin’s Glazed donut figures.  The fact that they appear together in one chart doesn’t mean their calculations used the same computation process.  The information on its face lead to one conclusion, which may or may not support your own experience of these donuts.  Haven’t you already  put that experience to use and attributed  the observed differences’ cause to something other than the method of calculation?   In short, you used cause and effect favoring intuition over critical thinking.

When it comes to talking about strategy,  we often forget to ask the questions before we pull the data.  ROI may justify one investment choice over another and then again it may merely be used to confirm the value of your investment decisions after the fact.  Data should move you from insight to reality.  Remember a dot in one dimension is a line in another, the value of the era of big data increases our opportunity to capture more dimensions.  The challenge is using data to gain more perspective and beware of our biases.

Big Data and more statisticulation


by Irving Geis in How to Lie with StatisticsToday’s New York Times made my hair stand on end. Kenneth Chang’s piece,  Parsing of Data Led to Mixed Messages on Organic Food’s Value,  admirably points out how easy it is to confuse the public. The article raises good questions about research methods, standing by institutional reputation not merit.  I wonder  Is this  just another sign of academic practice needing a bit of a makeover?

The Case in point

Stanford, whose sterling reputation lifted higher with news celebrating its latest Nobel prize winning economist, added new fodder to the debate of organic vs. traditional agriculture methods and practices.  The respect enjoyed by Stanford university and its published research shuts down public questioning and personally, reminded me of Darrell Huff‘s 1954 slim volume How to Lie with StatisticsA sentiment also captured whimsically by  Irving Geis in his  illustration(shown on the left)  in the same volume.

Huff coined the phrase “statisculation,” to refer to those who use statistical material to misinform people.  Chang’s article doesn’t judge. He merely helps  innocent readers understand something about meta-analysis, a perfectly acceptable method that produces very useful and cost-effective insights by re-examining other researchers’ data. Also customary in scientific circles, the Stanford published results and conclusions, raise questions and challenges pressed by other researchers.

The problem is that this dispute and airing works for closed loop academic circles where everyone understands and accepts the evaluation process, and accepts that knowledge and acceptance never comes with only one paper.  Chang’s article points out that Kirsten Brandt, the scientist who led the England Newcastle University published 2011 study  though also a meta analysis, produced very different results, did not enjoy the wider public audience and hit the social media airwaves.  Consumers  making daily buying decisions instead are made vulnerable by sound bytes shared by those who stand to benefit the most from the Stanford University researchers’ findings–that organic doesn’t offer material differences.

On average Stanford’s findings may be true, but since when do we actually care about the average?  The tomato I buy isn’t average, and how, where and when its produced changes its quality.  For example, this year the drought didn’t produce the same volume or quality of tomatoes locally as last year. Chances are the nutritional quality differs too. So why would averages across geography and year prove meaningful to compare?

Big Data and its ongoing hype does offer  greater opportunity. The capability to look at and analyze more attributes, makes it possible to discover more relationships. produce new insights, increase our understanding of subtle differences into the relationship between things or ideas.  The cost/benefits of this explosion of hype around the promise of big data and analytics however in the short-term may prove disproportionately beneficial.  As the Chang article illustrates, more data doesn’t improve the analysis quality. Limited capability to test and engage in the healthy scientific defense process, may lead to greater manipulations or statisticulation.

Few people capably and competently understand differences in error terms or appreciate that some reported differences just aren’t as significant as others because the method relied on averages and not the full range of the distribution. If I’ve already lost you, then perhaps you should take a look at Darrell Huff’s book.

Required reading for any published author and every student is a basic primer in statistics.

 

Social Media great for insights not prediction


An example of the share buttons common to many...

An example of the share buttons common to many social web pages. Thanks to http://www.nouveller.com for the free icon pack image. The author (Benjamin Reid) releases the image into the public domain, with the following text available at the source page: “You can use them anywhere you like, absolutely anywhere, anything. No attribution, 100% free.”. (Photo credit: Wikipedia)

Is it really surprising that on social media, generally speaking, people share more emotionally linked thoughts?

What People Really Want vs. What They Share on Social Media.

For my money this is not much of an insight.  After all, humans, like many other animals, are social creatures. From birth, our lives depend on others. In time, those who bring us along and introduce us to the ways of the world nurture specific beliefs and frame our understanding of the world.  Our connections to others are vital to our survival, happiness and success.

Social media simplifies our ability to share and connect. The social impulse that compels us to take part naturally mirrors underlying, maybe even unconscious emotions. The result is a natural  association between content and intention rooted in sentiment. Following the tradition of anthropology, or design research, self-reported assertions such as our tweets or Facebook updates can prove revealing. Tracking and tallying these qualitative data crumbs outline a wider system of association linkages and are wonderful additions to descriptive analysis. Whether linked specifically to more traditional demographic variables or not, they show characteristics,  detect relationships about something or someone; but are no proportional in their representation.

Infographic on how Social Media are being used...

Infographic on how Social Media are being used, and how everything is changed by them. (Photo credit: Wikipedia)

So what’s the problem? Insights don’t scale. The accompanying graphics suggests that there’s added value, and maybe there is for the casual observer, but at the moment I’m not convinced.

Problem Re-framed

Last week, I shared lunch with a group of people familiar with both quantitative and qualitative research methods to talk about big data.  Design, or anthropology, research methods focus on observing very small groups of subjects in natural conditions.  Watching people as they shop, work, make dinner, go to work etc. The data and analysis skews to the qualitative. Watching what people do has always proved to be more reliable a predictor than asking what they think. Researchers long ago discovered the knowing vs. doing gap.

For the less statistically inclined, probability sampling is necessary but not itself sufficient to make claims about a larger population group.  Exercising diligence in selecting a random sample to ask a series of questions, or observe them can still produce bias or large errors in the results if input from those who respond or were readily available are included.  All surveys include a margin of error due to sampling. National voter exit polls, for example, carefully sample to keep their  margin of error for a 95% confidence interval low, e.g. about +/- 3% . ( For further information check out: Edison research on exit polls)  The margin of error on public opinion polls asking what people believe and for whom they plan to vote is wider than the post voting survey results taken at the polls.

Diary studies illustrate the value in subjective research. Sure, the results are challenging to extend and difficult to scale as the richness of this data does not easily lend to classic systems analysis.  Often in the hands of the experienced researcher, the subtle presence or absence of contextual cues lead to new insights, or deeper understanding of the situation, or present circumstances responsible for a behavior.  Researchers isolating the specific cues come closer to understanding our inner nature and then developing insights into cause and effect.

Build it and….

The inspiration implied in the phrase if you build it they will come, suggests knowledge of what and how to build, this intuition may come from subjective research.  Note, the phrase is neither strategic or predictive of the number or timing of visitors.  Contrast anecdotal indicators to an algorithm churning through significant quantities of transactions to find common elements, the co-related information.  Observational data offer context, while the algorithm provides the measure of total significance.

Cover of

Cover via Amazon

If we’ve learned anything from the work of the behavioral economists, humans are predictably irrational.  Why?  The relative strength of an emotion can but doesn’t necessarily overcome reason.  The contextual elements trigger both specific behaviors, as well as unexpected associations and very different behaviors.

We are far from understanding how to successfully integrate expressed wants social media provides with analysis of objective, aggregate data.

As Steve Smith, of Pegasus Capital Advisors suggests, there is great power in pushing the economics analysis up the value chain.  Social media doesn’t create the transaction, the risks focus on reputation which has implications but has yet to disrupt the flow or more accurately allocation of capital.

I’m looking forward to seeing the continuing evolution of social media and the teams of marketing analysts familiar with statistical sampling to help chart a new course. It would be

great if they can help lead the charge toward a more robust metric of success.  One that favors the quadruple bottom line and thus captures Environmental, Social, Cultural (including governance) and. Economic factors.