Quantcast
Channel: Management – SCTR7: Data Science and Analytics
Viewing all articles
Browse latest Browse all 75

When analytics fails: fueled by randomness

$
0
0

As an analytics professional, it may perhaps be no surprise that I believe in a world which is susceptible to analysis – a world which yields to scientific inquiry. Further, I believe that if we invest effort, if we exert what Kahneman would call our System 2, we can improve decisions and overcome our tendency (as highly evolved primates) to rely on convenient cognitive shortcuts, to turn to quick answers.

randomness

Fueled by randomness!

However, a recent article chastened me to not to leap to conclusions by putting analysis before problem framing, a common omission in decision making. The logic is both challenging and simple: the world is much more random than we are comfortable to admit.  Our propensity to rush to frame and explain observations lies at the root of many cognitive biases, one of the most fundamental being falsely conflating correlation with causation (a topic which I raise in my recent TedX talk:  Data Analytics and Our Magical Mind).

The message is that the first question we should ask in problem framing is not “what causes this?”, but rather, “which elements of this phenomenon are purely random versus discretely probabalistic?”

The AEON article which inspired me is entitled:  How to choose? When your reasons are worse than useless, sometimes the most rational choice is a random stab in the dark, by Michael Schulson:   http://aeon.co/magazine/world-views/is-the-most-rational-choice-the-random-one/

Schulson’s article is an enjoyable romp through a world revolving on the wheel of fortune, with us running after it, making up stories to explain its fickle perambulations.

The article starts with the story of an anthropologist perplexed by pastoral farmers in Borneo using a seemingly random system of augury involving birds to make crucial planting decisions. The anthropologist desperately wants to find some useful logic in the bird signaling-based farming planning system.

However, his ultimate conclusion is that it was essentially a random process, but one which none-the-less allowed for positive outcomes. He understands that in the unpredictable jungle setting, random distributions of plantings were, in aggregate from a risk management perpective, better off than investing all plantings in a misguided analysis which leads to all ‘bird eggs’ being put into one basket.

More than a curiosity, there is a serious basis for this principle on two particular accounts:

  1. Every which way but lose: when we lack cogent information in an environment of high uncertainty, random choices are often better than attempts to manufacture a decision based upon a faulty logic, especially when the guess leads to centralization (think of investors putting all shares in one industry on a whim); and
  2. Planning ourselves to ruin:  In some markets (framed broadly), attempts to impose a structure through the decision itself have sub-optimal outcomes (there is a rich history on disastrous public policy and ecological decisions that bear witness to this principle).

Beyond this the field of operations research also recognizes that in some cases purely random processes, or a combination of random and structured processes, may be demonstrably more efficient than structured methods. For instance auctions, queuing, or algorithmic approaches impose a structured, rule-based approach to interactions.

While there are many markets that can be made more efficient through auctions, there are some markets that are made more efficient through randomization – a lottery approach. Other schemes being equal, the U.S. offers a lottery system for some green cards, ensuring an equitable distribution.  Similarly, Schulson suggests some judgements with arbitary, difficult to distinguish, or ‘soft’ criteria could be benefitted by the introduction of lottery elements, such as school admissions or figure skating judging.

As Schulson points out, such proposals are controversial as the implication that random processes may be more fair strikes at two common modern assumptions:  “that it is always desirable (and usually possible) to eliminate uncertainty and chance from a situation; and that achievement is perfectly reflective of effort and talent.”  These assumptions, however, often crumble under close observation.

We can observe numerous studies that suggest that monkeys throwing darts at a newspaper, in aggregate, pick better stock portfolios than professional investment managers.

My own experience working with comparative simulations also backs up this principle: some phenomenon reach optimality more quickly via a random selection processes. In terms of analytics tools, multi-agent simulation and genetic algorithms can be applied to rapidly calculate optimal outcomes in complex systems.

Similarly, in complex markets, there is a growing realization that centralized regulatory, rule-based restrictions can often cause outsized dysfunctions. Whereas the U.S. mortgage meltdown has chastened the notion that deregulation achieves optimality, there is a third way: re-architecting incentives in markets. In some cases, it is better to re-align the incentives of individuals and to allow their random interactions to proceed, rather than to impose a centralized, structured set of rules and regulations.

An example that would otherwise be humorous, were it not the daily bane of travellers (and a vast value-wasting sinkhole), is the apparent reality that boarding flights at the airport would actually be more efficient if passengers were boarded, en masse, at random, rather than in rows (although a yet more efficient structured method can be demonstrated): http://www.economist.com/blogs/gulliver/2011/09/boarding-planes-efficiently .  All other things being equal, airlines would simply be more profitable if they at least boarded at random (Soutwest Airlines essentially proves this point).

Schulson’s article continues on an interesting tact to raise the idea of randomly selecting public officials for public administration and the legislature, the principle of sortition. The example of the nine archons in ancient Athens is raised as an example. We could also point to the U.S. jury system as a small form of decision making via a random element – the pseudo-random jury selection process.

The article brought to mind Nassim Nicholas Taleb’s book, “Fooled by Randomnesshttp://en.wikipedia.org/wiki/Fooled_by_Randomness . Taleb takes aim at the human propensity for seeing causation and for constantly attempting to attach an explanation to phenomenon.  He also punctures the notion of success being the direct result of talent and effort, especially in the world of high finance. In case after case, he demonstrates a world which is much more ruled by random, fickle processes than we would otherwise like to admit.

In short, we become quickly uncomfortable with the notion that we have less control than we assume.  We become edgy and disconsolate with the proposition that we are subject to and at the whims of large, lumbering random events. Demographic change, economic fluctuations, natural disasters, political events, business mergers – the world is full of gyrations which are largely outside our control.

Still, we strive, indeed we are driven, to explain and to attribute causation, often as a desire for comfort and for a false sense of control. Kahneman reflects upon this propensity in “Thinking, Fast and Slow” (http://en.wikipedia.org/wiki/Thinking_fast_and_slow) by suggesting that we evolved inbuilt mental mechanisms leading us to jump to causal assumptions and to frame explanations based on biases.

His argument is that on the primordial savanna even if a rustle in the grass was a tiger only once in a thousand cases, the lethal accretion of those unlucky individuals who improperly ignored the unlikely signs of danger was enough, over millennia, to evolve highly charged ‘meaning antennas’.  A detailed discussion on this evolutionary perspective is available here: http://sctr7.com/2014/05/21/our-magical-minds-redux/ .

The problem, then, in modern life is that we are deluged with signs, symbols, and warnings, many of which are random and meaningless noise, but for which we none-the-less seek explanations. We are driven to connect phenomenon to causes – we are all natural scientists, if you will, but often lack the requisite scientific training to distinguish signals from noise.

This is healthy chastisement for analytics professionals as a whole, myself included: the application of an arsenal of analytics tools alerts us to a slew of correlating phenomenon in any typical large data set. However, which of these are phantom patterns? Which are latent variables which we potentially misconstrue as causal?

Perhaps a good adage for the data scientist to pin to their mirror when they brush their teeth each morning is: “Good morning! The world is frighteningly, fundamentally random. Always question the apparent TRUTH as potential CHANCE… Have an uncertain day!”

Upon consideration, however, it struck me that, analytics can at least assist us in identifying when we are in these types of scenarios: those which are purely random or subject to lottery-optimal solutions. Also, analytics can assist us in quantifying, catagorizing, and potentially containing uncertainty in terms of probabilities.  This is otherwise the fast-moving intersection of risk management and analytics in a bubble.

Having worked with complex simulations, I am particularly enthusiastic about Monte Carlo simulation: by aggregating many probabilties which together contribute to a larger case, it is possible to get a macro-understanding of phenomenon and to conduct scenario analysis.

Once macro-phenomenon has been more clearly understood in terms of contributing micro- probabilities, controls can be put in place and scenarioes can be accomodated through sensitivity analysis.  Risks can be insured, mitigated, controlled, or offloaded.  This is otherwise the domain of risk analytics and finds particularly elegant methods in the world of structured finance and project finance.

My take-away from the article was a healthy warning to re-examine assumptions concerning causation and predictability.  Many seeming patterns are random when examined over the long-term, or a revision to the mean principle dominates over misleading short-term trends.  Also, the article re-vivified my conviction that randomness does not mean that we cannot improve a situation: by breaking down a phenomenon into more fundamental elements, we can better understand which components are probabalistic from those which have patterned and predictable aspects.  As such, there is hope in terms of being able to segment and control risks.  The key is to keep the possibility of randomness in mind when attempting to explain phenomenon.



Viewing all articles
Browse latest Browse all 75

Trending Articles