peopleanalyst

library / lib4c98f8e13319104c

Big Data: A Revolution That Will Transform How We Live, Work, and Think

Viktor Mayer-Schönberger, Kenneth Cukier · 2013

In a sentence

Big data—the ability to analyze vast quantities of information rather than samples—is transforming how we understand the world by privileging correlation over causation, scale over exactitude, and prediction over explanation.

Big Data: A Revolution That Will Transform How We Live, Work, and Think argues that we are entering a new era in which the sheer scale of available data changes not just what we know but how we know it. Mayer-Schönberger and Cukier show how using all the data (N=all) instead of samples, accepting messiness instead of demanding precision, and embracing correlation instead of obsessing over causation unlocks enormous new economic and social value—from predicting flu outbreaks and airfare prices to preventing infrastructure failures and saving premature babies. Through vivid examples (Google Flu Trends, Farecast, Amazon recommendations, exploding manholes), the authors explain the mechanics of datafication—rendering ever more aspects of reality into quantifiable, analyzable form—and reveal how data's value increasingly lies in its reuse and option value. But they also confront the dark side: the erosion of privacy, the menace of punishing people for predicted (not actual) behavior, and the danger of a 'dictatorship of data.' The book offers a framework for governance—shifting from individual consent to data-user accountability, safeguarding human agency, and creating a new profession of 'algorithmists.' It is at once an enthusiastic primer, a business strategy guide, and a cautionary meditation on humanity's place amid a quantifiable world.

The four lenses

  • Science
  • Statistics
  • Systems
  • Strategy

The model

A causal-framework model expressing how design levers (using all data, accepting messiness, prioritizing correlation, datafication) and a big-data mindset drive psychological and behavioral states (data reuse, prediction-based decision-making) that produce outcomes (economic value, predictive accuracy) while also generating risks (privacy erosion, loss of human agency, dictatorship of data) that governance mechanisms moderate.

Using All the Data (N=all)design lever

The practice of analyzing the comprehensive or near-comprehensive dataset relating to a phenomenon rather than relying on a statistical sample, enabling granular insights into subgroups and outliers that sampling cannot reveal.

Embracing Messiness (Imprecision Tolerance)design lever

The willingness to accept inexactitude, inconsistency, and lower-quality data in exchange for far larger volumes, trading micro-level accuracy for macro-level insight and the ability to capture unstructured information.

Prioritizing Correlation Over Causationdesign lever

The analytic orientation toward identifying statistical associations and useful proxies (knowing what) rather than insisting on understanding underlying causal mechanisms (knowing why), enabling faster and cheaper insights and predictions.

Dataficationdesign lever

The process of rendering aspects of the world—location, sentiment, posture, relationships, behaviors—into quantified, tabulated, analyzable data formats, distinct from mere digitization, thereby unlocking latent informational value for new uses.

Big-Data Mindsetpsychological state

The cognitive orientation that recognizes latent and option value in data, imagines novel secondary uses, and frees itself from conventional thinking about what is feasible, often held by creative outsiders rather than domain incumbents.

Data Reuse and Recombinationbehavioral pattern

The behavioral pattern of applying collected data to multiple purposes beyond its primary use—through basic reuse, merging datasets, extensibility, and capturing data exhaust—thereby releasing data's latent option value.

Prediction-Based Decision-Makingbehavioral pattern

The behavioral shift toward augmenting or overruling human judgment with data-driven predictions and correlations, replacing intuition and subject-expertise with statistical models in operational and managerial decisions.

Economic Value Creationoutcome metric

The outcome whereby big data becomes a vital economic input and corporate asset, generating new goods, services, business models, productivity gains, and competitive advantage for those who hold and analyze data effectively.

Predictive Accuracy and Insightoutcome metric

The outcome of improved ability to forecast events, detect anomalies, identify trends, and prevent problems—such as flu spread, equipment failure, infection onset, or fire risk—through large-scale correlational analysis.

Privacy Erosionoutcome metric

The risk outcome whereby the scale and reuse of personal data, combined with the failure of anonymization, notice-and-consent, and opting out, exposes individuals to surveillance and re-identification.

Loss of Human Agency (Predictive Punishment)outcome metric

The risk outcome whereby big-data predictions of future behavior are used to judge and punish individuals for propensities rather than actions, negating free will, individual responsibility, and the presumption of innocence.

Dictatorship of Dataoutcome metric

The risk outcome of fetishizing data and predictions—becoming mindlessly bound by analytic output, collecting data for its own sake, or attributing undeserved truth to figures—leading to misuse and impoverished judgment, as exemplified by McNamara's body counts.

Governance Safeguardscontextual condition

The contextual mechanisms—shifting privacy from consent to data-user accountability, protecting human agency, employing algorithmists, and applying antitrust-style regulation—designed to contain big data's risks while enabling its benefits.

How they connect

  • use all data predicts predictive accuracy
  • embrace messiness influences use all data
  • prioritize correlation predicts prediction decision making
  • datafication predicts data reuse
  • big data mindset moderates data reuse
  • data reuse predicts economic value creation
  • prediction decision making predicts economic value creation
  • prediction decision making predicts predictive accuracy
  • data reuse predicts privacy erosion
  • prediction decision making predicts loss of human agency
  • prediction decision making predicts dictatorship of data
  • governance safeguards moderates privacy erosion
  • governance safeguards moderates loss of human agency
  • governance safeguards moderates dictatorship of data

A candidate measure

Big Data: A Revolution That Will Transform How We Live, Work, and Think — derived measurement candidates

Using All the Data (N=all)

ratio of analyzed to total available data; number of subcategories analyzable; frequency of sampling vs. full-dataset analysis

self-report suitability: low

Embracing Messiness (Imprecision Tolerance)

proportion of unstructured data used; stated error-tolerance thresholds; adoption of noSQL/Hadoop tools

self-report suitability: medium

Prioritizing Correlation Over Causation

share of correlational vs. experimental methods; decision rationales citing 'what' not 'why'; proxy usage frequency

self-report suitability: medium

Datafication

count of datafied domains; volume of quantified records; sensor coverage

self-report suitability: low

Big-Data Mindset

count of new data-reuse ideas; diversity of secondary uses proposed; innovation outputs

self-report suitability: medium

Data Reuse and Recombination

number of distinct uses per dataset; count of dataset merges; revenue attributable to reuse

self-report suitability: medium

Prediction-Based Decision-Making

proportion of decisions model-driven; degree of automation; expert-vs-model override rates

self-report suitability: medium

Economic Value Creation

data-product revenue; productivity differentials (e.g., 6%); market-vs-book value gap

self-report suitability: low

Predictive Accuracy and Insight

prediction hit rate; error margin; lead-time of warning

self-report suitability: none

Privacy Erosion

re-identification incident counts; volume of personal data aggregated; perceived privacy-loss surveys

self-report suitability: medium

Loss of Human Agency (Predictive Punishment)

presence of propensity-based punishment policies; prevalence of pre-crime interventions; share of decisions based on predicted vs. actual acts

self-report suitability: low

Dictatorship of Data

instances of metric fetishism; decisions made despite known data flaws; absence of data-quality scrutiny

self-report suitability: low

Governance Safeguards

presence of accountability rules; openness/certification/disprovability requirements; number of algorithmists; differential-privacy adoption

self-report suitability: low

Run the assessment

The story

The reader A curious manager, technologist, policymaker, or citizen who wants to understand and harness the transformative power of big data while navigating its risks.

External problem

Overwhelming volumes of data are reshaping business, science, and society, yet most people lack a framework for understanding what big data is and how to extract its value.

Internal problem

They feel whiplashed by information overload, uncertain whether big data is hype or revolution, and anxious about its threats to privacy and freedom.

Philosophical problem

It is wrong to cling to outdated assumptions of information scarcity, exactitude, and causality when a new paradigm of abundance, messiness, and correlation offers deeper understanding.

The plan

  1. Recognize the three shifts: use more data (N=all), embrace messiness, and favor correlation.
  2. Adopt a big-data mindset: see latent value in data and imagine novel reuses.
  3. Position yourself or your organization within the data value chain (data, skills, or ideas).
  4. Collect data with extensibility and option value in mind.
  5. Establish governance through accountability, protecting human agency, and expert oversight.

Success

  • You extract new value and insight from data others discard or overlook.
  • You make faster, better-informed decisions by letting data speak.
  • You anticipate and prevent problems before they occur.
  • You help build governance that captures big data's benefits while safeguarding privacy and freedom.

At stake

  • You remain trapped in small-data thinking and miss enormous value.
  • Competitors who master data outpace and displace you.
  • Society drifts into a dictatorship of data, eroding privacy, free will, and justice.
  • Predictive systems punish people for what they might do rather than what they have done.