Blog Viewer

Book Review of How Data Happened

  

BOOK REVIEW

How Data Happened: A History from the Age of Reason to the Age of Algorithms, by Chris Wiggins & Matthew L. Jones, New York: W.W. Norton & Company, 2023. 384 pages. ISBN: 978-1-324-00673-2

image

This book promises a comprehensive history of data with its technical, political, and ethical impact on individuals and authority. The book starts by discussing the importance of understanding data and its role in human society. A day after this book's worldwide release, on 22nd March 2023, more than a thousand technology leaders and researchers, including Elon Musk, urged artificial intelligence labs to halt development of the most advanced systems, warning in an open letter that A.I. tools posed "profound risks to society and humanity" (Vallance, 2023). 

The authors' inspiration, rooted in CP Snow's Two Cultures philosophy, spurred a cross-disciplinary Columbia University course. Their book emerged from these diverse perspectives. Professor Matthew Jones examines mathematics' societal impact, contrasting with Chris Wiggins' exploration of algorithmic influence as a New York Times data scientist. Originating from a 2015 discussion among undergraduates, this interdisciplinary approach reimagines statistics' evolution from politics to its role in shaping today's digital reality.

The authors stress understanding the history beyond current news helps grasp today's situations and their outcomes. They mention scholars like Shoshana Zuboff and Cathy O'Neil, who highlight how fast and widespread automated systems replicate past inequalities (Ellinger, 2020). They propose a different view of history, focusing on decisions and debates often overlooked. The book hints that seeing history this way can empower those feeling overwhelmed by data-driven dominance, giving a sense of control in today's world. Understanding this context offers insights into contemporary issues and their resolutions.

The book is divided into three sections containing thirteen chapters. In the first section, data is persuasive because it is used to support what is true. In the second section of the book, as data becomes a widely deployed technology, data itself becomes the source of power. In the third section, the authors discuss the future of data power by presenting an analytical framework of power in terms of the state, the institution, and the individual. The authors contemplate one intellectual transition in each chapter. Instead of adopting a linear chronology, they employed a retrospective methodology, highlighting pivotal historical occurrences that have influenced the current state of affairs in the domains of politics, professions, and personal lives. They discuss how a new technical and scientific capability was developed, who supported, advanced, or funded this capability or transition, how this transition was contested, and how this new capability altered the balance of power.

The historical exploration commences in 18th-century Europe, pinpointing 1770 as the advent of the term "statistics" into English lexicon, signifying the integral link between population enumeration and governance. Adolphe Quetelet, a Belgian astronomer initially interested in celestial study, navigated post-revolutionary chaos and revolutionized astronomy techniques to dissect social quandaries. His visionary adaptation of astronomy methods for societal analysis laid the foundation for new understandings of society's mechanics. Later, Galton, in 1889, utilized these techniques to assess Britain's governance potential, pioneering mathematical tools within social physics. The book dissects these innovations, revealing the transformation of data, derived from celestial predictions, into tools for social inference. It critiques the presumption that science validates human hierarchy, exemplifying this through historical figures like Quetelet and Galton. The discussion shifts, highlighting Florence Nightingale's statistical prowess, emphasizing her revolutionary data visualization strategies in healthcare management. Moreover, Professor Mahalanobis's statistical contributions shaped the Indian statecraft, underscoring his critique of caste reification through statistics. The book paints a historical portrait, following influential figures and challenging the presumptions of scientific validation in societal structures.

In the second section, the authors describe how data became a form of technology and how much of digital computation was derived from the interpretation of data streams, namely cryptography. Before Bletchley Park, Polish mathematicians had demonstrated to British and French mathematicians how to break codes using special-purpose hardware. Bletchley Park brings to mind Alan Turing, not as the father of Theoretical Computer Science but as a pioneer of data computation, who collaborated with more than 8000 under-represented women staff to design machines that processed real world data science problems (Smith, 2015). Post-World War II, the United States government and what comes to be known as the military industrial complex provided substantial funding to IBM and Bell Labs for the development of digital computation (Reed, 2012). As data evolved into a scalable technology, the book shifted to a discussion on how society began recognizing this excessive data power as a guise of state power, which had to be opposed. This primarily focused on the privacy debates of the 1970s, when individuals were concerned about the state amassing vast quantities of personal information. People were simply unconcerned about companies possessing all data. They were concerned that the state would possess all of their information. The authors initiate the final chapter of the second section with a quote from Allen Ginsberg's poem Howl: "I witnessed the greatest intellects of my generation destroyed by madness." These lines conveyed a sense of the chasm between the extraordinary reach and capacity of these new technologies and the incredibly narrow furrows into which they were being pushed by vested corporate and governmental intelligence. This is reminiscent of the existential angst caused by the mismatch between the wonderful new technologies that data science is currently producing and its most prevalent use, which is to get people to click on advertisements.

In the concluding section, the authors divert from Michel Foucault's and Gordon Gekko's views on power, using a metaphor of an unstable three-player game. Gordon Gekko, from the movie "Wall Street," epitomizes the pursuit of power for personal gain. In contrast, Foucault examines power within societal structures. The authors’ analogy however draws from William Janeway's book "Doing Capitalism in the Innovation Economy," which discusses three forces propelling technological progress: government-backed initiatives, entrepreneurial endeavors, and speculative markets. Janeway emphasizes their roles in advancing innovation and the economy, shaping the direction of power. To adequately contextualize the discourse, the authors studied the sources of power, particularly emphasizing the dynamic interplay between corporate power, governmental power, and people power, where the word “data” became shorthand for data driven algorithmic decision making systems. The first chapter of this section begins with the taxpayer-funded Tuskegee Experiment on the Black male population of Alabama which compelled the U.S. government to develop a specification for ethics known as the Belmont Report in 1978. This context is helpful for understanding how ethics can be an applied field that constrains people's choices. The administrative procedure of obtaining informed consent, which was based on the three principles of beneficence, justice, and respect for personhood, has facilitated the adoption of an empowering approach in the context of AI ethics conflict. This approach can be roughly translated into three fairness measures, namely independence, separation, and sufficiency.

The authors then take us into the corporate world, venture capital-fueled investments in technology, and advertising-driven internet platforms, where they discuss the effects of the attention economy and explore how our current algorithm-mediated reality operates in such a setting. Advising against standalone analysis, the authors stressed the importance of comparing contemporary language models, such as ChatGPT, with predecessors like ELIZA, an AI spoof from 1966. ELIZA functions with fixed rules, whereas ChatGPT predicts using internet data, potentially reproducing copyrighted content without proper attribution. Understanding ChatGPT necessitates historical context. The authors also examine how venture capital enables rapid growth and innovation of start-ups, often at the expense of ethical and social considerations, and how it creates a feedback loop of data and power among dominant platforms.

The final chapter offers insights into the future nexus of data and power, advocating for collective agency in steering toward a more equitable trajectory. Employing a historical lens, the chapter delineates the evolution of data-driven technologies, emphasizing shifts in intellectual, financial, and power landscapes. It highlights ongoing struggles between corporate, state, and individual powers across legal, ethical, and educational domains. Emphasizing the ethical and political implications, it urges interdisciplinary collaboration for fostering transparent, fair, and accountable systems.

This book provides a compelling and comprehensive journey through the history and societal impact of data, offering critical insights into its evolution and influence on politics and power. It navigates from historical origins to present-day implications, engagingly addressing ethical challenges and envisioning more equitable futures. Suited for a wide audience curious about data's role, including students, researchers, practitioners, and policymakers, it caters to both newcomers and those familiar with data concepts. The book's endnotes and references further support exploration and deeper understanding. However, this book has three major limitations. Firstly, it does not address the growth of operations research during World War II. However authors balances this shortcoming using numerous references to Leon Brieman and his classic paper on the "two cultures" of statistical modeling that paved the way for data science to join mathematical statistics as a mainstream subject (Raper, 2020). Secondly, the authors discuss developments in the United States, the United Kingdom, sections of Europe, and India, but they do not explore the field of social cybernetics pioneered by Norbert Wiener and its subsequent impact on Chinese society due to the social credit system pioneered by Qian Xuesen (Writer, 2019). Finally, the course module taught by the authors has substantive coding lessons using Google colab notebooks that are not provided to the readers of this book. For example, functional engagement using pre-authored Jupyter notebooks frequently results in data categorization that can subsequently emerge into novel correlational structures (Alaimo & Kallinikos, 2020). In contrast, the act of passively reading the book by scholars in the field of organizational study restricts their participation solely to the utilization of rhetorical or critical perspectives. 

The authors urge an introspective analysis for effective data utilization. They seek to elucidate data's scientific and regulatory constraints, power dynamics, political influences, and historical context. They stress that "data," from the Latin "datum," isn't inherently “given” but sought. They stress data as "CAPTA," acknowledging its captured, rather than inherently given nature in historical contexts. The book challenges the notion of data as flawless, emphasizing its human-made nature, embodying biases in decision-making processes. Refuting the sanctity of given data, the book highlights its subjective and influenced essence.

Reference

Alaimo, Cristina., & Kallinikos, Jannis. (2020). Managing by Data: Algorithmic Categories and Organizing. Organization Studies, 42(9), 1385–1407.

Ellinger, Eleunthia Wong. (2020). Book Review: Shoshana Zuboff The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. Organization studies, 41(11), 1577–1584.

Raper, Simon. (2020). Leo Breiman's "two cultures". Significance, 17(1), 34-37.

Reed, Michael I. (2012). Masters of the universe: Power and elites in organization studies. Organization Studies, 33(2), 203-221.

Smith, Christopher. (2015). The hidden history of Bletchley Park: A social and organisational history, 1939–1945. London: Palgrave Macmillan

Vallance, Chris. (2023, March 31). Elon Musk among experts urging a halt to AI training. BBC. https://www.bbc.com/news/technology-65110030

Writer, Nicholas D.(2019). Artificial Intelligence, China, Russia, and the Global Order Technological, Political, Global, and Creative Perspectives. Montgomery, AL: Air University Press.

-----

Reviewed by: 

Mayukh Mukhopadhyay

Executive Doctoral Scholar

Indian Institute of Management Indore

-----

0 comments
0 views

Permalink