Rarely in the history of science is it possible to locate precisely when and where a particular discipline arose for the first time. Just as medieval alchemy morphed into modern chemistry over centuries, most fields seem to have evolved and transformed very gradually. Not so data science. We can say with certainty that data science emerged, in both name and practice, in the early 1960s in the United States in the heart of the military-industrial complex. To a surprisingly high degree, what was called data science then matches the definition of the term in 2008, the year Facebook started using “data scientist” as a job category. This is no accident—the material and social conditions that occasioned the rise of data science in the post-war era are the same conditions that gave birth to the social media giants of Silicon Valley. At one time these conditions were confined to the military and scientific sectors; now they have become the foundation for the economy and society as a whole.
This thesis will come as a surprise to many. We tend the think the field—or at least the buzzword and hype—was created in Silicon Valley in the last fifteen years or so, roughly around the time Harvard Business Review anointed data scientist “the sexiest job of the 21st century.” Those who have done a little digging may push that date back to the mid-1990s, when a small group of statisticians in the US attempted to expand their field to encompass data science. Some cite Peter Naur’s suggestion to rename computer science as data science in the 1970s. Still others claim that John Tukey effectively founded the field in 1962 or thereabouts with his essay on data analysis. These events are fragments of a more complete history that needs to be told, not only to set the record straight, but to recognize the world that connects this history and the present era. We are living in a world where it was necessary to invent data science, and the same logic that drove its invention in the 1960s is driving its flourishing today.
One reason we can be so precise about its origins is that data science is not a science in the traditional sense—a fact often cited by critics of the field. It does not have a clear object in the world whose laws are to be revealed by applying the scientific method. It is not easily divided into pure and applied branches. Instead, the name data science was invented to designate a unique combination of technologies and practices drawn from extant fields, including data processing, signal processing, operations research, statistics, computer science, machine learning, and human-computer interaction. These fields were brought together to meet the requirements of a new medium—electronic data. Known then as data deluge and now as big data, this medium emerged from an increasingly connected system of communication and surveillance technologies associated with a slew of large-scale, post-war projects ranging from air defense and space exploration to high energy physics and electronic records management. Although each constituent field has a long and complicated history, their combination into a single domain of expertise is historically singular.
In a series of posts, I want to tell the story of the moment in history that gave rise to data science and the world in which it was necessary to invent it. The story of this world is a history of ideas, social realities, and material conditions extending from the rise of American hegemony after WWII to the era of social media capitalism in which we currently live.
