What is Data?
In God we trust; all others must bring Data. W.Edwards Deming
Data is a collection of raw, structured, or unstructured information representing facts, observations, or measurements, which can be processed, analyzed, and utilized to draw meaningful insights, make decisions, and solve problems.
In the context of the Artificial Intelligence (AI) ecosystem, data plays an indispensable role as the foundation for developing, training and refining AI models and evaluating their performance.
Data before the invention of writing

At the cradle of humanity, there was data in its original form, oral; humanity used oral history to preserve, cumulate and disseminate information. Thus, oral history can be classified as a form of data.
Oral history refers to the practice of preserving historical information and cultural knowledge through spoken narratives, stories, and accounts passed down from one generation to another. These narratives can convey a wide range of information, such as customs, beliefs, values, events, and experiences, forming a rich qualitative data source.
Oral history is an essential complement to written records and archaeological evidence, as it often captures perspectives, voices, and experiences that may be underrepresented or omitted from other sources. This form of data is particularly valuable in studying the history of societies with limited or no written records and understanding the lived experiences and viewpoints of individuals and communities within a broader historical context.
However, it is crucial to recognize the limitations and challenges associated with oral history as a form of data. Since oral history relies on human memory and storytelling, it can be subject to biases, distortions, and inaccuracies that may arise from personal perspectives, cultural influences, and the passage of time. To mitigate these issues, researchers often use triangulation, corroboration, and critical interpretation methods to assess the reliability and validity of oral history data, ensuring a more comprehensive and nuanced understanding of the past.
Data after the invention of writing

The earliest examples of recorded data in human history can be traced back to the invention of writing systems, which allowed humans to record and communicate information.
One of the earliest known writing systems is the Sumerian cuneiform script, which dates back to around 3200 BCE. The script was initially used for accounting purposes, allowing the Sumerians to record data related to trade, taxes, and the allocation of resources. Cuneiform tablets, made of clay and inscribed with a stylus, contained information about economic transactions, inventories, agricultural production, and other aspects of daily life. These tablets represent the earliest known form of writing and serve as an early example of data storage and communication in human history.
Another early example of data is the use of tally marks, which predate formal writing systems. Tally marks have been found in various forms across different cultures and have been used to record and represent numerical data, such as counts of items or events.
The Lebombo bone, discovered in the Lebombo Mountains of Swaziland, is an example of a tally stick that dates back to around 35,000 BCE. This artefact contains a series of notches that are believed to represent a lunar calendar or a record of menstrual cycles.
These early examples of data demonstrate how humans have been creating, storing, and using information for thousands of years to facilitate communication, manage resources, and understand the world around them.
Data creation has evolved over time, with various methods and technologies emerging across different eras. Here are examples of data creation in the past, present, and future:
Past:
In the past, data creation was primarily manual and analogue. For instance, consider census data collection, which has been conducted for centuries. In ancient Rome, census data was collected through enumerators who went door-to-door, recording information about individuals, their property, and their social status on papyrus or clay tablets. This data was then used to assess taxes, allocate resources, and plan military strategies.
Present:
With the advent of modern technology, data creation has become increasingly digital, automated, and diverse.
An example is social media platforms, where users generate vast amounts of data in the form of text, images, videos, and interactions. The platform owners then store and analyse this data for various purposes, such as targeted advertising, content recommendation, and sentiment analysis. Other examples include IoT devices and sensors, which generate real-time data about their environment and user behaviours, and electronic health records, which store and manage patient data digitally for more efficient healthcare delivery.
Future:
Data creation is expected to become even more sophisticated with technological advancements and the growing ubiquity of interconnected devices. One potential example is the development of advanced Brain-Computer Interfaces (BCIs), which could directly record and interpret neural signals from the human brain. This could generate new data types related to cognitive processes, emotions, and experiences, enabling innovative applications in mental health, communication, and human-computer interaction.
Another example is the increasing use of autonomous systems, such as drones and self-driving vehicles, which will generate massive amounts of data through their sensors, navigation systems, and decision-making processes, contributing to the broader field of artificial intelligence and machine learning.
These examples demonstrate how data creation has evolved and will continue to evolve, driven by technological advancements, changes in human behaviour, and the growing need for information to make informed decisions and solve complex problems.
References:
Koons, B., Moriarty, H., Kear, T., Thomas, A., & Henderson, M. (2019). Factors Related to International Travel for Transplantation Among U.S.-Listed Kidney Transplant Candidates. Nephrology Nursing Journal, 46(4), 397.