The data is sometimes called the "new oil," a newly discovered source of wealth that is extracted from the depths of corporate and government archives. Some accountants are so excited about the potential value of the data that they count it in the same way as a physical asset.
While it is true that data can enhance an organization's value, this resource has no intrinsic value. Like oil, data needs to be extracted and refined with the right quality. Data needs to be transported across information networks before it can be used to create new value. The value of data is not in the information itself, but in the transformations it undergoes.
The analogy between data and oil is only partially correct in that data is an infinite resource. The same data can be used many times for a sometimes originally undesired purpose.
The ability to use data for more than one purpose is one of the reasons data science has gained popularity around the table. Senior managers are looking for ways to extract value from so-called "dark data"
. Data scientists use these forgotten data sources to create new knowledge, make better decisions, and generate innovation.
The question that arises from this introduction is how to manage and analyze the data so that it becomes a valuable resource. We will present a normative model for creating value from data using three basic principles derived from the architecture.
This model is useful for data scientists as an internal check to ensure that their activities maximize value. Managers can use this model to evaluate the results of a data science project without having to understand the mathematical complexities of data science.
The 3 Basic Principles of a Data-Driven Company
Although data science is a quintessential 21st century activity, to define good data science, we can draw inspiration from a Roman architect and engineer who lived two thousand years ago.
Vitruvius is immortalized through his book "About Architecture
", which inspired Leonardo Da Vinci to draw his famous Vitruvian man
. Vitruvius wrote that an ideal building must exhibit three qualities: utilitarian, firm and venus-like, or utility, solidity and aesthetics.
Buildings must be useful so that they can be used for their purpose. A house must be functional and comfortable; a theater must be designed so that everyone can see the stage. Each type of building has its own functional requirements.
Secondly, buildings must be solid in the sense that they are firm enough to withstand the forces acting on them. Finally, the buildings must be aesthetic. In Vitruvius' words, buildings must resemble Venus, the Roman goddess of beauty and seduction.
Vitruvius' rules for architecture can also be applied to data science products (Lankow, J., Ritchie, J., & Crooks, R. (2012). Computer Graphics: The Power of Visual Narration. Hoboken, N.J: John Wiley & Sons, Inc
Data science needs to have utility; it needs to be useful to create value. The analysis must be solid so that it can be trusted. Data science products must also be aesthetic, in order to maximize the value they provide to an organization, as shown
1- Utility in Data Science
How do we know something is useful? The simple, but not very enlightening answer is that when something is useful, it is useful. Some philosophers interpret usefulness as the ability to provide the greatest good for the greatest number of people. This definition is quite convincing, but it requires some contextualization. What is right in one situation may not be as beneficial in another.
For a data science strategy to be successful, it must facilitate organizational goals. Data scientists are opportunistic in the approach they use to solve problems. Insight implies that the same data can be used for different issues, depending on the perspective taken on the available information and the problem at hand.
After digesting a research report or viewing a visualization, managers should ask themselves, "What am I doing differently today as a result? The usefulness of data science depends on the ability of the results to positively influence reality for professionals. In other words, the outcome of data science should comfort management that objectives have been met or provide practical ideas for solving existing problems or preventing future ones.
2- Strength in Data Science
Just as a building must be solid and not collapse, a data product must be solid in order to create business value. Robustness is where science and data meet.
The robustness of a data product is defined by the validity and reliability of the analysis, which are well-established scientific principles as shown in the figure below. The robustness of data science also requires that the results be reproducible. Finally, the data, and the process of creating data products, must be governed to ensure beneficial results.
The difference between traditional forms of business analysis and data science is the systematic approach to problem solving. The key word in the term data science is therefore not data, but science. Data science is only useful when the data answer a useful question, which is the scientific part of the process.
This systematic approach ensures that the results of data science are reliable for deciding alternative courses of action. Systematic data science uses the principles of scientific research, but its approach is more pragmatic.
While scientists seek general truths to explain the world, data scientists seek to pragmatically solve problems. The basic principles behind this methodical approach are the validity, reliability, and reproducibility of data, methods, and results.
3- Aesthetics in Data Science
Vitruvius insisted that the buildings, or any other structure, must be beautiful. The aesthetics of a building cause more than just a pleasant feeling. Architecturally designed places stimulate our thinking, increase our well-being, improve our productivity and stimulate creativity.
While it is clear that buildings should be pleasing to the eye, the aesthetics of data products may not be so obvious. The science requirement of aesthetic data is not a call for embellishment and obfuscation of ugly details of results.
The process of cleaning and analyzing the data is inherently complex. Presenting the results of this process is a form of storytelling that reduces this complexity to ensure that a data product is understandable.
The value chain of data science begins with reality, as described by the data. This data is converted into knowledge, which managers use to influence reality to achieve their goals. This chain that goes from reality to human knowledge contains four transformations, each with opportunities for loss of validity and reliability.
The last step in the value chain requires the user of data science results to interpret the information to draw the right conclusion about their future course of action. Reproducibility is one of the tools to minimize the possibility of misinterpretation of analyses. Another mechanism to ensure proper interpretation is to produce an aesthetic data science.
Aesthetics in data science is about creating a data product, which can be a visualization or a report, designed to enable the user to draw the right conclusions. A messy graph or an incomprehensible report limits the value that can be extracted from the information.
We will be talking more in detail about each of these in the next post, we hope!