From John Snow to Dr. Strange

In this article Elpida Bantra brings us back to 1854 where the father of modern epidemiology John Snow used data analysis techniques to rescue the city of London.

 

Data science is about extracting knowledge from data, discovering patterns and building insights through records and facts. Contrary to popular belief, it is the opposite of trying to prove something. Simply put, data science is not just being aware of the problem and looking for the solution. It’s much more, you peruse the world of big data and endeavour to identify patterns and develop methods by which you will utilise the information.

With the rising popularity of data science, people have come to see data scientists in the same light as one would see the marvel superhero, Dr Strange. The famous superhero is known for his sharp mind and bravery, as he makes decisions with certainty when the world is in danger. That being said, even though I would love to see myself as part of his crew, data science does not exactly work like this. Data shows us the way, sets the restrictions, reveals the future and helps us choose the safest solutions, with utmost clarity and confidence. Data is all knowing and informative, it is the wisdom box. Data defies the old theory of thinking outside the box, by forcing you to use only what is in the box. People and machines alike learn from data it is the only weapon we have that can help us make decisions and solve problems of varying scales.

Picture a scenario where a well hidden and dangerous disease is plaguing a town and there is no solution in sight. Back in 1854, London was facing large scale unprecedented deaths due to an unknown cause. The solution was eventually crafted by John Snow, our superhero and the father of modern epidemiology (no, really his name was John Snow!). In an effort to identify the root of the problem, John Snow placed dots on a map and through spatial data demarcation and discovered a cholera outbreak at a water pump saving thousands of lives. People, frequenting the pump were dying due to consuming cholera contaminated water.

 

This is the original map by John Snow, showing the cluster of cholera cases in the London epidemic of 1854, was drawn and lithographed by Charles Cheffins. Snow’s study was a major event in the history of public health and geography. Imagine if people were not answering Snow’s surveys (data availability) as a result he couldn’t place the dots on the map (data visualization) or did not use descriptive statistics (data interpretation) to correlate these deaths. Decision making against even invisible enemies can be made with the available of data and the right tools. History is full of examples like John Snow’s discovery, where data reveals behaviours and patterns that lead to correlations that even the experts could not predict.

In the past, science was accessible only to the elite. Therefore it fascinates me that today, having recognised data’s unlimited strength, people collect it for everything. The universe, economics, consumption and so many other fields are studied and within these fields people innovate because of the many available tools to process data. We have been transported into a wonderland, the world of big data, where information is plentiful and constantly renewed. And as we often hear in a superhero movie, ‘with great power comes great responsibility. Now that we have seen the criticality of data and how it can be used, it is imperative that we have hindsight and add on to existing data and insights to enable organisations to have foresight and develop strategies. That being stated, one thing is clear, with the knowledge of data increasing so will the amount of available data. This is a barrier that data scientists and companies alike would have to overcome. With the stack of information rapidly growing, we have to not only find effective ways to store it but to also use it. Hence, we have to develop tools and models that process massive data sets.

Global access to mathematical tools, the internet, models and data promote progress, defending us and the environment from potential risks and increases the quality of life.  It shows what we can do by passing knowledge from one generation to the other. Data and models can drastically help decision makers to defend public health and improve their lives, justifying the countless hours spent creating and preserving structured data and validated confident models.

A good example of real world use of data the study of a substance that is bioaccumulative and present in everyday products, passing through aquatic routes to the oceans. A combination of consumers’ habits data with geospatial data could help save the environment and ensuring public health integrity. Knowing when, how and where things such as plastic or heavy metals accumulate could help us in taking quick and effective actions. Similarly, gas emissions and the resultant air pollution problem can be handled with greater  efficiency when we have data and a better understanding of the problem. Models can simulate realistic exposure scenarios, estimate the risk and identify the root cause.

Today we have specialist tools such as the Expert Models platform that enable organisations to make data-driven decisions through the use of structured data and predictive exposure models. Data science is constantly shaping up and by using data science, we may not be able to go back in time and save the planet and change history but we can definitely predict risks and develop tactics to circumvent or solve problems.

Written by Elpida Bantra on July 10 2018

Signup for our newsletter