Data science journey

Published February 4, 2019, 12:00 AM

by manilabulletin_admin



Last year, 2018, was the year where the data science hype peaked in the country. In fact, LinkedIn recently reported data scientist as the most in-demand job. In addition, based on our research, there were more than a thousand junior employees in the Philippines with the job title “data scientist.” Companies jumped in the bandwagon of building a data science practice as a panacea to their organization and market woes.

But if you look deeper into the real job function of most of these data scientists in organizations today, they are either converting data into nice pretty visuals for management consumption and/or cleaning and rearranging disorganized data for further processing, or toying with machine learning tools, an emerging hype in artificial intelligence.Their backgrounds are varying – from computer science and engineering, to statistics and mathematics, to even a registered nurse who studied analytics.

Now, if a company chanced upon a true data scientist who is expensive nowadays, and asked to perform these aforesaid functions, he or she would most likely leave soon, because the true nature of data science if far from what most organizations understand today.

This stems from the lack of clarity and understanding on what data science is among professionals and business executives. Actually, more than 60% of them described data science is a practice in visualizing processed data for management use, based on our survey during an event.

Data science is a new field which is misunderstood by many. What is it exactly? Let’s go back to basics.

According to the Science Council, a professional body of scientists based in London, “Science is the pursuit and application of knowledge and understanding of the natural and social world following a scientific methodology based on evidence.”

Scientific methodology includes objective observation, i.e. measurement and data (possibly although not necessarily using mathematics as a tool), evidence, experiment and/or observation as benchmarks for testing hypotheses, induction i.e reasoning to establish general rules or conclusions drawn from facts or examples, repetition, critical analysis, and verification and testing, i.e. critical exposure to scrutiny, peer review and assessment

A scientist, therefore, is “someone who systematically gathers and uses research and evidence, making a hypothesis and testing it, to gain and share understanding and knowledge.”

Hence, a data scientist is a professional who systematically gathers, processes, and analyses data using advanced tools, and uses research and evidence, making a hypothesis and testing it, to gain and share understanding and knowledge in a specific domain or industry. In other words, he or she is a complex problem-solver; one who solves novel and ill-defined problems in complex and real-world settings brought about by the emergence of big data, i.e. structured data such as company customer information, and unstructured data, such as social media data.
An example of a complex problem solved by a data scientist was cited by Forbes recently – “What are the odds that a given customer opens a promotional email?”
“One can imagine that this probability can be estimated by way of its relationship to the customer’s particular characteristics and that this relationship can be derived from copious data — one approach might be to quantify the average behavior of all customers who share similar
characteristics. Identifying this relationship and the relevant characteristics is the job of the data scientist,” as reported in Forbes. In other words, “data scientists produce mathematical models for the purposes of prediction.”
Because of the difficulty of finding a true data scientist that fits the qualification, companies can start their data science journey with some data engineers and data analysts who report to a business executive in charge of operational support. The data engineer, also known as data architect, gathers and collects the data, stores it, does batch processing or real-time processing on it, for analytical or operational uses. A data analyst, on the other hand, translate these data and numbers into plain English – whether it’s sales figures, market research, logistics, or transportation costs – and perform statistical analyses to help companies make better business decisions.

But it’s important to emphasize that the business executive in-charge, is the one providing the research direction, hypothesis formulation and testing, and insight and understanding because he or she has the domain knowledge of the business or industry. Hence, he or she should possess skills as high-level of critical thinking, creativity, an eye for detail, and lack of bias in interpreting data. The data engineers and data analysts, as part of the data science team, will perform the technical and analytical jobs to support in the testing and validation of the hypotheses.

After a year, when the company already gained experience and matures, then it can hire a data scientist to provide the research direction and hypotheses formulation. The business executive in-charge would have started his or her data governance frameworktooverall manage the availability, usability, integrity and security of data used in an enterprise, with a defined set of procedures and a plan to execute those procedures.

Data science can provide competitive advantages to organizations that know how to organize it, use and take advantage of it. But it’s a journey that can be learned and should be learned by business executives who want to capitalize on its potential.


The author is President & CEO of Hungry Workhorse Consulting, a digital and culture transformation firm. He is the Chairman of the Information and Communications Technology Committee of the Financial Executives Institute of the Philippines. He teaches strategic management in the MBA Program of De La Salle University. The author may be emailed at [email protected]