The new pollution — data

Published May 17, 2022, 12:05 AM

by Monchito B. Ibrahim


Monchito Ibrahim

The metaphor “data is the new oil” was coined by The Economist in 2017. Today, it has become a cliché and is so yesterday. Yes, the exponential growth of data today brought about by the ubiquitous use of digital technologies has opened many opportunities for innovations to happen, enabling more informed decisions, and very creative ways of monetizing it. But we are also beginning to see a trend where data is also used to manipulate people’s minds. The election campaign, for example, has shown us how some quarters have fabricated what seemed to be so compelling stories out of made-up data just to try to convey the message that their candidates are well-positioned for victory. Even after the election, reading those conspiracy theories posted on social media has become one of the most entertaining activities we can have if we want to while our time.

It is becoming clear that the new oil data is likely to become the new pollutant of the planet and more significantly the minds. And this pollution can be more dangerous than we thought. What we have seen during and after the last election is just one of the early signs of something that can become Pandora’s box of complex problems if we are not careful. The wrong use of data for malicious propaganda to influence the minds of people should make all of us worry.

The proper use of data starts with ensuring quality. At the start of the pandemic, we have seen several countries worldwide admitting that the number of cases they have initially released was inaccurate. A case in point would be the example of a European country where they admitted that the number of cases they initially reported publicly was significantly understated due to a simple issue: they used an old file version of Excel in churning out the numbers. The numbers doubled when they recomputed using the correct tool.

Paraphrasing the words of my AAP colleague, Doc Ligot, data privacy can be a thing of the past when people spend 90 percent of their waking hours on social media. Remember how Cambridge Analytica was reported to have helped former President Donald Trump get elected to the office and in the process pulled off one of the most dramatic political upsets in modern history? Apparently, exhaustive survey methods, data modeling, and performance-optimizing algorithms were used to target tens of thousands of ads to different audiences in the months leading up to the election. Those ads were viewed billions of times. The scandal exposed what seemed to be social media’s ability to share our private lives for advertising and marketing purposes. This is made more disturbing when it was revealed that these methods can possibly undermine democratic processes by psychologically influencing large numbers of people online.

We should also note how social media can be a petri dish for disinformation. This is usually facilitated by automation using chatbots. A good example would be a story featured in the New York Times in 2018 describing how the ruling regime of an Asian country has used hate speech on social media against a cultural minority that went undetected for years. Apparently, the victims sued the social media platform for failing to prevent the incitement of violence against the minority group.

A major problem with data is bias. Data is generated by processes usually handled by humans and machine processes and sometimes these processes can be broken or credulous in the first place.

Naturally, broken processes lead to bad data, and bad data leads to biased algorithms. The use of artificial intelligence to automate most business processes, including decision-making, presents immense promise. Automating repetitive tasks usually results in improving the frequency and accuracy of transactions. However, AI models need to be constantly monitored and calibrated to take into consideration emerging situational changes.

If you have watched the documentary, Coded Bias, you would have seen how facial recognition algorithms that were trained using imbalanced datasets propagated racial and gender discrimination.
Yes, data and in general, digital information, is the fuel of the new economy. It is the resource that is making more institutions become agile and creating endless opportunities to create social value. But like the carbon fuel of the old economy, we are beginning to see how it is becoming to be a pollutant. These harmful pollutants are spilling into the digital ecosystem and are slowly disrupting our social foundations. We need to understand the potential extent of the harm to society from this new pollutant. More importantly, we need to find ways quick to prevent or at least minimize the further spread of this emerging societal cancer before it gets out of hand.

(The author is the lead convenor of the Alliance for Technology Innovators for the Nation (ATIN), vice president of the Analytics Association of the Philippines, and vice president, of the UP System Information Technology Foundation.)

Email: [email protected]