Big Data

Big data isn’t always as helpful as we like to think

  • author image

The evolution of big data has spurred open the doors of human advancement in incredible ways. Using hidden algorithms — self-contained sets of rules followed by computers — within big data to solve problems allows you to decipher various trends and patterns on a gigantic scale. 

Predictive modeling capabilities and microtargeting allow for a tailored and upgraded online user experience. 

Many people are inclined to trust big data because of the large sample sizes and complex algorithms used to derive various patterns. But to what extent should you believe or apply the results of these mysterious mathematical models? 

How big data can be misunderstood and misused 

Data scientist and former Wall Street hedge fund employee, Cathy O’Neil, discusses her take on big data in her new book,Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.” Her experience with big data on Wall Street occurred right before the financial crash of 2008 — when the consequence of proprietary big data projects resulted in a global economic collapse. 

Her take on the experience was that employees had no way of accounting for misused or misunderstood data because of the limited amount of information they could possess or share. 

Limited access to information was a way for many companies to protect their assets in case an employee left to work for a competing firm. At that time, because of this lack of communication and information sharing, employees couldn’t even begin to see the shaky foundations of widespread mortgage fraud holding up the entire financial system. 

But enough about housing. Where did the bias in these mathematical models originate? 

Bias in information systems 

All mathematical models start with a hunch – a hunch that seeks to uncover a more profound logic that provides an even greater explanation to the way things are. What happens when the basis of these models is coated in bias? Or an unreliable assumption? 

During this process, while intentions may be honorable, people are human. The determinants of those mathematical models may favor the hopes of the individual or entity encoding it. This can lead to the collection of data that may appear more objective than it is. 

O’Neil points out what all of these owned and operated big data projects, or “weapons of math destruction” have in common: they operate on a substantial scale, few people actually understand them and, because they are closed systems, they are subject to feedback loops that tend to reinforce the same unreliable assumptions. 

The murky similarities of these features will subject many big data projects to the comparable risks and consequences experienced by Wall Street in 2008. 

What can we learn from the mistakes of big data? 

According to O’Neil, it’s possible to find models that are more transparent, with clear objectives and a self-regulating feedback structure. This unambiguous type of modeling would allow more people to understand how and why various inputs and outputs are targeted. 

Clarity would allow for a more democratic data-collecting process since biased results would be regulated by the mechanism computing them or detected by those monitoring them. 

Companies must be mindful of how they use data in day-to-day business, especially in “workforce management” or scheduling software. 

If you use only a single model to guide decision making — without being able to understand or systematically check the reliability of said method — you risk falling into a pit of discriminatory logic that could have tragic consequences. For example, if you use personality tests or algorithms to screen applicants or current employees, their answers are translated into logic based on assumptions. It’s not 100 percent accurate, and it can result in inaccuracies and candidate discrimination or termination. 

What happens when these innovative algorithms are market driven and profit motivated? When profit is the basis of mathematical computation, will that model be subject to risk or bias? In most cases, yes. For example, our obsession with big data may make cars vulnerable to hackers, a problem few people would have been concerned with in the past. 

As innovative technology becomes more connected, in this case with devices like electronically controlled steering and braking units and remote key access, the potential for additional vulnerabilities increases, too. A single defect in the technology and software could be amplified much like the financial crash of 2008. Except, unlike the 2008 stock market collapse that resulted in extensive financial loss, vehicle hacking could result in widespread death. 

As with many other forms of reasoning, big data is prone to bias and risk. While big data can prove to be groundbreaking, the models and results of these ever-evolving applications should be monitored and regulated according to industry. 

Moving forward, to enhance individual security and prevent discriminatory modeling practices, O’Neil concludes that it will be increasingly necessary to create information systems with transparency, self-regulation or required open “algorithmic audits.” With this understanding, it’s possible to begin disarming the “weapons of math destruction” that infiltrate every aspect of our existence. 

Kayla Matthews is a technology journalist and blogger. To read more posts about big data, technology and artificial intelligence, follow Kayla on Twitter or subscribe to her newsletter on Productivity Bytes.

 

Post Your Comment




Leave another comment.

In order to post a comment, you must complete the fields indicated above.

Post Your Comment Close

Thank you for your comment.

Thank you for submitting your comment, your opinion is an important part of SandHill.com

Your comment has been submitted for review and will be posted to this article as soon as it is approved.

Back to Article

Topics Related to this Article