Weapons of Math Destruction

Credit scores are one of the formulas that determine our world. They often work against us, from job prospects to how long we’re on hold.

By Cathy O'Neil|Thursday, September 01, 2016
When I was little, I used to gaze at the traffic out the car window and study license plate numbers. I would reduce each one to its basic elements — the prime numbers that made it up. 45 = 3 x 3 x 5. That’s called factoring, and it was my favorite investigative pastime.

My love for math eventually became a passion. I went to math camp when I was 14 and came home clutching a Rubik’s Cube to my chest. Math provided a neat refuge from the messiness of the real world. It marched forward, its field of knowledge expanding relentlessly, proof by proof. And I could add to it. I majored in math in college and went on to get my Ph.D. Eventually, I became a tenure-track professor at Barnard College, which had a combined math department with Columbia College.

And then I made a big change. I quit my job and went to work as a quantitative analyst for D. E. Shaw, a leading hedge fund. In leaving academia for finance, I carried mathematics from abstract theory into practice. The operations we performed on numbers translated into trillions of dollars sloshing from one account to another. At first I was excited and amazed by working in this new laboratory, the global economy. But in the autumn of 2008, after I’d been there for a bit more than a year, it came crashing down.

The crash made it all too clear that mathematics, once my refuge, was not only deeply entangled in the world’s problems, but also fueling many of them. The housing crisis, the collapse of major financial institutions, the rise of unemployment — all aided and abetted by mathematicians wielding magic formulas. What’s more, thanks to the extraordinary powers I loved so much, math combined with technology to multiply the chaos and misfortune, adding efficiency and scale to systems that I now recognized as flawed.

If we had been clear-headed, we all would have taken a step back to figure out how math had been misused and how we could prevent a similar catastrophe in the future. But instead, in the wake of the crisis, new mathematical techniques were hotter than ever, and expanding into still more domains. They churned 24/7 through petabytes of information, much of it scraped from social media or e-commerce websites. And increasingly, they focused not on the movements of global financial markets but on human beings — on us. Mathematicians and statisticians were studying our desires, movements and spending power. They were predicting our trustworthiness and calculating our potential as students, workers, lovers, criminals.

This was the Big Data economy, and it promised spectacular gains. A computer program could speed through thousands of résumés or loan applications in seconds and sort them into neat lists, with the most promising candidates on top. This not only saved time but also was marketed as fair and objective. After all, it didn’t involve prejudiced humans digging through reams of paper, just machines processing cold numbers. By 2010 or so, mathematics was asserting itself as never before in human affairs.

Yet I saw trouble. The math-powered applications driving the data economy were based on choices made by fallible human beings. Some of these choices were no doubt made with the best intentions. Nevertheless, many of these models and algorithms encoded human prejudice, misunderstanding and bias into the software systems that increasingly managed our lives. Like gods, these mathematical models were opaque, their workings invisible to all but the highest priests in their domain: mathematicians and computer scientists. Their verdicts, even when wrong or harmful, were beyond dispute or appeal. And they tended to punish the poor and the oppressed in our society, while making the rich richer.

I came up with a name for these harmful models: Weapons of Math Destruction, or WMDs for short.

And the human victims of WMDs, we’ll see time and again, are held to a far higher standard of evidence than the algorithms themselves.

Welcome to the dark side of Big Data.

A Fair Model
Local bankers used to stand tall in a town. They controlled the money. If you wanted a new car or a mortgage, you’d put on your Sunday best and pay a visit. And as a member of your community, this banker would probably know certain details about your life. He’d know about your churchgoing habits, or lack of them. He’d know all the stories about your older brother’s run-ins with the law. He’d know what your boss (who was also his golfing buddy) said about you as a worker. Naturally, he’d know your race and ethnicity and he’d also glance at the numbers on your application.

The first four factors often worked their way, consciously or not, into the banker’s judgment. And there’s a good chance he was more likely to trust people from his own circles. This was only human. But for millions of Americans, it meant the predigital status quo was challenging, to say the least. Outsiders, including minorities and women, were routinely locked out. They had to create an impressive financial portfolio — and then hunt for open-minded bankers.

It wasn’t fair. And then along came an algorithm, and things improved. In the 50s, a mathematician named Earl Isaac and his engineer friend, Bill Fair, devised a model they called Fair, Isaac, and Corporation (FICO) to evaluate the risk of an individual defaulting on a loan. This FICO score was fed by a formula that looked only at a borrower’s finances — mostly his or her debt load and bill-paying record. The score was colorblind. And it turned out to be great for banks because it predicted risk far more accurately while opening the door to millions of new customers. FICO scores are still around. They’re used by the credit agencies, including Experian, Transunion, and Equifax, which each contribute different sources of information to the FICO model to come up with their own scores.

These scores have lots of commendable, non-WMD attributes. First, they have a clear feedback loop. Credit companies can see which borrowers default on their loans, and they can match those numbers against their scores. If borrowers with high scores seem to be defaulting on loans more frequently than the model would predict, FICO and the credit agencies can tweak those models to make them more accurate. This is a sound use of statistics.

The credit scores are also relatively transparent. FICO’s website, for example, offers simple instructions on how to improve your score. (Reduce debt, pay bills on time and stop ordering new credit cards.) Equally important, the credit-scoring industry is regulated. If you have questions about your score, you have the legal right to ask for your credit report, which includes all the information that goes into the score, including your record of mortgage and utility payments, your total debt and the percentage of available credit you’re using. Though the process can be torturously slow, if you find mistakes, you can try to have them fixed.

Since Fair and Isaac’s pioneering days, the use of scoring has proliferated wildly. Today, we’re added up in every conceivable way as statisticians and mathematicians patch together a mishmash of data, from our ZIP codes and internet surfing patterns to our recent purchases. Many of their pseudoscientific models attempt to predict our creditworthiness, giving each of us so-called e-scores, which are based on numerous variables such as our occupation, what our houses are valued at and our spending habits.

These numbers, which we rarely see, open doors for some of us, while slamming them in the face of others. Unlike the FICO scores they resemble, e-scores are arbitrary, unaccountable, unregulated and often unfair — in short, they’re WMDs.

One Virginia company offers a prime example. It provides customer-targeting services for companies, including one that helps manage call center traffic. In a flash, this technology races through available data on callers and places them in a hierarchy. Those at the top are deemed more profitable prospects and are quickly funneled to a human operator. Those at the bottom either wait much longer or are dispatched into an outsourced overflow center, where they’re handled largely by machines.

Credit card companies carry out similar rapid-fire calculations as soon as someone shows up on their website. They can often access data on web browsing and purchasing patterns, which provide loads of insights about the potential customer. Chances are, the person clicking for new Jaguars is richer than the one checking out a 2003 Taurus on Carfax.com. Most scoring systems also read the location of the visitor’s computer. When this is matched with real estate data, they can draw inferences about wealth. A person using a computer on San Francisco’s posh Balboa Terrace is a far better prospect than one across the bay in East Oakland.

Now consider the nasty feedback loop e-scores create. There’s a very high chance the e-scoring system will give the borrower from the rough section of East Oakland a low score. Lots of people default there. So the credit card offer popping up will be targeted to a riskier demographic. That means less available credit and higher interest rates for those who are already struggling.

E-scores are only stand-ins for credit scores. But since companies are legally prohibited from using credit scores for marketing purposes, they make do with this sloppy substitute. There’s a certain logic to that prohibition. After all, our credit history includes highly personal data and it makes sense that we should have control over who sees it. But the consequence is that companies end up diving into largely unregulated data pools to create a parallel data marketplace. In the process, they largely avoid government oversight. They then measure success by gains in efficiency, cash flow and profits. With few exceptions, concepts like justice and transparency don’t fit into their algorithms.

Let’s compare that to the 1950s-era banker. Consciously or not, that banker was weighing various data points that had little or nothing to do with his would-be borrower’s ability to shoulder a mortgage. He looked across his desk and saw his customer’s race and drew conclusions from that. The customer’s father’s criminal record may have counted against him or her, while regular church attendance may have helped.

All of these data points were proxies. In his search for financial responsibility, the banker could have dispassionately studied the numbers (as some exemplary bankers no doubt did). But instead, he drew correlations to race, religion and family connections. In doing so, he avoided scrutinizing the borrower as an individual and instead lumped him in a group of people — what statisticians today call a bucket. “People like you,” he decided, could or couldn’t be trusted.

Fair and Isaac’s great advance was to ditch the proxies in favor of the relevant financial data, like past bill-paying behavior. They focused their analysis on the individual — not on other people with similar attributes. E-scores, by contrast, march us back in time. They analyze the individual through a veritable blizzard of proxies. In a few milliseconds, they carry out thousands of “people like you” calculations. And if enough of these “similar” people turn out to be deadbeats or, worse, criminals, that individual will be treated accordingly.

The Problem With Proxies
From time to time, people ask me how to teach ethics to a class of data scientists. I usually begin with a discussion of how to build an e-score model and ask them whether it makes sense to use “race” as an input in the model. They inevitably respond that such a question would be unfair and probably illegal. The next question is whether to use “ZIP code.” This seems fair enough, at first. But it doesn’t take long for the students to see they’re codifying past injustices into their model. When they include an attribute such as “ZIP code,” they’re expressing the opinion that the history of human behavior in that patch of real estate should determine, at least in part, what kind of loan a person who lives there should get.

In other words, the modelers for e-scores have to make do with trying to answer the question “How have people like you behaved in the past?” when ideally they would ask, “How have you behaved in the past?”

I should note that in the statistical universe proxies inhabit, they often work. Birds of a feather do tend to flock together. Rich people buy cruises and BMWs. All too often, poor people need a payday loan. And since these statistical models appear to work most of the time, efficiency rises and profits surge. Investors double down on scientific systems that can place thousands of people into what appear to be the correct buckets. It’s the triumph of Big Data.

But what about the person who is misunderstood and placed in the wrong bucket? That happens. And there’s no feedback to set the system straight. A statistics-crunching engine has no way to learn it dispatched a valuable potential customer to call center hell. Worse, losers in the unregulated e-score universe have little recourse to complain, much less correct the system’s error. In the realm of WMDs, they’re collateral damage. And since the whole murky system grinds away in distant server farms, they rarely find out about it. Most of them probably conclude, with reason, that life is simply unfair.

Credit Is a Virtue
In the world I’ve described so far, e-scores nourished by millions of proxies exist in the shadows, while our credit reports, packed with pertinent and relevant data, operate under rule of law. But sadly, it’s not quite that simple. All too often, credit reports serve as proxies, too.

It shouldn’t be surprising that many institutions in our society, from big companies to the government, are on the hunt for trustworthy and reliable people. So when it comes to hiring, an all-too-common approach is to consider the applicant’s credit score. If people pay their bills on time and avoid debt, employers ask, doesn’t that signal trustworthiness and dependability? It’s not exactly the same, they know. But wouldn’t there be a significant overlap?

That’s how credit reports have expanded far beyond their original turf. Creditworthiness has become an all-too-easy stand-in for other virtues. Conversely, bad credit has grown to signal a host of sins and shortcomings that have nothing to do with paying bills.

For certain applications, such a proxy might appear harmless. Some online dating services, for example, match people based on credit scores. One of them, CreditScoreDating, proclaims that “good credit scores are sexy.” We can debate the wisdom of linking financial behavior to love. But at least the customers of CreditScoreDating know what they’re getting into and why. It’s up to them.

But if you’re looking for a job, there’s an excellent chance that a missed credit card payment or late fees on student loans could be working against you. According to a survey by the Society for Human Resource Management, nearly half of America’s employers screen potential hires by looking at their credit reports. Some of them check the credit status of current employees as well, especially when they’re up for a promotion.

Before companies carry out these checks, they must first ask for permission. But that’s usually little more than a formality; at many companies, those refusing to surrender their credit data won’t even be considered for jobs. And if their credit record is poor, there’s a good chance they’ll be passed over. A 2012 survey on credit card debt in low- and middle-income families made this point all too clear. One in 10 participants reported hearing from employers that blemished credit histories had sunk their chances, and it’s anybody’s guess how many were disqualified by their credit reports but left in the dark. While the law stipulates employers must alert job seekers when credit issues disqualify them, it’s hardly a stretch to believe some of them simply tell candidates they weren’t a good fit or that others were more qualified.

The practice of using credit scores in hirings and promotions creates a dangerous poverty cycle. After all, if you can’t get a job because of your credit record, that record will likely get worse, making it even harder to land work. It’s not unlike the problem young people face when they look for their first job — and are disqualified for lack of experience. Or the plight of the longtime unemployed, who find that few will hire them because they’ve been without a job for too long. It’s a spiraling and defeating feedback loop for the unlucky people caught in it.

Employers, naturally, have little sympathy for this argument.

Good credit, they argue, is an attribute of a responsible person, the kind they want to hire. But framing debt as a moral issue is a mistake. Plenty of hardworking and trustworthy people lose jobs every day as companies fail, cut costs, or move jobs offshore. These numbers climb during recessions. And many of the newly unemployed find themselves without health insurance. All it takes is an accident or an illness for them to miss a loan payment. Even with the Affordable Care Act, which reduced the ranks of the uninsured, medical expenses remain the single biggest cause of bankruptcies in America.

This isn’t to say personnel departments across America are intentionally building a poverty trap. They no doubt believe credit reports hold relevant facts that help them make important decisions. After all, “the more data, the better” is the guiding principle of the Information Age. Yet in the name of fairness, some of this data should remain uncrunched.

Adapted from Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, Copyright © 2016 by Cathy O'Neil. Published by Crown Publishers, an imprint of Penguin Random House LLC.
Comment on this article