On a searing Thursday afternoon in September 2011, a technician reconfiguring circuits at an electrical switchyard near Yuma, Ariz., prematurely cranked open a hand-operated switch. This tiny misstep shorted out the Southwest Powerlink — a major electrical artery for the region and a key part of the entire Western grid — and sparked one of the biggest blackouts ever to strike North America.
As electricity sought new paths across the network, other lines overloaded, snapping more equipment offline. Power plants, transformers and power lines across western Arizona, Southern California and Mexico’s Baja California automatically shut off to protect themselves. The climax came at 3:38 p.m. when five lines supplying San Diego simultaneously shut down, sealing that region’s electrical fate. Power instantly zapped out for 7 million people.
For 12 hours, a swath of the Southwest was powerless. Commerce shut down as electronic transactions and cash registers failed. Without signal lights to guide traffic, streets jammed. Food spoiled. Millions of gallons of raw sewage escaped, tainting coastal estuaries and beaches. Hospitals, 911 call centers and other first responders struggled to meet demand while relying on limited backup power.
In all, the cost just to San Diego’s economy was at least $100 million. But the price tag for big blackouts can go even higher. “When we have a blackout in New York, people die, and the cost is essentially $6 billion to $10 billion per day,” New York Independent System Operator CEO Stephen Whitley said at a 2015 conference. (NYISO manages the state’s power grid.) With so much at stake, the industry places a high premium on reliability.
In theory, spontaneous blackouts should never happen. According to the cardinal rules of designing and operating power grids, the system should always have enough spare capacity to sustain the loss of any single element, even one as big as the Southwest Powerlink. Grid operators create computer simulations of their grids and systematically rerun the models, taking out each element in turn, and confirm that the flows are stable.
Operating the grid in this way is the electrical equivalent of driving with a spare tire in the trunk. Yet big blackouts keep happening. They’re a tough problem to crack because power grids, often described as the world’s largest machines, are massively complex systems. Big blackouts are usually the result of multiple components acting up.
After the human error that kicked off the 2011 Southwest blackout, myriad elements of the system did not behave according to the grid operators’ models. Transformers in California’s Imperial Valley overloaded faster than expected, and an automated scheme shut off those five lines running south to San Diego, even though none of the lines was at imminent risk of overheating.
Forecasting events such as the Southwest blackout, in which a half-dozen or more components fail, is simply computationally impossible. “You’re talking about running your model for longer than the age of the universe,” says Ian Dobson, a professor of power engineering and a blackouts expert at Iowa State University. As a result, it’s hard to understand the risks facing a power grid. Without the ability to simulate the largest blackouts, power grid operators can’t foresee what conditions — what combinations of human and component failures — are most likely to cause them.
After 20 years of trying to get around this computational barrier, Dobson and a pair of physicists, Ben Carreras and David Newman, have found a solution. Drawing insights from the behavior of other complex systems, the trio has created a novel simulator that can mimic the largest blackouts that a power grid is likely to experience.
Experts say there is no time to lose in bringing such tools online. The trio’s insights suggest that grids are vulnerable to bigger blackouts than any we’ve seen before. And potential triggers are multiplying as wilder weather from climate change, rising concerns over terrorism and fluctuating power from renewable energy sources heap new challenges on aging grids. “The industry needs those tools, and we need to provide them as soon as possible,” says Yuri Makarov, a blackouts researcher at the Department of Energy’s Pacific Northwest National Laboratory.
Dobson, Carreras and Newman just need to sell the power industry on heeding that warning.
The Critical Point
The trio’s hunt for the cause of big blackouts grew out of the two physicists’ research on fusion energy at the U.S. Department of Energy’s Oak Ridge National Laboratory in Tennessee. Newman, who as a teen developed a fascination with turbulence as a rafting guide in Colorado, arrived at Oak Ridge in 1993 to explore a different kind of turbulence: the plasma of fusing hydrogen atoms inside experimental fusion reactors. The earnest, freshly minted Ph.D. was teamed up with plasma physicist Carreras, a sharp-tongued Spaniard who was one of the lab’s most distinguished scientists.
Newman and Carreras made an odd couple, but an effective one. They set out to understand the unexpected instabilities that arose when scientists sparked and tried to contain nuclear fusion, the process that fuels the stars. They produced a mathematical model showing that turbulence in fusion plasmas, contrary to prevailing wisdom, bears little resemblance to the snarling rivers of Newman’s youth.
Whereas whitewater churns in response to localized conditions within the stream, the duo showed that turbulence in superheated plasma in fusion reactors has more to do with the total amount of energy in the system. After the total heat grew beyond a critical point, the probability of collapse grew exponentially. In the parlance of systems theory, it was a classic complex system with a “self-organizing point of criticality” — a concept elaborated in the 1980s by theoretical physicist Per Bak. It describes how growing sand piles collapse in avalanches when the strain on the grains becomes too great. After a certain amount of sand is in the pile, the likelihood of collapse becomes imminent.
By the mid-’90s, scientists had identified similar patterns of growth and collapse in diverse natural systems, from forest fires to earthquakes. Newman and Carreras discovered that the same theory explains why plasmas have lasted no longer than a few seconds in fusion reactor tests to date. The work earned Newman a Presidential Early Career Award — the highest federal honor bestowed upon young scientists.
In 1995, Newman saw a news report on a blackout and wondered if this “point of criticality” theory might also apply to major power outages — and whether it could help prevent them. Carreras, eager for a new problem to solve, suggested they bring in a grid expert. They found one in Dobson, who had earned a reputation as an innovative power systems engineer by using advanced math to unmask unsuspected relationships between voltage drops and blackouts. The trio first looked at the historical record of big blackouts to see if they could detect criticality’s distinctive imprint. They mined a database of blackouts in North America and plotted them by size. If big blackouts were just a random, unlucky confluence of many small failures, as grid planners and operators believed, a major grid collapse would occur only once in a thousand years or so, showing up as the slim tail on a bell curve. Instead, the plot bulged out to the right, showing that blackouts were striking hundreds of times more often.
For the trio, it was a strong suggestion that blackouts were, indeed, the power grid equivalent of a sand pile’s avalanche. “It’s as if there is a physical law there,” says Carreras.
Evolving Grids
In January 2000, Carreras, Dobson and Newman reported the overabundance of big blackouts at the Hawaii International Conference on System Sciences (HICSS), one of the biggest and longest-running annual gatherings for systems scientists. They speculated that blackout risk might spike when power flows on grids exceeded some threshold, the familiar critical point in systems theory. But what was pushing grids to the point of criticality? They knew power consumption was rising, while financial pressures limited the construction of new lines. Could these influences combine to put extra strain on the grid’s transmission lines, enough to reach a tipping point?
To test this theory, the trio realized they’d have to rethink power grid simulation. Existing simulators could not handle direct modeling of big blackouts because of the complexity of power grids. But what if they could create a simpler simulator that could be set in motion and observed as power levels increased over time, like Bak’s growing sand piles?
Enlisting the help of Vicky Lynch, a gifted computational scientist at Oak Ridge, they worked out a power grid simulator that left out many of the nuances of the physics that conventional grid simulators represent, and they applied it to an artificial grid less than one-hundredth the scale of the U.S. Western grid. Each run of the simulator represents a day in the life of a modeled grid, and each day, any of its components can fail at random. The simulator records what, if any, blackouts occur as a result. Then it evolves before the next run, strengthening affected lines to handle additional power in future runs. “It was the simplest possible power systems model, by design,” says Dobson.
But it worked. The telltale pattern of large blackouts was there. On their artificial grid, just as in the archives, blackouts looked like growing sand piles or fusion reactors: complex systems. As expected, big blackouts spiked when simulated electricity flows exceeded a critical threshold. In January 2001, the trio was back at HICSS presenting their simulations.
The 100-Year Blackout
The grid simulations cast an entirely different light on big blackouts and posed provocative new questions for grid design and operation. The most disturbing implication came from University of Vermont grid researcher Paul Hines. Following the trio’s logic, he concluded that a blackout bigger than any we’ve seen before is probably in our future.
Using the same statistical tools that urban planners and insurance companies use to predict disasters such as earthquakes and 100-year floods based on prior patterns, Hines forecast a 100-year blackout that would knock out 186,000 megawatts of power. That is more than 23 times bigger than the Southwest blackout of 2011 and more than twice the size of North America’s biggest power failure, the August 2003 Northeast blackout that left 50 million people without power.
The trio fleshed out an equally disturbing lesson on blackout prevention: The conventional practices of preventing blackouts, which involve trying to thwart even the smallest failures, may actually increase the likelihood of big ones. After tweaking the simulator settings to reduce the possibility of random line failures, the artificial grids experienced more big blackouts. Protecting the grid against small blackouts enables it to run at higher and higher power levels, ultimately setting up the grid for a major collapse.
That may seem counterintuitive, but it’s in line with systems research that shows merely preventing failure can increase a system’s probability of collapse. Consider forest fires: Research (and history) shows that suppressing small forest fires allows kindling to build up, setting the stage for large, truly devastating conflagrations. The trio’s simulations suggest that power grids are susceptible to the same paradox.
An Inconvenient Idea
The Northeast blackout of 2003 struck at a prime moment for the trio, thrusting their theory into the media spotlight. Major newspapers and news broadcasts turned to them for help in explaining why the grid might have spontaneously collapsed. An article in the journal Nature captured their message succinctly: “Power grids are inherently prone to big blackouts. . . . Trying to make them more robust can make the problem worse.”
The idea that preventing failures might unwittingly hasten big blackouts proved wildly unpopular with power companies and engineers, who struggled to reassure a nervous public amid a crisis of confidence in the grid. Carreras believes the controversy put the trio’s research in jeopardy; he says in fall of 2003 the director of Oak Ridge National Laboratory told him that officials in Washington were fuming over the trio’s message. Carreras suspects that’s why research funding from the U.S. Department of Energy dried up soon afterward. “We got cut off,” he says.
They found new grants, but it was a struggle. Electrical engineers reviewing grid research proposals questioned the trio’s stripped-down, low-resolution power model and took a dim view of their interdisciplinary efforts. One common refrain in grant reviews, recalls Newman, was, “What could a physicist, especially one at a podunk school, know about the power grid?”
Extreme Events
What ultimately brought the three in from the cold, a few years later, was an accelerating energy revolution in California. In 2002, the state mandated that utilities use increasing amounts of electricity from renewable sources such as wind and solar. Utility executives and state energy experts feared that these cleaner but less predictable energy sources would heighten the risk of blackouts, and to keep the power flowing, they were willing to test out-of-the-box ideas, including the trio’s.
Merwin Brown — whose grid research and development program at the California Institute for Energy and Environment financed the trio’s work from 2009 to 2011 — describes it as the research equivalent of a Hail Mary pass. “It kept being said that you really can’t analyze these cascading outages because it’s just such a huge calculation,” says Brown. “My team felt that the research program should have a few long-term, high-risk, big-payoff efforts. We said, ‘Let’s test that hypothesis.’ ”
In 2006, Brown enlisted Dobson to help craft the project. The resulting $1.16 million Extreme Events initiative would test the trio’s approach against the most advanced power grid simulator then available, operated by scientists at the Pacific Northwest National Laboratory (PNNL), to predict where and how often big blackouts struck.
For the first time, they had the chance to apply their simulator to a real power grid: the mighty Western Interconnection, one of the nation’s three main grids, which stretches from Mexico to British Columbia and east to the Rockies. Blackouts striking California could start anywhere on the Western Interconnection and propagate across it without any political obstacles.
Brown hoped the trio and the PNNL team would work well together, but the honeymoon was short-lived. Some members of the PNNL team arrived with the same critical view that the trio faced from grant reviewers, arguing that the trio’s approach could not be trusted, recalls Brown. Exchanges at the project’s 2009 launch got so heated that some worried it might come to blows. Newman remembers it as a new low: “I have crazy colleagues in other areas, but I’ve never seen anything quite like that before.”
Nevertheless, the project moved forward as two separate efforts, and the trio’s results, reported in March 2011, vindicated their approach. The pattern of blackouts that their simulator produced closely matched the Western grid’s record of outages. In contrast, their counterpart’s powerful but non-evolving simulator underpredicted big blackouts by about a factor of 10, according to calculations by Dobson and one of his students. (The PNNL team did not present its own comparative analysis of blackout frequency.)
The simulations also validated the link between power levels and risk: The probability of the biggest blackouts rose sharply when simulated power flows exceeded a critical point of roughly 50 percent of the grid’s capacity limits.
More real-world validation came later that year, when the Southwest blackout struck the Western grid. In the Extreme Events final report, the trio had predicted vulnerable regions within the Western Interconnection — eight areas where large blackouts repeatedly struck their simulations. One of the eight vulnerable grid segments they identified was the quintet of lines north of San Diego whose shutdown would seal the region’s fate. “It was pretty amazing,” says Newman. “The actual blackout was in September. Our prediction had been in February.”
Smart vs. Safe
Validation in California helped the trio and their approach to blackouts earn the respect of the power engineering community. Their ideas have inspired researchers to use simulations to seek out the telltale overabundance of big blackouts on power grids from Scandinavia to New Zealand to China. But changing grid operations based on the trio’s basic takeaways — that big blackouts are predictable and that running power grids cooler would exponentially reduce their incidence — has only just begun.
Most power system analysis continues to use simulators that cannot predict the biggest blackouts. But power system operators do talk more openly these days about the link between power levels and risk. Whitley, the New York Independent System Operator CEO, told attendees at the 2015 HICSS meeting about a new power-trimming procedure that New York’s system operators have developed to reduce risks when power consumption is running especially high.
Under the so-called “thunderstorm alert,” extra power-generation facilities are turned on to reduce the power flowing long distances on the state’s transmission lines. Whitley said the extra costs were justified, given the far higher costs that a blackout would impose: “If you have to run a few gas turbines for a couple of hours, who cares!” he told the assembled system scientists.
The next step for the trio is to refine tools to advise grid operators like Whitley on when and how to act to reduce blackout risk. And with power grids in the process of a redesign, the time is now, says Milorad Papic, a senior grid planner who leads an engineering society initiative on blackout analysis.
Power systems are adopting so-called smart grid technologies, such as advanced power sensors and automated switches, that could have unintended impacts on reliability, Papic notes. For example, advanced sensors are providing unprecedented real-time information on power flows that grid operators are using to monitor system stability. Whitley’s team is using stability warnings from those sensors to guide the use of their thunderstorm alert. But such real-time intelligence could also entice grid operators to allow more power to flow over existing power lines.
“You’re getting closer to limits, and overloads can propagate more quickly and generate more problems,” Papic says. Without careful study, a smarter grid could actually become a less safe grid.
Perhaps it is inevitable — based on the trio’s own research — that their insights will get short shrift until another catastrophic blackout. If they are right, we shouldn’t have to wait too long: The next big one is always just around the corner.