What Was Nate Silver’s Data Revolution?

Created

Sep 5, 2023 1:36 AM

Sign up for the News & Politics.

The latest from Washington and beyond, covering current events, the economy, and more, from our columnists and correspondents.

E-mail address

By signing up, you agree to our User Agreement and Privacy Policy & Cookie Statement. This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

The mania for data-based political coverage—evidenced by the arms race to build FiveThirtyEight imitators like the Washington Post’s Monkey Cage or The Upshot at the Times, the newspaper with which FiveThirtyEight was affiliated until 2013—was driven, at least in some part, by an imbalanced understanding of what the numbers were telling us. “People kind of liked it for the wrong reasons,” Silver told me. FiveThirtyEight, he explained, came out of a “specific tradition of gambling and forecasting,” which, when done properly, was mostly a tool to measure odds. But the conditions of the 2012 election, which Silver described as “boring” when compared with 2016 and 2020, and the emotional investment that Democrats had in Barack Obama turned Silver into someone they saw as an oracle who bore only good news.

This was certainly great for Silver’s career, but it wasn’t exactly the point of all those election models, nor was it a sustainable way to manage expectations. It’s likely that much of the readership of the Times was sincerely interested in poll aggregation, but there was also an undeniable appeal to prognostications that assured readers that Obama was going to defeat John McCain, and that any story about debate performances or shifts in donor priorities could be summarily dismissed as a vestige of an old way of talking about elections. The problem, of course, is that when your liberal audience wants you to provide only the number that allays its fears, you actually need the Democrats to keep winning. “The minute you have a forecast where there’s less certainty, people don’t like that,” Silver said. “The minute you have a forecast that doesn’t have a Democrat winning, they don’t like that very much.”

Silver’s foray into the predictions game was via PECOTA, a baseball-forecasting model that mostly appealed to avid fantasy-baseball players. He became interested in politics in 2006, when Congress passed the Unlawful Internet Gambling Enforcement Act, which effectively shut down online poker in the United States. Silver had been playing poker professionally and began looking into why his livelihood had been taken away from him. The pursuit of an edge in gambling—whether it comes from counting cards in blackjack, game theory in poker, or modelling N.B.A. games—has always been adversarial. The stories that grip the public, like Ben Mezrich’s “Bringing Down the House,” Al Alvarez’s “The Biggest Game in Town,” or Michael Lewis’s “Moneyball” and “Liar’s Poker,” always pit the smart-guy upstarts against the slow-moving institution. Silver effectively brought the “Moneyball” formula into political journalism.

Public fights between young insurgents and the old guard often play out in the media. In baseball, team ownership quickly capitulated to the new “Moneyball” regime. But team owners don’t control who gets to be a local columnist, nor do they get to cast the voices on talking-head shows. So, as every team in the league handed its operations over to a cast of younger, nerdier dudes with charts, the data-converted masses began a coup against the old guard of sports journalists who still believed in antiquated stats like pitcher wins and R.B.I.s. The divide felt alluringly political. The quants were cast as progressives who could see the sport’s objective truths, which, at the time, tended to favor highly productive Latino players who might have been ignored or even derided by the old baseball press. A choice was presented: You could either stick with the crusty columnists who still relied on racially coded adjectives such as “cerebral” and “blue-collar” to describe players, or you joined the enlightened new order that understood that David Eckstein, the slap-hitting second baseman for the St. Louis Cardinals who received undue plaudits for his grit and hustle, was actually terrible at baseball.

The analytics revolution in the sport happened fast, but the media infighting lasted much longer. By 2010, the nerds had already won, but, if you looked at the way the quant-converted fans were carrying on, you’d have thought that Murray Chass ran every baseball team in the world and told all his players to bunt and never, ever, take ball four. The disruptors, in other words, had become the establishment, but weren’t exactly acting like it.

Silver’s data revolution has followed some of those same patterns. During FiveThirtyEight’s rise, between 2008 and 2012, Silver became the hero for everyone who was tired of traditional, bloated election punditry. “We were the insurgents within the New York Times,” Silver said. “We were implicitly critiquing their media coverage of the Presidential race.” That enviable status, in which one is both part of a venerated institution and free to point out the fossils in the room, gave Silver the trust of the Times’ reading public and also allowed those readers to air their grievances. The quants have undoubtedly improved political coverage—just like the baseball-analytics journalists, who, despite being incredibly annoying at times, were a welcome reprieve from the fusty writers who paraded around press boxes for decades with their Baseball Writers’ Association of America lanyards around their necks. But, like them, Silver ultimately became an institution unto himself; this, in turn, made him a target for criticism.

Silver believes that moment of transition happened for him when he left the Times for ESPN a decade ago. “The minute someone is, like, ‘Hey, we’re going to take these nerdy white guys and hire them a staff of thirty people,’ you’re no longer sympathetic,” he said. He said the site’s big launch was also accompanied by “a lot of bragging that was kind of stupid relative to the quality of the product.” But, like any honest gambler who can see the swings in his career with some clarity, he attributed many of FiveThirtyEight’s later troubles to the inevitable regression back to the mean. “You’re just on this crazy winning streak,” he said, referring to his proficiency in picking the 2008 and 2012 elections. “It’s like making the World Series of Poker final table in back to back years or something—where you have two cycles where you’re outperforming your expectations. Inevitably, you’re going to get shat on.”

After Donald Trump’s improbable victory over Hillary Clinton, in 2016, a lot of the same liberal readers who had once relied upon Silver as a comfort doll directed their anger toward all the quants who had given Clinton a sizable edge in the election. During the primaries, FiveThirtyEight was consistently dismissive of Trump’s chances of winning the nomination, but in the general election against Hillary Clinton, Silver gave Trump the best chances of any reputable election-forecasting outfit, at twenty-nine per cent. That twenty-nine-per-cent outcome was realized, as will be true twenty-nine per cent of the time. But Silver’s critics still felt like they had been misled—the oracle had seemed to have been debunked.

The number crunchers haven’t really recovered since. Poll analytics and data-driven political journalism haven’t disappeared in the past seven years, but the hope for an empirical alternative to Beltway reportage has certainly faded. But I don’t think the cooling of the public’s attachment to their favorite poll wizards had much to do with who got it right and who got it wrong. If that were true, Helmut Norpoth, a professor at SUNY Stony Brook whose model correctly called the election for Trump, would have become the next Nate Silver. Instead, it became clear that the gravity of the 2016 election and the real fear that the same liberals felt for the future of the country could no longer be processed by turning elections into a gambling proposition.

The question still remains: How do you gauge, say, Ron DeSantis’s chances to win the Republican primary? It’s obvious that the old-school political journalism that the quants rightfully tried to replace, with its cast of hustling aides and shadowy kingmakers, hasn’t improved. Just last November, that same machine was putting out stories about the incoming “red wave” in both the House and the Senate; reporters were pontificating about how John Fetterman’s poor performance in a debate against Mehmet Oz, which seemed to show the effects of his recent stroke, would doom him on Election Night.

The quants were supposed to provide a pathway out of the thicket of bad narratives, but that hope came from a fundamental misunderstanding of what the models were trying to measure. Silver was imagined to embody certainty when, in fact, his actual job was to inform the public of probabilities. “It’s very weird to become very well known for the wrong reasons,” Silver said. “People say, ‘Oh you have numbers and therefore a lot of certainty’ and they can’t quite process the fact that you can use numbers to quantify uncertainty as well.”

Silver, for his part, doesn’t quite seem to know what to do next. He is working on a book about gambling, and hopes his next media venture won’t be narrowly focussed on politics, in part because he doesn’t feel particularly invested in them. “I am a fan of the N.B.A.,” he said. “If there’s some palace intrigue about some coach, I’m listening to Bill Simmons’s or Zach Lowe’s podcasts. Or, if there’s drama in the poker world, I’m paying attention because I’m a fan of that world. Whereas, with the politics stuff, I just like the elections part.”

Split Ticket, a political-modelling Web site run by a group of twentysomethings, argues that there is a way to explain electoral politics to the broader public without falling into the oracle business. The site, which touts its ability to communicate numbers to the broader public, began during the pandemic, when Lakshya Jain, Harrison Lavelle, and Armin Thomas were talking on Twitter about polling and the upcoming 2020 election. Jain did not have a particularly distinguished start to his political prognosticating career. He had just finished a master’s program in machine learning and computer science at the University of California, Berkeley, and ran a calculation that predicted—disastrously—that Joe Biden would win four hundred and thirteen electoral votes. But, over time, Jain has improved his track record. In November, 2021 he, Lavelle, and Thomas launched Split Ticket, and were soon joined by Leon Sit and Clare Considine. Since then, they have nailed Georgia’s Senate runoff, and avoided the failed “red wave” narrative in the 2022 midterm elections.

As was true for FiveThirtyEight, Split Ticket’s most promising innovation comes from a permutation of sports analytics. Wins Above Replacement (WAR) gauges how many wins an athlete adds to a team relative to “replacement level,” which, roughly speaking, means the level of a bad player who would cost a team nothing to acquire. The hope is to provide a single number that conveys the full effect of a player’s impact on a game. Split Ticket’s version of WAR tries to do a similar thing with candidate quality by providing a “quantifiable ‘score’ for each district that displays whether the Republican or Democrat performed better relative to data-based expectations.” Just as baseball analytics adjusts for ballpark effects or the quality of a team’s fielding, Split Ticket controls for factors like the “racial composition” of a seat, incumbency, over-all voting trends within the state where the election took place, and the money that each candidate spent. A candidate, then, may be swept into office as part of trends up the ballot, but WAR attempts to isolate the candidate’s individual quality from those external factors.

What Jain and Split Ticket hope to bring to the political conversation is a retrospective look at political outcomes which cuts against the typical postmortems from the commentariat. Jain talked about pundits who blame every Democrat election loss on the candidate’s flirtations with defunding the police in the summer of 2020. “When people talk about candidate quality in the national media through a more holistic lens, they subconsciously project their own biases onto the results and try to explain it in ways that may not necessarily comport with what the reality actually was,” Jain said. A metric like WAR, then, tries to disambiguate and detach a candidate from the convenient existing narratives.

Their formula found, for example, that even though the Democrat Charles Booker lost his Kentucky Senate election against the Republican incumbent Rand Paul, Booker actually outperformed his expectations by 8.1 per cent. The Wisconsin race between the Democrat Mandela Barnes and Ron Johnson, another Republican incumbent, included a great deal of speculation that what the Wall Street Journal called Barnes’s “progressive ties” could ultimately hurt his electability. Split Ticket found that, though Barnes did ultimately lose to Johnson, he still outperformed the theoretical replacement-level candidate by four points.

Before Amy Klobuchar ran for President, much was made of the strength she had shown in her senatorial runs, particularly in rural parts of Minnesota. The case for her candidacy was cast in quantitative terms—she was a Democrat who won her 2018 Senate race by a twenty-four-point margin in a state that Hillary Clinton had won by only two points in 2016, and she held an unusual edge among white, non-college-educated voters. Some pundits imagined that Klobuchar had special qualities that would translate to wins in other swingy Midwestern states such as Michigan and Wisconsin. But by controlling for the gap in campaign spending between Klobuchar and her opponent, as well as Klobuchar’s incumbency advantage, Split Ticket found that she had outperformed expectations only by six points—still impressive, but certainly not anything to build a Presidential campaign on.

WAR is by no means perfect. Like any model, its quality will ultimately depend on how well the Split Ticket team can adjust its controls and keep pace with a complex, changing voting public. And Jain and his colleagues can’t control how their numbers might get swept up in partisan narratives. In sports, these types of all-in-one metrics tend to gain traction with the general public when they flatter the consensus of their audience. Confirmation bias, buoyed by the political desires of one’s audience, becomes the arbiter of whether a given metric is seen as good or bad. It’s likely that Split Ticket’s success will depend on the same impulses that made people misread Silver for so many years.

But that doesn’t mean the quants aren’t doing something valuable. Our industry still has not extricated itself from the Beltway-gossip model in which a writer talks to a few people in D.C., puts his finger up to the wind, and takes what amounts to an educated guess. The truth, however unsatisfying, is that, though the polling wizards are fallible, invite a type of misreading, and perhaps do not always live up to the gravity of the moment, they’re still better than what came before. ♦