Re-assessing 2020 senate races with data-driven assessments of candidate quality

When the pre-election discourse around candidate quality pops up, my mind always goes to the famous Arthur Conan Doyle saying of how “it’s easy to be wise after the event”. It sums it up more accurately than any of us would care to admit -- pre-election polling numbers, vibes, and anecdotal stories often color our perception of how a candidate is actually doing, and Twitter and media are awash with tales of how a candidate has managed to “upend conventional wisdom” en-route to presumably smashing through pre-election expectations.

Then the election happens and some of those assessments are borne out, as with Jon Ossoff and Raphael Warnock flipping Georgia for Democrats, while others fall flat on their face, like Sara Gideon losing by a massive amount to Susan Collins despite leading nearly every pre-election poll. And suddenly, narratives turn on a dime, and the “dynamic candidate” who was going to “take the state by storm” becomes “another tale of hubris”. Candidates who were royalty a week ago become persona-non-grata in circles, and tales of their generosity and warmth are replaced by whispers of disorganization and complacency.

The truth is, gauging candidate quality is often a fool’s errand in which takes are largely grounded in hindsight more than anything and ignore readily available data. As polarization grows and crossover voting declines, more and more races become easily explainable by factors like a state’s presidential lean. To really quantify how “good” or “bad” a candidate's performance is, we must first establish a baseline by which to compare against. Many techniques may be used for this, but we’ll try to establish how a generic pair of candidates would have done in an election if given the same funding and environment and then examine the actual election result’s deviation from the expected baseline.

The question, then, becomes what factors we should control for. A large portion of the outcome in each Senate race can be explained by easily available and quantifiable factors, especially in presidential years. To quantify how much, we can assemble a model that utilizes candidate and party financial activity, presidential partisanship, candidate incumbency, and state racial and educational demographics to predict the outcome of its Senate election. By controlling for these factors, we can get a rough idea of how a generic pair of candidates would have done if given the same resources and circumstances, and we can then assess candidate quality by examining their overperformance against expectations.

The single most important factor in determining a state’s election results is its presidential lean; in fact, we can attribute ~85% of a state’s 2020 senate election results to its 2020 presidential lean. This makes the national environment all the more important, as it is the single greatest determining factor in how a state will vote. For example, Iowa’s Senate election was one that just about any Democratic challenger would have lost; Theresa Greenfield put up a superb performance, outperforming Biden by ~1.5 points on margin, and yet she still lost comfortably because Trump won the state by 8.

Senate races, however, are often won and lost on the margins more than anything, and in close states, this often makes the difference. It is here that we consider incumbency and spending. Incumbency provides a very real boost to candidates, especially among change-averse or content voters who have grown to recognize and like the incumbent, even across partisan lines; for example, Susan Collins almost certainly has this to thank for her remaining in the Senate.

Financial spending, meanwhile, is one of the few ways in which electorates can be actively swayed and shifted, whether by targeted messaging or by turnout operations. We can roughly quantify its direct impact too; in fact, in 2020, ~20% of the variance could be attributed to spending by campaigns and affiliated groups. Lastly, we add in controls for the demographics of a state.

Controlling for all of the above factors, we can get a better gauge of candidate quality, which can be examined by the following question: How much did a race’s results deviate from what a generic pair of candidates would have been expected to get with the same circumstances and resources? For this, we’ll assemble a multilinear regression model with two-way presidential results, incumbency, demographic data, and fundraising numbers fed into it, regressed against the actual two-way Senate results.

The resulting 2020 over/underperformance map is below.


The results are presented in tabular form as well.


A couple of notes of caution before we proceed:
  1. Just like Baseball’s Wins Above Replacement (WAR) metric, our Performance Above Expected (PAE) metric must be contextualized with proper uncertainty and error bands; gaps of less than half a point are simply too close to draw definitive conclusions from when comparing two candidates. For an example of this, let’s compare Mike Espy and Jeanne Shaheen. Espy has a PAE of 6.2 and Shaheen has a PAE of 6.1. This metric is simply not granular enough to let us conclusively decide who performed better. However, what it can tell us is that both were exceptional, especially relative to their expected baselines. On the other hand, when comparing Mark Kelly (3.5 PAE) to, say, Barbara Bollier (2.2 PAE), we can have a reasonable degree of confidence that Kelly overperformed by more than Bollier, given the separation between the two.

  2. It is difficult to ascertain how much of an election’s deviation from the baseline was down to one candidate being *bad* vs another candidate being *good*. The conventional wisdom states that Mark Kelly was an exceptional candidate while Martha McSally was a mediocre one; however, how much of Kelly’s 3.5 PAE was down to him excelling and how much of it was down to McSally underperforming? Similarly, in North Carolina, widely-derided candidate Cal Cunningham’s PAE was actually +1.6 above average, but how much of this was down to Thom Tillis being a relatively unpopular incumbent who was likely carried over the line by Trump’s coattails?

    Teasing this separation out is a near-impossible task, and so while we’ll refer to PAE in terms of single candidates for brevity (e.g. Kelly had a 3.5 PAE, Collins had a 13.2 PAE, etc), we encourage the reader to draw their own conclusions regarding how much of a race’s deviation was down to which candidate.
----

Examining the 2020 Senate map under the lens of “performance vs expectations”, some things pop out to the eye. Firstly, it is entirely likely that candidate quality was decisive in helping Democrats clinch Senate wins in Georgia and Arizona; results around the nation and fundamentals suggest they should have lost both races, but both Jon Ossoff (+2.6 PAE) and Mark Kelly (+3.5 PAE) significantly overperformed their baselines and flipped seats in races they should have lost. On the flip side of this, Susan Collins (+13.2 PAE) was probably the single best candidate of the 2020 cycle and almost certainly saved the race for Republicans in what once looked like a certain Democratic pickup by virtue of her own candidate strength.


Secondly, despite the national underperformance relative to expectations (losses in North Carolina, Maine, Iowa, and Montana were all considered major disappointments by many) Democratic recruitment in key swing states was arguably quite stellar; the only swing-state battleground recruit that underperformed was Sara Gideon, and it is difficult to tell whether anyone would really have beaten Susan Collins in her race, given her exceptional candidate strength.


Steve Bullock (+9.6 PAE), Mike Espy (+6.2 PAE), Theresa Greenfield (+5.2 PAE), Jaime Harrison (+2.4 PAE), and Barbara Bollier (+2.2 PAE) all lost their races, but they performed significantly better than what Democrats might have been expected to manage given the national results. Greenfield, Bullock, and Espy, in particular, were genuinely phenomenal nominees that were simply sunk by ticket splitting declining to a record low in this election -- just about nobody could have won their races, given the magnitude by which Donald Trump won their states.


In fact, we can take this a step further; oft-ridiculed Cal Cunningham (+1.6 PAE) probably didn't cost Democrats the North Carolina seat despite his scandals. The regression suggests that he might have actually done a touch better than what a generic Democrat should have achieved against a generic Republican in the same circumstances, and while that may seem hard to believe, it's at least enough to indicate that another standard Democrat probably wouldn't have won either. Biden losing the state by over a percent makes it somewhat difficult to argue that Cunningham would have been able to do what no other candidate (save for Susan Collins) managed nationally in winning his race while the presidential nominee lost it, and so it’s likely that candidate quality was not the deciding factor in costing Democrats the North Carolina Senate race, contrary to what conventional wisdom settled on in the aftermath of November 2020.


In a similar vein, many post-election assessments regarding losing candidates tend to be misleading at best and often misguided. The tendency after the election was to mock candidates like John James (+3.5 PAE) and Amy McGrath (+7.8 PAE) for losing races that money was shoveled into at a record rate, but this isn’t necessarily something that the data supports. James came within a hair of winning Michigan and outran Trump while running against a (supposedly) strong incumbent in Gary Peters, while Amy McGrath outperformed Biden by over 6 percentage points.


In James' case, this was his second such overperformance, and so we might be more comfortable in drawing the conclusion that he wasn't actually as bad of a candidate as one may assume by looking at his status as a two-time election loser. In fact, James overperformed polling and expectations significantly in both races.


In McGrath's case, McConnell’s unpopularity likely had a sizable amount to do with this overperformance, and it can also be interpreted as an 8 point Republican underperformance more than anything. That said, among the biggest criticisms of McGrath was that she spent a lot of money for nothing, and it’s worth noting that the margins in this race indicate a Democratic overperformance even after controlling for the immense amounts of money poured into Kentucky’s Senate race. The problem was simply that no Democrat could conceivably have won in Kentucky in 2020, but that isn't really down to McGrath's campaign or candidacy -- Andy Beshear himself wouldn't have won this race in a presidential year.


Lastly, although much of the focus was on the disappointing results of Democratic challengers, it was actually Democratic incumbents across the nation who significantly underperformed their partisan baselines. Only two out of the eleven Democratic incumbents running for re-election overperformed expectations: Jack Reed (+12.9 PAE) and Jeanne Shaheen (+6.1 PAE).


There could be multiple explanations for this, and the "why" is far more difficult to analyze than the "what". One theory would claim that this has to do with incumbents “coasting” and not taking their re-elections as seriously in the light of overly rosy Democratic polling numbers. Another could point to a potentially unfriendlier down-ballot environment for 2020 Democrats. However, it’s worth noting that while many felt that most Democratic challengers underperformed in the immediate 2020 aftermath, the likelier reality is that the underperformance was mostly among incumbent Democratic senators. Challengers generally did about as well as one could have expected given the national environment that actually materialized -- the polling misses aren’t the fault of any specific candidate!

Gauging candidate quality is still one of the most difficult tasks in election handicapping, and they impact everything from the race ratings of forecasters to funding decisions made by PACs and parties. But although we may not have as reliable of a way to gauge this ex-ante, we have managed to establish a decent technique by which to do this ex-post facto, which helps us get a handle on how candidates actually performed in terms of results relative to the resources they had and the rest of the nation’s results. And that’s still a big step forward from what we do have.

 

Kommentare