Much has been made over the past 8 – 12 years of who is winning the data arms race – Republicans or Democrats. Eitan Hersh’s book from 2015, Hacking the Electorate, exposes the Democrats’ wider system and culture of data to be significantly lacking when compared to headlines. Based on the information presented in the book, there can be little doubt that the TargetPoint Consulting (TPC) – Republican National Committee (RNC) Voter Scoring system has surpassed the Democratic infrastructure, which was built around the increasingly fragile Catalist-NGP-VAN arrangement.
Electorate also shows that the further institutionalization of these systems may be the key to victories in 2016 and beyond.
The Yale professor’s findings as to the Democrats’ weakness outside of Chicago itself cast doubt on the wisdom of the Democrats’ media-praised monopolistic system. In contrast, the competitive infrastructure on the Republican side is driving steady improvements. Using the RNC’s database systems and the Data Trust as parallel hubs, and not being owned by a particular campaign or single self-interested company, innovations by campaigns and vendors party-wide are maintained and institutionalized. They are also backed by the committee’s commitment to expanding training as well as building large external support and business intelligence units in the data division.
Many of the innovations conducted on the Obama campaign are now integral to the party system and TPC’s custom analytics. Like the Chicago operation, models form the core of our targeting systems, but while TPC prides itself on its accurate predictions, the TPC-RNC Voter Scoring system goes far beyond simple scores. Among other enhancements, considerable research and development is conducted to turn those scores into algorithmically optimized universes for each specific purpose, and to use existing models in ways never before conceptualized.
From driving resource allocation; to providing accurate benchmarks on tiny segments of the population that cannot be captured by traditional polling; to creating custom targeting or monitoring universes on the fly to produce or measure progress among extremely specific political or psychological types; our individual-level predictions form the building blocks from which custom solutions can be constructed.
Hersh’s analysis of the microtargeting used by nearly all Democrats in the 2008-2012 elections finds numerous severe deficiencies, most of which TPC’s methods avoid, mitigate or have entirely moved past.
For instance, Hersh also tested prototypical Democratic universes and found them lacking. This is not surprising, as his well-informed imitation of Democratic practices did not use scores at all in choosing voters, instead cobbling them together by incorporating party identifiers, voters from heavily-partisan or African-American, districts and a few other basic variables. This often produced universes of absurd size, such as persuasion universes equal to as much as 70% of the population. While he noted that a few of the most advanced campaigns on the Democratic side did use models to build contact lists, he found that his universes wound up looking acceptably like such models anyway, calling those methods into question as well.
While the Obama campaign did use some experiment-informed persuadability models for overall persuasion universes in 2012, Hersh reports that its rather election-specific results are unlikely to be widely useful. Furthermore, a study conducted of Democratic operatives using NGP-VAN indicates that models are rarely considered of primary importance by those choosing how to use data on the ground.
Hersh also notes that Democratic campaigns are starting to consider using named ballot and issue models rather than mere partisanship in order to get more accurate results. These methods have always been used by TPC, and by the RNC in target states. Only with the advent of cheap national bulk scoring from companies such as Catalist did national partisanship microtargeting arise. A look at the differences between outcomes across same-state races is an easy demonstration as to why named ballot and separate party modeling can be far superior to generic partisanship.
Even though the hardest-hitting portions of Hersh’s empirical work did not test true MicroTargeting (but a weaker substitute used by many Democratic campaigns), Hersh does bring up a number of valid criticisms of MicroTargeting in general, mostly pushing back on overwrought media accounts and the claims of Democratic operatives he interviewed for the book. Having been leaders in political MicroTargeting for over a decade, TPC data scientists can say that his criticisms generally do not reveal unknown pitfalls, but rather known limitations that skilled targeters and mature operations can and do minimize or avoid.
While it is true that while there are thousands of data points available on each voter, we are well aware that the vast majority of the time, a couple dozen to a couple hundred are actually useful, and that their accuracy varies. The most recent of our monthly RNC model builds used 212 distinct variables across a number of models. The relative reliability of information based on public records and of modeled results are well known and accounted for in modeling and targeting decisions. But while the number of variables frequently used might be relatively low, many infrequently-useful variables are nevertheless of notable value, particularly when producing custom models for candidates with unique characteristics, issues, or primaries.
On the Democratic side, Hersh exposes an entrenched culture of racial targeting and messaging in campaign circles, in which – outside the several states in which race is a matter of public record – true MicroTargeting is often replaced by simply targeting areas with large concentrations of minorities or relying on modeled, consumer data-based racial variables to target purely on race.
But as TPC is well aware, this data is not all that reliable, leaving many Democratic candidates making racial appeals to groups containing 30% or more of the wrong race. Even in the states where race can be properly known, this “good enough” approach misses geographical, political, or racial minorities completely, and is not a form of MicroTargeting. In fact, TPC began pushing Republican campaigns away from geographical targeting starting in 2002.
Hersh assumes that Republicans use racial targeting as well, but, while race is incorporated into many models, TPC has no reason to give it so dispositive a place in our decision-making. In the Voter Scoring system, we look for those with opinions ripe for mobilization or persuasion, regardless of skin color or background.
It is also an uncontroversial reality that party registration and data on primary vote history are the strongest predictors of voting behavior, but that does not mean that states without one or both cannot be accurately modeled.
In 2014, even in states without either party registration or primary vote history, such as Wisconsin, TPC’s verification polls conducted with the RNC found that each universe was planning on casting their ballots in proportions generally within the margin of error. In Virginia, where TPC lacked party registration data and had only party primary voting data, our models not only tracked our follow-up polling very closely, but predicted the razor-thin margin of the race in a way no public poll did.
Furthermore, party registration can be far less reliable in southern and coal states, where Republican-voting Democrats are common, especially on the federal level. Yet, TPC’s models, unlike public polling, displayed pinpoint accuracy in states such as Kentucky and Arkansas, where party registration data is extremely misleading, again showing strong accuracy in each universe and predicting the blowout outcomes the public polls missed.
Hacking the Electorate is an important corrective to media narratives of the omniscient campaign and the advantages of Democratic centralization. Hersh’s analysis certainly provided food for thought and some ideas for further innovation, and our system will continue to improve, as it has over many years. This road has been long, and has been the result of a sustained commitment by the RNC leadership and staff, particularly the doubling down on data by Chairman Reince Priebus. However, it also provides strong evidence that the TPC-RNC Voter Scoring system and Republican Data Ecosystem are both on a strong methodological and institutional path.