Introduction

In any election, and especially 2020, we need to work to not only understand what happened in the Presidential election but also why it happened. Throughout this class, I have mostly been focused on prediction, rather than inference. In this post, I will assess an important theme of the 2020: the urban-rural divide between Republicans and Democrats, and start the process of establishing a causal relationship.

The Urban Rural Divide

Between the two major parties, there has been growing evidence of the urban rural divide since the 1980s. Cities, and generally more urban areas tend to vote for Democrat candidates, while rural areas vote for Republicans. The phenomenon is well documented. Earlier in this semester, we read chapters from Red Fighting Blue: How Geography and Electoral Rules Polarize American Politics by David Hopkins, which demonstrates the growing divide.

The growing divide between metro areas and more rural areas on the coasts of America. From chapter 6 of Red Fighting Blue.

The growing divide between metro areas and more rural areas on the coasts of America. From chapter 6 of Red Fighting Blue.

We can also see some differentiation between the coasts, meaning the Pacific Northwest and Northeast and the rest of the country. There is a much sharper divergence between urban and rural areas in the Midwest and south after the 1996 election, following Bill Clinton’s presidency.

The growing divide between metro areas and more rural areas in the South and Midwest of America. From chapter 6 of Red Fighting Blue.

The growing divide between metro areas and more rural areas in the South and Midwest of America. From chapter 6 of Red Fighting Blue.

In the aftermath of the election, there has been a lot written about the urban-rural divide. Even back before the election, FiveThirtyEight published this piece that shows the connections between how urban or rural a state is and which way they voted in 2016. The correlation is striking. NPR just published an article full of quotes from farmers and people from rural areas worried about how the incoming Biden administration will handle the rural economy, which is in disarray due to the COVID-19 pandemic. In addition, if you look at maps of where COVID-19 is hitting the country hardest at the moment, it primarily is impact rural states. Even in the language of his victory speech, Biden was already reaching out to rural communities that did not vote for him, to begin what he referred to as a healing process.

Why It Matters

Understanding the urban-rural divide gives insight into a host of factors about people’s lives, and also shows the balance of power in American politics today. As previously mentioned, there do seem to be real differences between urban and rural areas that manifest themselves in how they vote. These differences include but are not limited to the economy, religion, views on race, education, and many other facets. Understanding how these factors interact are key to understanding elections, and we can use elections to understand which of these factors is most important.

The urban-rural divide also controls the balance of power in American politics. Because of institutions like the Senate and the Electoral College, certain areas have a disproportionate amount of voting power compared to their population. With the current electoral divide between urban and rural areas, this gives the Republican party a huge amount of power. According to calculations by FiveThirtyEight’s Nate Silver, the Senate leans roughly 6.6 percentage points more Republican as a whole. When it comes to governing strategy, this gives Republicans the options to rule as relative extremists1 and still have a high chance of keeping control of the chamber.

The Electoral College’s bias is more complicated to unpack. It has the same state level bias towards rural states, but the winner take all nature gives an incredible amount of power to cities. For example, in Georgia, swings in the areas around Atlanta flipped the entire state for Biden.

A Growing Divide?

One common narrative is that the urban-rural divide grew from 2016 to 2020. This is a testable hypothesis: if we look at the changes in voting behavior at a granular enough level, we should be able to see differences in the swings towards Democrats and Republicans based on how urban or rural a place is. Luckily for me, all of this data is relatively accessible.

Using county election results from 2020, as provided in class, and the same data from 2016 from the MIT election lab, I constructed a dataset that shows the county level change in two party vote share for both Democrats and Republicans. Finding measures of how urban or rural a county is was somewhat trickier, but conveniently for me I actually already had a dataset that included a host of county level demographic data from another class2. The data primarily comes from US Census Bureau. The majority of the data, including income, race, and education, comes from the most recent Census survey in 2018. It also includes population density calculated using the land areas in the full 2010 Census and populations from the 2018 survey.

We can start by looking at some graphs to see if there is correlation between changes in vote share and characteristics of particular counties. Because I am working with two party vote shares, I will focus on changes in support for Democrats.

There is a clear, slight positive trendline. Just visually, we can see that places with low population density had a decrease in Democrat two party vote share between 2016 and 2020, indicating a shift towards Republicans. As the population density increases, so does the change in two party vote share for Democrats, indicating that more urban counties shifted towards Biden. We can take a look at the regression that generates the trend line.

Regression of Change in Demcrat Two Party Vote Share on Population Density
  dem change
Predictors Estimates CI p
(Intercept) -0.72 -0.94 – -0.50 <0.001
pop_density [log10] 0.75 0.63 – 0.87 <0.001
Observations 3097
R2 / R2 adjusted 0.047 / 0.047

Indeed, there is a negative intercept with a positive slope. One way to interpret the slope meaning is that if the population density increases by 1 percent, the change in vote share will increase by 0.0075 percentage points. Given this, it seems clear there was at least some amount of shift. We can also take a look at shifts in states that flipped between 2016 and 2020.

At the state level, the relationship is less concrete. For Michigan and Pennsylvania, the evidence points towards a uniform swing towards the Democrats in this election (or a swing along some other dimension), as the trend line intercept and slope are both positive. Wisconsin follows a similar patten, but less strongly. To my eye, of these states, the only one that seems to fit the national pattern is Georgia. To some degree, there is outside evidence to Georgia fitting the pattern. For example, the NYTimes produced this piece that demonstrates the swings in the Atlanta suburbs, relatively urban counties.

Coefficient Stability

There is a counter theory to the theory that what matters when trying to understand how the urban rural divide impacts elections: some other factor is what really matters, and the factor is just highly correlated with how urban a county is. To take a look at this, we can see how the Democrat two party vote share changed against median household income and against percentage of residents with a Bachelor’s degree.

We can see that both education and household income have a positive relationship with a swing towards Democrats in the 2020 election. These are both correlated with population density. Part of the draw of cities for most people is the higher incomes. In addition, because of the concentration of higher paying jobs, people tend to be better educated.

To take a look at how population density, education, and household income fit together, we can put them all into a regression. One thing to keep in mind is that if there were a direct relationship between population density and the change in vote share, the coefficient should not change when we control for education and household income.

Regression of Change in Demcrat Two Party Vote Share on Population Density, Education, and Income
  dem change
Predictors Estimates CI p
(Intercept) -22.30 -27.09 – -17.51 <0.001
pop_density [log10] -0.05 -0.16 – 0.07 0.445
median_household_income
[log10]
4.38 3.33 – 5.44 <0.001
pct_bachelors 0.10 0.09 – 0.12 <0.001
Observations 3097
R2 / R2 adjusted 0.260 / 0.259

As we can see, the coefficients completely change. The population density sign changes, and it is also no longer statistically significant. Interestingly, the coefficient on median household income is very large in comparison to the coefficient on the percentage with a bachelor’s degree, suggesting a greater importance. Of course, if other varables were included in the regression, then the coefficients still might change.

All this suggests that to establish a causal relationship, the urban rural divide only serves as a proxy for other more complex factors. In the earlier regression, the results were a result of omitted variable bias, rather than a causal relationship. It should also be noted that the R-squared is quite low, indicating a poor fit. This relationship is undoubtedly significantly more complex that this simple regression, meaning that further investigation is necessary.


  1. What I mean is that Republicans can hold up popular bills, like furhter economic stimulus during the pandemic, or try to pass unpopular bills like repealling the ACA and not have to worry too much about electoral consequences.

  2. The dataset comes from a project in AC209a: Introduction to Data Science. The group project is on predicting the spread of COVID-19 at the county level, which is why I had this dataset in the first place. Nick Normandin, one of my group mates, gathered the data and did much of the cleaning on it.