Using county-level data, it depends on who’s classification system you use

Counties may not be the right basis for diagnosing the contributors to Covid.

One of the oft-repeated claims in the pandemic is the notion that cities and density are significant contributors to the risk of being infected with the Covid-19 virus. Some of this, we have argued, is based on a deep-seated (and wrong-headed) prejudice associating cities with communicable diseases (the “teeming tenement” meme).  But beyond base beliefs, what does the data show?

Because in the United States, the public health function is administered chiefly through counties, the nationwide data on the prevalence of Covid-19 cases and deaths is reported county-by-county.  And analysts (ourselves included) have used this county-level data to plot the prevalence of the disease in different parts of the country.  Our approach has been to aggregate data to the metropolitan level for the nation’s largest metro areas, based on the understanding that county units vary widely across the nation, and that the the labor, commuting and economic markets formed by metro areas are probably a more robust basis for comparing the extent of the disease nationwide.

As we noted earlier, several very good analysts, including Bill Frey at the Brookings Institution, Jed Kolko of Indeed and Bill Bishop at the Daily Yonder, have used the county-level data to look at the comparative prevalence of the disease in cities as compared to suburbs.  The style of their analyses is similar:  They look at county-level data, and classify counties as either urban, or some flavor of suburban or exurban.  They then aggregate the data for all the similarly classified counties across the nation, and compute prevalence rates (reported cases or deaths divided by population).  Jed Kolko’s analysis provides a representative example.

Translating with the Rosetta Stone

The challenge in interpreting their results comes from the fact that each of the three analysts uses a different system for classifying counties based on “urbanness,” as we explored in our earlier commentary on this subject.  We concluded that there was no right or ideal way to provide decide such a classification of counties, and noted that the three methods differ substantially in both the number of places (and people) classified as “urban” and also in which counties fall into which bins.

To illustrate the practical differences between the different definitions, we created a kind of Rosetta Stone (with data graciously provided by each of the three authors).  Let’s take a look at the city/suburb split using each of the three definitions. For this exercise, we use the USA data county level data on Covid Cases as of May 12.  As we usually do at City Observatory, our focus is on the 53 metropolitan areas in the US with a population of one million or more.

Covid-19 prevalence

The following table illustrates Covid-19 prevalence aggregated by each of the three classification systems.  The table shows the total population living in each county classification, the total number of reported cases in those counties, and computes the rate of cases (per 100,000) population.

Urban/Suburban Population, Covid-19 Cases, and Rate per 100,000,
May 12, 2020, (Metropolitan Areas with 1 million or more population).

Population Cases Rate
Brookings
1-Urban Core 99,667,019 780,762 783
2-Mature Suburb 61,943,681 178,381 288
3-Emerging Suburb 14,308,473 27,413 192
4-Exurb 5,253,680 11,915 227
NonMetro 118,497 259 219
Total, Large Metros 181,291,350 998,730 551
Kolko
1-Urban 75,949,521 622,889 820
2-SuburbanHigh 69,104,197 274,372 397
3-SuburbanLow 36,237,632 101,469 280
Total, Large Metros 181,291,350 998,730 551
Yonder
1-Central counties 90,665,117 439,086 484
2-Suburban Counties 86,288,760 551,678 639
3-Exurban 4,218,976 7,707 183
4-Rural Adjacent to Large MSA 118,497 259 219
Total, Large Metros 181,291,350 998,730 551

As you can see, one gets a very different impression of whether city or suburb rates are higher depending on which classification one uses.  Overall, the prevalence rate across categories is 551 cases per 100,000.  But the Kolko and Brookings classifications imply that the average prevalence is about twice as high in cities as in suburbs, while the Yonder classification implies the reverse, that the rate is about 30 percent higher in suburbs than in cities.

Outside of New York

The New York City metropolitan area has been the epicenter of the pandemic, and has accounted for a disproportionate share of reported cases and deaths.  Because of that concentration of cases, and the region’s large (nearly 20 million) population), it could be that this single metro area skews the totals.  In addition, there are significant differences in how the three typologies classify counties in the New York metropolitan area.  For example, the Brookings and Kolko methods classify Westchester and Nassau counties as “urban” while Yonder classifies those counties, and also Queens County, as suburban.

To filter out the effects of New York’s direct contribution to the pandemic, and to sidestep the disagreements about how to classify counties there, we construct a second table aggregating the data for the remaining 52 large metropolitan areas.  First, its worth noting that the aggregate rate of reported cases per 100,000 population drops about 40 percent, to about 354 cases per 100,000.

Urban/Suburban Population, Covid-19 Cases, and Rate per 100,000,
May 12, 2020, (Metropolitan Areas with 1 million or more population, excluding New York metro)

Population Cases Rate
Brookings
1-Urban Core 82,235,306 376,890 458
2-Mature Suburb 60,420,138 159,745 264
3-Emerging Suburb 14,042,865 25,657 183
4-Exurb 5,197,900 11,482 221
NonMetro 118,497 259 219
Total, Large Metros 162,014,706 574,033 354
Kolko
1-Urban 61,770,791 289,436 469
2-SuburbanHigh 65,521,033 199,860 305
3-SuburbanLow 34,722,882 84,737 244
Total, Large Metros 162,014,706 574,033 354
Yonder
1-Central counties 84,227,331 308,054 366
2-Suburban Counties 73,505,682 258,446 352
3-Exurban 4,163,196 7,274 175
4-Rural Adjacent to Large MSA 118,497 259 219
Total, Large Metros 162,014,706 574,033 354

Excluding New York shifts the apparent city/suburb balance in each of the three classification systems.  Brookings reports essentially the same relative gap between city and suburban rates (with city rates about double those in mature suburbs).  In the data that exclude the New York metro, Kolko still reports a higher prevalence of Covid in suburbs than in cities, although by a smaller margin (50 percent higher in cities, rather than nearly double).  The Yonder tabulation now reports a higher rate of prevalence in urban counties than suburban ones, although by a relatively small margin (366 in cities vs 352 in suburbs).

In the end, we’d recommend that anyone interested in understanding the geography of the pandemic closely read the work of Kolko, Frey and Bishop–they’re all smart analysts who paint vivid pictures with the data.  But as this analysis makes clear, drawing clear lines between urban and suburban counties is more of an art than a science, and its useful to understand the different definitions in order to be able to make sense of the conclusions. Even with the best efforts,  it is hard to use highly aggregated county data to make a clear cases about whether the pandemic is much worse in cities than in suburbs.  The devil is very much in the definitional details, as the range of estimates presented here illustrates.

In addition, it may help to look at the issue from different perspectives. As our own analysis comparing city and suburban rates within metropolitan areas shows, it matters a great deal more whether you are in a metro area with a high rate of infections, than whether you are in a city or suburb of any given metro area.  Its also clearly the case that within metros, city and suburban rates are highly correlated, and that when the virus was spreading rapidly, the average suburb was only about six days or so behind its central city in the prevalence of the virus.  Plus, when more detailed data becomes available, such as zip code level tabulations of cases and deaths, we may be able to more carefully discern the differences between cities and their suburbs.