We’ve received many questions on how we did the analysis behind our Storefront Index. This post will describe our dataset, our method, and how we created our visualizations. We hope that this will spur future research and new forms of visualizations, similar to the way in which the release of our Lost In Place data led to amazing reinterpretations of the dataset.

Screen Shot 2016-05-20 at 11.28.32 AM

We used a database of businesses from Custom Lists American Business Directory. The Directory contains 2014 records on US businesses, their industry classification, and their address. Our aim was to understand how clusters of these quasi-private storefront spaces contribute to active streetscapes and generated steady flows of people — so we filtered our business dataset based on three criteria: 1) businesses in the largest metro areas; 2) businesses that have storefronts; and 3) lastly, a spatial filter based on clustering.

For the first filter, we simply chose business in the largest metro areas based on the 2013 definitions of CBSA. For the second filter, we selected businesses that fit into one of 44 industry classifications that would typically have customer-serving storefront. These include businesses like grocers, bookstores, and salons. A full list of our categories can be found on page 18 of the report.

Armed with a storefront business dataset, we next sought to find clusters of storefronts. Thus, for each business with a storefront, we needed to know the distance to the next closest storefront. Our first task was to geocode the addresses, turning the address into a latitude and longitude that we could map. (Luckily, we performed this geocoding in ArcMap just before they cut off access to their geocoding API.) We then used the NEAR function in GIS software, allowing us to calculate the distance to the next closest storefront in meters. To apply our filter, we then chose only storefronts that had another storefront within 100 meters, allowing us to identify clusters of destinations that would be easily walkable.

With our three filters applied, we created a set of map images (for the report) and an interactive map. We used a 3-mile buffer around the central business district of our metro areas of interest. For the images, we highlighted these buffers in white, and used only the points of our clustered storefront locations and the US Census Bureau’s Roads shapefile. For the interactive web map, we used a circles to represent the 3-mile buffer, the points of our clustered storefront locations, and the Mapbox library with the Stamen Toner basemap.

Our list of industry classifications (using Standard Industry Classifications) can be found here and GeoJSON shapefiles for each metro area (using FIPS codes for each metro) can be found here.