Association Between User-Generated Commuting Data and Population-Representative Active Commuting Surveillance Data—Four Cities, 2014-2015

Year:
Benefits studied:
Uses studied:
Region: ,
Place: Austin, Texas; Denver, Colorado; Nashville, Tennessee; and San Francisco, California

Overview

One of the primary concerns about data from GPS tracking apps is that the users tend to be more frequent recreators or commuters and therefore do not accurately represent the actual population. This paper shows that there is a strong correlation between the reported share of people in a neighborhood commuting by active transportation between the American Community Survey (a nationally representative survey) and Strava (a GPS tracking app).

Relevance

This study is relevant for those interested in doing counts of cyclists in their community, possibly down to the neighborhood level, and who want to see whether user-generated data is representative of the population. The results show that Strava is strongly correlated with the number and share of people commuting by bicycle, suggesting Strava could be used to identify neighborhoods with higher concentrations of active commuters.

This paper does not address how representative Strava data is in terms of frequency and duration of commutes. Additional research is required to determine whether Strava users are, indeed, more avid active commuters than the general population.

Location

This study evaluates four U.S. cities: Austin, Texas; Denver, Colorado; Nashville, Tennessee; and San Francisco, California.

Trail Type

This study addresses active commuting in four major cities, including separated and non-separated paths.

Purpose

The purpose of this study is to evaluate whether user-generated data from an app like Strava is representative of the general population, and to determine whether these data can inform public health and transportation planning.

The funding for this study is not identified. The authors are from government agencies (Center for Disease Control and Agency for Toxic Substances and Disease Registry) and Strava.

Findings

  • Overall the correlation between ACS and Strava data is 0.60 for the median number of active commuters, and 0.59 for the percent of active commuters.
  • San Francisco and Denver had the highest correlation (0.58 and 0.52) while Nashville and Austin had the lowest (0.28 and 0.36).
  • The correlation was stronger in denser neighborhoods, from 0.40 in the lowest third in terms of density and 0.61 for top third of neighborhoods for density. The authors suggest that this may reflect the greater popularity of active commuting in more dense places and Strava being more highly adopted in denser places.

Methods

The authors compared data between two data sources: the American Community Survey (ACS) and Strava. They evaluated data at the Census block group-level, which contain between 600-3,000 residents. They calculated the correlation between the number of active commuters in a block group as reported by Strava versus the American Community Survey (ACS). The authors also calculated correlations at different levels of population density, obtained from ACS.

Citation

Whitfield, G.P. 2016. Association Between User-Generated Commuting Data and Population-Representative Active Commuting Surveillance Data—Four Cities, 2014–2015. CDC Morbidity and Mortality Weekly Report 65(36): 959-962.