No Walk in the Park: The Viability and Fairness of Social Media Analysis for Parks and Recreation Policy Making

How to cite this study

Mashhadi, A., Winder, S.G., Lia, E.H. and Wood, S.A. 2021. No walk in the park: the viability and fairness of social media analysis for parks and recreational policy making. In Proceedings of the International AAAI Conference on Web and Social Media 15: 409-420.

Overview

This study examines the biases in social media analyses using different data sources that estimate the number and demographics of visitors to urban parks. Flickr, Instagram, an on-site survey, an online/phone survey, and an AI facial recognition program are utilized to address the bias that can be generated from different social media platforms. The number of visitors and demographic composition of the visitors are estimated using an AI facial recognition algorithm (Face+++).

Relevance

This study is relevant for researchers interested in using several research methods to improve social media analyses of park visitation. Visitation estimates can inform management decisions and improve recreational opportunities. Researchers should keep in mind that two important biases can develop from social media analyses: those that result from how the data is generated and those that are caused by the algorithms that analyze the data. For example, changes in the popularity of social media platforms may bias photo-user-days (PUDs) over time and AI algorithms can be trained on different datasets, potentially leading to bias in the results.

Location

This study focuses on 10 randomly selected urban parks in Seattle, Washington.

Trail Type

This study evaluates 10 city parks in Seattle, Washington, representing a broad range of park types, neighborhoods, and user groups.

Purpose

The purpose of this study was to determine whether geo-located images shared publicly on social media can offer an accurate portrait of urban park visitors and their demographics. The funding for this work was provided by the Bullitt Foundation.

Findings

  • Instagram PUDs were found to be substantially higher than Flickr PUDs from January 2016-January 2019.  The number of Instagram photos increased overall while the number of photos posted on Flickr declined, potentially reflecting the popularity of the platforms over time. 
  • The average number of people that appear in photographs (counted by humans) is consistent between Instagram and Flickr. However, Face++ undercounts people in social media photos, leading to an underestimate of park visitation rates. 
  • According to the demographic detection feature of Face++, Flickr images contain a greater number of children, with a manual inspection of some of the images indicating that Flickr is more often used for photographing sports events that included children.
  • The facial recognition algorithm fails to detect subjects in photographs of group activities. 
  • Photographs containing children are twice as likely to be under-detected by the algorithm compared to images that lack children.
  • In the intercept survey, more respondents reported their race as White than those in the social media posts identified by Face+++. This indicates that in this study the content of photographs captures a greater proportion of non-white visitors than traditional methods of surveying park users.

Methods

This study uses people counted from images posted on Flickr and Instagram from January to  2016 to January 2019 to estimate visitation rates and visitor demographics. This demographic data was then compared to a visitor-intercept survey and a larger-scale multi-model survey of Seattle residents. The intercept survey took place at five randomly selected park exits and asked visitors to complete a written survey in English about visitors’ activities, demographics, and experiences in parks in their neighborhood. Responses were voluntary and no compensation was provided. 165 surveys were collected. In the multi-model survey, an 830-participant web and phone survey commissioned by Seattle Parks and Recreation was filtered to include responses of only those who live in the same sites and selected participants who visit the park more than10 times per year, reducing the sample to 72 participants. 

To study the biases in Instagram and Flickr, the number of people in photographs was manually counted by humans through Amazon Mechanical Turk and artificially counted using Face+++, an AI facial recognition program. These results were then compared to the human-labeled count and the algorithm, identifying false positives where the Face+++ over-counted people in the image (by mistaking a painting for a face, for example) or false negatives where faces should have been detected but were not. Demographics were divided into categories of “white” or “non-white” and “children” or “adults.” 500 pictures were analyzed.


Added to library on November 27, 2023