Utilizing mobile phone data to inform COVID-19 response
PHONE SMART Mobile phone data can be used to inform different aspects of COVID-19 response. At the population level, quantifying changes in human mobility or clustering can help evaluate the impact of an NPI and identify hotspots where additional or different interventions may need to be applied. At the individual level, mobile phone data may be used to understand patterns of individual contacts and enhance contact tracing.
Evaluating current interventions and monitoring their release
The most widely used application of mobile phone data in public health to date is the use of telecom geolocation data to track population movements11,12. Mobile phone operators routinely collect Call Detail Records (CDRs) that contain a timestamp and GPS location with a unique identifier for all subscribers. These data thus are typically readily available and offer high coverage to estimate mobility patterns of individuals using their mobile devices. We note that similar time-resolved GPS location data may be passively collected through certain applications, though typically for only a subset of subscribers that may introduce further bias.
CDRs can be used to generate a number of metrics for characterizing large, population-level mobility patterns. Origin-Destination (OD) matrices reflect the number of times a trip is made between two locations (of varying spatial resolution) in a certain period. These matrices can be analyzed over time to detect temporal trends (i.e., holidays, seasonality, weekday vs weekend) and regular hotspots of attraction. These spatial and temporal flows of individuals between locations, including the magnitude and frequency of these movements, can be used to understand the risk of importation from areas with ongoing outbreaks to areas without sustained transmission where there is a risk of reintroduction and resurgence. Aggregate flows can also be used to retrace the likely introduction and spread of an outbreak in new areas and to inform future projections of disease risk or burden across space and decision making around the design and implementation of travel restrictions or increased surveillance.
Aggregate mobility patterns may also be critical pieces of evidence when evaluating the effectiveness of various NPIs. Most NPIs are reliant on modifying physical behavior. Monitoring the volume, frequency, and average distance of flow during interventions can be used to directly quantify the adoption and effect of these interventions, and identify areas of high potential risk to target with different interventions. There are already identified associations between reductions in population-level mobility within and between different locations and COVID-19 incidence6,10,29, though further exploration of which population-level metrics are most closely related to changes in disease risk and whether these associations are sustained throughout an outbreak is needed30. These associations would ideally be interrogated to identify individual behaviors associated with mobility measures that are also associated with individual risk of COVID-19.
The effect on NPIs can also be monitored through subscriber density metrics that combine the recorded GPS location and timestamp of CDRs to capture the real-time population density and identify potential hotspots. When using finer-scale GPS location data, these density metrics may quantify the likelihood or frequency that users came into proximal contact. A third metric derived from CDR or GPS location data, the radius of gyration, quantifies the range over which a single person may travel in a specified time period. Importantly, the data required for these applications are non-identifiable; they cannot be used to identify any given individual’s interactions, but provide population-level insight into the average clustering and movement of individuals. These metrics, along with traditional OD matrix flows, were recently employed in Italy as a way to evaluate the impact of its national lockdown31. Traffic flow between provinces and probability of colocation were reduced initially in the northern provinces, where the COVID-19 outbreak was first observed, a clear signal of reactive social distancing. As the epidemic progressed, and especially once the national lockdown was enforced, the entire country saw a reduction in traffic between provinces; however, the probability of colocation remained highly dependent on province and was likely attributed to the number of cases reported in each province. Interestingly, the average distance traveled by individuals was significantly reduced across all provinces after the initial outbreak was confirmed.
The use of Bluetooth data (records of proximal interactions between Bluetooth-enabled devices) to quantify physical clustering or real-time density of subscribers at small spatial scales (e.g., zip codes) and fine temporal resolution has been explored for the purposes of contact tracing (see below). The use of these data has been considered less for population-level analyses, though it offers another source of information on behavioral changes under different NPIs. When activated, mobile phones will emit a Bluetooth beacon that is detected by other activated phones. When two Bluetooth-enabled devices are within range, the date, time, distance and duration of interaction can be recorded. The frequency or number of these interactions (analyzed anonymously to form, broadly, measures of clustering or proximal interaction rates over time) may be important given the role of sustained interaction or overcrowding of individuals32,33,34 and contact structure in SARS-CoV-2 transmission35. Furthermore, Bluetooth data in combination with GPS data or a network of Bluetooth sensors can be used to quantify the amount of time people spend at home or other identified locations when lockdown measures are in place to determine if policies are effective.
These data and measures of population-level mobility or clustering patterns would be exceedingly difficult to collect on a similar scale without mobile phone data. These data are often continuously collected, in near real-time, allowing for continued analysis as an outbreak unfolds. Importantly, though, a baseline understanding of contact or clustering patterns prior to any interventions is necessary to inform estimates of intervention impact.
Facilitating contact tracing
Opt-in applications (apps)36,37,38,39,40,41,42 that rely on digital approaches to enumerate and contact individuals who may have been in proximity with someone infected with COVID-19 have been proposed to increase efficiency and decrease the very large burden of manual contact tracing programs43,44,45. By enabling rapid tracing of perhaps higher proportions of affected individuals, these apps can reduce the amount of time that a potentially infected person would have to infect others, particularly in asymptomatic or pre-symptomatic phases of infection46. Most contact tracing apps collect Bluetooth and/or GPS location data to create trails of contacts over a moving time window (14-28 days). Unlike the data needed to understand population-level, aggregated behaviors described above, these data must be linked to single individuals and capture pairwise interactions with other identifiable individuals. Once a case has been identified, they are added to a list of infected users that is queried by the other phones in the network. If the infected user is detected in the trail of contacts, then the user and their contacts are alerted, either by the app or by a public health official, to initiate isolation and quarantine.
This contact tracing process occurs either in a centralized manner, where user information is sent to a remote computer where matching occurs, or in a decentralized manner, where the matching process occurs on the user’s phone. In order for these approaches to feed directly into public health decision making, a direct line between the developers, public health response teams, and users needs to be put in place. This will also be key to mitigating any privacy concerns, which should be dealt with in a transparent and direct manner. Although there has been little discussion to date, routinely collected, individually-identifiable Bluetooth or fine-scale GPS location data may also be used to infer and quantify high-resolution proximity network structures which may further inform contact tracing efforts, but will also raise additional privacy concerns47,48.
Frameworks to process and analyze mobile phone data
Luckily, computing resources and methods to analyze and extract these data will not likely be the limiting factor in these instances. Groups such as Flowminder and Telenor Research Group have worked for multiple years to develop more streamlined processes to analyze these data, particularly aggregate mobility data, that are able to directly interface with mobile phone operators. Flowminder has produced a suite of CDR aggregates, such as counts of active subscribers per region or counts of travelers, that can then be used to calculate indicators of mobility, such as crowdedness, population mixing, locations of interest, and intra-/inter-regional travel49. The code to extract these metrics is publicly available at50. Telenor Research Group works directly with mobile phone operators to provide researchers with spatially aggregated CDR/mobility data51. Facebook’s Data For Good program provides aggregated mobility data to researchers that come from their subscribers, and companies like Cuebiq provided mobility data for a number of COVID-19 studies that summarize the distance users travel or the proportion of users that stay at home52. These existing frameworks – not only the analyses, but also the privacy considerations and data sharing agreements – will provide standardized methods that facilitate integrating mobility data into intervention assessments.
Various forms of identifiable personal information are generated when using mobile phones, including names, identification numbers, fine spatial and temporal data on where the device was used, other users’ identification numbers who may have been detected by Bluetooth, and personal details that might be entered into an app. In light of the growing number of digital privacy concerns and regulations, one must carefully consider the exact form and use of mobile phone data being collected against the legal and ethical need to protect users’ data security and confidentiality. While maintaining user confidentiality is often seen as a hindrance to the use of mobile phone data, in that it limits the use of individual-level data and typically requires aggregation to coarse spatial and temporal resolutions, there are a number of existing frameworks that can help provide guidance for the effective, privacy-conscious use of mobile phone data53.
Exactly which model of data privacy will best suit the use of mobile phone data for COVID-19 response will depend on the exact form and proposed use of the data. As discussed above, there already exist many data processing and analysis frameworks to provide anonymized indicators of population mobility. These standard procedures, though, could result in aggregated data with insufficient spatial and temporal resolution to be effective for monitoring the spread of SARS-CoV-2. Privacy regulations, such as the European Union’s General Data Protection Regulation (GDPR)54, offer exceptions for the use of non-anonymous data that may be needed for other response efforts. For example, opt-in applications for contact tracing may seek consent of the data subject to collect and analyze identifiable data, though the ability to scale opt-in approaches to a wide enough population and to maintain user compliance and participation remains unclear. GDPR and other regulations also provide an exception for anonymization of data to be used in public service, but the regulatory hurdles to gain this exception can be substantial and would require clear use policies and applications for these data. The use of mobile phone data, particularly forms such as those proposed through contact tracing applications, must be weighed against the possible infringements of privacy and civil liberties versus the potential public health benefit.