Charlottesville’s Open Data Advisory Group and Astraea are challenging local data scientists to create predictive models for activity on the Downtown Mall based on data from its WiFi network. Credit: Credit: Josh Mandell, Charlottesville Tomorrow
Thousands of people work, play, dine and socialize on Charlottesville’s Downtown Mall every day — and some of them connect their phones and computers to the mall’s free Wi-Fi network. 
The $7.5 million Downtown Mall rebricking and renovation project in 2009 included a contract with Blue Ridge InterNetworks to provide Wi-Fi. Last week, the city of Charlottesville released the mall’s Wi-Fi usage data for a contest that challenges data science professionals and enthusiasts to give insights into how people use the city’s signature public space.
The open data challenge is co-sponsored by the city’s Open Data Advisory Group and Astraea, a local software startup. Astraea data scientists will judge the entries and award two $500 prizes and Nvidia Titan Xp graphics cards for the projects that incorporate the best predictive model for Wi-Fi use and the best visual “data storytelling.”
“There are issues and challenges that Charlottesville is facing that [the city government] could partner with data scientists to solve,” said Daniel Bailey, co-founder and chief technology officer at Astraea. “There is a ton of local data science talent. We want to raise awareness of valuable public datasets and harness that talent for social good.”
The creators of the winning projects also will have the opportunity to present their findings at the Tom Tom Founders Festival’s Applied Machine Learning Conference on April 12.
The open data challenge is centered on aggregated, anonymized connection data from 2017. Bailey said more 40,000 clients connected to the network last year, accounting for more than 330,000 user sessions.
The dataset can be broken down easily to show how many connections were made at each of the mall’s nine Wi-Fi access points last year. 
“That is one way to tell where people are at [on the Downtown Mall],” said Jason Ness, the city’s business development manager. 
For the predictive model challenge, entrants will be asked predict the network’s numbers of clients and sessions and usage in kilobytes for a week in 2017 that was intentionally left out of the dataset. 
Bailey said knowledge of machine learning methods likely would be necessary to build a predictive model accurate enough to win the $500 prize. 
Machine learning is a branch of artificial intelligence that focuses on enabling computers to recognize patterns without being programmed to perform specific tasks. Astraea is developing a machine learning platform for analyzing enormous collections of images captured by Earth-observing satellites. 
Bailey said a successful machine learning model would identify relationships between Wi-Fi usage and other factors that indicate and influence the level of activity on the Downtown Mall — such as weather data, parking ticket records, bus schedules and event calendars.
Some of this data can be found through Charlottesville’s Open Data Portal (, which launched in August.
Public Works Director Paul Oberdorfer said predictive models for Downtown Mall activity based on Wi-Fi data could help city parks and recreation staff schedule maintenance projects for times that would cause the least disruption for businesses.
Oberdorfer said the data projects also could help downtown businesses determine what their optimal hours of operation might be.
“[The open data challenge] could provide some businesses with analytics beyond what they are currently capable of doing,” Oberdorfer said.
Ness said Charlottesville’s open data challenge is influenced by a recent experiment by Transport for London, the managing agency for the city’s famed Underground rapid transit system, nicknamed the Tube.
In 2016, Transport for London collected information about Tube ridership through Wi-Fi connection data in a four-week pilot study. The agency concluded that depersonalized data “… removes the need for costly, time-consuming surveys and means we can provide detailed customer information for specific times of the day, on individual lines, platforms and even trains.”
“We hope to support Charlottesville on the path to becoming a ‘smart city’ — one that uses data to drive decisions,” Bailey said. “This is an opportunity for Charlottesville to lead and continue that path.”
Complete rules for the open data challenge can be found at . 

Josh Mandell graduated from Yale in 2016 and has been recognized by the Virginia Press Association with five awards for education writing, health, science and environmental writing and multimedia reporting.