How City Governments Can Establish an Effective Data Infrastructure How City Governments Can Establish an Effective Data Infrastructure
If a city government wants to become data-driven, it can’t make its workforce pull the extra weight. That’s why City of... How City Governments Can Establish an Effective Data Infrastructure

If a city government wants to become data-driven, it can’t make its workforce pull the extra weight. That’s why City of San Diego Chief Data Officer Maksim Pecherskiy was hired in late 2014. Pecherskiy helped build the city’s data portal to make government data available and easily usable for programmers, city officials, and the general public – all to help create an effective data infrastructure.

“Data should never be extra work — it should make your work easier,” Pecherskiy said during an ODSC Webinar. “We do everything we can to make sure that happens.”

Government workers often have to use disparate systems to make decisions and have to prioritize their work manually. Their software is usually outdated, expensive, and complicated. And their work is majorly reactive rather than predictive.

San Diego’s data team wanted to use data to solve some of these painful, time-consuming problems in the simplest ways for them and for their users. Their goal was to solve each problem in a way that would make solving the next similar problem easier.

To establish any city’s data infrastructure, Pecherskiy said teams must do the following:

  1. Determine their approach.
  2. Decide what policies are in place and what policies should be changed.
  3. Determine what the data collection and use process is and how it should be changed.
  4. Work with the people who work with the data and with subject matter experts.
  5. Connect to the data source(s).
  6. Build a middleware (to transfer information between the data source and interface).
  7. Provide an interface for people and policymakers to use — whether that’s a web application, Tableau visualization, map or something else.

In San Diego, Pecherskiy took inventory of what data the city had before he created a process for releasing it to the public. Governments collect data about everything from streetlights to trash cans to police cars, and they do so in a number of different formats: SQL databases, desktops, shared drives, Oracle databases, and more.

Pecherskiy had to ensure the portal could access the city’s various data sources, and could be automated to refresh that data and metadata on a schedule.

His team built a custom database with Amazon’s Simple Storage Service bucket using open source code from Philadelphia’s data officer. It provides data to users in various formats, including shapefiles and CSVs, so users ranging in ability can all gain the insight they want. Pecherskiy and his team do analyses for city departments that want to look deeper than they have the skill sets to.

Working with government data, it is crucial that the information is actionable for city officials when they make decisions. So San Diego’s data team created a continuously updated performance dashboard to make it easy for the mayor and council members to evaluate the city’s performance and where to allocate funds. A budget portal helps residents and public officials evaluate, forecast, and analyze budget policy decisions.

The goal in San Diego’s data strategy is to “let machines do what they do best so humans can do what they do best.”

For instance, the data team evaluated parking meter utilization to determine where meters weren’t worth the time and money of maintenance. In another case, data workers rode along with the city’s nine primary delivery trucks to optimize their routes so they could spend less time stuck in traffic.

Conflict mitigation tasks, when done manually by city departments, take 10 to 20 hours of work a month and are influenced by the shifting nuances of the worker behind the computer. To solve that, the data team implemented an Apache Airflow automation system to do the same amount of work mechanically each night in three seconds.

When a city combines its raw data with static analysis, dynamic interaction, and human services, its resources can be used more effectively and result in better outcomes.

Key Takeaways:

  • Data should make life easier, not harder, when used as part of a city system, and those in charge of setting up that data system can’t impose more work on those who collect the data.
  • City data teams can follow a general seven-step plan to implement an effective data structure in their governments.

Ready to learn more data science skills and techniques in-person? Register for ODSC West this October 31 – November 3 now and hear from world-renowned names in data science and artificial intelligence!

Paxtyn Merten

Paxtyn Merten

Paxtyn is a student at Northeastern University studying journalism and data science.