fbpx
ODSC East US Attendees Visualization ODSC East US Attendees Visualization
Everyone – well, almost everyone – likes maps, especially maps with interesting data. The final product of analysis hides the mountain... ODSC East US Attendees Visualization

Everyone – well, almost everyone – likes maps, especially maps with interesting data. The final product of analysis hides the mountain of work that goes into its creation. In this case, all the blood, sweat, and tears comes from those versed in the intricacies of geospatial data analysis. So, do you have to become an expert in the field if you want to create your own cool maps? Fortunately, that isn’t the case. They are a number of packages in several languages – folium for Python, leaflet for Javascript, and R’s rCharts, to name a few – that lower the bar for entry. There are also several products that move this bar even lower, providing a wealth of options that make the process a smooth ride. One of these is CartoDB whose very own Santiago Giraldo will be at ODSC East to give a talk on data visualization. With some data wrangling and CartoDB’s software, it’s easy to construct a visual representation of ODSC East’s attendees within the United States.

A simple data aggregation in R with dplyr produced attendee aggregate counts per state. From there I moved over to CartoDB and uploaded the data for the next, often dreaded task, geocoding.

After only a couple of missteps, the geocoding at the state level was completed. A couple clicks produced a choropleth map, a suitable color scheme, and labels to show the number of attendees per state. The last task was to make the location of ODSC East, the Boston Convention and Exhibition Center, stand out.

After getting the coordinates of the venue from Google Maps, some more poking around put it front and center in CartoDB. You can see the resulting map here.

Gordon Fleetwood

Gordon studied Math before immersing himself in Data Science. Originally a die-hard Python user, R's tidyverse ecosystem gradually subsumed his workflow until only scikit-learn remained untouched. He is fascinated by the elegance of robust data-driven decision making in all areas of life, and is currently involved in applying these techniques to the EdTech space.

1