How Data Scientists Can Enable Small Businesses to Utilize Their Data How Data Scientists Can Enable Small Businesses to Utilize Their Data
There were 6 million businesses in the United States in 2016. Industry giants like Walmart and Amazon have hundreds of thousands... How Data Scientists Can Enable Small Businesses to Utilize Their Data

There were 6 million businesses in the United States in 2016. Industry giants like Walmart and Amazon have hundreds of thousands or millions of employees on their own.

Still, only 0.3 percent of U.S. companies have more than 500 employees. The vast majority — 79 percent — have fewer than 10.

Even companies with just 1.5 full-time employees need the benefits of data science, said datalab.cc founder Barton Poulson during his lecture at ODSC East 2017. Data-centered recommendations that bring a company’s operating budget up even .1 percent can make a huge difference for their bottom end. And when a company’s operating budget is just $60,000 a year, those extra funds can go far.

Medium and small companies and nonprofits often collect data, but rarely employ a chief data officer or data staff. This means there is immense opportunity for data scientists and enthusiasts to do something meaningful for those organizations, Poulson said.

Poulson detailed five steps for data scientists to begin working more with the “other 99 percent” of companies:

  • Acknowledge these other companies
  • Learn about how they operate
  • Adapt to them
  • Connect with them
  • Enable them to do data work on their own

Most small and medium businesses are not designed for rapid growth: They are designed for service, sustainability, and a steady income. Their biggest daily technical concerns are spreadsheets, CRM, CMS, social media marketing, email marketing, web analytics, and SEO. By and large, these companies only use Excel for their data-related tasks.


Drew Conway’s data science Venn diagram

That changes the kind of analysis and recommendations data scientists can and should make for them. Poulson pointed to the fact that Drew Conway’s data science Venn diagram shows the intersections of coding, statistics, and domain, but he said in the data science environment people tend to focus on just two parts of it: coding at machine learning (which lies at the intersection of statistics and coding). Domain expertise is all but forgotten, despite the fact it is a defining feature of data science.


When working for small and medium businesses, that domain knowledge is vital, and machine learning and coding are seldom a part of the picture. So data scientists working as consultants must adapt in order to best serve their needs.

Small and medium companies often want to use their data to answer simple, yes-or-no questions, like whether they should sell a certain product or stay open certain days. Thus it is important that data workers do the simplest possible analysis that will answer that question adequately. By and large, this analysis will include just sums, counts, and percentage calculations in Excel, plus tables, pivot tables, and subgroup analysis using filters.

To communicate these analyses in data visualizations, consultants should stick to three charts for nearly all of their viz: bar charts, scatter plots, and line charts. Simple charts without distracting interactivity or video are better for these situations.


Avoid complex data viz when working with small businesses.

Once data analysts understand how smaller companies and nonprofits work and adapt to them, they can connect and work with them. Doing so can be beneficial to the data scientist, too, because it opens a space to practice using different kinds of data and answering different questions.

It also provides an opportunity to network. Analysts working at large organizations could even get paid by their own company for consulting nonprofits to make the work count as a charitable donation, Poulson said.

In the end, data analysts looking to support medium and small businesses and nonprofits should help them learn to do the data analytics on their own to be self-sufficient. This can be achieved through heavy documentation, including how-to videos, and saving files in generic formats (.txt and .csv files) to a location accessible to the organization.

Groups that already do work like this include datakind.org, Data Analysts for Social Good, and the Michigan Ann Arbor Data Dive, a hackathon where participants use nonprofits’ data. Otherwise, enthusiastic analysts can connect with a company or nonprofit they’re familiar with or do so through a Small Business Development Center.

Key takeaways:

  • The vast majority of U.S. companies are small and medium-sized, and they need to use data in different ways than companies known for hiring data analysts do.
  • Data analysts should learn how these smaller businesses operate, adapt to them, and provide them with tools to perform their own analytics to have the most positive impact.
  • Simple is almost always better with small businesses: simple analysis tools, simple charts, simple answers, and simple solutions.
Paxtyn Merten

Paxtyn Merten

Paxtyn is a student at Northeastern University studying journalism and data science.