How to Speak Data in Plain English How to Speak Data in Plain English
The ability to analyze and model data has improved dramatically over the past decade, but our ability to communicate it has... How to Speak Data in Plain English

The ability to analyze and model data has improved dramatically over the past decade, but our ability to communicate it has not. Data science bootcamps and programs have sprung up promising 6-figure jobs in exchange for learning to code. For many analysts, though, coding is only the start. There’s a person on the other end of the project that wants to make business sense of the analysis and can’t understand when we talk about analytics concepts like p-values or Random Forest models. The most important soft skill a data analyst can have is to be an effective communicator and learn how to translate between “data-speak” and “business-speak” and learning how to speak data.

Below are 3 ways to improve the way we communicate data to a non-technical audience:

  1. Be Concise
  2. Be Simple
  3. Be Relatable

I’m in the midst of reading The Wizard of Lies, a book that details the rise and fall of Bernie Madoff’s $65B Ponzi scheme. For readers not familiar with finance, a Ponzi scheme is when an investor commits fraud by generating returns through bringing in new client money rather than successfully investing in stocks or bonds.

Going through the book begs the question: How did no one pick up on this before it grew to a $65B problem and came toppling down?

The truth is, they did. Madoff was arrested in 2008, but as early as 1999, the SEC (Securities and Exchange Commission; the “Wall Street police”) received detailed reports about how Madoff’s performance couldn’t possibly be real. Why didn’t they listen and act? There are many reasons, but I suspect one might be the way the information was communicated.

Be Concise

Image for post

Screenshot of SEC Complaint; Red Flag #5

In 2005, the same person that filed the 1999 complaint wrote and filed another report. It had 29 red flags identifying why he thought Madoff was a fraud. On the surface, it might make sense that the more red flags, the more overwhelming the evidence of guilt.

However, let’s take a look at just one of the red flags on the left (albeit one of the more number-intensive ones). Do you think you could interpret it? Could you keep track of all the numbers and figure out what they mean? Could you maintain your focus for the 24 red flags that follow afterward? Personally, my eyes and attention wane by the third or fourth sentence.

Humans are not designed to retain and process so much information at once. The most impactful 3 reasons for action can be more powerful than listing out all 29. What would you think about the case against Madoff if you were to only see the following 3 statements?

  • Over a 7-year time horizon, a comparable fund to Madoff’s had an objectively more conservative investment strategy yet had~9x more months with negative returns and 34% lower returns per year.
  • Given the market performance, the likelihood of Madoff achieving his returns is less than 1 in 1T.
  • To have executed on his stated investing strategy, Madoff would need to have bought a higher dollar value of financial products than existed in the entire market in which he operated.

Fewer takeaways and data points force a reader to pay more attention and assign more weight to what you’ve written. You can be persuasive by valuing the “less is more” philosophy.

Lastly, notice how we don’t need to include technical jargon to make our point. In the second bullet, most non-technical readers won’t care how we arrived at the 1 in 1T (assuming the analysis is sound) but it will have its intended impact. Including technical detail for a non-technical audience adds unnecessary complexity. We want them to focus on the takeaway and why it matters rather than bog them down with the analytical process.

Takeaway on how to speak data: Don’t drown a non-technical audience in data or process. Give your most compelling 2–3 findings or reasons for action and provide backup data as requested.

Be Simple

You should always be asking yourself “What is the easiest way to explain this finding?”. Provide your findings to someone that has no context about the problem and see if they can understand what you’re saying. Write what you want to say, then write it again with only 50% of the words. Empathize with your audience by considering whether you could understand what you’re writing or saying if you had no knowledge of analytics.

Let’s take the following statement in the complaint to the SEC…

“During the 87-month span analyzed, Madoff was down only 3 months versus GATEX being down 26 months. GATEX earned an annualized return of 10.27% during the period studied vs. 15.62% for Bernie Madoff…”

…and suggest an alternate way that might be easier to follow:

Over the 7 years studied, $100,000 invested with Madoff would have grown to $286,000 but only $203,000 for its competitor. Further, Madoff’s performance suggests that he was 90% more effective at avoiding negative returns than comparable funds.

Image for post

We could also consider including a simple chart like the above to emphasize the point.

What did we change to simplify the interpretation of the data? First, months was changed to years since it’s more common for someone to think of that amount of time in years. Then, instead of an annual return, we put it on dollar values, which people can more easily compare and appreciate the magnitude as it compounds over time. Then, we presented the data on negative returns in a format that I believe better emphasizes the difference in performance and should make someone question how the difference could be 90%.

We should remove numbers where possible and add interpretation. The author mentions 3 negative months vs. 26 and 10.2% annual returns vs. 15.6% but what he’s really saying is that Madoff’s performance is uncharacteristically better than his competitor. For a non-technical audience, just give that answer and have the supporting data ready if necessary rather than giving the numbers and assuming they’ll come to the same answer.

Takeaway on how to speak data: Avoid jargon, be empathetic, and where possible show — rather than tell — your data. Give an answer; don’t give data and assume the person will arrive at the same answer.

Be Relatable

A compelling way to communicate information is to make connections to something that’s more familiar to your intended audience; to relate the data to something the person is already familiar with.

Should the SEC be familiar with Hedge Funds and the language of the industry? Sure, they should, but maybe the people assigned to the case are new to the Agency. Did you know that the SEC’s turnover in 1999 (the year the first report was received) was ~15% for Compliance Examiners and another 14% in 2000? This implies that ~30% of the staff was new when trying to act on the report. The author was a highly technical trader while the examiners were competent attorneys with a working knowledge of finance.

Reframing the data means knowing your audience and considering a way to frame the problem in a context that they’ll appreciate and understand. As Securities investigators, the examiners might not have fully understood the Options market, but they should understand the basic principles of how a stock market works. The report author mentions a red flag about the Options market not being big enough to support Madoff’s strategy. An alternative way to communicate this could have been:

  • Madoff’s actions are equivalent to buying $1B or more of GE stock when the company is only worth $500M

If you want to take it outside of the finance industry, you could say:

  • In order for Madoff’s strategy to be legitimate, it would be as if a plane flew 150 customers to a destination when the plane could only fit 100 customers.

Another point of re-framing could be in the likelihood of achieving his monthly returns (the 1 in 1T). You could add context like: You would be 60,000x more likely to win the lottery than achieve the same returns as Madoff.

We’re trying to show that the scale of what he was doing did not match the constraints that the market provided in terms of size or performance. We want someone to say “That doesn’t / can’t make sense” as we give the comparison. To do so, we need to remove the technical jargon and put the problem in a way that emphasizes the goal of an imbalance in magnitude rather than a straight reporting of the facts. You can and should report the facts, but help a reader interpret those facts where possible by including comparisons, benchmarks, or alternative interpretations that relate your key point.

Takeaway on how to speak data: Reframing the data is a helpful way to communicate the essence of what your data is saying and to engage your audience by moving the problem into a sphere that they better understand and can relate to.

Conclusion on How to Speak Data

Communication is the single most important way to influence and persuade with data. It means having a strong enough grasp on the data, modeling, and analyses to perform sound analysis while being able to vary your communication style to meet the sophistication of your audience.

We can’t expect that someone can (or should) have the same level of knowledge as us. Take time to orient your audience to your findings by properly explaining charts and giving them the time to process what you’ve shared. When I present, I oftentimes forget that I’ve been looking at my presentation and findings for a week or more while my audience is seeing it for the first time.

Finally, if you don’t know your audience well enough to gauge their analytics sophistication, assume it’s low and work your way up from there. While it may initially appear insulting, it’s significantly easier to scale up your technical explanations than to start too technical and lose your audience early.

Original post here – reposted with permission.

Jordan Bean

Jordan Bean

Jordan is an analytics professional with an interest in data storytelling, visualization, and simplifying complex topics into interpretable messages. He's currently pursuing a Masters degree in Business Analytics at Wake Forest University while working in analytics for Liberty Mutual Insurance. Prior to that, he worked in consulting for Private Equity firms on strategy and buyouts. Feel free to connect at https://www.linkedin.com/in/jordanbean/ to talk through any thoughts on articles or breaking into the data field.