As newer fields emerge within data science and the research is still hard to grasp, sometimes it’s best to talk to the experts and pioneers of the field. Recently, we spoke with Elijah Meeks, data visualization expert and Chief Innovation Officer at Noteable about complex data visualizations and their uses. You can listen to the full Lightning Interview here, and read the transcript for the first few questions with Elijah Meeks below.
What is complex data visualization?
It’s a really interesting question and a really interesting concept. The way I got into data visualization was by starting with geospatial data visualization, like maps, cartography, and GIS and that’s a very well-established field. Fundamentally what I mean is not bar charts, line charts, pie charts, histograms, and all the other stuff.
When you think about it, maps are extremely complex data visualizations, but we don’t think of it that way because here in America you take a geography course in the sixth or seventh grade, and you’re exposed to this very abstract representation of the world for which there are multiple ways that you could represent the world and not just the Cartesian plane where you place points or polygons.
I think that because I started out in GIS that gave me a bit more of an expectation for what people could handle and for what tools should be able to do, because if you look at a tool like ArcGIS or QGIS or something like that, there’s so much complexity there. There’s an expectation that people are going to invest not just in learning the complexity, but also learning how to read it and that kind of gives you a target for what you can do with other forms that maybe people are less familiar with.
I was at Stanford and I came to Stanford as someone who had a background in GIS, and I did a lot of GIS work and just more general cartography early on. There was a day when somebody wanted me to work on some project, they said “we want you to work on it because you’re the map guy.” In the back of my head, something didn’t sound right about that, and I wrinkled at the concept because I didn’t want to be the expert in one thing.
I didn’t have the language for it at the time, but I wanted to be considered an information designer, not just a person who specialized in one tool. So at that point, I got this bug in my ear to go after network visualization, so I actually went from being the map guy to being this network visualization enthusiast. So I got up to speed on network analysis, got involved with Gephi which is one of these open-source tools for looking at complex graphs or complex networks, and did a lot of work there.
I went from being the map guy to being the network guy, and then I went to sort of more generically like how do we get into a visual display of quantitative information like with bar charts and line charts and things like that.
What I found when I became the network guy was the scariest thing about data visualization, especially complex data visualization. This is a challenge that most people I think don’t recognize until they really start to use it, which is that if you take this really complex data set and you create a network out of it, and you put it in front of experts, experts are going to see patterns in that and they are going to be excited by it, even if it turns out you made a fundamental mistake in how you were building the network itself.
What I discovered was that people are really attracted to and excited by complex visualization in general, especially networks, and then they become very scared and very disillusioned as soon as they find out you were wrong and that pattern-seeking ability that makes human beings so powerful can be misused or miswired by the wrong chart.
I think that that’s one of the reasons why people are so nervous about leveraging complex data visualization. It’s not so much that you can’t read it which is one of the problems or that you’re not familiar with it it’s that even when you can read it, you can find things in it that you aren’t as well-prepared to understand the caveats of, because it can sometimes be such an exciting visual form and you can see these patterns that you want to see. and we run into that with all kinds of data visualization.
Why is there a need for complex data visualization?
If you’re trying to tell a story, you want to speak in a language your audience understands. What is a story if I’m speaking to you in a language that you don’t comprehend? The same thing applies to data visualization.
What I really mean is that there’s a numerically precise category of data visualizations. We hear about this when we think about the optimization that you see in data visualization research when people talk about being able to measure lengths more easily than measuring angles. The reason why people get so uncomfortable about pie charts, what we’re really talking about is numerical precision. A line chart, a bar chart, or a histogram is very numerically precise, but what that means is you can only make decisions based on numerical data and that might – to someone who’s a little bit unfamiliar with the forms data – might say “Well what other kind of data is there?”
I think the best example of that really is network data, even though it’s very complex. For instance, I could take all of the people that you’re connected to based on the meetings you’ve had over the last few years. I could look at your calendar and look at who’s involved in meetings, and I could make a bar chart of the person who had the most meetings, and that would give me some indication of effort and power in an organization, but I could take that same dataset and I could actually connect you to all the people that you were in meetings with and I could connect them to all the people that they were in meetings with and I could from that discover that there are some people who have very few meetings but they’re in meetings with the people who have meetings with powerful people.
You can find out that people are connected to each other in interesting ways that indicate power beyond the raw magnitude of the time they’re spending in meetings and you’re only going to see that. That’s a topological kind of data, and you’re only going to see topological patterns in topological representation. So when we boil things down to numerically precise representations what we’re boiling away is the geographic topological, hierarchical, or textual patterns of the data that we couldn’t display because those aren’t amenable to bar charts and line charts and other numerically precise forms of data.
There are a lot of other kinds of data like how people move through systems. Or how different processes aggregate in a way to indicate the performance of a system, which we use a flame graph for. I think these examples get lost when people try to boil them down to a numerical metric somehow representing those values.
How to learn more about complex data visualization
It’s easy to create bar charts and graphs, but a lot more goes into creating complex data visualization or designs that tell the complete story. By attending ODSC East 2023 this May 9th-11th and checking out the data visualization track, you’ll learn tangible skills to make your data shine and make the most impact. Register now while tickets are 70% off for a limited time!