Recently, Andreas von der Heydt, Merchandising VP at Chewy, shared an image on LinkedIn that generated a lot of buzz about data storytelling. The image compared raw data to LEGO bricks and a fully assembled LEGO house to a data story.
I had seen a variation of the same image several years ago. I was able to isolate the origin of the first four steps to a visual created by Hot Butter Studio co-founders, Brandon Rossen and Karyn Lurie. Their image focused on infographics and was meant to be read as “an infographic is data sorted, arranged, and presented visually.” Von der Heydt cropped out the infographic part and then added a fifth step with a LEGO house (Creator 5198 Apple Tree House) and the caption, “Explained with a story.”
While I like the overall concept, there are a couple of minor flaws and a major omission that must be addressed to better represent the process of moving from raw data to a data story. I do believe Von der Heydt was on to something with his analogy as it clearly resonated with many people. As a LEGO fan and an advocate for data storytelling, I felt duty bound to re-examine the analogy and develop it a little more.
With my re-interpretation of the analogy, I went a slightly different direction but retained the same number of steps. Here’s a quick summary of my five steps:
The journey of going from raw data to a data story is a process. Successful data storytelling doesn’t begin at step five—it begins right at the beginning with the data you collect. Let’s start building a deeper understanding of these crucial steps by reviewing each one in more detail.
Note: All of the visuals in this post were based on LEGO Creator Small Cottage set (271 pieces).
Today, most organizations collect a lot of data. Similarly, over the course of multiple birthday and holiday gifts, a household can accumulate tons of LEGO pieces. Like LEGO bricks, data comes in various forms and can be used to build all kinds of things. If you leave data or LEGO bricks in their raw form, they don’t serve a purpose other than to collect dust and take up storage space. It’s only when they’re combined that they begin to transform into something meaningful or useful.
Similarly, you’ll have data from a wide variety of source systems in your department or organization. Whether the data remains siloed in these systems or aggregated in a data lake or data warehouse, your pile of data will continue to expand over time.
Rather than storing the assorted LEGO pieces in a random pile, it’s better to organize them by color, shape, size, or function. During this process, you can remove non-LEGO items or even broken LEGO pieces from the pile. Depending on what you’re attempting to build, you may need to combine LEGO pieces from more than one LEGO set.
Before you can use the data you’ve collected, it must go through a similar process of cleansing, organizing, and combining. A significant amount of time and effort can be spent on just making data usable before it can be visualized, analyzed, and turned into data stories.
Now, at this point, you could go rummaging through these sorted piles of LEGO bricks and start to create something in an ad-hoc fashion. However, it will be time consuming to comb through the bricks even when they have been organized into piles by color or function.
In this image, I have organized the LEGO bricks in a more methodical manner by size, shape, function, and color. For example, I have put all the windows and doors at the top next to each other, and all the slanted roof pieces in descending order of size. The bricks have also been spread out so it’s easier to determine what you have to work with, and you can quickly pinpoint the bricks you need as you’re building.
Similarly, once you have clean data, raw data tables won’t be as useful as reports with data charts and graphs that provide better visual context. Data visualization can help you to see the data more clearly and easily explore the information to find potential insights. For example, a well-designed dashboard can help you examine the data both from a high-level (breadth) and more detailed (depth) perspective.
At this stage, when you begin analyzing the data for answers to specific business questions you should have a clear purpose and a narrow focus. When you build a LEGO creation, you must have a clear idea of what you’re building. Are you creating a car, a boat, or a plane? Yes, your 3-in-1 LEGO set may allow you to build all three vehicles but not all three at the same time.
When you’re building a LEGO creation either from a set of instructions or just from your imagination, you’ll always build it in stages. Frequently, most of your analysis work won’t be fruitful and will need to be discarded. Likewise in the LEGO building process, you may decide to discard an unwanted subassembly or rebuild it a different way. In the analysis process, when you identify a key observation or insight, it is similar to these subassemblies or subcomponents that form a part of your desired LEGO structure (e.g., a roof assembly or a wall with a door for your house).
Most of the time with LEGOs, you’re building something for yourself. However, there are situations when you might construct something for someone else such as a sibling, parent, or friend. In contrast, when we perform data analysis we are most often performing it to benefit others—our manager, team, department, company, and so on. The more you know about your key stakeholders’ interests or needs, the more targeted your analysis can be and the more valuable your insights will be.
In both cases, your intended audience will inform what you build with LEGOs and the focus of your analysis and subsequent data story. For example, if you were looking to build something out of LEGOs for a younger brother who loves skateboarding, his passion would guide what you create with the LEGO pieces. Yes, you can build anything with LEGOs, but having a target audience in mind will focus your efforts.
Similarly, as you start exploring the data, you may uncover a variety of interesting findings. Some of your observations and insights will be more important or relevant to your audience than others. If you understand their business goals or problems, it can guide your analysis and prioritize which insights you focus on—and ultimately, what goes into your data story.
Even if you’ve assembled the LEGO bricks into some interesting but basic subassemblies, you won’t have a cohesive experience yet. However, when you bring all the separate parts together to create a house with a skateboarding rail and a relatable skateboarder LEGO minifigure (i.e., humanizing the data), you now have something compelling that your younger brother will want to play with.
Similarly, a set of observations and insights will be incomplete if they don’t have an overarching narrative that binds them together. When you add relevant context and meaning to the numbers, your audience will be engaged and enlightened by your insights. As you explain your insights with data stories, you better prepare the audience to make informed decisions and take action.
Data storytelling is the final step at the end of a multi-step process. The quality of your data stories will depend on what happens at each preceding step. If we fail to tell the data story effectively, all the prior work can go for naught. You’ll notice I intentionally included someone’s hand playing with the LEGO house because, ultimately, we need our data stories to be embraced to drive action. I firmly believe effective data stories can inspire change.
As I mentioned at the beginning, I like the LEGO data story analogy. However, I feel Von der Hedyt’s image has some flaws that I’ve tried to address in my version. Here are the three main challenges I uncovered while examining the original image.
In my rendition of the LEGO data story image, I used the analogy in an exclusively abstract way and not a literal one. The first and last steps in my version are almost identical to the original one created by Von der Heydt. However, the first challenge I have with his version is its co-mingling of literal and abstract examples. In particular, the transition from a colorful column chart (literal) to the house (abstract) in my view isn’t consistent or harmonious. You either need to be literal or abstract, and by mixing the approaches, you lose something in translation. I did have fun coming up with the sample visualizations below, but they're too literal for this analogy.
My second challenge I have with the LEGO data story image is the weakness of the Arranged step. In its original context of infographics, I can see how the arrangement of visuals in an infographic is an important layout consideration. However, in the new data story context, the Arranged step doesn’t tie to a key step in the process and feels unnecessary. In fact, you could argue it isn’t that different from the Presented Visually step (bricks assembled by color). For my version, I wanted to ensure each step reinforced a major step in the process.
My greatest issue with the original image is it is missing a crucial step—analysis. Just because you’ve visualized data, it doesn’t mean you have found an insight. If you only have interesting observations but no insights, you aren’t ready to tell a data story yet. You need to progress from ‘what’ to ‘why,’ which is only possible with analysis. Without a central insight or main takeaway, your data story will always be incomplete (imagine a real story with no climax). By not explicitly focusing on analysis, Von der Heydt’s image is missing a vital component.
Despite these issues, I still like the LEGO data storytelling analogy. It combines a passion from my childhood with another from my adult life. Both LEGO building and data storytelling demand planning, creativity, attention to detail, and problem-solving skills. As I mention in my book, analogies such as this one can be powerful mental shortcuts that can quickly generate shared understanding. American attorney Dudley Field Malone once stated, “One good analogy is worth three hours of discussion.”
I hope my small contribution to the LEGO data storytelling analogy can help advance further understanding of this important topic. I know I’m not the only LEGO maniac turned data storyteller so hopefully it resonates with others.
Effective Data Storytelling teaches you how to communicate insights that influence decisions, inspire action, and drive change.