The ultimate guide to analyzing data: Part 1
How to monitor business performance and make sense of data
Whether you’re in Marketing, Operations or Product, data can help you make better decisions. Unfortunately, most companies are better at collecting data than making sense of it.
Based on my experience turning data into actionable insights at Rippling, Meta and Uber, I’ve put together this step-by-step playbook on how to leverage data to understand business performance and identify risks and opportunities in a structured way.
I’m including a ton of hands-on examples inspired by hard-earned lessons over the last 10 years so you don’t have to make the same mistakes as me.
Since there is a lot to cover, this will be a two-part post. In this part, I’ll go over the following:
What metrics to track: How to establish the revenue equation and driver tree for your business
How to track: How to set up monitoring and avoid common pitfalls. We’ll cover how to choose the right time horizon, deal with seasonality, master cohorted data and more!
Part 2 covers the following:
Extracting insights: How to identify issues and opportunities in a structured and repeatable way. We’ll go over the most common types of trends you’ll come across, and how to make sense of them.
Sounds simple enough, but the devil is in the details, so let’s dive into them one-by-one.
Chapter 1: Establishing your business’ revenue equation and key drivers
The very first step is to figure out what metrics to track. By starting with the revenue equation for your business you can make sure you focus on the metrics that actually drive business outcomes.
The exact equation depends on the type of business. Lenny Rachitsky and Dan Hockenmaier wrote a great piece that shows the equations for the most common B2B and B2C business models which you can use as a starting point.
Start with the high-level equation (e.g. “Revenue = Impressions * CPM / 1000” for an ads-based business) and then break each part down further to get to the underlying drivers.
The resulting driver tree, with the output at the top and inputs at the bottom, tells you what drives results in the business and what dashboards you need to build so that you can do end-to-end investigations.
Example: Here is a (partial) driver tree for an ads-based B2C product:
Things to note:
Ambiguity: There isn’t just one right revenue equation or driver tree. For example, impressions per user could be expressed as a function of sessions instead of time spent. Which one makes the most sense depends on how your product and business model work.
Completeness: Having every single input represented in the driver tree is not the goal. To monitor high-level performance of the business, keep it focused on the key drivers and only go a few levels deep. Once you perform deep dives (more on that next week), you can expand it ad-hoc in the area of interest.
Understanding leading and lagging metrics
The equation for your business might make it seem like the inputs translate immediately into the outputs but in reality, most of the business outcomes you care about take time to materialize.
The most obvious example is a Marketing & Sales funnel: You generate leads, they turn into qualified opportunities, and finally the deal closes. Depending on your business and the type of customer, this can take many months.
In other words, if you are looking at an outcome metric such as revenue, you are often looking at the result of actions you took weeks or months earlier.
As a rule of thumb, the further down you go in your driver tree, the more of a leading indicator a metric is; the further up you go, the more of a lagging metric you’re dealing with.
Quantifying the lag
It’s worth looking at historical conversion windows to understand what degree of lag you are dealing with.
That way, you’ll be better able to work backwards (if you see revenue fluctuations, you’ll know how far back to go to look for the cause) as well as project forward (you’ll know how long it will take for the impact of initiatives to materialize).
In my experience, developing rules of thumb (does it on average take a day or a month for a new user to become active) will get you 80% — 90% of the value, so there is no need to over-engineer this.
Chapter 2: Setting up monitoring and avoiding common pitfalls
So you have your driver tree; how do you use this to monitor the performance of your business?
The first step is setting up a dashboard to monitor the key metrics. I am not going to dive into a comparison of the various BI tools you could use (I might do that in a separate post in the future). Everything I’m talking about in this post can easily be done in Google Sheets, so your choice of BI software won’t be a limiting factor.
Instead, I want to focus on a few best practices that will help you make sense of the data and avoid common pitfalls.
1. Choosing the appropriate time frame for each metric
While you want to pick up on trends as early as possible, you need to be careful not to fall into the trap of looking at overly granular data and trying to draw insights from what is mostly noise.
Consider the time horizon of the activities you’re measuring and whether you’re able to act on the data:
Real-time data is useful for a B2C marketplace like Uber because
1) transactions have a short lifecycle (an Uber ride is typically requested, accepted and completed within less than an hour) and
2) because Uber has the tools to respond in real-time (e.g. surge pricing, incentives, driver comms).
In contrast, in a B2B SaaS business, daily Sales data is going to be noisy and less actionable due to long deal cycles.
You’ll also want to consider the time horizon of the goals you are setting against the metric. If your teams have monthly goals, then the default view for these metrics should be monthly.
The main problem with monthly metrics (or even longer time periods) is that you have few data points to work with and you have to wait a long time until you get an updated view of performance.
One compromise is to plot metrics on a rolling average basis: This way, you will pick up on the latest trends but are removing a lot of the noise by smoothing the data.
Example: Looking at the monthly numbers on the left hand side we might conclude that we’re in a solid spot to hit the April target; looking at the 30-day rolling average, however, we notice that revenue generation fell off a cliff (and we should dig into this ASAP).
2. Setting benchmarks
In order to derive insights from metrics, you need to be able to put a number into context.
The simplest way is to benchmark the metric over time: Is the metric improving or deteriorating? Of course, it’s even better if you have an idea of the exact level you want the metric to be at.
If you have an official goal set against the metric, great. But even if you don’t, you can still figure out whether you’re on track or not by deriving implied goals.
Example: Let’s say you have a quota for your Sales team, but you don’t have an official goal for how much pipeline you need to generate to hit quota.
In this case, you can look at the historical ratio of open pipeline to quota (“Pipeline Coverage”), and use this as your benchmark.
Be aware: By doing this, you are implicitly assuming that performance will remain steady (in this case, that the team is converting pipeline to revenue at a steady rate).
3. Accounting for seasonality
In almost any business, you need to account for seasonality to interpret data correctly. In other words, does the metric you’re looking at have repeating patterns by time of day / day of week / time of month / calendar month?
Example: Look at this monthly trend of new ARR in a B2B SaaS business:
If you look at the drop in new ARR in July and August in this simple bar chart, you might freak out and start an extensive investigation.
However, if you plot each year on top of each other, you are able to figure out the seasonality pattern and realize that there is an annual summer lull and you can expect business to pick up again in September:
But seasonality doesn’t have to be monthly; it could be that certain weekdays have stronger or weaker performance, or you typically see business picking up towards the end of the month.
Example: Let’s assume you want to look at how your Sales team is doing in the current month (April). It’s the 15th business day of the month and you brought in $26k so far against a goal of $50k. Ignoring seasonality, it looks like you’re going to miss since you only have 6 business days left.
However, you know that the team tends to bring a lot of deals over the finish line at the end of the month.
In this case, we can plot cumulative sales and compare against prior months to make sense of the pattern. This allows us to see that we’re actually in a solid spot for this time of the month since the trajectory is not linear.
4. Dealing with “baking” metrics
One of the most common pitfalls in analyzing metrics is to look at numbers that have not had sufficient time to “bake”, i.e. reach their final value.
Here are a few of the most common examples:
User acquisition funnel: You are measuring the conversion from traffic to signups to activation; you don’t know how many of the more recent signups will still convert in the future
Sales funnel: Your average deal cycle lasts multiple months and you do not know how many of your open deals from recent months will still close
Retention: You want to understand how well a given cohort of users is retaining with your business
In all of these cases, the performance of recent cohorts looks worse than it actually is because the data is not complete yet. If you don’t want to wait, you generally have three options for dealing with this problem:
Option 1: Cut the metric by time period
The most straightforward way is to cut aggregate metrics by time period (e.g. first week conversion, second week conversion etc.).This allows you to get an early read while making the comparison apples-to-apples and avoiding a bias towards older cohorts.
You can then display the result in a cohort heatmap. Here’s an example for an acquisition funnel tracking conversion from signup to first transaction:
This way, you can see that on an apples-to-apples basis, our conversion rate is getting worse (our week-1 CVR dropped from > 20% to c. 15% in recent cohorts). By just looking at the aggregate conversion rate (the last column) we wouldn’t have been able to distinguish an actual drop from incomplete data.
Option 2: Change the metric definition
In some cases, you can change the definition of the metric to avoid looking at incomplete data.
For example, instead of looking at how many deals that entered the pipeline in March closed until now, you could look at how many of the deals that closed in March were won vs. lost. This number will not change over time, while you might have to wait months for the final performance of the March deal cohort.
Option 3: Forecasting
Based on past data, you can project where the final performance of a cohort will likely end up. The more time passes and the more actual data you gather, the more the forecast will converge to the actual value.
But be careful: Forecasting cohort performance should be approached carefully as it’s easy to get this wrong. E.g. if your B2B business has low win rates, a single deal might meaningfully change the performance of a cohort. Forecasting this accurately is very difficult.
Further Reading
This is Part 1 of a two-part post.
Check out Part 2 where we start leveraging the data to extract insights!