User-centric performance metrics

Philip Walton

We've all heard how important performance is. But when we talk about performance, and about making websites "fast", what specifically do we mean?

The truth is that performance is relative:

A site might be fast for one user (on a fast network with a powerful device) but slow for another user (on a slow network with a low-end device).
Two sites might finish loading in the exact same amount of time, but one can seem to load faster if it loads content progressively, instead of waiting until the end to display anything.
A site might appear to load quickly, but then respond slowly, or not at all, to user interaction.

When talking about performance, it's important to be precise and to refer to performance in terms of metrics, objective criteria that can be quantitatively measured. But it's also important to make sure the metrics you're measuring are useful.

Metrics

Historically, web performance has been measured with the load event. However, even though load is a well-defined moment in a page's lifecycle, that moment doesn't necessarily correspond with anything the user cares about.

For example, a server could respond with a minimal page that "loads" immediately, but then defers fetching content or displaying anything on the page until several seconds after the load event fires. Such a page technically has a fast load time, but that time doesn't correspond to how a user experiences the page loading.

Over the past few years, members of the Chrome team, in collaboration with the W3C Web Performance Working Group, have been working to standardize a set of new APIs and metrics that more accurately measure how users experience the performance of a web page.

To help ensure the metrics are relevant to users, we frame them around a few key questions:

Is it happening?	Did the navigation start successfully? Has the server responded?
Is it useful?	Has enough content rendered that users can engage with it?
Is it usable?	Can users interact with the page, or is it busy?
Is it delightful?	Are the interactions smooth and natural, free of lag and jank?

How metrics are measured

Performance metrics are generally measured in one of two ways:

In the lab: using tools to simulate a page load in a consistent, controlled environment
In the field: on real users actually loading and interacting with the page

Neither of these options is necessarily better or worse than the other. In fact, you generally want to use both to ensure good performance.

In the lab

Testing performance in the lab is essential when developing new features. Before features are released in production, it's impossible to measure their performance characteristics on real users, so testing them in the lab before the feature is released is the best way to prevent performance regressions.

In the field

On the other hand, while testing in the lab is a reasonable proxy for performance, it isn't necessarily reflective of how all users experience your site.

The performance of a site can vary dramatically based on a user's device capabilities and network conditions. It can also vary based on whether (or how) a user is interacting with the page.

Page loads are also not always deterministic. For example, sites that load personalized content or ads can experience vastly different performance characteristics from user to user. A lab test won't capture those differences.

The only way to truly know how your site performs for your users is to actually measure its performance as those users load and interact with it. This type of measurement is commonly called Real User Monitoring (RUM).

Types of metrics

There are several other types of metrics that are relevant to how users perceive performance:

Perceived load speed: how quickly a page can load and render all of its visual elements to the screen.
Load responsiveness: how quickly a page can load and execute any JavaScript code required for components to respond quickly to user interaction.
Runtime responsiveness: how quickly a page can respond to user interaction after it loads.
Visual stability: do elements on the page shift in ways that users don't expect, potentially interfering with their interactions?
Smoothness: do transitions and animations render at a consistent frame rate and flow fluidly from one state to the next?

Given all these types of performance metrics, it's hopefully clear that no single metric is sufficient to capture all the performance characteristics of a page.

Important metrics to measure

First Contentful Paint (FCP): The time from when the page starts loading to when any part of the page's content is rendered on the screen. (lab, field)
Largest Contentful Paint (LCP): The time from when the page starts loading to when the largest text block or image element is rendered on the screen. (lab, field)
Interaction to Next Paint (INP): The latency of every tap, click, or keyboard interaction made with the page. Based on the number of interactions, this metric selects the page's worst (or close to worst) interaction latency as a single, representative value to describe a page's overall responsiveness. (lab, field)
Total Blocking Time (TBT): The total amount of time between FCP and Time to Interactive (TTI) where the main thread was blocked for long enough to prevent input responsiveness. (lab)
Cumulative Layout Shift (CLS): The cumulative score of all unexpected layout shifts that happen between when the page starts loading and when its lifecycle state changes to hidden. (lab, field)
Time to First Byte (TTFB): The time it takes for the network to respond to a user request with the first byte of a resource. (lab, field)

This list includes metrics measuring many of the various performance aspects relevant to users, but it doesn't include everything. For example, runtime responsiveness and smoothness aren't covered.

In some cases, new metrics will be introduced to cover missing areas, but in other cases, the best metrics are ones specifically tailored to your site.

Custom metrics

The performance metrics listed here are good for getting a general understanding of the performance characteristics of most sites on the web. They're also good for having a common set of metrics for sites to compare their performance against their competitors.

However, there are times when a specific site is unique in some way that requires additional metrics to capture the full performance picture. For example, the LCP metric is intended to measure when a page's main content has finished loading, but there might be cases where the largest element isn't part of the page's main content, making LCP irrelevant.

To address such cases, the Web Performance Working Group has also standardized lower-level APIs that can be useful for implementing your own custom metrics:

See the Custom Metrics guide to learn how to use these APIs to measure performance characteristics specific to your site.