Amount of data needed

This section can help you build a sense of how much data you need. The guidance about the amount of data needed is rough and directional because the true answer depends on what the data is like. The most accurate way to assess this is to run the model and evaluate the width of the credible intervals.

  • Data size is the number of geos times the number of time points.

  • These time points and geos are not independent. For example, 1,000 data points in a marketing mixed modeling (MMM) setting isn't the same as something like 1,000 coin flips or 1,000 randomly assigned participants in an experiment.

Also see the sections for national models and geo models.

Amount of data for national models

In Meridian's national model, the effects are modeled with model parameters, each with independent priors. For national models, a key confidence check metric is the number of data points per model parameter. For example, if you have 12 media channels, six controls, and eight knots, the total is 26 parameters. (Ignore Adstock and Hill parameters for simplicity.) With two years of weekly data (104 data points), you have four data points per parameter. This sample size is too low to estimate the model reliably. (Additionally, insufficient variation in the media spend adversely impacts national models.) For more information about knots, see How the knots argument works.

Because it is difficult to get enough data for a national model, you can do the following:

  • Lower the scope of the MMM. You can estimate fewer media channels (either by dropping a channel with low-spend or combining channels), use fewer knot parameters to estimate time effects (if you aren't using the default knots=1 setting), and remove any extraneous controls. However, don't remove important confounders.

  • Get much more data. For example, use three years' of weekly data instead of two. Adding more data will reduce the variance in inference, but might make the inference less relevant.

  • Alternatively, consider adding geo granularity to your data and using a geo model instead of lowering the scope or adding more data.

Consider the previous hypothetical example for the national model. You can combine the 12 media channels into three, lower your knots to two. You might also recognize that one of your controls explains the KPI but not the media, which means that it is not a true confounder and you can remove it. If you also use three years' worth of weekly data, you then have 156 data points to estimate 10 parameters. This is roughly 15 data points per parameter and now you might be able to glean some directional information from the MMM.

Amount of data for geo models

The number of data points per effect estimated by the geo model remains an important metric for checking confidence. However, counting the number of effects using the number of model parameters is not as straightforward in the geo model as it is in the national model. Complexity arises because the geo hierarchy shares information across geos, which makes the geo-level parameters dependent rather than independent. The amount of information shared across these geos plays a role.

For example, 105 geos and three years of weekly data yield $105 \times 156 = 16,380$ data points. If you estimate 12 media channels, six controls, and 100 knots, you can evaluate data sufficiency through two lenses (ignore Adstock and Hill parameters for simplicity):

  • Strict (No-Pooling) View: If you assume no information sharing across geos, you must estimate $(12 \times 105) + (6 \times 105) + 100 + (105 - 1) = 2,094$ parameters. (You multiply by 105 because media and controls have geo-level parameters.) This yields about 8 data points per parameter and represents a strict lower bound.
  • Lenient (Perfect-Pooling) View: If you assume perfect information sharing (each media and control channel has one common parameter), the model has $12 + 6 + 100 + (105 - 1) = 222$ parameters. This yields about 74 data points per parameter, which is the calculation used by the Data-to-parameter ratio in the EDA package.

Each view counts the same media and control parameters differently:

  • Media parameters: Counted as $12 \times 105 = 1,260$ parameters in the strict view (independent geo-level parameters) but only $12$ parameters in the lenient view (one common national parameter).
  • Control parameters: Counted as $6 \times 105 = 630$ parameters in the strict view (independent geo-level parameters) but only $6$ parameters in the lenient view (one common national parameter).

Each view counts the knots and geo parameters similarly:

  • Knots: Counted as $100$ for each of the knots.
  • Geo parameters: Counted as $105 - 1 = 104$ for each of the 105 geos, with a baseline geo subtracted.

In reality, because Meridian uses partial pooling (hierarchical modeling), the actual effective "data points per parameter" is somewhere in between the strict 8 and the lenient 74. The actual amount of information shared depends on how similar the parameters are across geos, which is determined by the data and the hierarchical variance parameters (eta_m and xi_c).

The only way to determine the hierarchical variance parameters (eta_m and xi_c) is by actually fitting the model. For this reason, we avoid prescribing a single "correct" minimum ratio. Instead:

  • The strict calculation serves as a useful thought exercise to understand the potential complexity and the worst-case scenario.
  • The EDA package uses the lenient calculation as a practical, lenient guardrail to identify severe data scarcity where the model would be under-determined even under perfect pooling.

We recommend that if you are having difficulty getting enough data for a geo-level model, then consider combining media channels or dropping a media channel with low spend. Or, you can put a more regularizing prior on hierarchical variance terms eta_m and xi_c, for example, HalfNormal(0.1), which will encourage sharing information across geos.

Can I use campaign-level data?

The Meridian model is focused only at channel-level. We generally don't recommend running at the campaign-level because MMM is a macro tool that works well at the channel-level. If you use distinct campaigns that have hard starts and stops, you risk losing the memory of the Adstock. If you are interested in more granular insights, we recommend data-driven multi-touch attribution for your digital channels.