Meridian | Google for Developers

Geo-level model considerations

Geo selection

When you are selecting geos, consider the following guidance:

Drop the smallest geos by total KPI first. Smaller geos have less contribution to ROI, yet they can still have a high influence on model fit, particularly when there is a single residual variance for all groups (unique_sigma_for_each_geo = False of ModelSpec).

For US advertisers using designated market area (DMA) as the geographical unit, a rough guideline is to model the top 50-100 DMAs by population size. This generally includes the vast majority of the KPI units, while excluding most of the noisier small DMAs that might impact model fit and convergence.
When each geo has its own residual variance (unique_sigma_for_each_geo = True of ModelSpec), noisier geos have less impact on model fit. However, this option can make convergence difficult for some datasets because it adds so much flexibility to the model. If MCMC sampling does converge under this option, it might be worth plotting the geo population size versus the mean residual standard deviation (sigma parameter) - in most cases, you would expect to see a fairly monotone pattern. If you don't see this pattern, then it might be better to set unique_sigma_for_each_geo = False and use a smaller subset of geos.
If you want to make sure the model represents 100% of your KPI units, you can aggregate smaller geos into larger regions. However, this option comes with several caveats:

Recognize that geo-level modeling is a big advantage and this advantage grows with the number of geographically separated treatment units. For more information, see National-level versus geo-level modeling.
Different geo aggregation grouping methods can lead to different MMM results.
Media execution variables, such as impressions or cost, can usually be summed across geos. However, some control variables, such as temperature, can be less straightforward to aggregate.

National-level media in a geo-level model

When most media are available at the geo-level, but one or two are only available at the national level, we recommend imputing the national-level media at a geo-level and running a geo-model. One naive imputation method is to approximate the geo-level media variable from its national level value, using the proportion of the population in the geo relative to the total population. Although it is preferable to have accurate geo-level data so that imputation isn't necessary, imputation can still yield useful information about the model parameters. For more information, see section 4.4 of Geo-level Bayesian Hierarchical Media Mix Modeling.

Model settings

Set the `max_lag` parameter

The Meridian model allows for media at time $t$ to affect the KPI at times $t, t + 1, \dots , t + L$ where the integer $L$ is a hyperparameter set by the user using max_lag of ModelSpec. Media can potentially have a long effect that can go beyond max_lag. However, the lagged effect of media converges towards zero, due to the model assumption of geometric decay.

In practice, max_lag is used to truncate how long media can have an effect because it has positive benefits including improved model convergence, reasonable model runtimes, and maximizing data usage (reducing variance). Keeping the max_lag in the 2-10 range leads to a good balance of these advantages and disadvantages.

Increasing max_lag doesn't necessarily mean that ROI estimates will also increase. One reason for this is because if the media at time $t$ can affect the KPI at time $t+L$, this can take away from the effect of media at times $t+1, \dots , t+L$ on the KPI at time $t+L$.

Set custom priors using past experiments

Meridian requires passing distributions for ROI calibration. Although setting custom priors using results from previous experiments is a sound approach, there are many nuances to consider before proceeding. For example:

The timing of the experiment in relation to the MMM time window: If the experiment was conducted either before or after the MMM time window, the results might not be directly applicable.
The duration of the experiment: An experiment of short duration might not effectively capture the long-term effects of the marketing effectiveness.
The complexity of the experiment: If the experiment involved a mixture of channels, the results might not provide clear insights into the performance of individual channels.
Estimand differences: The estimands used in experiments can differ from those used in the MMM. For example, the MMM counterfactual is zero spend, whereas some experiments might have a different counterfactual, such as reduced spend.
Population differences: The population targeted in the experiment might not be the same as the population considered in the MMM.

We recommend setting the custom priors based on your belief in the effectiveness of a channel. A prior belief can be informed by many things, including experiments or other reliable analyses. Use the strength in that prior belief to inform the standard deviation of the prior:

If you have a strong belief in the effectiveness of a channel, you can apply an adjustment factor to the standard deviation of the prior to reflect your confidence. For example, suppose you have conducted several experiments for a particular channel and all the experiments yielded similar ROI point estimates, or you have historical data from previous MMM analyses that support the effectiveness of this channel. In this case, you could set a smaller standard deviation for the prior so that the distribution won't vary widely. This tighter distribution indicates your strong confidence in the experimental results.
Conversely, the experiment might not necessarily translate to the MMM, considering some of the nuances listed earlier. In this case, you might choose to apply an adjustment factor to standard deviation of the prior distribution. For example, you could set a larger standard deviation for the prior, depending on your level of skepticism.

You should consider using the roi_calibration_period argument in ModelSpec. For more information, see Set the ROI calibration period.

When setting the prior, the LogNormal distribution is a common one to use. The following sample code can be used to transform experiment's mean and standard error to the LogNormal prior distribution:


  import numpy as np

  def estimate_lognormal_dist(mean, std):
    """Reparameterization of lognormal distribution in terms of its mean and std."""
    mu_log = np.log(mean) - 0.5 * np.log((std/mean)**2 + 1)
    std_log = np.sqrt(np.log((std/mean)**2 + 1))
    return [mu_log, std_log]

However, if the results from previous experiments are near zero, you should consider whether your prior beliefs are accurately represented by a non-negative distribution, such as the LogNormal distribution. We highly recommend plotting the prior distribution to confirm it matches your prior intuitions before proceeding with the analysis. The following sample code shows how to get reparameterized LogNormal parameters, define the distribution, and draw samples from it.


  import tensorflow as tf
  import tensorflow_probability as tfp

  # Get reparameterized LogNormal distribution parameters
  mu_log, std_log = estimate_lognormal_dist(mean, std)
  mu_log = tf.convert_to_tensor(mu_log, dtype=tf.float32)
  std_log = tf.convert_to_tensor(std_log, dtype=tf.float32)
  # Define the LogNormal distribution
  lognormal_dist = tfp.distributions.LogNormal(mu_log, std_log)
  # Draw 10,000 samples
  lognormal_samples = lognormal_dist.sample(10000).numpy()

Value of the KPI is unknown