Geo-level model considerations
Geo selection
When you are selecting geos, consider the following guidance:
-
Drop the smallest geos by total KPI first. Smaller geos have less contribution to ROI, yet they can still have a high influence on model fit, particularly when there is a single residual variance for all groups (
unique_sigma_for_each_geo = False
ofModelSpec
).For US advertisers using designated market area (DMA) as the geographical unit, a rough guideline is to model the top 50-100 DMAs by population size. This generally includes the vast majority of the KPI units, while excluding most of the noisier small DMAs that might impact model fit and convergence.
-
When each geo has its own residual variance
(
unique_sigma_for_each_geo = True
ofModelSpec
), noisier geos have less impact on model fit. However, this option can make convergence difficult for some datasets because it adds so much flexibility to the model. If MCMC sampling does converge under this option, it might be worth plotting the geo population size versus the mean residual standard deviation (sigma
parameter) - in most cases, you would expect to see a fairly monotone pattern. If you don't see this pattern, then it might be better to setunique_sigma_for_each_geo = False
and use a smaller subset of geos. -
If you want to make sure the model represents 100% of your KPI units, you can aggregate smaller geos into larger regions. However, this option comes with several caveats:
- Recognize that geo-level modeling is a big advantage and this advantage grows with the number of geographically separated treatment units. For more information, see National-level versus geo-level modeling.
- Different geo aggregation grouping methods can lead to different MMM results.
- Media execution variables, such as impressions or cost, can usually be summed across geos. However, some control variables, such as temperature, can be less straightforward to aggregate.
National-level media in a geo-level model
When most media are available at the geo-level, but one or two are only available at the national level, we recommend imputing the national-level media at a geo-level and running a geo-model. One naive imputation method is to approximate the geo-level media variable from its national level value, using the proportion of the population in the geo relative to the total population. Although it is preferable to have accurate geo-level data so that imputation isn't necessary, imputation can still yield useful information about the model parameters. For more information, see section 4.4 of Geo-level Bayesian Hierarchical Media Mix Modeling.
Model settings
Set the max_lag
parameter
The Meridian model allows for media at time $t$ to affect the KPI at
times $t, t + 1, \dots , t + L$ where the integer $L$ is a hyperparameter set
by the user using max_lag
of ModelSpec
. Media can
potentially have a long effect that can go beyond max_lag
.
However, the lagged effect of media converges towards zero, due to the model
assumption of geometric decay.
In practice, max_lag
is used to truncate how long media can have
an effect because it has positive benefits including improved model
convergence, reasonable model runtimes, and maximizing data usage (reducing
variance). Keeping the max_lag
in the 2-10 range leads to a good
balance of these advantages and disadvantages.
Increasing max_lag
doesn't necessarily mean that ROI estimates
will also increase. One reason for this is because if the media at time $t$
can affect the KPI at time $t+L$, this can take away from the effect of media
at times $t+1, \dots , t+L$ on the KPI at time $t+L$.
Set custom priors using past experiments
Meridian requires passing distributions for ROI calibration. Although setting custom priors using results from previous experiments is a sound approach, there are many nuances to consider before proceeding. For example:
-
The timing of the experiment in relation to the MMM time window: If the experiment was conducted either before or after the MMM time window, the results might not be directly applicable.
-
The duration of the experiment: An experiment of short duration might not effectively capture the long-term effects of the marketing effectiveness.
-
The complexity of the experiment: If the experiment involved a mixture of channels, the results might not provide clear insights into the performance of individual channels.
-
Estimand differences: The estimands used in experiments can differ from those used in the MMM. For example, the MMM counterfactual is zero spend, whereas some experiments might have a different counterfactual, such as reduced spend.
-
Population differences: The population targeted in the experiment might not be the same as the population considered in the MMM.
We recommend setting the custom priors based on your belief in the effectiveness of a channel. A prior belief can be informed by many things, including experiments or other reliable analyses. Use the strength in that prior belief to inform the standard deviation of the prior:
-
If you have a strong belief in the effectiveness of a channel, you can apply an adjustment factor to the standard deviation of the prior to reflect your confidence. For example, suppose you have conducted several experiments for a particular channel and all the experiments yielded similar ROI point estimates, or you have historical data from previous MMM analyses that support the effectiveness of this channel. In this case, you could set a smaller standard deviation for the prior so that the distribution won't vary widely. This tighter distribution indicates your strong confidence in the experimental results.
-
Conversely, the experiment might not necessarily translate to the MMM, considering some of the nuances listed earlier. In this case, you might choose to apply an adjustment factor to standard deviation of the prior distribution. For example, you could set a larger standard deviation for the prior, depending on your level of skepticism.
You should consider using the roi_calibration_period
argument in
ModelSpec
. For more information, see
Set the ROI calibration period.
When setting the prior, the LogNormal
distribution is a common
one to use. The following sample code can be used to transform experiment's
mean and standard error to the LogNormal
prior
distribution:
import numpy as np
def estimate_lognormal_dist(mean, std):
"""Reparameterization of lognormal distribution in terms of its mean and std."""
mu_log = np.log(mean) - 0.5 * np.log((std/mean)**2 + 1)
std_log = np.sqrt(np.log((std/mean)**2 + 1))
return [mu_log, std_log]
However, if the results from previous experiments are near zero, you should
consider whether your prior beliefs are accurately represented by a
non-negative distribution, such as the LogNormal
distribution. We
highly recommend plotting the prior distribution to confirm it matches
your prior intuitions before proceeding with the analysis. The following
sample code shows how to get reparameterized LogNormal
parameters, define the distribution, and draw samples from it.
import tensorflow as tf
import tensorflow_probability as tfp
# Get reparameterized LogNormal distribution parameters
mu_log, std_log = estimate_lognormal_dist(mean, std)
mu_log = tf.convert_to_tensor(mu_log, dtype=tf.float32)
std_log = tf.convert_to_tensor(std_log, dtype=tf.float32)
# Define the LogNormal distribution
lognormal_dist = tfp.distributions.LogNormal(mu_log, std_log)
# Draw 10,000 samples
lognormal_samples = lognormal_dist.sample(10000).numpy()