The primary goal of marketing mix modeling (MMM) is the accurate estimation of causal marketing effects. However, directly validating the quality of causal inference is difficult and requires well-designed experiments. These experiments must be executed correctly and must have the same estimand as the MMM. Since you are using an MMM, experiments are likely not practical. For this reason, the causal inference cannot be directly assessed. Instead, you have to rely on indirect measures.
We recommend that you make modeling decisions that make sense for the goal of causal inference, and not for minimizing prediction error. Consider these guidelines:
Make sure that your control variable set includes all important confounding variables, which impact both media execution and the response. For more information, see Selecting control variables.
Be careful about including control variables that are not actually confounders. Too many variables can increase the risk of overfitting and model misspecification bias.
Only add media variables for which you are interested in learning the causal inference.
Model time using the advice in Choosing the number of knots for time effects in the model, and don't necessarily try to model time with as many knots as you can.
This process does require some self-reflection from you as the advertiser, however this will most likely lead to the best model fit. Considering that you planned your own media strategy, you probably know or have a strong sense of what variables impacted your planning around media execution.
The results need to make sense. Results that don't make sense include unusually low baselines that are often negative, and one media channel dominating all other media channels. Meridian has out-of-sample prediction metrics which are useful as a preliminary check to make sure the model structure is appropriate and not extremely overparameterized.
About out-of-sample prediction metrics
The goal in marketing mix modeling (MMM) is causal inference, and not necessarily to minimize out-of-sample prediction metrics. It can be safer to have a model that includes all confounding variables and allows enough flexibility in the model structure to get unbiased, causal estimates (such as ROI), even if this means the model is overfit.
It is still a good idea to check the out-of-sample fit to make sure your model
structure is appropriate and not extremely overparameterized, but the
out-of-sample prediction metrics shouldn't be the primary way model fit is
assessed. Out-of-sample fit can be evaluated using the holdout_id
argument in ModelSpec
and the predictive_accuracy
method of Analyzer
.
Access the posterior distribution draws for all parameters
The posterior distribution draws of all parameters can be accessed from the
Meridian model object through the inference_data
attribute.
For example, to access values of the posterior for alpha (Adstock decay
parameter) for each media channel from a Meridian model object named
mmm
you can access the property mmm.inference_data.posterior.alpha_m
. From
there you can obtain any percentile from the posterior distribution using syntax
similar to this example for a 75th percentile:
np.percentile(meridian.inference_data.posterior.alpha_m, 0.75, axis=(0, 1))
Additionally, you can obtain the mean and credible interval of specified width
for the estimated posterior distribution of a parameter. For example, the mean
and 90th percentile credible interval for alpha_m
can be obtained using the
following syntax:
Analyzer.get_mean_and_ci(mmm.inference_data.posterior.alpha_m, 0.9)
.
The posterior distributions for all other parameters can be accessed in a similar manner.
Methods for model fit and results
The Analyzer
class has several methods providing many of the post-modeling
quantities of interest.
Some examples include:
Analyzer.media_summary_metrics
provides a summary table organized by channel. The metrics of interest include:impressions
,spend
,incremental_impact
,pct_of_contribution
,ROI
,effectiveness
, andmROI
.Analyzer.incremental_impact
lets you explore the impact of a specific media source.Analyzer.roi
provides estimates of ROI and can be customized by media, time, and geo.Analyzer.marginal_roi
provides estimates of marginal ROI and can be customized by media, time, and geo.Analyzer.response_curves
returns the data used in plotting the response curves.Analyzer.predictive_accuracy
returns metrics for predictive accuracy, including R-squared, MAPE, wMAPE.Analyzer.expected_outcome
returns either the prior or posterior expected outcome.
For a full list of methods and more information about their potential specifications, including selecting specific geos or times, see the analyzer.py code.