Default prior distributions

This section describes the default prior distributions for the Meridian model. All prior distributions are specified by the prior_distribution argument, which accepts a PriorDistribution object. Each parameter has its own argument in the PriorDistribution constructor, and the joint prior distribution assumes that all of the priors are independent.

Distributions can be specified as either a vector (such as, tfp.distributions.Normal([1, 2, 3], [1, 1, 2])) or as a scalar (such as tfp.distributions.Normal(1, 2)). All scalar distributions are broadcast to the length of the parameter vector that they represent.


Parameter: \(b_k\)

Default Prior: Normal(0, 5)

Probability density for the normal distribution with mean=0 and


  • Uninformative prior dictating how much time can have an effect.
  • Uninformative because you want the flexibility to allow time to have a strong effect.
  • Can be learned from the data given multiple geos per time period, and also multiple time periods per knot when the number of knots is low.


Parameter: \(\tau_g\)

Default Prior: Normal(0, 5)

Probability density for the normal distribution with mean=0 and


  • Uninformative prior dictating geo-differences.
  • Uninformative because you want the flexibility to allow geo to have a strong effect.
  • Can be learned from the data given multiple time periods per geo.

roi_m and roi_rf

Parameter: \(\text{ROI}_m,\text{ROI}_n^{(rf)}\)

Default Prior: LogNormal(0.2, 0.9)

Probability density for the log-normal


  • This prior says that a priori the mean ROI across channels are 1.83, 50% of ROIs are greater than 1.22, 80% are between 0.5 and 6.0, 95% are between 0.25 and 9.0, and 99% are less than 10.0.
  • By default, each channel is assigned the same ROI prior.

beta_m and beta_rf

Parameter: \(\beta_m,\beta_n^{(rf)}\)

Default Prior: HalfNormal(5)

Probability density for the half-normal distribution with


  • Uninformative prior on hierarchical mean of geo-level media effects (beta_gm; beta_grf) for impression and reach and frequency media channels respectively.
  • Uninformative because the interpretation of beta_m can vary widely given transformations, scaling, and kind of media execution.

eta_m and eta_rf

Parameter: \(\eta_m,\eta_n^{(rf)}\)

Default Prior: HalfNormal(1)

Probability density for the half-normal distribution with


Moderate regularization encourages pooling across geos. This leads to lower variance estimates at the cost of increased bias, and allows the model to use the data more efficiently.


Parameter: \(\gamma_c\)

Default Prior: Normal(0, 5)

Probability density for the normal distribution with mean=0 and


Uninformative because of the wide range of control variables you can possibly see.


Parameter: \(\xi_c\)

Default Prior: HalfNormal(5)

Probability density for the half-normal distribution with


  • Uninformative to allow a wide range of geo variation in control variable effects.
  • By default, pooling is weaker for control effects than for media effects because control effects are simple linear effects (without the complexity of Hill and Adstock transformations).

alpha_m and alpha_rf

Parameter: \(\alpha_m,\alpha_n^{(rf)}\)

Default Prior: Uniform(0, 1)

Probability density for the standard uniform

Rationale: Uninformative to allow data to inform the decay rate.


Parameter: \(ec_m\)

Default Prior: TruncatedNormal(0.8, 0.8, 0.1, 10). This is the conditional distribution \(X|0.1 < X < 10\), where \(X \sim N(0.8,0.8)\).

Probability density for a truncated normal


  • The data is scaled such that when \(ec=1\), the half-saturation happens at the median of the non-zero media units per capita across geos and time. \(ec=k\) means that the half-saturation happens at \(X\) times the median value.
  • This prior has mean near one, which is a reasonable a priori assumption of where the half-saturation happens.
  • The truncation is done to keep the parameter within a reasonable range for parameter identifiability.
  • If a channel is way under-saturated (\(ec > 10\)) or way over-saturated (\(ec < 0.1\)), the data does not really contain information about the half-saturation point anyway. In such cases, the ec_m parameter determines the shape of the response curve, but shouldn't be interpreted as an accurate estimate of half-saturation.


Parameter: \(ec_n^{(rf)}\)

Default Prior: LogNormal(0.7, 0.4) + 1

Probability density for a transformed log-normal

# Tensorflow Probability Syntax
    tfp.distributions.LogNormal(0.7, 0.4),


  • Moderately informative to prevent non-identification with slope_rf.
  • Set in conjunction with the slope_rf prior so that the prior distribution for optimal frequency has a mean of 2.1 and 90% CI of [1.0, 4.4]. This is considered to be a reasonable range of optimal frequency.


Parameter: \(\text{slope}_m\)

Default Prior: Deterministic(1)


  • Difficult to learn because of identifiability reasons.
  • Deterministic (1) means it is restricted to concave Hill curves.
  • The budget optimization algorithm produces a global optimum when Hill curves are concave. Changing this prior can lead to non-concave Hill curves and budget optimization can no longer produce a global optimum.


Parameter: \(\text{slope}_n^{(rf)}\)

Default Prior: LogNormal(0.7, 0.4)

Probability density for a log-normal


  • Moderately informative to prevent non-identification with ec_rf.
  • Set in conjunction with the ec_rf prior so that the prior distribution for optimal frequency has a mean of 2.1 and 90% CI of [1, 4.4], a reasonable range of optimal frequency.


Parameter: \(\sigma_g\)

Default Prior: HalfNormal(5)

Probability density for the half-normal distribution with


Uninformative because residual variance varies widely by advertiser.