Default prior distributions

This section describes the default prior distributions for the Meridian model. All prior distributions are specified by the prior_distribution argument, which accepts a PriorDistribution object. Each parameter has its own argument in the PriorDistribution constructor, and the joint prior distribution assumes that all of the priors are independent.

Distributions can be specified as either a vector (such as, tfp.distributions.Normal([1, 2, 3], [1, 1, 2])) or as a scalar (such as tfp.distributions.Normal(1, 2)). All scalar distributions are broadcast to the length of the parameter vector that they represent.

knot_values

Parameter: \(b_k\)

Default Prior: Normal(0, 5)

Probability density for the normal distribution with mean=0 and
scale=5

Rationale:

  • Uninformative prior dictating how much time can have an effect.
  • Uninformative because you want the flexibility to allow time to have a strong effect.
  • Can be learned from the data given multiple geos per time period, and also multiple time periods per knot when the number of knots is low.

tau_g_excl_baseline

Parameter: \(\tau_g\)

Default Prior: Normal(0, 5)

Probability density for the normal distribution with mean=0 and
scale=5

Rationale:

  • Uninformative prior dictating geo-differences.
  • Uninformative because you want the flexibility to allow geo to have a strong effect.
  • Can be learned from the data given multiple time periods per geo.

roi_m and roi_rf

Parameter: \(\text{ROI}_m,\text{ROI}_n^{(rf)}\)

Default Prior: LogNormal(0.2, 0.9)

Probability density for the log-normal
distribution

Rationale:

  • This prior says that a priori the mean ROI across channels are 1.83, 50% of ROIs are greater than 1.22, 80% are between 0.5 and 6.0, 95% are between 0.25 and 9.0, and 99% are less than 10.0.
  • By default, each channel is assigned the same ROI prior.

beta_m and beta_rf

Parameter: \(\beta_m,\beta_n^{(rf)}\)

Default Prior: HalfNormal(5)

Probability density for the half-normal distribution with
scale=5

Rationale:

  • Uninformative prior on hierarchical mean of geo-level media effects (beta_gm; beta_grf) for impression and reach and frequency media channels respectively.
  • Uninformative because the interpretation of beta_m can vary widely given transformations, scaling, and kind of media execution.

eta_m and eta_rf

Parameter: \(\eta_m,\eta_n^{(rf)}\)

Default Prior: HalfNormal(1)

Probability density for the half-normal distribution with
scale=1

Rationale:

Moderate regularization encourages pooling across geos. This leads to lower variance estimates at the cost of increased bias, and allows the model to use the data more efficiently.

gamma_c

Parameter: \(\gamma_c\)

Default Prior: Normal(0, 5)

Probability density for the normal distribution with mean=0 and
scale=5

Rationale:

Uninformative because of the wide range of control variables you can possibly see.

xi_c

Parameter: \(\xi_c\)

Default Prior: HalfNormal(5)

Probability density for the half-normal distribution with
scale=5

Rationale:

  • Uninformative to allow a wide range of geo variation in control variable effects.
  • By default, pooling is weaker for control effects than for media effects because control effects are simple linear effects (without the complexity of Hill and Adstock transformations).

alpha_m and alpha_rf

Parameter: \(\alpha_m,\alpha_n^{(rf)}\)

Default Prior: Uniform(0, 1)

Probability density for the standard uniform
distribution

Rationale: Uninformative to allow data to inform the decay rate.

ec_m

Parameter: \(ec_m\)

Default Prior: TruncatedNormal(0.8, 0.8, 0.1, 10). This is the conditional distribution \(X|0.1 < X < 10\), where \(X \sim N(0.8,0.8)\).

Probability density for a truncated normal
distribution

Rationale:

  • The data is scaled such that when \(ec=1\), the half-saturation happens at the median of the non-zero media units per capita across geos and time. \(ec=k\) means that the half-saturation happens at \(X\) times the median value.
  • This prior has mean near one, which is a reasonable a priori assumption of where the half-saturation happens.
  • The truncation is done to keep the parameter within a reasonable range for parameter identifiability.
  • If a channel is way under-saturated (\(ec > 10\)) or way over-saturated (\(ec < 0.1\)), the data does not really contain information about the half-saturation point anyway. In such cases, the ec_m parameter determines the shape of the response curve, but shouldn't be interpreted as an accurate estimate of half-saturation.

ec_rf

Parameter: \(ec_n^{(rf)}\)

Default Prior: LogNormal(0.7, 0.4) + 1

Probability density for a transformed log-normal
distribution

# Tensorflow Probability Syntax
tfp.distributions.TransformedDistribution(
    tfp.distributions.LogNormal(0.7, 0.4),
    tfp.bijectors.Shift(0.1)
)

Rationale:

  • Moderately informative to prevent non-identification with slope_rf.
  • Set in conjunction with the slope_rf prior so that the prior distribution for optimal frequency has a mean of 2.1 and 90% CI of [1.0, 4.4]. This is considered to be a reasonable range of optimal frequency.

slope_m

Parameter: \(\text{slope}_m\)

Default Prior: Deterministic(1)

Rationale:

  • Difficult to learn because of identifiability reasons.
  • Deterministic (1) means it is restricted to concave Hill curves.
  • The budget optimization algorithm produces a global optimum when Hill curves are concave. Changing this prior can lead to non-concave Hill curves and budget optimization can no longer produce a global optimum.

slope_rf

Parameter: \(\text{slope}_n^{(rf)}\)

Default Prior: LogNormal(0.7, 0.4)

Probability density for a log-normal
distribution

Rationale:

  • Moderately informative to prevent non-identification with ec_rf.
  • Set in conjunction with the ec_rf prior so that the prior distribution for optimal frequency has a mean of 2.1 and 90% CI of [1, 4.4], a reasonable range of optimal frequency.

sigma

Parameter: \(\sigma_g\)

Default Prior: HalfNormal(5)

Probability density for the half-normal distribution with
scale=5

Rationale:

Uninformative because residual variance varies widely by advertiser.