Learn how Meridian GeoX helps power your MMM calibration

Mathematical notation reference

This guide is a reference of the mathematical notation used in Meridian.

It is designed to help you interpret and understand the equations that estimate the causal impact of your treatment variables, and perform budget optimization.

Base variables and input data
Data state and transformation notation
Index variables (subscripts)
Model parameters
Hyperparameters
Time-varying parameters
Model specification
Conditionals & logic
Causal & optimization notation

Base variables and input data

These symbols represent the inputs used in Meridian's model equations, representing fully scaled and transformed data. The letter indicates the metric, and the bracketed superscript defines the specific type or category of data.

Symbol	Description
$y$	KPI: The response (target, dependent) variable of the model. It can be revenue, sales units, conversions, or anything else that the treatment variables may have a causal effect upon.
$z^{[C]}$	Control variables: Variables in the model that aren't treatment variables (for example, weather or price). These are used to estimate the baseline outcome.
$x^{[M]}$	Paid media variables: The media execution level (for example, clicks or spend) for paid media channels.
$r^{[RF]}$	Paid reach: The number of unique individuals exposed to paid media.
$f^{[RF]}$	Paid frequency: The number of paid impressions per unique viewer.
$x^{[OM]}$	Organic media variables: The media execution level (for example, newsletter opens) for organic media channels.
$r^{[ORF]}$	Organic reach: The number of unique individuals exposed to organic media.
$f^{[ORF]}$	Organic frequency: The number of organic impressions per unique viewer.
$x^{[N]}$	Non-media treatment variables: The execution level for non-media interventions (for example, promotions or pricing).
$p$	Population: The population size of each geo, used to scale data so small and large regions are comparable.
$u$	Unit values: Currency values used to convert raw units into spend for ROI calculations.

Data state and transformation notation

In the Meridian framework, variables go through a transformation function before entering the model. Special markers (like dots and daggers) indicate which stage of the transformation the data is in.

Symbol	Description	Example
$\ddot{(\cdot)}$	Raw input data (double-dot): The "as-is" data provided by the user before any scaling occurs.	$\ddot{y}$ represents the raw KPI count for a region.
$(\cdot)^\dagger$	Population scaled (dagger): The intermediate data state. This is the raw data divided by the geo's population ($p_ {g}$).	$y^\dagger_ {g,t} = \ddot{y}_ {g,t} / p_ {g}$
$(\cdot)$	Fully transformed variable: The final transformed data used in the model equations. For a KPI, this is the dagger variable centered to mean zero and scaled to standard deviation one.	$y$ is the final sales value the model learns from.
$L(\cdot)$	Transformation function: The specific linear transformation function applied to convert the raw units into the fully scaled units. Specific details for each transformation performed in Meridian can be found in the Input data section.	$y = L^{[Y]}(\ddot{y})$

Index variables (subscripts)

Indexes are the "coordinates" for data arrays, telling you exactly which slice of data is being referenced. Meridian attaches subscripts to the base variables to specify dimensions like geo and time (for example, $x^{[M]}_ {g,t,i}$).

Symbol	Description	Example
$g$	Geography: Indexes specific geographical units ($1, \dots, G$).	$g$ = New York or London.
$t$	Time: Indexes specific time periods ($1, \dots, T$).	$t$ = Week 10 of the time period used to train to the MMM.
$i$	Variable index: A universal index used to specify a particular channel or treatment within a category.	$i = 3$ refers to the 3rd paid media channel.
$G$	Total geographies: Total number of geographical units.	$G = 50$ for a US state-level model.
$T$	Total time periods: Total number of time periods.	$T = 104$ for two years of weekly data.
$N_ {C}$	Total controls: Total number of control variables.	$N_ {C} = 3$ (for example, price, weather, holidays).
$N_ {M}$	Total paid media: Total number of paid media variables without R&F.	$N_ {M} = 4$ (for example, TV, Radio, Print, Search).
$N_ {RF}$	Total paid R&F: Total number of paid media variables with R&F.	$N_ {RF} = 2$ (for example, Facebook, YouTube).
$N_ {OM}$	Total organic media: Total number of organic media variables without R&F.	$N_ {OM} = 2$ (for example, SEO, Social posts).
$N_ {ORF}$	Total organic R&F: Total number of organic media variables with R&F.	$N_ {ORF} = 1$ (for example, Organic Newsletter).
$N_ {N}$	Total non-media treatments: Total number of non-media treatment variables.	$N_ {N} = 2$ (for example, in-store promotions, coupons).

Model parameters

These are the "learned" parameters and coefficients (denoted by Greek letters) that the model estimates from the data.

Symbol	Description
$\theta$	Theta: General term for any unobservable parameter the model is estimating.
$\tau_ {g}$	Tau (Geo intercepts): Geo effects, which represent the average KPI of each geo relative to the baseline geo.
$\mu_ {t}$	Mu (Time-varying intercepts): Time effects derived from the knot values.
$b_ {k}$	Knot parameter: The estimated knot value at knot $k$.
$\beta^{[M]}_ {i}, \beta^{[RF]}_ {i},$ $\beta^{[OM]}_ {i}, \beta^{[ORF]}_ {i}$	Beta (Hierarchical media effects): A parameter for the hierarchical distribution of geo-level media effects. When the media effects distribution is set to normal, it is the hierarchical mean. When set to log-normal, it is the hierarchical parameter for the mean of the underlying, log-transformed normal distribution.
$\beta^{[M]}_ {g,i}, \beta^{[RF]}_ {g,i},$ $\beta^{[OM]}_ {g,i}, \beta^{[ORF]}_ {g,i}$	Beta (Geo-level media effects): The specific media effect coefficient for a channel $i$ within geo $g$, drawn from the hierarchical distribution.
$\gamma^{[C]}_ {i}, \gamma^{[N]}_ {i}$	Gamma (Hierarchical control mean): The hierarchical mean of the coefficient on a control or non-media channel. Hierarchy is defined over geos.
$\sigma$	Sigma (Residual standard deviation): The standard deviation of noise.
$\eta$	Eta (Media hierarchical variance): A parameter for the hierarchical distribution of geo-level media effects. When the media effects distribution is set to normal, it is the hierarchical standard deviation. When set to log-normal, it is the hierarchical parameter for the standard deviation of the underlying, log-transformed normal distribution.
$\xi$	Xi (Control & non-media hierarchical variance): The hierarchical standard deviation of the coefficient on a control or non-media channel. Hierarchy is defined over geos.
$\alpha$	Alpha (Adstock decay rate): A value between 0 and 1.
$\mathtt{ec}$	Half-saturation: The "saturation point" where you achieve 50% of the maximum possible sales lift.
$\mathtt{slope}$	Slope: Controls the shape of the response curve. $slope\leq1$ creates a strictly concave curve; $slope > 1$ creates an "S-curve".

Hyperparameters

These are fixed parameters set before the model is trained, acting as structural inputs rather than learned coefficients.

Symbol	Description
$L$	Maximum lag duration: A fixed hyperparameter representing the maximum number of weeks an ad is assumed to affect sales.
$K$	Total knots: The total number of knots used to model the time-varying time effect.
$s_ {k}$	Knot location: The specific time period where the $k$-th knot is located.

Time-varying parameters

Meridian uses knots to model time effects. Rather than estimating a unique time effect for every single time period, the model estimates values at specific anchor points (knots) and interpolates the values for the periods in between.

These symbols represent the notation mechanics used to calculate that interpolation.

Symbol	Description
$b_ {k}$	Knot parameter: The estimated knot value at knot $k$.
$\ell(t)$	Lower knot index: The index of the nearest preceding knot for a given time $t$.
$u(t)$	Upper knot index: The index of the nearest succeeding knot for a given time $t$.
$w(t)$	Time weight: The interpolation weight for time $t$, calculated based on its distance between the neighboring knot locations ($s_ {\ell(t)}$ and $s_ {u(t)}$).
$\mu_ {t}$	Time-varying intercept: The resulting time effect for time $t$, calculated as the weighted average: $\mu_ {t} = w(t)b_ {\ell(t)} + (1-w(t))b_ {u(t)}$.

Model specification

For the full mathematical equation that combines these inputs and parameters into the Meridian model, refer to the Model specification page.

Conditionals & logic

These symbols represent dependencies, mathematical logic, or statistical relationships.

Symbol	Description	Example
$\mid$	The pipe: Read as "Given that." Indicates conditional probability or expectation.	$P(\theta \mid data)$ means the probability of the parameters given the observed data.
$I_ {\lbrace \dots \rbrace}$	Indicator function: A logical switch. It equals 1 if the condition inside is `true`, and 0 otherwise.	$I_ {i}^{[C]} = 1$ if population scaling is used for control variable $i$, and 0 otherwise.
$\sim$	Tilde operator: Read as "is distributed as." Links a parameter to its statistical prior distribution. (Note: This operator is distinct from the tilde accent $\overset \sim Y$ used to denote potential outcomes).	$\gamma^{[C]}_ {i} \sim \text{Normal}(0, 5)$ means the parameter follows a normal distribution with mean 0 and standard deviation 5.
$\lbrace \dots \rbrace$	Braces: Denotes a set, vector, or multi-dimensional array of variables.	${x_ {g,t,i}}$ represents the entire array of observed media execution, and ${q_ {t-s}}^L_ {s=0}$ represents a sequence over a time lag.
$\forall$	For all: The universal quantifier. It means the equation or condition applies to every value in a specific set.	$\forall g,t$ means the condition applies to all geographic regions and all time periods.

Causal inference & budget optimization

These symbols are used to define counterfactual scenarios, generate response curves, and calculate optimal budget allocations.

Symbol	Description
$\overset \sim Y^{(\lbrace x^{(1)} \rbrace)}$	Potential outcome: The hypothetical outcome (e.g., sales) that would occur under a specific scenario. The tilde ($\overset \sim Y$) denotes it is a potential outcome, and the superscript ($\lbrace x^{(1)} \rbrace$) denotes the specific media execution scenario being tested.
$x^{(1)}, x^{(0)}$	Counterfactual scenarios: Used to compare different media execution realities. Typically, $x^{(1)}$ represents historical execution, and $x^{(0)}$ represents a baseline (e.g., zero spend on a specific channel).
$b_ {i}$	Budget: The total budget allocated to a specific channel $i$ during budget optimization.
$\omega$	Spend scaling factor: A multiplier used to scale historical spend up or down. Used mathematically to generate response curves or calculate marginal ROI.
$f^*$	Target/optimal frequency: The optimal average ad frequency solved for during reach and frequency optimization.
$(j)$	MCMC draw superscript: Denotes a specific simulation "draw" (one of thousands of possible answers generated by the model) used to calculate the posterior mean of the expected outcome.

Mathematical notation reference

Table of contents

Base variables and input data

Related links

Data state and transformation notation

Related links

Index variables (subscripts)

Related links

Model parameters

Related links

Hyperparameters

Time-varying parameters

Related links

Model specification

Conditionals & logic

Related links

Causal inference & budget optimization

Related links

Mathematical notation reference Stay organized with collections Save and categorize content based on your preferences.

Table of contents

Base variables and input data

Related links

Data state and transformation notation

Related links

Index variables (subscripts)

Related links

Model parameters

Related links

Hyperparameters

Time-varying parameters

Related links

Model specification

Conditionals & logic

Related links

Causal inference & budget optimization

Related links

Mathematical notation reference