Outcome is the primary metric of interest that Meridian measures the causal
effect of treatment variables upon. This is typically revenue, but when the KPI
is not revenue and revenue_per_kpi
data is not available, then
Meridian defines the outcome to be the KPI itself.
Colloquially, you can define return on investment (ROI) as the incremental outcome generated by a media channel divided by its cost. This implies that media has a causal effect on outcome that you want to estimate. To do this in a principled way, you need to define incremental outcome using the language of causal inference.
For demonstration purposes, consider the case where no paid or organic media channels have reach and frequency data. Using the notation from Input data, you have an observed array of transformed media units \(\{x^{[M]}_{g,t,i}\}\), organic media units \(\{x^{[OM]}_{g,t,i}\}\), and non-media treatments \(\{x^{[N]}_{g,t,i}\}\) the entirety of which is denoted by \(\{x_{g,t,i}\}\). This set includes values for all paid and organic media channels, and non-media treatments for all \(g=1,\dots G \) and \(t=-\infty,\dots,T \), although in practice you only need to worry about \(t=1-L,2-L,\dots T\) where \(L\) is the assumed maximum lag of media effects. (For the purposes of this discussion, refer to units on the transformed scale \(x_{g,t,i}\) instead of the raw scale \(\overset{\cdot \cdot}{x}_{g,t,m}\). There is a one-to-one correspondence between raw and transformed units, so it makes no practical difference.)
Even if an advertiser's actual execution was \(\{x_{g,t,i}\}\), you can imagine what the outcome might have been if the advertiser had instead executed a different media array, such as \(\{x^{(\ast)}_{g,t,i}\}\). You can denote this outcome as the set of random variables \(\{ \overset \sim Y_{g,t}^{ (\{ x_{g,t,i}^{(\ast)} \}) }\}\). In the causal inference literature, the set \(\{ \overset \sim Y_{g,t}^{ (\{ x_{g,t,i}^{(\ast)} \}) }\}\) are called potential outcomes, and the set of values \(\{ x^{(\ast)}_{g,t,i} \}\) is called a counterfactual scenario.
In the causal inference literature, it is common to see notation like \(Y^{(1)}\) and \(Y^{(0)}\) representing potential outcomes under treatment and control counterfactual scenarios. MMM is similar but slightly more complex because the potential outcomes are a two-dimensional array of values, and the treatment is a three-dimensional array of values. Note that not every potential outcome in the array \(\{ \overset \sim Y_{g,t}^{ (\{ x_{g,t,i}^{(\ast)} \}) }\}\) actually depends on all values in the array \(\{ x^{(\ast)}_{g,t,i} \}\). For example, media in a given time period cannot affect past sales. However, this notation is preferred because it is simpler than trying to denote exactly which media values each potential outcome depends on for each time period.
Although for any two counterfactual media scenarios, such as \(\{ x^{(1)}_{g,t,i} \}\) and \(\{ x^{(0)}_{g,t,i} \}\), you could define the actual incremental outcome as:
However, this quantity is not estimable because data cannot provide any information about the joint distribution of \(\overset \sim Y_{g,t}^{ (\{ x_{g,t,i}^{(1)} \}) }\) and \(\overset \sim Y_{g,t}^{ (\{ x_{g,t,i}^{(0)} \}) }\). It is only possible to observe one potential outcome, namely \(\overset \sim Y_{g,t}^{ \left( \left\{ x_{g,t,i} \right\}\right) }\). (Note that intuitively, as \(\{ x^{(1)}_{g,t,i} \}\) becomes arbitrarily close to \(\{ x^{(0)}_{g,t,i} \}\), the potential outcomes \(\overset \sim Y_{g,t}^{ (\{ x_{g,t,i}^{(1)} \}) }\) and \(\overset \sim Y_{g,t}^{ (\{ x_{g,t,i}^{(0)} \}) }\) should approach the same value, but this intuition is not enough to specify the joint distribution more generally.)
Instead, for any two counterfactual media scenarios, \(\{ x^{(1)}_{g,t,i} \}\) and \(\{ x^{(0)}_{g,t,i} \}\), define incremental outcome as:
where \(\{z_{g,t,i}\}\) denotes the observed values for a set of control variables. This shorthand notation is used to indicate that the expectation is conditional upon the control random variables taking on these values. Using an MMM regression model and a carefully selected set of control variables, this conditional expectation is estimable. For more information, see ROI, mROI, and response curves.
Typically, the sum is taken over \(g=1,\dots G\) and \(t=1,\dots T\), however, you can also define incremental outcome for any subset of these values.