# Incremental KPI definition

Colloquially, you can define return on investment (ROI) as the incremental revenue generated by a media channel divided by the cost of the media channel. This obviously implies that media has a causal effect on revenue or other revenue-generating KPIs, which you would like to estimate. To do this in a principled way, you need to define incremental KPIs using the language of causal inference.

Consider the case where no media channels have reach and frequency data. Using the notation from Input data, you have an observed array of transformed media units $$\{x_{g,t,m}\}$$. This set includes values for all $$g=1,\dots G; m=1,\dots ,M;$$ and $$t=-\infty,\dots,T$$, although in practice you only need to worry about $$t=1-L,2-L,\dots T$$ where $$L$$ is the assumed maximum lag of media effects. (For the purposes of this discussion, refer to media units on the transformed scale $$x_{g,t,m}$$ instead of the raw scale $$\overset{\cdot \cdot}{x}_{g,t,m}$$. There is a one-to-one correspondence between raw and transformed units, so it makes no practical difference.)

Even if an advertiser's actual media execution was $$\{x_{g,t,m}\}$$, you can imagine what the KPI outcome might have been if the advertiser had instead executed a different media array, such as $$\{x^{(\ast)}_{g,t,m}\}$$. You can denote this KPI outcome as the set of random variables $$\{ \overset \sim Y_{g,t}^{ (\{ x_{g,t,m}^{(\ast)} \}) }\}$$. In the causal inference literature, the set $$\{ \overset \sim Y_{g,t}^{ (\{ x_{g,t,m}^{(\ast)} \}) }\}$$ are called potential outcomes, and the set of media values $$\{ x^{(\ast)}_{g,t,m} \}$$ is called a counterfactual scenario.

In the causal inference literature, it is common to see notation like $$Y^{(1)}$$ and $$Y^{(0)}$$ representing potential outcomes under treatment and control counterfactual scenarios. MMM is similar but slightly more complex because the potential outcomes are a two-dimensional array of values, and the treatment is a three-dimensional array of values. Note that not every potential outcome in the array $$\{ \overset \sim Y_{g,t}^{ (\{ x_{g,t,m}^{(\ast)} \}) }\}$$ actually depends on all values in the array $$\{ x^{(\ast)}_{g,t,m} \}$$. For example, media in a given time period cannot affect past sales. However, this notation is preferred because it is simpler than trying to denote exactly which media values each potential outcome depends on for each time period.

Although for any two counterfactual media scenarios, such as $$\{ x^{(1)}_{g,t,m} \}$$ and $$\{ x^{(0)}_{g,t,m} \}$$, you could define the actual incremental KPI as:

$$\sum\limits _{g,t} \left( \overset \sim Y_{g,t}^{ \left( \left\{ x_{g,t,m}^{(1)} \right\} \right) } - \overset \sim Y_{g,t}^{ \left( \left\{ x_{g,t,m}^{(0)} \right\} \right) } \right)$$

However, this quantity is not estimable because data cannot provide any information about the joint distribution of $$\overset \sim Y_{g,t}^{ (\{ x_{g,t,m}^{(1)} \}) }$$ and $$\overset \sim Y_{g,t}^{ (\{ x_{g,t,m}^{(0)} \}) }$$. It is only possible to observe one potential outcome, namely $$\overset \sim Y_{g,t}^{ \left( \left\{ x_{g,t,m} \right\}\right) }$$. (Note that intuitively, as $$\{ x^{(1)}_{g,t,m} \}$$ becomes arbitrarily close to $$\{ x^{(0)}_{g,t,m} \}$$, the potential outcomes $$\overset \sim Y_{g,t}^{ (\{ x_{g,t,m}^{(1)} \}) }$$ and $$\overset \sim Y_{g,t}^{ (\{ x_{g,t,m}^{(0)} \}) }$$ should approach the same value, but this intuition is not enough to specify the joint distribution more generally.)

Instead, for any two counterfactual media scenarios, such as $$\{ x^{(1)}_{g,t,m} \}$$ and $$\{ x^{(0)}_{g,t,m} \}$$, define incremental KPI as:

$$\text{IncrementalSales} \left( \left\{ x^{(1)}_{g,t,m} \right\}, \left\{ x^{(0)}_{g,t,m} \right\} \right) = E \left( \sum\limits_{g,t} \left( \overset \sim Y_{g,t}^{ \left(\left\{ x_{g,t,m}^{(1)} \right\}\right) } - \overset \sim Y_{g,t}^{ \left(\left\{ x_{g,t,m}^{(0)} \right\}\right) } \right) \Bigg| \left\{ z_{g,t,c} \right\} \right)$$

where $$\{z_{g,t,c}\}$$ denotes the observed values for a set of control variables. This shorthand notation is used to indicate that the expectation is conditional upon the control random variables taking on these values. Using an MMM regression model and a carefully selected set of control variables, this conditional expectation is estimable. For more information, see ROI, mROI, and response curves.

Typically, the sum is taken over $$g=1,\dots G$$ and $$t=1,\dots T$$, however, you can also define incremental KPI for any subset of these values.

[{ "type": "thumb-down", "id": "missingTheInformationINeed", "label":"Missing the information I need" },{ "type": "thumb-down", "id": "tooComplicatedTooManySteps", "label":"Too complicated / too many steps" },{ "type": "thumb-down", "id": "outOfDate", "label":"Out of date" },{ "type": "thumb-down", "id": "samplesCodeIssue", "label":"Samples / code issue" },{ "type": "thumb-down", "id": "otherDown", "label":"Other" }]
[{ "type": "thumb-up", "id": "easyToUnderstand", "label":"Easy to understand" },{ "type": "thumb-up", "id": "solvedMyProblem", "label":"Solved my problem" },{ "type": "thumb-up", "id": "otherUp", "label":"Other" }]