Try Meridian Scenario Planner for interactive budget planning!

meridian.analysis.analyzer.Analyzer

Runs calculations to analyze the raw data after fitting the model.

meridian.analysis.analyzer.Analyzer(
    meridian: (meridian.model.model.Meridian | None) = None,
    *,
    model_context: (meridian.model.context.ModelContext | None) = None,
    inference_data: (az.InferenceData | None) = None
)

Attributes
`inference_data`
`model_context`

Attributes

inference_data

model_context

Methods

`adstock_decay`

View source

adstock_decay(
    confidence_level: float = constants.DEFAULT_CONFIDENCE_LEVEL
) -> pd.DataFrame

Calculates adstock decay for paid media, RF, and organic media channels.

Args
`confidence_level`	Confidence level for prior and posterior credible intervals, represented as a value between zero and one.

Returns
Pandas DataFrame containing the `channel`, `time_units`, `distribution`, `ci_hi`, `ci_lo`, and `mean` for the Adstock function.

`baseline_summary_metrics`

View source

baseline_summary_metrics(
    selected_geos: (Sequence[str] | None) = None,
    selected_times: (Sequence[str] | None) = None,
    aggregate_geos: bool = True,
    aggregate_times: bool = True,
    non_media_baseline_values: (Sequence[float] | None) = None,
    use_kpi: bool = False,
    confidence_level: float = constants.DEFAULT_CONFIDENCE_LEVEL,
    batch_size: int = constants.DEFAULT_BATCH_SIZE
) -> xr.Dataset

Returns baseline summary metrics.

Args
`selected_geos`	Optional list containing a subset of geos to include. By default, all geos are included.
`selected_times`	Optional list containing a subset of times to include. By default, all time periods are included.
`aggregate_geos`	Boolean. If `True`, the expected outcome is summed over all of the regions.
`aggregate_times`	Boolean. If `True`, the expected outcome is summed over all of the time periods.
`non_media_baseline_values`	Optional list of shape `(n_non_media_channels,)`. Each element is a float which means that the fixed value will be used as baseline for the given channel. It is expected that they are scaled by population for the channels where `model_spec.non_media_population_scaling_id` is `True`. If `None`, the `model_spec.non_media_baseline_values` is used, which defaults to the minimum value for each non_media treatment channel.
`use_kpi`	Boolean. If `True`, the baseline summary metrics are calculated using KPI. If `False`, the metrics are calculated using revenue.
`confidence_level`	Confidence level for media summary metrics credible intervals, represented as a value between zero and one.
`batch_size`	Integer representing the maximum draws per chain in each batch. The calculation is run in batches to avoid memory exhaustion. If a memory error occurs, try reducing `batch_size`. The calculation will generally be faster with larger `batch_size` values.

Returns
An `xr.Dataset` with coordinates: `metric` (`mean`, `median`, `ci_low`,`ci_high`),`distribution` (prior, posterior) and contains the following data variables: `baseline_outcome`, `pct_of_contribution`.

`compute_incremental_outcome_aggregate`

View source

compute_incremental_outcome_aggregate(
    use_posterior: bool,
    new_data: (meridian.analysis.analyzer.DataTensors | None) = None,
    use_kpi: bool = False,
    include_non_paid_channels: bool = True,
    non_media_baseline_values: (Sequence[float] | None) = None,
    **kwargs
) -> meridian.backend.Tensor

Aggregates the incremental outcome of the media channels.

Args
`use_posterior`	Boolean. If `True`, then the incremental outcome posterior distribution is calculated. Otherwise, the prior distribution is calculated.
`new_data`	Optional `DataTensors` container with optional tensors: `media`, `reach`, `frequency`, `organic_media`, `organic_reach`, `organic_frequency`, `non_media_treatments` and `revenue_per_kpi`. If `None`, the incremental outcome is calculated using the `InputData` provided to the Meridian object. If `new_data` is provided, the incremental outcome is calculated using the new tensors in `new_data` and the original values of the remaining tensors. For example, `compute_incremental_outcome_aggregate(new_data=DataTensors(media=new_media))` computes the incremental outcome using `new_media` and the original values of `reach`, `frequency`, `organic_media`, `organic_reach`, `organic_frequency`, `non_media_treatments` and `revenue_per_kpi`. If any of the tensors in `new_data` is provided with a different number of time periods than in `InputData`, then all tensors must be provided with the same number of time periods.
`use_kpi`	Boolean. If `True`, the summary metrics are calculated using KPI. If `False`, the metrics are calculated using revenue.
`include_non_paid_channels`	Boolean. If `True`, then non-media treatments and organic effects are included in the calculation. If `False`, then only the paid media and RF effects are included.
`non_media_baseline_values`	Optional list of shape `(n_non_media_channels,)`. Each element is a float which means that the fixed value will be used as baseline for the given channel. It is expected that they are scaled by population for the channels where `model_spec.non_media_population_scaling_id` is `True`. If `None`, the `model_spec.non_media_baseline_values` is used, which defaults to the minimum value for each non_media treatment channel.
`**kwargs`	kwargs to pass to `incremental_outcome`, which could contain selected_geos, selected_times, aggregate_geos, aggregate_times, batch_size.

Returns
A Tensor with the same dimensions as `incremental_outcome` except the size of the channel dimension is incremented by one, with the new component at the end containing the total incremental outcome of all channels.

`cpik`

View source

cpik(
    use_posterior: bool = True,
    new_data: (meridian.analysis.analyzer.DataTensors | None) = None,
    selected_geos: (Sequence[str] | None) = None,
    selected_times: (Sequence[str] | Sequence[bool] | None) = None,
    aggregate_geos: bool = True,
    batch_size: int = constants.DEFAULT_BATCH_SIZE
) -> meridian.backend.Tensor

Calculates the cost per incremental KPI distribution for each channel.

The CPIK numerator is the total spend on the channel. The CPIK denominator is the change in expected KPI when one channel's spend is set to zero, leaving all other channels' spend unchanged.

If new_data=None, this method calculates CPIK conditional on the values of the paid media variables that the Meridian object was initialized with. The user can also override this historical data through the new_data argument. For example,

new_data = DataTensors(media=new_media, frequency=new_frequency)

If selected_geos or selected_times is specified, then the CPIK numerator is the total spend during the selected geos and time periods. An exception will be thrown if the spend of the InputData used to train the model does not have geo and time dimensions. (If the new_data.media_spend and new_data.rf_spend arguments are used with different dimensions than the InputData spend, then an exception will be thrown since this is a likely user error.)

Note that CPIK is simply 1/ROI, where ROI is obtained from a call to the roi method with use_kpi=True.

Args
`use_posterior`	Boolean. If `True` then the posterior distribution is calculated. Otherwise, the prior distribution is calculated.
`new_data`	Optional. DataTensors containing `media`, `media_spend`, `reach`, `frequency`, `rf_spend` and `revenue_per_kpi` data. If provided, the cpik is calculated using the values of the tensors passed in `new_data` and the original values of all the remaining tensors. If `None`, the ROI is calculated using the original values of all the tensors. If any of the tensors in `new_data` is provided with a different number of time periods than in `InputData`, then all tensors must be provided with the same number of time periods.
`selected_geos`	Optional. Contains a subset of geos to include. By default, all geos are included.
`selected_times`	Optional list containing either a subset of dates to include or booleans with length equal to the number of time periods in the `new_data` args, if provided. By default, all time periods are included.
`aggregate_geos`	Boolean. If `True`, the expected KPI is summed over all of the regions.
`batch_size`	Integer representing the maximum draws per chain in each batch. The calculation is run in batches to avoid memory exhaustion. If a memory error occurs, try reducing `batch_size`. The calculation will generally be faster with larger `batch_size` values.

Returns
Tensor of CPIK values with dimensions `(n_chains, n_draws, n_geos, (n_media_channels + n_rf_channels))`. The `n_geos` dimension is dropped if `aggregate_geos=True`.

`expected_outcome`

View source

expected_outcome(
    use_posterior: bool = True,
    new_data: (meridian.analysis.analyzer.DataTensors | None) = None,
    selected_geos: (Sequence[str] | None) = None,
    selected_times: (Sequence[str] | None) = None,
    aggregate_geos: bool = True,
    aggregate_times: bool = True,
    inverse_transform_outcome: bool = True,
    use_kpi: bool = False,
    batch_size: int = constants.DEFAULT_BATCH_SIZE
) -> meridian.backend.Tensor

Calculates either prior or posterior expected outcome.

This calculates E(Outcome|Media, RF, Organic media, Organic RF, Non-media treatments, Controls) for each posterior (or prior) parameter draw, where Outcome refers to either revenue if use_kpi=False, or kpi if use_kpi=True. When revenue_per_kpi is not defined, use_kpi cannot be False.

If new_data=None, this method calculates expected outcome conditional on the values of the independent variables that the Meridian object was initialized with. The user can also override this historical data through the new_data argument, as long as the new tensors' dimensions match. For example,

new_data=DataTensors(reach=new_reach, frequency=new_frequency)

In principle, expected outcome could be calculated with other time dimensions (for future predictions, for instance). However, this is not allowed with this method because of the additional complexities this introduces:

Corresponding price (revenue per KPI) data would also be needed.
If the model contains weekly effect parameters, then some method is needed to estimate or predict these effects for time periods outside of the training data window.

Args
`use_posterior`	Boolean. If `True`, then the expected outcome posterior distribution is calculated. Otherwise, the prior distribution is calculated.
`new_data`	An optional `DataTensors` container with optional new tensors: `media`, `reach`, `frequency`, `organic_media`, `organic_reach`, `organic_frequency`, `non_media_treatments`, `revenue_per_kpi`, `controls`. If `None`, expected outcome is calculated conditional on the original values of the data tensors that the Meridian object was initialized with. If `new_data` argument is used, expected outcome is calculated conditional on the values of the tensors passed in `new_data` and on the original values of the remaining unset tensors. For example, `expected_outcome(new_data=DataTensors(reach=new_reach, frequency=new_frequency))` calculates expected outcome conditional on the original `media`, `organic_media`, `organic_reach`, `organic_frequency`, `non_media_treatments`, `revenue_per_kpi`, and `controls` tensors and on the new given values for `reach` and `frequency` tensors. The new tensors' dimensions must match the dimensions of the corresponding original tensors from `input_data`.
`selected_geos`	Optional list of containing a subset of geos to include. By default, all geos are included.
`selected_times`	Optional list of containing a subset of dates to include. The values accepted here must match time dimension coordinates from `InputData.time`. By default, all time periods are included.
`aggregate_geos`	Boolean. If `True`, the expected outcome is summed over all regions.
`aggregate_times`	Boolean. If `True`, the expected outcome is summed over all time periods.
`inverse_transform_outcome`	Boolean. If `True`, returns the expected outcome in the original KPI or revenue (depending on what is passed to `use_kpi`), as it was passed to `InputData`. If False, returns the outcome after transformation by `KpiTransformer`, reflecting how its represented within the model.
`use_kpi`	Boolean. If `use_kpi = True`, the expected KPI is calculated; otherwise the expected revenue `(kpi * revenue_per_kpi)` is calculated. It is required that `use_kpi = True` if `revenue_per_kpi` is not defined or if `inverse_transform_outcome = False`.
`batch_size`	Integer representing the maximum draws per chain in each batch. The calculation is run in batches to avoid memory exhaustion. If a memory error occurs, try reducing `batch_size`. The calculation will generally be faster with larger `batch_size` values.

Returns
Tensor of expected outcome (either KPI or revenue, depending on the `use_kpi` argument) with dimensions `(n_chains, n_draws, n_geos, n_times)`. The `n_geos` and `n_times` dimensions is dropped if `aggregate_geos=True` or `aggregate_time=True`, respectively.

Raises
`NotFittedModelError`	if `sample_posterior()` (for `use_posterior=True`) or `sample_prior()` (for `use_posterior=False`) has not been called prior to calling this method.

`expected_vs_actual_data`

View source

expected_vs_actual_data(
    aggregate_geos: bool = False,
    aggregate_times: bool = False,
    use_kpi: bool = False,
    split_by_holdout_id: bool = False,
    non_media_baseline_values: (Sequence[float] | None) = None,
    confidence_level: float = constants.DEFAULT_CONFIDENCE_LEVEL
) -> xr.Dataset

Calculates the data for the expected versus actual outcome over time.

Args
`aggregate_geos`	Boolean. If `True`, the expected, baseline, and actual are summed over all of the regions.
`aggregate_times`	Boolean. If `True`, the expected, baseline, and actual are summed over all of the time periods.
`use_kpi`	If `True`, calculate the incremental KPI. Otherwise, calculate the incremental revenue using the revenue per KPI (if available).
`split_by_holdout_id`	Boolean. If `True` and `holdout_id` exists, the data is split into `'Train'`, `'Test'`, and `'All Data'` subsections.
`non_media_baseline_values`	Optional list of shape `(n_non_media_channels,)`. Each element is a float which means that the fixed value will be used as baseline for the given channel. It is expected that they are scaled by population for the channels where `model_spec.non_media_population_scaling_id` is `True`. If `None`, the `model_spec.non_media_baseline_values` is used, which defaults to the minimum value for each non_media treatment channel.
`confidence_level`	Confidence level for expected outcome credible intervals, represented as a value between zero and one. Default: `0.9`.

Returns
A dataset with the expected, baseline, and actual outcome metrics.

`filter_and_aggregate_geos_and_times`

View source

filter_and_aggregate_geos_and_times(
    tensor: meridian.backend.Tensor,
    selected_geos: (Sequence[str] | None) = None,
    selected_times: (Sequence[str] | Sequence[bool] | None) = None,
    aggregate_geos: bool = True,
    aggregate_times: bool = True,
    flexible_time_dim: bool = False,
    has_media_dim: bool = True
) -> meridian.backend.Tensor

Filters and/or aggregates geo and time dimensions of a tensor.

Args
`tensor`	Tensor with dimensions `[..., n_geos, n_times]` or `[..., n_geos, n_times, n_channels]`, where `n_channels` is the number of either media channels, RF channels, all paid channels (media and RF), or all channels (media, RF, non-media, organic media, organic RF).
`selected_geos`	Optional list containing a subset of geos to include. By default, all geos are included. The selected geos should match those in `InputData.geo`.
`selected_times`	Optional list of times to include. This can either be a string list containing a subset of time dimension coordinates from `InputData.time` or a boolean list with length equal to the time dimension of the tensor. By default, all time periods are included.
`aggregate_geos`	Boolean. If `True`, the tensor is summed over all geos.
`aggregate_times`	Boolean. If `True`, the tensor is summed over all time periods.
`flexible_time_dim`	Boolean. If `True`, the time dimension of the tensor is not required to match the number of time periods in `InputData.time`. In this case, if using `selected_times`, it must be a boolean list with length equal to the time dimension of the tensor.
`has_media_dim`	Boolean. Only used if `flexible_time_dim=True`. Otherwise, this is assumed based on the tensor dimensions. If `True`, the tensor is assumed to have a media dimension following the time dimension. If `False`, the last dimension of the tensor is assumed to be the time dimension.

Returns
A tensor with filtered and/or aggregated geo and time dimensions.

`get_aggregated_impressions`

View source

get_aggregated_impressions(
    new_data: (meridian.analysis.analyzer.DataTensors | None) = None,
    selected_geos: (Sequence[str] | None) = None,
    selected_times: (Sequence[str] | Sequence[bool] | None) = None,
    aggregate_geos: bool = True,
    aggregate_times: bool = True,
    optimal_frequency: (Sequence[float] | None) = None,
    include_non_paid_channels: bool = True
) -> meridian.backend.Tensor

Computes aggregated impressions values in the data across all channels.

Args
`new_data`	An optional `DataTensors` object containing the new `media`, `reach`, `frequency`, `organic_media`, `organic_reach`, `organic_frequency`, and `non_media_treatments` tensors. If `new_data` argument is used, then the aggregated impressions are computed using the values of the tensors passed in the `new_data` argument and the original values of all the remaining tensors. If `None`, the existing tensors from the Meridian object are used.
`selected_geos`	Optional list containing a subset of geos to include. By default, all geos are included.
`selected_times`	Optional list containing either a subset of dates to include or booleans with length equal to the number of time periods in the tensors in the `new_data` argument, if provided. By default, all time periods are included.
`aggregate_geos`	Boolean. If `True`, the expected outcome is summed over all of the regions.
`aggregate_times`	Boolean. If `True`, the expected outcome is summed over all of the time periods.
`optimal_frequency`	An optional list with dimension `n_rf_channels`, containing the optimal frequency per channel, that maximizes posterior mean ROI. Default value is `None`, and historical frequency is used for the metrics calculation.
`include_non_paid_channels`	Boolean. If `True`, the organic media, organic RF, and non-media channels are included in the aggregation.

Returns
A tensor with the shape `(n_selected_geos, n_selected_times, n_channels)` (or `(n_channels,)` if geos and times are aggregated) with aggregate impression values per channel.

`get_aggregated_spend`

View source

get_aggregated_spend(
    new_data: (meridian.analysis.analyzer.DataTensors | None) = None,
    selected_geos: (Sequence[str] | None) = None,
    selected_times: (Sequence[str] | Sequence[bool] | None) = None,
    include_media: bool = True,
    include_rf: bool = True
) -> xr.DataArray

Gets the aggregated spend based on the selected geos and time.

Args
`new_data`	An optional `DataTensors` object containing the new `media`, `media_spend`, `reach`, `frequency`, `rf_spend` tensors. If `None`, the existing tensors from the Meridian object are used. If `new_data` argument is used, then the aggregated spend is computed using the values of the tensors passed in the `new_data` argument and the original values of all the remaining tensors. If any of the tensors in `new_data` is provided with a different number of time periods than in `InputData`, then all tensors must be provided with the same number of time periods.
`selected_geos`	Optional list containing a subset of geos to include. By default, all geos are included. The selected geos should match those in `InputData.geo`.
`selected_times`	Optional list containing either a subset of dates to include or booleans with length equal to the number of time periods in KPI data. By default, all time periods are included.
`include_media`	Whether to include spends for paid media channels that do not have R&F data.
`include_rf`	Whether to include spends for paid media channels with R&F data.

Returns
An `xr.DataArray` with the coordinate `channel` and contains the data variable `spend`.

Raises
`ValueError`	A ValueError is raised when `include_media` and `include_rf` are both False.

`get_historical_spend`

View source

get_historical_spend(
    selected_times: (Sequence[str] | None) = None,
    include_media: bool = True,
    include_rf: bool = True
) -> xr.DataArray

Deprecated. Gets the aggregated historical spend based on the time.

Args
`selected_times`	The time period to get the historical spends. If None, the historical spends will be aggregated over all time points.
`include_media`	Whether to include spends for paid media channels that do not have R&F data.
`include_rf`	Whether to include spends for paid media channels with R&F data.

Returns
An `xr.DataArray` with the coordinate `channel` and contains the data variable `spend`.

Raises
`ValueError`	A ValueError is raised when `include_media` and `include_rf` are both False.

`get_rhat`

View source

get_rhat() -> Mapping[str, meridian.backend.Tensor]

Computes the R-hat values for each parameter in the model.

Returns
A dictionary of r-hat values where each parameter is a key and values are r-hats corresponding to the parameter.

Raises
`NotFittedModelError`	If self.sample_posterior() is not called before calling this method.

`hill_curves`

View source

hill_curves(
    confidence_level: float = constants.DEFAULT_CONFIDENCE_LEVEL,
    n_bins: int = 25
) -> pd.DataFrame

Estimates Hill curve tables used for plotting each channel's curves.

Args
`confidence_level`	Confidence level for prior and posterior credible intervals, represented as a value between zero and one. Default is `0.9`.
`n_bins`	Number of equal-width bins to include in the histogram for the plotting. Default is `25`.

Returns

Returns
Hill curves `pd.DataFrame` with columns: `channel`: `media` or `rf` channel name. `media_units`: Media (for `media` channels) or average frequency (for `rf` channels) units. `distribution`: Indication of `posterior` or `prior` draw. `ci_hi`: Upper bound of the credible interval of the value of the Hill function. `ci_lo`: Lower bound of the credible interval of the value of the Hill function. `mean`: Point-wise mean of the value of the Hill function per draw. `channel_type`: Indication of a `media` or `rf` channel. `scaled_count_histogram`: Scaled count of media units or average frequencies within the bin. `count_histogram`: Count value of media units or average frequencies within the bin. `start_interval_histogram`: Media unit or average frequency starting point for a histogram bin. `end_interval_histogram`: Media unit or average frequency ending point for a histogram bin.

Hill curves pd.DataFrame with columns:

channel: media or rf channel name.
media_units: Media (for media channels) or average frequency (for rf channels) units.
distribution: Indication of posterior or prior draw.
ci_hi: Upper bound of the credible interval of the value of the Hill function.
ci_lo: Lower bound of the credible interval of the value of the Hill function.
mean: Point-wise mean of the value of the Hill function per draw.
channel_type: Indication of a media or rf channel.
scaled_count_histogram: Scaled count of media units or average frequencies within the bin.
count_histogram: Count value of media units or average frequencies within the bin.
start_interval_histogram: Media unit or average frequency starting point for a histogram bin.
end_interval_histogram: Media unit or average frequency ending point for a histogram bin.

`incremental_outcome`

View source

incremental_outcome(
    use_posterior: bool = True,
    new_data: (meridian.analysis.analyzer.DataTensors | None) = None,
    non_media_baseline_values: (Sequence[float] | None) = None,
    scaling_factor0: float = 0.0,
    scaling_factor1: float = 1.0,
    selected_geos: (Sequence[str] | None) = None,
    selected_times: (Sequence[str] | Sequence[bool] | None) = None,
    media_selected_times: (Sequence[str] | Sequence[bool] | None) = None,
    aggregate_geos: bool = True,
    aggregate_times: bool = True,
    inverse_transform_outcome: bool = True,
    use_kpi: bool = False,
    by_reach: bool = True,
    include_non_paid_channels: bool = True,
    batch_size: int = constants.DEFAULT_BATCH_SIZE
) -> meridian.backend.Tensor

Calculates either the posterior or prior incremental outcome.

This calculates the media outcome of each media channel for each posterior or prior parameter draw. Incremental outcome is defined as:

E(Outcome|Treatment_1, Controls) minus E(Outcome|Treatment_0, Controls)

For paid & organic channels (without reach and frequency data), Treatment_1 means that media execution for a given channel is multiplied by scaling_factor1 (1.0 by default) for the set of time periods specified by media_selected_times. Similarly, Treatment_0 means that media execution is multiplied by scaling_factor0 (0.0 by default) for these time periods.

For paid & organic channels with reach and frequency data, either reach or frequency is held fixed while the other is scaled, depending on the by_reach argument.

For non-media treatments, Treatment_1 means that the variable is set to historical values. Treatment_0 means that the variable is set to its baseline value for all geos and time periods. Note that the scaling factors (scaling_factor0 and scaling_factor1) are not applicable to non-media treatments.

"Outcome" refers to either revenue if use_kpi=False, or kpi if use_kpi=True. When revenue_per_kpi is not defined, use_kpi cannot be False.

If new_data=None, this method computes incremental outcome using media, reach, frequency, organic_media, organic_reach, organic_frequency, non_media_treatments and revenue_per_kpi tensors that the Meridian object was initialized with. This behavior can be overridden with the new_data argument. For example, new_data=DataTensors(media=new_media) calculates incremental outcome using the new_media tensor and the original values of reach, frequency, organic_media, organic_reach, organic_frequency, non_media_treatments and revenue_per_kpi tensors.

The calculation in this method depends on two key assumptions made in the Meridian implementation:

Additivity of media effects (no interactions).
Additive changes on the model KPI scale correspond to additive changes on the original KPI scale. In other words, the intercept and control effects do not influence the media effects. This assumption currently holds because the outcome transformation only involves centering and scaling, for example, no log transformations.

Args
`use_posterior`	Boolean. If `True`, then the incremental outcome posterior distribution is calculated. Otherwise, the prior distribution is calculated.
`new_data`	Optional `DataTensors` container with optional tensors: `media`, `reach`, `frequency`, `organic_media`, `organic_reach`, `organic_frequency`, `non_media_treatments` and `revenue_per_kpi`. If `None`, the incremental outcome is calculated using the `InputData` provided to the Meridian object. If `new_data` is provided, the incremental outcome is calculated using the new tensors in `new_data` and the original values of the remaining tensors. For example, `incremental_outcome(new_data=DataTensors(media=new_media)` computes the incremental outcome using `new_media` and the original values of `reach`, `frequency`, `organic_media`, `organic_reach`, `organic_frequency`, `non_media_treatments` and `revenue_per_kpi`. If any of the tensors in `new_data` is provided with a different number of time periods than in `InputData`, then all tensors must be provided with the same number of time periods.
`non_media_baseline_values`	Optional list of shape `(n_non_media_channels,)`. Each element is a float which means that the fixed value will be used as baseline for the given channel. It is expected that they are scaled by population for the channels where `model_spec.non_media_population_scaling_id` is `True`. If `None`, the `model_spec.non_media_baseline_values` is used, which defaults to the minimum value for each non_media treatment channel.
`scaling_factor0`	Float. The factor by which to scale the counterfactual scenario "Media_0" during the time periods specified in `media_selected_times`. Must be non-negative and less than `scaling_factor1`.
`scaling_factor1`	Float. The factor by which to scale "Media_1" during the selected time periods specified in `media_selected_times`. Must be non-negative and greater than `scaling_factor0`.
`selected_geos`	Optional list containing a subset of geos to include. By default, all geos are included.
`selected_times`	Optional list containing either a subset of dates to include or booleans with length equal to the number of time periods in `new_data` if time is modified in `new_data`, or `input_data.n_times` otherwise. The incremental outcome corresponds to incremental KPI generated during the `selected_times` arg by media executed during the `media_selected_times` arg. Note that if `use_kpi=False`, then `selected_times` can only include the time periods that have `revenue_per_kpi` input data. By default, all time periods are included where `revenue_per_kpi` data is available.
`media_selected_times`	Optional list containing either a subset of dates to include or booleans with length equal to the number of time periods in KPI data or number of time periods in the `new_data` args, if provided. If `new_data` is provided, `media_selected_times` can select any subset of time periods in `new_data`. If `new_data` is not provided, `media_selected_times` selects from `InputData.time`. The incremental outcome corresponds to incremental KPI generated during the `selected_times` arg by treatment variables executed during the `media_selected_times` arg. For each channel, the incremental outcome is defined as the difference between expected KPI when treatment variables execution is scaled by `scaling_factor1` and `scaling_factor0` during these specified time periods. By default, the difference is between treatment variables at historical execution levels, or as provided in `new_data`, versus zero execution. Defaults to include all time periods.
`aggregate_geos`	Boolean. If `True`, then incremental outcome is summed over all regions.
`aggregate_times`	Boolean. If `True`, then incremental outcome is summed over all time periods.
`inverse_transform_outcome`	Boolean. If `True`, returns the expected outcome in the original KPI or revenue (depending on what is passed to `use_kpi`), as it was passed to `InputData`. If False, returns the outcome after transformation by `KpiTransformer`, reflecting how its represented within the model.
`use_kpi`	Boolean. If `use_kpi = True`, the expected KPI is calculated; otherwise the expected revenue `(kpi * revenue_per_kpi)` is calculated. It is required that `use_kpi = True` if `revenue_per_kpi` data is not available or if `inverse_transform_outcome = False`.
`by_reach`	Boolean. If `True`, then the incremental outcome is calculated by scaling the reach and holding the frequency constant. If `False`, then the incremental outcome is calculated by scaling the frequency and holding the reach constant. Only used for channels with RF data.
`include_non_paid_channels`	Boolean. If `True`, then non-media treatments and organic effects are included in the calculation. If `False`, then only the paid media and RF effects are included.
`batch_size`	Integer representing the maximum draws per chain in each batch. The calculation is run in batches to avoid memory exhaustion. If a memory error occurs, try reducing `batch_size`. The calculation will generally be faster with larger `batch_size` values.

Returns

Returns
Tensor of incremental outcome (either KPI or revenue, depending on `use_kpi` argument) with dimensions `(n_chains, n_draws, n_geos, n_times, n_channels)`. If `include_non_paid_channels=True`, then `n_channel` is the total number of media, RF, organic media, and organic RF and non-media channels. If `include_non_paid_channels=False`, then `n_channels` is the total number of media and RF channels. The `n_geos` and `n_times` dimensions are dropped if `aggregate_geos=True` or `aggregate_times=True`, respectively.

Tensor of incremental outcome (either KPI or revenue, depending on use_kpi argument) with dimensions

(n_chains, n_draws, n_geos,
n_times, n_channels)

. If include_non_paid_channels=True, then n_channel is the total number of media, RF, organic media, and organic RF and non-media channels. If include_non_paid_channels=False, then n_channels is the total number of media and RF channels. The n_geos and n_times dimensions are dropped if aggregate_geos=True or aggregate_times=True, respectively.

Raises
`NotFittedModelError`	If `sample_posterior()` (for `use_posterior=True`) or `sample_prior()` (for `use_posterior=False`) has not been called prior to calling this method.
`ValueError`	If `new_data` argument contains tensors with modified time dimension and not all treatment variables are provided in `new_data` with matching time dimensions.

`marginal_roi`

View source

marginal_roi(
    incremental_increase: float = 0.01,
    use_posterior: bool = True,
    new_data: (meridian.analysis.analyzer.DataTensors | None) = None,
    selected_geos: (Sequence[str] | None) = None,
    selected_times: (Sequence[str] | Sequence[bool] | None) = None,
    aggregate_geos: bool = True,
    by_reach: bool = True,
    use_kpi: bool = False,
    batch_size: int = constants.DEFAULT_BATCH_SIZE
) -> meridian.backend.Tensor

Calculates the marginal ROI prior or posterior distribution.

The marginal ROI (mROI) numerator is the change in expected outcome (kpi or kpi * revenue_per_kpi) when one channel's spend is increased by a small fraction. The mROI denominator is the corresponding small fraction of the channel's total spend.

If new_data=None, this method calculates marginal ROI conditional on the values of the paid media variables that the Meridian object was initialized with. The user can also override this historical data through the new_data argument. For example,

new_data = DataTensors(media=new_media, frequency=new_frequency)

If selected_geos or selected_times is specified, then the mROI denominator is based on the total spend during the selected geos and time periods. An exception will be thrown if the spend of the InputData used to train the model does not have geo and time dimensions. (If the new_data.media_spend and new_data.rf_spend arguments are used with different dimensions than the InputData spend, then an exception will be thrown since this is a likely user error.)

Args
`incremental_increase`	Small fraction by which each channel's spend is increased when calculating its mROI numerator. The mROI denominator is this fraction of the channel's total spend. Only used if marginal is `True`.
`use_posterior`	If `True` then the posterior distribution is calculated. Otherwise, the prior distribution is calculated.
`new_data`	Optional. DataTensors containing `media`, `media_spend`, `reach`, `frequency`, `rf_spend` and `revenue_per_kpi` data. If provided, the marginal ROI is calculated using the values of the tensors passed in `new_data` and the original values of all the remaining tensors. If `None`, the marginal ROI is calculated using the original values of all the tensors. If any of the tensors in `new_data` is provided with a different number of time periods than in `InputData`, then all tensors must be provided with the same number of time periods.
`selected_geos`	Optional. Contains a subset of geos to include. By default, all geos are included.
`selected_times`	Optional list containing either a subset of dates to include or booleans with length equal to the number of time periods in the `new_data` args, if provided. By default, all time periods are included.
`aggregate_geos`	If `True`, the expected revenue is summed over all of the regions.
`by_reach`	Used for a channel with reach and frequency. If `True`, returns the mROI by reach for a given fixed frequency. If `False`, returns the mROI by frequency for a given fixed reach.
`use_kpi`	If `False`, then revenue is used to calculate the mROI numerator. Otherwise, uses KPI to calculate the mROI numerator.
`batch_size`	Maximum draws per chain in each batch. The calculation is run in batches to avoid memory exhaustion. If a memory error occurs, try reducing `batch_size`. The calculation will generally be faster with larger `batch_size` values.

Returns
Tensor of mROI values with dimensions `(n_chains, n_draws, n_geos, (n_media_channels + n_rf_channels))`. The `n_geos` dimension is dropped if `aggregate_geos=True`.

`negative_baseline_probability`

View source

negative_baseline_probability(
    non_media_baseline_values: (Sequence[float] | None) = None,
    use_posterior: bool = True,
    selected_geos: (Sequence[str] | None) = None,
    selected_times: (Sequence[str] | None) = None,
    use_kpi: bool = False,
    batch_size: int = constants.DEFAULT_BATCH_SIZE
) -> np.floating

Calculates either prior or posterior negative baseline probability.

This calculates either the prior or posterior probability that the baseline, aggregated over the supplied time window, is negative.

The baseline is calculated by computing expected_outcome with the following assumptions: 1) media is set to all zeros, 2) reach is set to all zeros, 3) organic_media is set to all zeros, 4) organic_reach is set to all zeros, 5) non_media_treatments is set to the counterfactual values according to the non_media_baseline_values argument, 6) controls are set to historical values.

Args
`non_media_baseline_values`	Optional list of shape `(n_non_media_channels,)`. Each element is a float denoting a fixed value that will be used as the baseline for the given channel. It is expected that they are scaled by population for the channels where `model_spec.non_media_population_scaling_id` is `True`. If `None`, the `model_spec.non_media_baseline_values` is used, which defaults to the minimum value for each non_media treatment channel.
`use_posterior`	Boolean. If `True`, then the expected outcome posterior distribution is calculated. Otherwise, the prior distribution is calculated.
`selected_geos`	Optional list of containing a subset of geos to include. By default, all geos are included.
`selected_times`	Optional list of containing a subset of dates to include. The values accepted here must match time dimension coordinates from `InputData.time`. By default, all time periods are included.
`use_kpi`	Boolean. If `use_kpi = True`, the expected KPI is calculated; otherwise the expected revenue `(kpi * revenue_per_kpi)` is calculated. It is required that `use_kpi = True` if `revenue_per_kpi` is not defined or if `inverse_transform_outcome = False`.
`batch_size`	Integer representing the maximum draws per chain in each batch. The calculation is run in batches to avoid memory exhaustion. If a memory error occurs, try reducing `batch_size`. The calculation will generally be faster with larger `batch_size` values.

Returns
A float representing the prior or posterior negative baseline probability over the supplied time window.

Raises
`NotFittedModelError`	if `sample_posterior()` (for `use_posterior=True`) or `sample_prior()` (for `use_posterior=False`) has not been called prior to calling this method.

`optimal_freq`

View source

optimal_freq(
    new_data: (meridian.analysis.analyzer.DataTensors | None) = None,
    max_frequency: (float | None) = None,
    freq_grid: (Sequence[float] | None) = None,
    use_posterior: bool = True,
    use_kpi: bool = False,
    selected_geos: (Sequence[str] | None) = None,
    selected_times: (Sequence[str] | Sequence[bool] | None) = None,
    confidence_level: float = constants.DEFAULT_CONFIDENCE_LEVEL
) -> xr.Dataset

Calculates the optimal frequency that maximizes posterior mean ROI.

For this optimization, historical spend is used and fixed, and frequency is restricted to be constant across all geographic regions and time periods. Reach is calculated for each geographic area and time period such that the number of impressions remains unchanged as frequency varies. Meridian solves for the frequency at which posterior mean ROI is optimized.

If new_data=None, this method calculates the opptimal frequency on the values of the paid RF variables that the Meridian object was initialized with. The user can override this historical data through the new_data argument. For example,

new_data = DataTensors(reach=new_reach, frequency=new_frequency)

Args
`new_data`	Optional `DataTensors` object containing `rf_impressions`, `rf_spend`, and `revenue_per_kpi`. If provided, the optimal frequency is calculated using the values of the tensors passed in `new_data` and the original values of all the remaining tensors. If `None`, the historical data used to initialize the Meridian object is used. If any of the tensors in `new_data` is provided with a different number of time periods than in `InputData`, then all tensors must be provided with the same number of time periods.
`max_frequency`	Maximum frequency value used to calculate the frequency grid. If `None`, the maximum frequency value is calculated from the historic frequency (maximum value of Meridian.input_data, not `new_data`). If `freq_grid` is provided, this argument has no effect.
`freq_grid`	List of frequency values. The ROI of each channel is calculated for each frequency value in the list. By default, the list includes numbers from `1.0` to the maximum frequency in increments of `0.1`.
`use_posterior`	Boolean. If `True`, posterior optimal frequencies are generated. If `False`, prior optimal frequencies are generated.
`use_kpi`	Boolean. If `True`, the counterfactual metrics are calculated using KPI. If `False`, the counterfactual metrics are calculated using revenue.
`selected_geos`	Optional list containing a subset of geos to include. By default, all geos are included.
`selected_times`	Optional list containing either a subset of dates to include or booleans with length equal to the number of time periods in `new_data` if time is modified in `new_data`, or `input_data.n_times` otherwise. By default, all time periods are included.
`confidence_level`	Confidence level for prior and posterior credible intervals, represented as a value between zero and one.

Returns

Returns
An xarray Dataset which contains: Coordinates: `frequency`, `rf_channel`, `metric` (`mean`, `median`, `ci_lo`, `ci_hi`). Data variables: `optimal_frequency`: The frequency that optimizes the posterior mean of ROI. `roi`: The ROI for each frequency value in `freq_grid`. `optimized_incremental_outcome`: The incremental outcome based on the optimal frequency. `optimized_effectiveness`: The effectiveness based on the optimal frequency. `optimized_roi`: The ROI based on the optimal frequency. `optimized_mroi_by_reach`: The marginal ROI with a small change in reach and fixed frequency at the optimal frequency. `optimized_mroi_by_frequency`: The marginal ROI with a small change around the optimal frequency and fixed reach. `optimized_cpik`: The CPIK based on the optimal frequency.

An xarray Dataset which contains:

Coordinates: frequency, rf_channel, metric (mean, median, ci_lo, ci_hi).
Data variables:
- optimal_frequency: The frequency that optimizes the posterior mean of ROI.
- roi: The ROI for each frequency value in freq_grid.
- optimized_incremental_outcome: The incremental outcome based on the optimal frequency.
- optimized_effectiveness: The effectiveness based on the optimal frequency.
- optimized_roi: The ROI based on the optimal frequency.
- optimized_mroi_by_reach: The marginal ROI with a small change in reach and fixed frequency at the optimal frequency.
- optimized_mroi_by_frequency: The marginal ROI with a small change around the optimal frequency and fixed reach.
- optimized_cpik: The CPIK based on the optimal frequency.

Raises
`NotFittedModelError`	If `sample_posterior()` (for `use_posterior=True`) or `sample_prior()` (for `use_posterior=False`) has not been called prior to calling this method.
`ValueError`	If there are no channels with reach and frequency data.

`predictive_accuracy`

View source

predictive_accuracy(
    selected_geos: (Sequence[str] | None) = None,
    selected_times: (Sequence[str] | None) = None,
    use_kpi: bool = False,
    batch_size: int = constants.DEFAULT_BATCH_SIZE
) -> xr.Dataset

Calculates R-Squared, MAPE, and wMAPE goodness of fit metrics.

R-Squared, MAPE (mean absolute percentage error), and wMAPE (weighted absolute percentage error) are calculated on the revenue scale (KPI * revenue_per_kpi) when revenue_per_kpi is specified, or the KPI scale when revenue_per_kpi = None. This is the same scale as what is used in the ROI numerator (incremental outcome).

Prediction errors in wMAPE are weighted by the actual revenue (KPI * revenue_per_kpi) when revenue_per_kpi is specified, or weighted by the KPI scale when revenue_per_kpi = None. This means that percentage errors when revenue is high are weighted more heavily than errors when revenue is low.

R-Squared, MAPE and wMAPE are calculated both at the model-level (one observation per geo and time period) and at the national-level (aggregating KPI or revenue outcome across geos so there is one observation per time period).

R-Squared, MAPE, and wMAPE are calculated for the full sample. If the model object has any holdout observations, then R-squared, MAPE, and wMAPE are also calculated for the Train and Test subsets.

Args
`selected_geos`	Optional list containing a subset of geos to include. By default, all geos are included.
`selected_times`	Optional list containing a subset of dates to include. By default, all time periods are included.
`use_kpi`	Whether to use KPI or revenue scale for the predictive accuracy metrics.
`batch_size`	Integer representing the maximum draws per chain in each batch. By default, `batch_size` is `100`. The calculation is run in batches to avoid memory exhaustion. If a memory error occurs, try reducing `batch_size`. The calculation will generally be faster with larger `batch_size` values.

Returns
An xarray Dataset containing the computed `R_Squared`, `MAPE`, and `wMAPE` values, with coordinates `metric`, `geo_granularity`, `evaluation_set`, and accompanying data variable `value`. If `holdout_id` exists, the data is split into `'Train'`, `'Test'`, and `'All Data'` subsections, and the three metrics are computed for each.

`response_curves`

View source

response_curves(
    new_data: (meridian.analysis.analyzer.DataTensors | None) = None,
    spend_multipliers: (list[float] | None) = None,
    use_posterior: bool = True,
    selected_geos: (Sequence[str] | None) = None,
    selected_times: (Sequence[str] | None) = None,
    by_reach: bool = True,
    use_optimal_frequency: bool = False,
    use_kpi: bool = False,
    confidence_level: float = constants.DEFAULT_CONFIDENCE_LEVEL,
    batch_size: int = constants.DEFAULT_BATCH_SIZE
) -> xr.Dataset

Method to generate a response curves xarray.Dataset.

Response curves are calculated in aggregate across geos and time periods, assuming the historical flighting pattern across geos and time periods for each media channel.

A list of multipliers is applied to each media channel's total historical spend within selected_geos and selected_times to obtain the x-axis values. The y-axis values are the incremental outcome generated by each channel within selected_geos and selected_times under the counterfactual where media units in each geo and time period are scaled by the corresponding multiplier. (Media units for time periods prior to selected_times are also scaled by the multiplier.)

Args
`new_data`	Optional `DataTensors` object with optional new tensors: `media`, `reach`, `frequency`, `media_spend`, `rf_spend`, `revenue_per_kpi`, `times`. If provided, the response curves are calculated using the values of the tensors passed in `new_data` and the original values of all the remaining tensors. If `None`, the response curves are calculated using the original values of all the tensors. If any of the tensors in `new_data` is provided with a different number of time periods than in `InputData`, then all tensors must be provided with the same number of time periods and the `time` tensor must be provided.
`spend_multipliers`	List of multipliers. Each channel's total spend is multiplied by these factors to obtain the values at which the curve is calculated for that channel.
`use_posterior`	Boolean. If `True`, posterior response curves are generated. If `False`, prior response curves are generated.
`selected_geos`	Optional list containing a subset of geos to include. By default, all geos are included.
`selected_times`	Optional list containing a subset of dates to include. If `new_data` is provided with modified time periods, then `selected_times` must be a subset of `new_data.times`. Otherwise, `selected_times` must be a subset of `self._model_context.input_data.time`. By default, all time periods are included.
`by_reach`	Boolean. For channels with reach and frequency. If `True`, plots the response curve by reach. If `False`, plots the response curve by frequency.
`use_optimal_frequency`	If `True`, uses the optimal frequency to plot the response curves. Defaults to `False`.
`use_kpi`	A boolean flag indicating whether to use KPI instead of revenue to generate the response curves. Defaults to `False`.
`confidence_level`	Confidence level for prior and posterior credible intervals, represented as a value between zero and one.
`batch_size`	Integer representing the maximum draws per chain in each batch. The calculation is run in batches to avoid memory exhaustion. If a memory error occurs, try reducing `batch_size`. The calculation will generally be faster with larger `batch_size` values.

Returns
An `xarray.Dataset` containing the data needed to visualize response curves.

`rhat_summary`

View source

rhat_summary(
    bad_rhat_threshold: float = 1.2
) -> pd.DataFrame

Computes a summary of the R-hat values for each parameter in the model.

Summarizes the Gelman & Rubin (1992) potential scale reduction for chain convergence, commonly referred to as R-hat. It is a convergence diagnostic measure that measures the degree to which variance (of the means) between chains exceeds what you would expect if the chains were identically distributed. Values close to 1.0 indicate convergence. R-hat < 1.2 indicates approximate convergence and is a reasonable threshold for many problems (Brooks & Gelman, 1998).

References
Andrew Gelman and Donald B. Rubin. Inference from Iterative Simulation Using Multiple Sequences. Statistical Science, 7(4):457-472, 1992. Stephen P. Brooks and Andrew Gelman. General Methods for Monitoring Convergence of Iterative Simulations. Journal of Computational and Graphical Statistics, 7(4), 1998.

Args
`bad_rhat_threshold`	The threshold for determining which R-hat values are considered bad.

Returns

Returns
A DataFrame with the following columns: `n_params`: The number of respective parameters in the model. `avg_rhat`: The average R-hat value for the respective parameter. `max_rhat`: The maximum R-hat value for the respective parameter. `percent_bad_rhat`: The percentage of R-hat values for the respective parameter that are greater than `bad_rhat_threshold`. `row_idx_bad_rhat`: The row indices of the R-hat values that are greater than `bad_rhat_threshold`. `col_idx_bad_rhat`: The column indices of the R-hat values that are greater than `bad_rhat_threshold`.

A DataFrame with the following columns:

n_params: The number of respective parameters in the model.
avg_rhat: The average R-hat value for the respective parameter.
max_rhat: The maximum R-hat value for the respective parameter.
percent_bad_rhat: The percentage of R-hat values for the respective parameter that are greater than bad_rhat_threshold.
row_idx_bad_rhat: The row indices of the R-hat values that are greater than bad_rhat_threshold.
col_idx_bad_rhat: The column indices of the R-hat values that are greater than bad_rhat_threshold.

Raises
`NotFittedModelError`	If `self.sample_posterior()` is not called before calling this method.
`ValueError`	If the number of dimensions of the R-hat array for a parameter is not `1` or `2`.

`roi`

View source

roi(
    use_posterior: bool = True,
    new_data: (meridian.analysis.analyzer.DataTensors | None) = None,
    selected_geos: (Sequence[str] | None) = None,
    selected_times: (Sequence[str] | Sequence[bool] | None) = None,
    aggregate_geos: bool = True,
    use_kpi: bool = False,
    batch_size: int = constants.DEFAULT_BATCH_SIZE
) -> meridian.backend.Tensor

Calculates ROI prior or posterior distribution for each media channel.

The ROI numerator is the change in expected outcome (kpi or kpi * revenue_per_kpi) when one channel's spend is set to zero, leaving all other channels' spend unchanged. The ROI denominator is the total spend of the channel.

If new_data=None, this method calculates ROI conditional on the values of the paid media variables that the Meridian object was initialized with. The user can also override this historical data through the new_data argument. For example,

new_data = DataTensors(media=new_media, frequency=new_frequency)

If selected_geos or selected_times is specified, then the ROI denominator is the total spend during the selected geos and time periods. An exception will be thrown if the spend of the InputData used to train the model does not have geo and time dimensions. (If the new_data.media_spend and new_data.rf_spend arguments are used with different dimensions than the InputData spend, then an exception will be thrown since this is a likely user error.)

Args
`use_posterior`	Boolean. If `True`, then the posterior distribution is calculated. Otherwise, the prior distribution is calculated.
`new_data`	Optional. DataTensors containing `media`, `media_spend`, `reach`, `frequency`, and `rf_spend`, and `revenue_per_kpi` data. If provided, the ROI is calculated using the values of the tensors passed in `new_data` and the original values of all the remaining tensors. If `None`, the ROI is calculated using the original values of all the tensors. If any of the tensors in `new_data` is provided with a different number of time periods than in `InputData`, then all tensors must be provided with the same number of time periods.
`selected_geos`	Optional. Contains a subset of geos to include. By default, all geos are included.
`selected_times`	Optional list containing either a subset of dates to include or booleans with length equal to the number of time periods in the `new_data` args, if provided. By default, all time periods are included.
`aggregate_geos`	Boolean. If `True`, the expected revenue is summed over all of the regions.
`use_kpi`	If `False`, then revenue is used to calculate the ROI numerator. Otherwise, uses KPI to calculate the ROI numerator.
`batch_size`	Integer representing the maximum draws per chain in each batch. The calculation is run in batches to avoid memory exhaustion. If a memory error occurs, try reducing `batch_size`. The calculation will generally be faster with larger `batch_size` values.

Returns
Tensor of ROI values with dimensions `(n_chains, n_draws, n_geos, (n_media_channels + n_rf_channels))`. The `n_geos` dimension is dropped if `aggregate_geos=True`.

`summary_metrics`

View source

summary_metrics(
    new_data: (meridian.analysis.analyzer.DataTensors | None) = None,
    marginal_roi_by_reach: bool = True,
    marginal_roi_incremental_increase: float = 0.01,
    selected_geos: (Sequence[str] | None) = None,
    selected_times: (Sequence[str] | Sequence[bool] | None) = None,
    aggregate_geos: bool = True,
    aggregate_times: bool = True,
    optimal_frequency: (Sequence[float] | None) = None,
    use_kpi: bool = False,
    confidence_level: float = constants.DEFAULT_CONFIDENCE_LEVEL,
    batch_size: int = constants.DEFAULT_BATCH_SIZE,
    include_non_paid_channels: bool = False,
    non_media_baseline_values: (Sequence[float] | None) = None
) -> xr.Dataset

Returns summary metrics.

If new_data=None, this method calculates all the metrics conditional on the values of the data variables that the Meridian object was initialized with. The user can also override this historical data through the new_data argument. For example, to override the media, frequency, and non-media treatments data variables, the user can pass the following new_data argument:

new_data = DataTensors(
    media=new_media,
    frequency=new_frequency,
    non_media_treatments=new_non_media_treatments)

Note that if new_data is provided with a different number of time periods than in InputData, pct_of_contribution is not defined because expected_outcome() is not defined for new time periods.

Note that mroi and effectiveness metrics are not defined (math.nan) for the aggregate "All Paid Channels" channel dimension.

Args
`new_data`	Optional `DataTensors` object with optional new tensors: `media`, `media_spend`, `reach`, `frequency`, `rf_spend`, `organic_media`, `organic_reach`, `organic_frequency`, `non_media_treatments`, `controls`, `revenue_per_kpi`. If provided, the summary metrics are calculated using the values of the tensors passed in `new_data` and the original values of all the remaining tensors. If `None`, the summary metrics are calculated using the original values of all the tensors. If `new_data` is provided with a different number of time periods than in `InputData`, then all tensors, except `controls`, must have the same number of time periods.
`marginal_roi_by_reach`	Boolean. Marginal ROI (mROI) is defined as the return on the next dollar spent. If this argument is `True`, the assumption is that the next dollar spent only impacts reach, holding frequency constant. If this argument is `False`, the assumption is that the next dollar spent only impacts frequency, holding reach constant. Used only when `include_non_paid_channels` is `False`.
`marginal_roi_incremental_increase`	Small fraction by which each channel's spend is increased when calculating its mROI numerator. The mROI denominator is this fraction of the channel's total spend. Used only when `include_non_paid_channels` is `False`.
`selected_geos`	Optional list containing a subset of geos to include. By default, all geos are included.
`selected_times`	Optional list containing either a subset of dates to include or booleans with length equal to the number of time periods in the tensors in the `new_data` argument, if provided. By default, all time periods are included.
`aggregate_geos`	Boolean. If `True`, the expected outcome is summed over all of the regions.
`aggregate_times`	Boolean. If `True`, the expected outcome is summed over all of the time periods. Note that if `False`, ROI, mROI, Effectiveness, and CPIK are not reported because they do not have a clear interpretation by time period.
`optimal_frequency`	An optional list with dimension `n_rf_channels`, containing the optimal frequency per channel, that maximizes posterior mean ROI. Default value is `None`, and historical frequency is used for the metrics calculation.
`use_kpi`	Boolean. If `True`, the summary metrics are calculated using KPI. If `False`, the metrics are calculated using revenue.
`confidence_level`	Confidence level for summary metrics credible intervals, represented as a value between zero and one.
`batch_size`	Integer representing the maximum draws per chain in each batch. The calculation is run in batches to avoid memory exhaustion. If a memory error occurs, try reducing `batch_size`. The calculation will generally be faster with larger `batch_size` values.
`include_non_paid_channels`	Boolean. If `True`, non-paid channels (organic media, organic reach and frequency, and non-media treatments) are included in the summary but only the metrics independent of spend are reported. If `False`, only the paid channels (media, reach and frequency) are included but the summary contains also the metrics dependent on spend. Default: `False`.
`non_media_baseline_values`	Optional list of shape `(n_non_media_channels,)`. Each element is a float which means that the fixed value will be used as baseline for the given channel. It is expected that they are scaled by population for the channels where `model_spec.non_media_population_scaling_id` is `True`. If `None`, the `model_spec.non_media_baseline_values` is used, which defaults to the minimum value for each non_media treatment channel.

Returns

An xr.Dataset with coordinates: channel, metric (mean, median, ci_low, ci_high), distribution (prior, posterior) and contains the following non-paid data variables: incremental_outcome, pct_of_contribution, effectiveness, and the following paid data variables: impressions, pct_of_impressions, spend, pct_of_spend, CPM, roi, mroi, cpik. The paid data variables are only included when include_non_paid_channels is False. Note that roi, mroi, cpik, and effectiveness metrics are not reported when aggregate_times=False because they do not have a clear interpretation by time period.

meridian.analysis.analyzer.Analyzer Stay organized with collections Save and categorize content based on your preferences.

Attributes

Methods

adstock_decay

baseline_summary_metrics

compute_incremental_outcome_aggregate

cpik

expected_outcome

expected_vs_actual_data

filter_and_aggregate_geos_and_times

get_aggregated_impressions

get_aggregated_spend

get_historical_spend

get_rhat

hill_curves

incremental_outcome

marginal_roi

negative_baseline_probability

optimal_freq

predictive_accuracy

response_curves

rhat_summary

roi

summary_metrics

meridian.analysis.analyzer.Analyzer

`adstock_decay`

`baseline_summary_metrics`

`compute_incremental_outcome_aggregate`

`cpik`

`expected_outcome`

`expected_vs_actual_data`

`filter_and_aggregate_geos_and_times`

`get_aggregated_impressions`

`get_aggregated_spend`

`get_historical_spend`

`get_rhat`

`hill_curves`

`incremental_outcome`

`marginal_roi`

`negative_baseline_probability`

`optimal_freq`

`predictive_accuracy`

`response_curves`

`rhat_summary`

`roi`

`summary_metrics`