![]() |
Liest Daten aus einem Pandas-DataFrame
.
Übernommen von: InputDataLoader
meridian.data.load.DataFrameDataLoader(
df: pd.DataFrame,
coord_to_columns: CoordToColumns,
kpi_type: str,
media_to_channel: (Mapping[str, str] | None) = None,
media_spend_to_channel: (Mapping[str, str] | None) = None,
reach_to_channel: (Mapping[str, str] | None) = None,
frequency_to_channel: (Mapping[str, str] | None) = None,
rf_spend_to_channel: (Mapping[str, str] | None) = None,
organic_reach_to_channel: (Mapping[str, str] | None) = None,
organic_frequency_to_channel: (Mapping[str, str] | None) = None
)
Diese Klasse liest Eingabedaten aus einem Pandas-DataFrame
. Das Attribut coord_to_columns
speichert eine Zuordnung von Zielkoordinaten der InputData
und Array-Namen zu den Spaltennamen des DataFrames, falls diese unterschiedlich sind. Es gibt folgende Felder:
geo
,time
,kpi
,revenue_per_kpi
,population
(einzelne Spalte)controls
(mehrere Spalten)- (1)
media
,media_spend
(mehrere Spalten) - (2)
reach
,frequency
,rf_spend
(mehrere Spalten) non_media_treatments
(mehrere Spalten, optional)organic_media
(mehrere Spalten, optional)organic_reach
,organic_frequency
(mehrere Spalten, optional)
Der DataFrame
muss (1) oder (2) enthalten, aber nicht beide.
Außerdem muss jeder Media-Channel in (1) oder (2) aufgeführt sein, aber nicht in beiden.
Wichtig:
- Die Werte in Zeitspalten müssen so formatiert sein: TT.MM.JJJJ.
- In einem länderbezogenen Modell sind
geo
undpopulation
optional. Wennpopulation
angegeben ist, wird es auf den Standardwert1.0
zurückgesetzt. - Wenn
media
-Daten angegeben werden, sindmedia_to_channel
undmedia_spend_to_channel
erforderlich. Wennreach
- undfrequency
-Daten angegeben werden, sindreach_to_channel
,frequency_to_channel
undrf_spend_to_channel
erforderlich. - Falls
organic_reach
- undorganic_frequency
-Daten angegeben werden, sindorganic_reach_to_channel
undorganic_frequency_to_channel
erforderlich.
Beispiel:
# df = [...]
coord_to_columns = CoordToColumns(
geo='dmas',
time='dates',
kpi='conversions',
revenue_per_kpi='revenue_per_conversions',
controls=['control_income'],
population='populations',
media=['impressions_tv', 'impressions_fb', 'impressions_search'],
media_spend=['spend_tv', 'spend_fb', 'spend_search'],
reach=['reach_yt'],
frequency=['frequency_yt'],
rf_spend=['rf_spend_yt'],
non_media_treatments=['price', 'discount']
organic_media=['organic_impressions_blog'],
organic_reach=['organic_reach_newsletter'],
organic_frequency=['organic_frequency_newsletter'],
)
media_to_channel = {
'impressions_tv': 'tv',
'impressions_fb': 'fb',
'impressions_search': 'search',
}
media_spend_to_channel = {
'spend_tv': 'tv', 'spend_fb': 'fb', 'spend_search': 'search'
}
reach_to_channel = {'reach_yt': 'yt'}
frequency_to_channel = {'frequency_yt': 'yt'}
rf_spend_to_channel = {'rf_spend_yt': 'yt'}
organic_reach_to_channel = {'organic_reach_newsletter': 'newsletter'}
organic_frequency_to_channel = {'organic_frequency_newsletter': 'newsletter'}
data_loader = DataFrameDataLoader(
df=df,
coord_to_columns=coord_to_columns,
kpi_type='non-revenue',
media_to_channel=media_to_channel,
media_spend_to_channel=media_spend_to_channel,
reach_to_channel=reach_to_channel,
frequency_to_channel=frequency_to_channel,
rf_spend_to_channel=rf_spend_to_channel,
organic_reach_to_channel=organic_reach_to_channel,
organic_frequency_to_channel=organic_frequency_to_channel,
)
data = data_loader.load()
Methoden
load
load() -> meridian.data.input_data.InputData
Liest Daten aus einem DataFrame und gibt ein „InputData“-Objekt zurück.
__eq__
__eq__(
other
)
Gibt zurück, ob „self==value“.