Monitoring Cleanup

1. Overview

If you have enabled monitoring, data will be saved to a BigQuery table for logging and retry purposes.

You may wish to periodically delete data from this table to reduce costs and to comply with data privacy regulations.

To do this, you can run the tcrm_monitoring_cleanup DAG.

2. Running Cleanup Manually

To run the cleanup DAG manually, navigate to the DAGs section of Airflow and click on the play button. By default, all data older than 50 days will be removed from the monitoring table.

3. Running Cleanup on a Schedule

Cleanup can also be set to run repeatedly on a schedule.

For information on how to do this, see section "3.3.2 Schedule a DAG" in the Installation Guide.

4. Running Cleanup After Another DAG

Cleanup can also be set to run automatically after another DAG finishes.

To enable this, set up an Airflow variable (Admin -> Variables -> Create) with the following format:

{dag_name}_enable_monitoring_cleanup (Example: tcrm_bq_to_ga_enable_monitoring_cleanup).

Set the value of this variable 1 to run cleanup automatically after this DAG finishes, or set it to 0 to turn off automatic cleanup.

Note that there is no need to create this variable when running from monitoring_cleanup_dag, since this DAG will run the cleanup operation by definition.

5. Customizing Days To Live

By default, the cleanup DAG will remove all data from the BigQuery monitoring table that is older than 50 days. You can customize the number of days data can live in the BigQuery table before the cleanup operator will delete it by setting the variable monitoring_data_days_to_live to the number of days you want data to live.