End-user privacy is at the core of everything that Ads Data Hub does; it's the foundation that our platform is built upon. In order to help maintain that privacy and help our customers with regulatory compliance, we impose certain checks and restrictions, designed to help prevent the transmission of data about individual users1 in the data that you get out of the platform. Here is an overview of the checks, with more detail in the sections below:
- Static checks. Static checks examine the statements in your queries to
look for obvious and immediate privacy concerns, such as:
- Exporting user identifiers, or any function of user identifiers.
- Using blocklisted functions over fields that contain user-level data.
- Data access budget. Your data access budget limits the total number of
times that you can access a given piece of data. Users approaching the end of
their budget will be notified via a privacy message
DATA_ACCESS_BUDGET_IS_NEARLY_EXHAUSTED. You may monitor the budget using the data access budget entry point or by observing budget notifications in the UI.
- Aggregation requirements. Aggregation requirements ensure that every row contains a large enough number of users to protect end-user privacy.
Difference checks. Difference checks compare results from the job that you're running to your previous results, as well as rows from the same result set. This is designed to help prevent you from gathering information about individual users by comparing data from multiple sets of users that meet our aggregation requirements. Difference check violations can be triggered by changes to your underlying data between two jobs.
When a result doesn't pass privacy checks, Ads Data Hub will display or return a privacy message informing you that a row was filtered. This can be anything from a single row, to an entire result set. To ensure that your reporting totals remain accurate, use a filtered row summary to count data from dropped rows2.
At the core of Ads Data Hub's privacy checks is the user aggregation threshold. For most queries, you can only receive reporting data on 50 or more users. However, queries that only access clicks and conversions can be used to report on 10 or more users. (Users with null IDs don't count towards this aggregation threshold.)
In the example below, the row containing campaign 125 would be filtered from the final results, because it aggregates results from 48 users, which is below the 50-user minimum. Filtered rows are those omitted from the results due to privacy restrictions.
Difference checks help ensure that users can't be identified through the comparison of multiple sufficiently aggregated results. When comparing a job's results to previous results, Ads Data Hub looks for vulnerabilities on the level of individual users. Because of this, even results from different campaigns, or results that report the same number of users, can be filtered if they have a large number of overlapping users.
On the other hand, two aggregated result sets may have the same number of users--appearing identical--but not share individual users, and therefore be privacy-safe, in which case they wouldn't be filtered.
Ads Data Hub uses data from your historical results when considering the vulnerability of a new result. This means that running the same query over and over again creates more data for difference checks to use when considering a new result's vulnerability. Additionally, the underlying data can change, leading to privacy check violations on queries thought to be stable.
When your job-level results differ adequately, but an individual row is similar to a row in a previous job, Ads Data Hub will filter the similar row. In the example below, the row containing campaign 123 in the second result will be filtered, because it differs from the previous result by a single user.
If the sum of the users in all rows in a result set is similar to that from a previous job, Ads Data Hub will filter the entire result set. In the example below, all results from the second job will be filtered.
Filtered row summary
Filtered row summaries tally data that was filtered due to privacy checks. Data from filtered rows is summed and added to a catch-all row. While the filtered data can't be further analyzed, it provides a summary of how much data was filtered from the results.
If your SQL is valid but might trigger excessive filtering, the query advisor surfaces actionable advice during the query development process, to help you avoid undesirable results.
Triggers include the following patterns:
- Joining aggregated subqueries
- Joining unaggregated data with potentially different users
- Recursively defined temp tables
To use the query advisor:
- UI. Recommendations will be surfaced in the query editor, above the query text.
- API. Use the