Cookie matching

At a high-level, cookie matching is the process by which an advertiser or vendor associates cookies in their domain with cookies in Google's domain. Matching these cookies allows you to connect first-party data that you own with Google ad data (tracked via Google, DoubleClick, and YouTube IDs) on that same user, allowing you to incorporate CRM data and better understand user behavior. By combining this data via privacy-centric joins, you can:

  • Target audiences based on specific items abandoned in shopping carts, if those users have interacted with your ads and domain.
  • Determine which ads lead to longer sessions on your domain.
  • Analyze purchase history joined to post-campaign data.

Limitations and end-user privacy

Though powerful, cookie matching comes with some limitations:

  • Joins between *_match and non-*_match tables are prohibited.
  • It requires engineering work from both you and Google.
  • It’s unlikely that you’ll be able to match all of your Google ad data. Match rates are subject to a number of factors, and vary by use-case and client-side setup. Match rates are often lower than users expect. Users are only eligible for cookie matching if they've interacted with your domain and your ads.
  • Google begins to populate your match tables when they’re set up. Depending on the frequency with which users visit your site and receive your matching pixel, it may take months before your match tables contain holistic, stable data on your users.
  • You won’t be able to associate individual users to multiple devices, unless those users are signed in, or you have some way of connecting users across devices.
  • You can’t match a single user using multiple cookies, as would be the case when a user clears their cookies.
  • Jobs that are run on match tables are subject to the same aggregation requirements as other jobs in Ads Data Hub. A low match rate combined with infrequent visits to your domain could lead to difficulties obtaining data. This is due to the combined effect of match rates and aggregation requirements1.
  • Consistent with Google’s policies on end-user privacy, you:
    • Are prohibited from matching a given user’s signed-in and signed-out data.
    • Can’t match signed-in data spanning multiple accounts.
    • Are unable to match data to users who have opted out of ads personalization.
  • For iOS events, you can only match data originating from apps on iOS 14.5+ from users who have granted permission under Apple's App Tracking Transparency framework.

In order for Google to populate your match tables, you must serve a match tag on every page of your domain where you’re interested in matching advertising data. Where you place the pixel will depend on your advertising goals. For example, you may want to attempt to match every user who visits your domain (requiring pixels on nearly all pages), or you could match converting users (requiring pixels on a conversion page). Generally, a more widespread pixel will lead to higher match rates.

The match tag is a transparent 1x1 pixel, containing your cookie matching profile ID and an encoded user or cookie ID:

<img src="https://cm.g.doubleclick.net/pixel?google_nid=adh_customername&google_hm=Q29va2llIG51bWJlciAxIQ" />

This match tag is what initiates communication between you and the Google cookie matching services.

Step-by-step overview

  1. A user visits a page with a match tag.
  2. The match tag initiates a series of redirects to DoubleClick, Google, and YouTube matching services. The requests contain that user’s ID or cookie from your website, plus the Google cookie in each of the matching service’s ID spaces.
  3. A transparent 1x1 pixel is returned to the browser to confirm that the request was fulfilled.

This process is shown in the following diagram:

Image depicting a series of redirects between the browser and matching services

Setup

The process for setting up cookie matching in Ads Data Hub is as follows:

  1. Contact your account representative and communicate your interest in cookie matching. They will discuss your goals and give you more information on deploying the tracking pixel on your domain.
  2. Ads Data Hub specialists will initiate another conversation to discuss the technical requirements and use-cases.
  3. While you’re deploying the tracking pixel and error endpoint, Google will create your match tables.

Once you've completed these steps, no immediate action is required. Google will populate your match tables daily2, so you’ll need to allow for enough time to pass before your table will contain enough data to provide meaningful matches and meet aggregation requirements. This is dependent on the frequency with which users visit your site; a site with daily visitors will reach this point much faster than one with monthly visitors. As the net-new number of matches slows down, your match tables will contain more comprehensive data.

Querying the match tables

After your match tables contain comprehensive data, you’re ready to run queries against them. Each table in the Ads Data Hub schema containing a user_id field is accompanied by a *_match table. For example, for the adh.google_ads_impressions table, Ads Data Hub also generates a match table called adh.google_ads_impressions_match containing your user IDs. These tables contain a subset of the rows available in the original tables, where there is a match on the user_id.

The match tables contain an additional column:

  • external_cookie: Your cookie, stored as bytes.

It’s important to consider the field’s type when writing your queries. SQL comparison operators expect that the literals you’re comparing are of the same type. Since you likely don't store your cookie as bytes, you may need to encode the data in order for comparisons to work. Additionally, string comparisons in SQL are sensitive to capitalization, so you may need to encode strings on both sides of your comparison to ensure that they can be accurately compared.

Encoding user IDs

Encode user IDs client-side

To ensure different ID formats can be safely transmitted via URL, all IDs need to be URL-safe Base64 encoded before being sent. The URL-safe Base64 decoded ID will be made available in Ads Data Hub in the external_cookie field, so you’ll need to undo any transformations you applied before encoding to derive your original ID.

If your ID is always 24 characters (or bytes) or less, you can include the URL-safe Base64 encoded ID in a pixel, as shown in example 1. If your ID is greater than 24 characters (or bytes) then you will need to transform it to a representation that's 24 bytes or less. In some cases (like the GUID in example 2) it's a matter of converting to the byte-representation. In other cases you may need to send Google a subset (or a hash) of your ID. Note that, in any case, you'll need to ensure that you can write a SQL JOIN which will convert the ID in your first-party table in the same way.

Example 1

Your user ID value will always be under the 24-byte length limit. Ads Data Hub recommends that you simply send your User ID directly to ADH (after encoding it as url-safe Base64 for URL transport purposes).

var userId = 'abcdef123456789';
// Encode the string (or number) in normal base64.
var userIdBase64 = btoa(userId);

// Ensure that the uploaded user IDs use web-safe Base64 encoding.
userIdBase64 = userIdBase64.replace(/\+/g, '-').replace(/\//g, '_')
    .replace(/=+$/, '');

// After encoding the UUID correctly, you can create the request tag and
// insert it into the DOM.
var imgElement = Document.createElement('img');
imgElement.src =
    'https://cm.g.doubleclick.net/pixel?google_nid=adh_customername&google_hm='
    + userIdBase64;
document.body.appendChild(imgElement);
Example 2

You assign a universal unique identifier (UUID) value as a user ID, such as: 123e4567-e89b-12d3-a456-426655440000.

Ads Data Hub recommends the following transformations when matching:

  1. UUID is formatted as a 36-character string.
  2. Hexadecimal decode UUID.
  3. UUID is formatted as bytes.
  4. URL-safe Base64 encode bytes.
  5. UUID is formatted as a string.

This could be implemented with the following code:

JavaScript

var userId = '123e4567-e89b-12d3-a456-426655440000';

// A helper function for converting a hex string to a byte array.
function strToBytes(str) {
        for (var bytes = [], i = 0; i < str.length; i += 2) {
          bytes.push(parseInt(str.substr(i, 2), 16));
        }
        return bytes;
}

// Remove the formatting dashes from the UUID.
userId = userId.replace(/-/g, '');

// Encode the hex string as a byte array.
var userIdBytes = strToBytes(userId);

// Encode the byte array in normal base64.
var userIdBase64 = btoa(String.fromCharCode(...new Uint8Array(userIdBytes)));

// Ensure that the uploaded user IDs use web-safe Base64 encoding.
userIdBase64 = userIdBase64.replace(/\+/g, '-').replace(/\//g, '_').replace(
    /=+$/, '');

// After encoding the UUID correctly, you can create the request tag and
// insert it into the DOM.
var imgElement = Document.createElement('img');
imgElement.src =
    'https://cm.g.doubleclick.net/pixel?google_nid=adh_customername&google_hm='
    + userIdBase64;
document.body.appendChild(imgElement);

Python

import base64

user_id = '123e4567-e89b-12d3-a456-426655440000'
user_id_as_bytes = bytes.fromhex(user_id.replace('-', ''))
base64.urlsafe_b64encode(user_id_as_bytes)

If there is a match with a Google user ID, the external_cookie field contains your ID as a byte value. To reconstruct your original ID, the following transformation is required:

  1. external_cookie is formatted as bytes.
  2. Hexadecimal encode external_cookie.
  3. external_cookie is formatted as a string.

Encode user IDs in Ads Data Hub

If you store the UUID string in a field in your first-party data, you’ll need to convert it to bytes, as in the above example, in order to successfully join your data.

The following example shows how to encode your UUID and join it on the external cookie field:

JOIN my_data ON imp.external_cookie = FROM_HEX(REPLACE(my_data.uuid, '-', ''))

Note that you cannot cast an integer to bytes. If your user ID is an integer (as in example 1 above), you would need to first cast as a string:

JOIN my_data ON imp.external_cookie = CAST(CAST(my_data.user_id AS STRING) AS BYTES)

Remember that the encoding required to match your data will be specific to how you store it, and how you encoded it before sending it to Ads Data Hub.

Learn more about string functions in BigQuery SQL.

Sample query

The following example joins first-party data with google_ads_impressions_match, then joins these results with adh_google_ads_impressions in a second query.

SELECT
  imp.campaign_id as campaign_id,
  sum(my_data.recent_orders) as orders,
  average(my_data.lifetime_value) as ltv
FROM
  adh.google_ads_impressions_match as imp
LEFT JOIN
  my_data ON imp.external_cookie = my_data.company_guest_id_bytes
GROUP BY
  campaign_id

With the previous query's results saved as previous_results, you can now join google_ads_impressions. This adds data on campaigns with 0 impressions to your results.

SELECT
  campaign_id,
  COALESCE(orders, 0) as orders,
  COALESCE(ltv, 0) as ltv,
FROM (SELECT DISTINCT campaign_id
   FROM adhgoogle_ads_impressions)
LEFT JOIN previous_results USING (campaign_id)

  1. Example: a 20% match rate effectively means that you need 250 users per row to meet the 50-user aggregation threshold, as 50 / .2 = 250. 

  2. Matches made in particular day may experience a delay of up to 48 hours before they appear in your tables.