Decrypting Advertiser Identifiers for Ad Networks

Ad Networks using JavaScript tags to fill ads through Authorized Buyers are eligible to receive advertiser identifiers for both Android and iOS devices. The information is sent through the %%EXTRA_TAG_DATA%% or %%ADVERTISING_IDENTIFIER%% macro in the JavaScript tag managed by Authorized Buyers. The rest of this section focuses on extracting %%EXTRA_TAG_DATA%% but see Remarketing with IDFA or Advertising ID for details on the %%ADVERTISING_IDENTIFIER%% encrypted proto buffer MobileAdvertisingId that can be analogously decrypted.

Timeline

The Ad Network updates their JavaScript in-app tags through the Authorized Buyers UI, adding in the %%EXTRA_TAG_DATA%% macro as explained below.
At serving time, the app requests an ad from Authorized Buyers through the Google Mobile Ads SDK, while securely passing the advertiser identifier.
The app receives back the JavaScript tag, with the %%EXTRA_TAG_DATA%% macro filled in with the encrypted Ad Network protocol buffer containing that identifier.
The app runs this tag, making a call to the Ad Network for the winning ad.
In order to use (monetize) this information, the Ad Network must process the protocol buffer:
1. Decode the websafe string back into a bytestring with WebSafeBase64.
2. Decrypt it using the scheme outlined below.
3. Deserialize the proto and obtain the advertiser id from ExtraTagData.advertising_id or ExtraTagData.hashed_idfa.

Dependencies

The WebSafeBase64 encoder.
A crypto library that supports SHA-1 HMAC, such as Openssl.
The Google protocol buffer compiler.

Decode websafe string

Because the information sent through the %%EXTRA_TAG_DATA%% macro must be sent through URL, Google servers encode it with web-safe base64 (RFC 3548).

Before attempting decryption therefore, you must decode the ASCII characters back into a bytestring. The sample C++ code below is based on the OpenSSL Project's BIO_f_base64(), and is part of Google's sample decryption code.

string AddPadding(const string& b64_string) {
  if (b64_string.size() % 4 == 3) {
    return b64_string + "=";
  } else if (b64_string.size() % 4 == 2) {
    return b64_string + "==";
  }
  return b64_string;
}

// Adapted from http://www.openssl.org/docs/man1.1.0/crypto/BIO_f_base64.html
// Takes a web safe base64 encoded string (RFC 3548) and decodes it.
// Normally, web safe base64 strings have padding '=' replaced with '.',
// but we will not pad the ciphertext. We add padding here because
// openssl has trouble with unpadded strings.
string B64Decode(const string& encoded) {
  string padded = AddPadding(encoded);
  // convert from web safe -> normal base64.
  int32 index = -1;
  while ((index = padded.find_first_of('-', index + 1)) != string::npos) {
    padded[index] = '+';
  }
  index = -1;
  while ((index = padded.find_first_of('_', index + 1)) != string::npos) {
    padded[index] = '/';
  }

  // base64 decode using openssl library.
  const int32 kOutputBufferSize = 256;
  char output[kOutputBufferSize];

  BIO* b64 = BIO_new(BIO_f_base64());
  BIO_set_flags(b64, BIO_FLAGS_BASE64_NO_NL);
  BIO* bio = BIO_new_mem_buf(const_cast(padded.data()),
                             padded.length());
  bio = BIO_push(b64, bio);
  int32 out_length = BIO_read(bio, output, kOutputBufferSize);
  BIO_free_all(bio);
  return string(output, out_length);
}

Structure of encrypted bytestring

Once you've decoded the ASCII characters back into a bytestring, you're ready to decrypt it. The encrypted bytestring contains 3 sections:

initialization_vector: 16-bytes.
ciphertext: series of 20-byte sections.
integrity_signature: 4-bytes.

{initialization_vector (16 bytes)}{ciphertext (20-byte sections)}{integrity_signature (4 bytes)}

The ciphertext byte array is divided into multiple 20-byte sections, with the exception that the very last section may contain between 1 and 20 bytes inclusive. For each section of the original byte_array, the corresponding 20-byte ciphertext is generated as:

<byte_array <xor> HMAC(encryption_key, initialization_vector || counter_bytes)>

where || is concatenation.

Definitions

Variable	Details
`initialization_vector`	16 bytes - unique to the impression.
`encryption_key`	32 bytes - provided at account setup.
`integrity_key`	32 bytes - provided at account setup.
`byte_array`	A serialized `ExtraTagData` object, in 20-byte sections.
`counter_bytes`	Byte value showing the ordinal number of the section, see below.
`final_message`	Total byte array sent through the `%%EXTRA_TAG_DATA%%` macro (minus WebSafeBase64 encoding).

Operators	Details
`hmac(key, data)`	SHA-1 HMAC, using `key` to encrypt `data`.
`a \|\| b`	string `a` concatenated with string `b`.

Calculate counter_bytes

counter_bytes marks the order of each 20-byte section of the ciphertext. Note that the last section may contain between 1 and 20 bytes inclusive. To fill counter_bytes with the correct value when running your hmac() function, count the 20-byte sections (including the remainder) and use the following reference table:

Section number	`counter_bytes` value
0	None
1 … 256	1 byte. The value increments from 0 to 255 sequentially.
257 … 512	2 bytes. The value of the first byte is 0, the value of the second byte increments from 0 to 255 sequentially.
513 … 768	3 bytes. The value of the first two bytes are 0, the value of the last byte increments from 0 to 255 sequentially.

Encryption scheme

The encryption scheme is based on the same scheme used for decrypting the hyperlocal targeting signal.

Serialization: An instance of the ExtraTagData object as defined in the protocol buffer is first serialized through SerializeAsString() to a byte array.
Encryption: The byte array is then encrypted using a custom encryption scheme designed to minimize size overhead while ensuring adequate security. The encryption scheme uses a keyed HMAC algorithm to generate a secret pad based on the initialization_vector, which is unique to the impression event.

Encryption pseudocode

byte_array = SerializeAsString(ExtraTagData object)
pad = hmac(encryption_key, initialization_vector ||
      counter_bytes )  // for each 20-byte section of byte_array
ciphertext = pad <xor> byte_array // for each 20-byte section of byte_array
integrity_signature = hmac(integrity_key, byte_array ||
                      initialization_vector)  // first 4 bytes
final_message = initialization_vector || ciphertext || integrity_signature

Decryption scheme

Your decryption code must 1) decrypt the protocol buffer using the encryption key, and 2) verify the integrity bits with the integrity key. The keys will be provided to you during account setup. There aren't any restrictions on how you structure your implementation. For the most part, you should be able to take the sample code and adapt it according to your needs.

Generate your pad: HMAC(encryption_key, initialization_vector || counter_bytes)
XOR: Take this result and <xor> with the ciphertext to reverse the encryption.
Verify: The integrity signature passes 4 bytes of HMAC(integrity_key, byte_array || initialization_vector)

Decryption pseudocode

// split up according to length rules
(initialization_vector, ciphertext, integrity_signature) = final_message

// for each 20-byte section of ciphertext
pad = hmac(encryption_key, initialization_vector || counter_bytes)

// for each 20-byte section of ciphertext
byte_array = ciphertext <xor> pad

confirmation_signature = hmac(integrity_key, byte_array ||
                         initialization_vector)
success = (confirmation_signature == integrity_signature)

Sample C++ code

Included here is a key function from our complete decryption example code.

bool DecryptByteArray(
    const string& ciphertext, const string& encryption_key,
    const string& integrity_key, string* cleartext) {
  // Step 1. find the length of initialization vector and clear text.
  const int cleartext_length =
     ciphertext.size() - kInitializationVectorSize - kSignatureSize;
  if (cleartext_length < 0) {
    // The length cannot be correct.
    return false;
  }

  string iv(ciphertext, 0, kInitializationVectorSize);

  // Step 2. recover clear text
  cleartext->resize(cleartext_length, '\0');
  const char* ciphertext_begin = string_as_array(ciphertext) + iv.size();
  const char* const ciphertext_end = ciphertext_begin + cleartext->size();
  string::iterator cleartext_begin = cleartext->begin();

  bool add_iv_counter_byte = true;
  while (ciphertext_begin < ciphertext_end) {
    uint32 pad_size = kHashOutputSize;
    uchar encryption_pad[kHashOutputSize];

    if (!HMAC(EVP_sha1(), string_as_array(encryption_key),
              encryption_key.length(), (uchar*)string_as_array(iv),
              iv.size(), encryption_pad, &pad_size)) {
      printf("Error: encryption HMAC failed.\n");
      return false;
    }

    for (int i = 0;
         i < kBlockSize && ciphertext_begin < ciphertext_end;
         ++i, ++cleartext_begin, ++ciphertext_begin) {
      *cleartext_begin = *ciphertext_begin ^ encryption_pad[i];
    }

    if (!add_iv_counter_byte) {
      char& last_byte = *iv.rbegin();
      ++last_byte;
      if (last_byte == '\0') {
        add_iv_counter_byte = true;
      }
    }

    if (add_iv_counter_byte) {
      add_iv_counter_byte = false;
      iv.push_back('\0');
    }
  }

Get data from Ad Network protocol buffer

Once you have decoded and decrypted the data passed in %%EXTRA_TAG_DATA%%, you're ready to deserialize the protocol buffer and get the advertiser identifier for targeting.

If you're unfamiliar with protocol buffers, start with our documentation.

Definition

Our Ad Network protocol buffer is defined like this:

message ExtraTagData {
  // advertising_id can be Apple's identifier for advertising (IDFA)
  // or Android's advertising identifier. When the advertising_id is an IDFA,
  // it is the plaintext returned by iOS's [ASIdentifierManager
  // advertisingIdentifier]. For hashed_idfa, the plaintext is the MD5 hash of
  // the IDFA.  Only one of the two fields will be available, depending on the
  // version of the SDK making the request.  Later SDKs provide unhashed values.
  optional bytes advertising_id = 1;
  optional bytes hashed_idfa = 2;
}

You'll need to deserialize it using ParseFromString() as described in the C++ protocol buffer documentation.

For details on the Android advertising_id and iOS hashed_idfa fields, see Decrypt Advertising ID and Targeting mobile app inventory with IDFA.

Java library

Instead of implementing the crypto algorithms to encode and decode the Advertiser Identifiers for ad networks, you can use DoubleClickCrypto.java. For more information, see Cryptography.