Larger-scale Link Checker - Single Account

This script is for a single account. For operating on multiple accounts in a Manager Account, use the Manager Account version of the script.

As a website evolves, new pages get added, old pages are taken down, and links get broken and fixed. Keeping an AdWords campaign in sync with the website is an ongoing battle for many advertisers. Live advertisements may be pointing to non-existent pages, and the advertiser ends up paying for clicks that yield 404 errors.

The link checker solution is an AdWords script for validating the existence of landing pages. However, for large accounts the solution has the following issues:

  • Large numbers of URLs can exceed the daily quotas for fetching URLs.
  • Only one URL can be checked at a time, and the script has an execution time limit. This limits the rate at which URLs can be checked.

The Larger-scale link checker addresses these issues by using Google Cloud infrastructure to assist with the link checking. The task is divided as follows:

  • The AdWords script collates the list of URLs, passes them to App Engine, running on Google Cloud infrastructure, then periodically checks on progress by means of script scheduling.
  • The App Engine instance performs the link checking, and is able to check links in parallel, as well as having considerably higher quotas than AdWords scripts for performing URL fetches.

Tip: Free App Engine instances are able to fetch a much greater number of URLs per day than AdWords Scripts, but are still subject to a quota. However, this quota can be increased much further still by enabling billing.

Configuration options

The script's main options can be set in the spreadsheet.

  • Scope: Select whether the script will check ads, keywords, and/or sitelinks, and whether the script will check them even if they are paused. Most users should include all three to ensure all of their URLs are checked, but it is typically not necessary to check paused entities.

    Tip: When working with large accounts, set this to a minimal settings, e.g. just sitelinks. Once you have confirmed your setup is working you can then adjust to include keywords and ads as necessary.

  • Failure strings: An optional list of strings which should also be considered as a "failure". For example, a page may exist yet contain the text "Out of stock". You may wish to highlight landing pages that point to products that are out of stock, for example.

    Note: When failure strings are specified, the link checker must request the full text of a page (GET request), as opposed to merely checking its existence (HEAD request). This will use up more of the App Engine quota, as well as taking longer.

  • Email at start of analysis: If you enable this option, the script will email you when each new analysis starts.
  • Email after finishing entire analysis: If you enable this option, the script will email you a consolidated summary after it finishes checking every URL.
  • Email even if no errors are found: If you enable this option, the script will email you (based on the two options above) even if it did not find any errors. Most users prefer to be emailed only when there are errors, but receiving an email even if there are no errors can be useful way to ensure the script is running as scheduled.
  • Days between analyses: Use this option to control how often the script starts a fresh analysis of all of your URLs.

Scheduling

Each time the script runs, it automatically detects whether it should resume an analysis already in progress or whether the last analysis finished and it is time to start a fresh one (based on the Days between analyses option). As a result, regardless of how often you want to launch a fresh analysis, schedule the script to run Hourly.

How it works

The script uses AdWords Reports to build up a list of URLs to check. These are then passed to the App Engine application, which performs the hard work of checking each URL.

The script polls the App Engine application each time it runs to determine whether there are any results to collect. Once all URLs have been checked by the App Engine application, the script writes the results back to the Results tab.

Frequently asked questions

Q: Will this script work in an AdWords manager account?
A: No, this version will not. Instead, use the dedicated manager account version.
Q: Can the script write successful URLs to the results too?
A: No, as this solution is designed to work with a large number of URLs, if a row is created for each URL, regardless of success or failure, this may hit the limit for Spreadsheet rows. Working on the assumption that the majority of landing pages will not return an error, the solution only creates rows for those that result in failure. However, a count of the total URLs checked is included.

Configuration

Configuration requires two steps:

  1. Setting up the Google Cloud project for the App Engine instance.
  2. Configuring the link checker options.
A Google Sheets document is used to perform both steps. To create a copy:
  • Open the template Sheets document.
  • Make a copy by clicking File > Make a copy.
    • Don't forget to update CONFIG_SPREADSHEET_URL in code.
    • Schedule the script Hourly.

Cloud setup

Setting up the Cloud aspects of the solution takes a few minutes, but only needs to be done once:

  1. Select the Cloud setup tab.
  2. Follow the steps in each box, in turn. It make take a minute or two for each step to take effect, so if an error is returned from one, wait a minute or so, before trying again.
Once all steps on the tab are complete, the Cloud setup is complete and you are ready to configure the link checker options as described above.

Cloud performance settings

The larger-scale link checker is able to check URLs both:

  • In parallel, using multiple tasks at the same time
  • At a configurable rate, e.g. the number of URLs per minute checked by each task
In some cases, it may be necessary to constrain the level of parallelization and/or the rate of URLs. For example, if all the landing pages from an account reside on one server, using a large number of parallel tasks and a high rate could result in triggering detection of denial of service attack.

As a consequence, the checker may then be blocked temporarily from checking further pages, and the results may be inaccurate. To configure parallelization and checking rate, use the App Engine Performance tab in your configuration spreadsheet:

  1. Set Max number of concurrent URL checking tasks to the degree of parallelization desired.
  2. Set Max number of URLs to check per minute, in each task to the maximum number of URLs each parallel task can check per minute.
  3. Set User Agent to a custom user agent, if you wish to use this in identifying traffic on your server that originates from the link checker.
  4. Click update to apply these new values.

Source code

  // Copyright 2017, Google Inc. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//     http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

/**
 * @name App Engine-based link checker - Single Account version
 *
 * @overview The App Engine-based link checker delegates link checking to an
 *     App Engine application in the Cloud. This enables links to be checked at
 *     a greater rate and volume than the purely AdWords scripts-based solution.
 *
 * @author AdWords Scripts Team [adwords-scripts@googlegroups.com]
 *
 * @version 1.0.1
 *
 * @changelog
 * - version 1.0.1
 *   - Fix handling of accounts with no serving campaigns.
 * - version 1.0
 *   - Released initial version.
 */

// This configuration spreadsheet should be a copy of
// https://docs.google.com/spreadsheets/d/1LIzzu5-pTt4Rkei5U41HjFz1EwV9f0q3uv2prpAvb5I/edit
var CONFIG_SPREADSHEET_URL = 'ENTER_SPREADSHEET_URL';

// The values below here are not intended to be user-changeable.
var FIFTEEN_SECS_MILLIS = 15000;
var MILLIS_PER_DAY = 24 * 60 * 60 * 1000;
var URL_BATCH_SIZE = 15000;

var LINK_CHECKER_DRIVE_FOLDER = 'App Engine LinkChecker';

var SITELINK_FINAL_URL_FIELD_ID = 5;
var SITELINK_FINAL_MOBILE_URL_FIELD_ID = 6;

var LIB_URL =
    'https://developers.google.com/adwords/scripts/docs/examples/oauth20-library';
var ROWS_CHUNK_SIZE = 4000;
var URL_PREFIX_RE = /^https?:\/\//;

var CONFIG_NAMED_RANGES = [
  'checkAdUrls',    'checkKeywordUrls',    'checkSitelinkUrls',
  'checkPausedAds', 'checkPausedKeywords', 'checkPausedSitelinks',
  'failureStrings', 'emailEachRun',        'emailOnCompletion',
  'emailNonErrors', 'emailRecipients',     'frequency',
  'dateStarted',    'dateCompleted',       'dateEmailed',
  'ProjectId',      'ClientSecret',        'ClientId',
  'RefreshToken',   'ReadyStatusFlag'
];

var authUrlFetch;
var sharedKeyUrlFetch;
var baseUrl;
var config;

/**
 * Main entry point.
 */
function main() {
  if (typeof OAuth2 === 'undefined') {
    throw Error(
        'OAuth2 library not found. Please take a copy of the OAuth2 ' +
        'library from ' + LIB_URL + ' and append to the end of this script.');
  }

  loadConfig();
  if (!config.ReadyStatusFlag) {
    throw Error(
        'The necessary Cloud setup steps have not been completed. ' +
        'Please open the "Cloud Setup" tab of your configuration Sheets ' +
        'document, and complete all the steps.');
  }
  initializeOAuth();
  initializeSharedKeyFetch();

  var operations = listOperations();
  var complete = isComplete(operations);
  if (complete && operations.length) {
    var numErrors = reportResults(operations);
    deleteOperations(operations);
    setDateAsNow('dateCompleted');
    sendCompletedEmail(numErrors);
  } else if (isDateForUrlCheck()) {
    var urlMap = loadUrlMap();
    createOperations(urlMap);
    if (Object.keys(urlMap).length) {
      setDateAsNow('dateStarted');
      sendStartEmail();
    }
  } else if (!complete) {
    sendProcessingEmail();
  }
}

/**
 * Loads configuration values from the configuration spreadsheet based on the
 * list of Named Ranges. Configuration is assigned to a global variable.
 */
function loadConfig() {
  config = {};
  var spreadsheet = SpreadsheetApp.openByUrl(CONFIG_SPREADSHEET_URL);
  for (var i = 0, namedRange; namedRange = CONFIG_NAMED_RANGES[i]; i++) {
    var range = spreadsheet.getRangeByName(namedRange);
    if (!range) {
      throw Error(
          'Named range in configuration spreadsheet not found. The ' +
          'spreadsheet must be created using "File > Make a copy". Simply ' +
          'copying and pasting cells into a new document is insufficient to ' +
          'copy across the required named ranges.');
    }
    if (range.getNumRows() === 1 && range.getNumColumns() === 1) {
      config[namedRange] = range.getValue();
      if (config[namedRange] === 'Yes') {
        config[namedRange] = true;
      } else if (config[namedRange] === 'No') {
        config[namedRange] = false;
      }
    } else {
      var values = range.getValues();
      config[namedRange] = values.reduce(function(a, b) {
        var c = b.join('');
        return c.length ? a.concat(c) : a;
      }, []);
    }
  }
}

/**
 * Initializes the OAuth object for facilitating requests to Cloud APIs.
 */
function initializeOAuth() {
  var tokenUrl = 'https://accounts.google.com/o/oauth2/token';
  var scope = 'https://www.googleapis.com/auth/appengine.admin ' +
      'https://www.googleapis.com/auth/cloud-platform ' +
      'https://www.googleapis.com/auth/userinfo.email';
  authUrlFetch = OAuth2.withRefreshToken(
      tokenUrl, config.ClientId, config.ClientSecret, config.RefreshToken,
      scope);
}

/**
 * Initializes the SharedKeyUrlFetch object for facilitating requests to GAE.
 */
function initializeSharedKeyFetch() {
  sharedKeyUrlFetch =
      SharedKeyUrlFetchApp.withProjectId(authUrlFetch, config.ProjectId);

  var accountId = AdWordsApp.currentAccount().getCustomerId();
  baseUrl = 'https://' + config.ProjectId +
      '.appspot.com/_ah/api/batchLinkChecker/v1/account/' + accountId +
      '/batchoperation';
}

/**
 * Loads a map of URLs to campaign/ad group/entities based on the preferences
 * expressed in the configuration. The aim is to create a mapping such that if
 * a URL is used many times across multiple ads or keywords, it need only be
 * uploaded to App Engine once, and results can be reconstituted once retrieved
 * from App Engine.
 *
 * @return {!Object} The created map.
 */
function loadUrlMap() {
  var urlMap = {};
  var servingCampaigns = getServingCampaignList();
  if (config.checkAdUrls &&
      (config.checkPausedAds || servingCampaigns.length)) {
    addAdsToUrlMap(urlMap, servingCampaigns);
  }
  if (config.checkKeywordUrls &&
      (config.checkPausedKeywords || servingCampaigns.length)) {
    addKeywordsToUrlMap(urlMap, servingCampaigns);
  }
  if (config.checkSitelinkUrls &&
      (config.checkPausedSitelinks || servingCampaigns.length)) {
    addSitelinksToUrlMap(urlMap, servingCampaigns);
  }
  return urlMap;
}

/**
 * Adds a URL to the map for a given combination of entity attributes. The aim
 * is that a record can be kept for each URL of the campaign, ad group, ad,
 * keyword etc. This allows only the URLs to be sent off to the App Engine
 * instance for checking, and then when returned, if multiple entities used
 * that URL, multiple records produced from the result.
 *
 * @param {!Object} urlMap The map to add to.
 * @param {string} campaignName The name of the containing campaign.
 * @param {string} adGroupName The name of the containing ad group.
 * @param {string} type The type of entity: Ad, Keyword or Sitelink.
 * @param {string|number} id The ID (For ad, sitelink) or text (for keyword).
 * @param {string} url The URL.
 */
function addUrlToMap(urlMap, campaignName, adGroupName, type, id, url) {
  if (url && URL_PREFIX_RE.test(url)) {
    var urls = expandUrlModifiers(url);
    urls.forEach(function(expandedUrl) {
      if (!urlMap[expandedUrl]) {
        urlMap[expandedUrl] = {};
      }
      var values = urlMap[expandedUrl];
      if (!values[campaignName]) {
        values[campaignName] = {};
      }
      var campaign = values[campaignName];
      if (!campaign[adGroupName]) {
        campaign[adGroupName] = {};
      }
      var adGroup = campaign[adGroupName];
      if (!adGroup[type]) {
        adGroup[type] = {};
      }
      adGroup[type][id] = true;
    });
  }
}

/**
 * Retrieve a list of Campaigns that are currently enabled and serving.
 *
 * @return {!Object} Map of Ids
 */
function getServingCampaignList() {
  var campaigns = [];
  var report = AdWordsApp.report(
      'SELECT CampaignId FROM ' +
      'CAMPAIGN_PERFORMANCE_REPORT WHERE CampaignStatus = "ENABLED" AND ' +
      'ServingStatus="SERVING"');
  var rows = report.rows();
  while (rows.hasNext()) {
    campaigns.push(rows.next().CampaignId);
  }
  return campaigns;
}

/**
 * Adds URLs to a map from ads in the account.
 *
 * @param {!Object} urlMap The map to add to.
 * @param {!Array.<string>} servingCampaigns A list of serving campaigns.
 */
function addAdsToUrlMap(urlMap, servingCampaigns) {
  var query = 'SELECT CampaignName, AdGroupName, Id, CreativeFinalUrls,' +
      'CreativeFinalMobileUrls, CreativeDestinationUrl ' +
      'FROM AD_PERFORMANCE_REPORT';
  if (!config.checkPausedAds) {
    query += ' WHERE CampaignStatus = "ENABLED" AND AdGroupStatus = "ENABLED"' +
        ' AND Status = "ENABLED" AND CampaignId IN [' +
        servingCampaigns.join(',') + ']';
  } else {
    query += ' WHERE CampaignStatus != "REMOVED" AND ' +
        'AdGroupStatus != "REMOVED" AND Status != "DISABLED"';
  }
  var report = AdWordsApp.report(query);
  var rows = report.rows();
  while (rows.hasNext()) {
    var row = rows.next();
    if (row.CreativeFinalUrls !== '--') {
      var urls = JSON.parse(row.CreativeFinalUrls);
      urls.forEach(function(url) {
        addUrlToMap(
            urlMap, row.CampaignName, row.AdGroupName, 'Ad', row.Id, url);
      });
    }
    if (row.CreativeFinalMobileUrls !== '--') {
      var urls = JSON.parse(row.CreativeFinalMobileUrls);
      urls.forEach(function(url) {
        addUrlToMap(
            urlMap, row.CampaignName, row.AdGroupName, 'Ad', row.Id, url);
      });
    }
    if (row.CreativeDestinationUrl !== '--') {
      addUrlToMap(
          urlMap, row.CampaignName, row.AdGroupName, 'Ad', row.Id,
          row.CreativeDestinationUrl);
    }
  }
}

/**
 * Adds URLs to a map from keywords in the account.
 *
 * @param {!Object} urlMap The map to add to.
 * @param {!Array.<string>} servingCampaigns A list of serving campaigns.
 */
function addKeywordsToUrlMap(urlMap, servingCampaigns) {
  var query = 'SELECT CampaignName, AdGroupName, Criteria, FinalUrls, ' +
      'FinalMobileUrls FROM KEYWORDS_PERFORMANCE_REPORT WHERE ' +
      'ApprovalStatus = "APPROVED"';
  if (!config.checkPausedKeywords) {
    query += ' AND CampaignStatus = "ENABLED" AND AdGroupStatus = "ENABLED"' +
        ' AND Status = "ENABLED" AND CampaignId IN [' +
        servingCampaigns.join(',') + ']';
  } else {
    query += ' AND CampaignStatus != "REMOVED" AND ' +
        'AdGroupStatus != "REMOVED" AND Status != "REMOVED"';
  }
  var report = AdWordsApp.report(query);
  var rows = report.rows();
  while (rows.hasNext()) {
    var row = rows.next();
    if (!config.checkPausedKeywords && row.AdGroupStatus !== 'enabled') {
      continue;
    }
    if (row.FinalUrls !== '--') {
      var urls = JSON.parse(row.FinalUrls);
      urls.forEach(function(url) {
        addUrlToMap(
            urlMap, row.CampaignName, row.AdGroupName, 'Keyword', row.Criteria,
            url);
      });
    }
    if (row.FinalMobileUrls !== '--') {
      var urls = JSON.parse(row.FinalMobileUrls);
      urls.forEach(function(url) {
        addUrlToMap(
            urlMap, row.CampaignName, row.AdGroupName, 'Keyword', row.Criteria,
            url);
      });
    }
  }
}

/**
 * Adds URLs to a map from sitelinks in the account.
 *
 * @param {!Object} urlMap The map to add to.
 * @param {!Array.<string>} servingCampaigns A list of serving campaigns.
 */
function addSitelinksToUrlMap(urlMap, servingCampaigns) {
  var query = 'SELECT CampaignName, AdGroupName, AttributeValues, FeedItemId ' +
      'FROM PLACEHOLDER_FEED_ITEM_REPORT WHERE PlaceholderType = 1 AND ' +
      'DisapprovalShortNames=""';
  if (!config.checkPausedSitelinks) {
    query += ' AND CampaignStatus = "ENABLED" AND AdGroupStatus = "ENABLED"' +
        ' AND Status = "ENABLED" AND CampaignId IN [' +
        servingCampaigns.join(',') + ']';
  } else {
    query += ' AND CampaignStatus != "REMOVED" AND ' +
        'AdGroupStatus != "REMOVED" AND Status != "REMOVED"';
  }
  var report = AdWordsApp.report(query);
  var rows = report.rows();
  while (rows.hasNext()) {
    var row = rows.next();
    var attributeValues = JSON.parse(row.AttributeValues);
    if (attributeValues[SITELINK_FINAL_URL_FIELD_ID]) {
      var urls = attributeValues[SITELINK_FINAL_URL_FIELD_ID];
      urls.forEach(function(url) {
        addUrlToMap(
            urlMap, row.CampaignName, row.AdGroupName, 'Sitelink',
            row.FeedItemId, url);
      });
    }
    if (attributeValues[SITELINK_FINAL_MOBILE_URL_FIELD_ID]) {
      var urls = attributeValues[SITELINK_FINAL_MOBILE_URL_FIELD_ID];
      urls.forEach(function(url) {
        addUrlToMap(
            urlMap, row.CampaignName, row.AdGroupName, 'Sitelink',
            row.FeedItemId, url);
      });
    }
  }
}

/**
 * Expands a URL that contains ValueTrack parameters such as {ifmobile:mobile}
 * to all the combinations, and returns as an array.
 *
 * @param {string} url The URL which may contain ValueTrack parameters.
 * @return {!Array.<string>} An array of one or more expanded URLs.
 */
function expandUrlModifiers(url) {
  var ifRegex = /({(if\w+):([^}]+)})/gi;
  var modifiers = {};
  var matches;
  while (matches = ifRegex.exec(url)) {
    // Tags are case-insensitive, e.g. IfMobile is valid.
    modifiers[matches[2].toLowerCase()] = {
      substitute: matches[0],
      replacement: matches[3]
    };
  }
  if (Object.keys(modifiers).length) {
    if (modifiers.ifmobile || modifiers.ifnotmobile) {
      var mobileCombinations =
          pairedUrlModifierReplace(modifiers, 'ifmobile', 'ifnotmobile', url);
    } else {
      var mobileCombinations = [url];
    }

    // Store in a map on the offchance that there are duplicates.
    var combinations = {};
    mobileCombinations.forEach(function(url) {
      if (modifiers.ifsearch || modifiers.ifcontent) {
        pairedUrlModifierReplace(modifiers, 'ifsearch', 'ifcontent', url)
            .forEach(function(modifiedUrl) {
              combinations[modifiedUrl] = true;
            });
      } else {
        combinations[url] = true;
      }
    });
    var modifiedUrls = Object.keys(combinations);
  } else {
    var modifiedUrls = [url];
  }
  // Remove any custom parameters
  return modifiedUrls.map(function(url) {
    return url.replace(/{[0-9a-zA-Z\_\+\:]+}/g, '');
  });
}

/**
 * Takes a pair of URL modifiers (e.g. ifmobile, ifnotmobile) and returns two
 * URLs: One considering the first case, the other considering the second.
 *
 * @param {Object} modifiers The modifier objects found in the URL.
 * @param {string} modifier1
 * @param {string} modifier2
 * @param {string} url The templated URL.
 * @return {!Array.<string>} The pair of formed URLs.
 */
function pairedUrlModifierReplace(modifiers, modifier1, modifier2, url) {
  return [
    urlModifierReplace(modifiers, modifier1, modifier2, url),
    urlModifierReplace(modifiers, modifier2, modifier1, url)
  ];
}

/**
 * Creates a URL by substituting in the required value for one modifier, and
 * blanking out the other.
 *
 * @param {Object} mods The modifier objects found in the URL.
 * @param {string} mod1 The modifier to substitute in value for.
 * @param {string} mod2 The modifier to blank.
 * @param {string} url The templated URL.
 * @return {string} The formed URL.
 */
function urlModifierReplace(mods, mod1, mod2, url) {
  var modUrl = mods[mod1] ?
      url.replace(mods[mod1].substitute, mods[mod1].replacement) :
      url;
  return mods[mod2] ? modUrl.replace(mods[mod2].substitute, '') : modUrl;
}

/**
 * Determines whether a list of operations retrieved from the App Engine
 * instance is complete or whether some operations are still ongoing.
 *
 * @param {!Array.<!Object>} operations A list of operations.
 * @return {boolean} true if there are no processing operations.
 */
function isComplete(operations) {
  return !operations.some(function(op) {
    return op.status === 'PROCESSING';
  });
}

/**
 * Determines whether sufficient time has passed since the last check such that
 * it is time to launch a new URL check.
 *
 * @return {boolean}
 */
function isDateForUrlCheck() {
  var now = new Date();
  if (!config.dateStarted && !config.dateCompleted) {
    return true;
  }
  if (config.dateCompleted &&
      config.dateCompleted.getTime() >= config.dateStarted.getTime() &&
      now.getTime() - config.dateCompleted.getTime() >
          MILLIS_PER_DAY * config.frequency) {
    return true;
  }
  return false;
}

/**
 * Sets the value of a named range in the configuration to the current Date.
 *
 * @param {string} namedRange The named range in the config spreadsheet.
 */
function setDateAsNow(namedRange) {
  var spreadsheet = SpreadsheetApp.openByUrl(CONFIG_SPREADSHEET_URL);
  spreadsheet.getRangeByName(namedRange).setValue(new Date());
}

/**
 * Produces a report from a list of operations as retrieved from the App Engine
 * instance.
 *
 * @param {!Array.<!Object>} operations The list of operations.
 * @return {number} The total number of errors listed in the report.
 */
function reportResults(operations) {
  var errors = [];
  var totalCheckedResults = 0;
  var totalCheckedErrors = 0;
  for (var i = 0, op; op = operations[i]; i++) {
    var detail = getOperation(op.batchId);
    var urlMap = loadLookup(op.batchId);

    totalCheckedResults += detail.checkedUrlCount;
    if (detail.errors) {
      totalCheckedErrors += detail.errors.length;
      detail.errors.forEach(function(error) {
        var errorTable = errorUrlMapToArray(error, urlMap);
        Array.prototype.push.apply(errors, errorTable);
      });
    }
  }

  writeResultsSpreadsheet(totalCheckedResults, totalCheckedErrors, errors);
  return totalCheckedErrors;
}

/**
 * Creates a table for the spreadsheet report from a single error result for a
 * URL, and the map of campaigns/ad groups/entities that have use that URL.
 *
 * @param {!Object} error The error returned from App Engine, with url and
 *     message properties.
 * @param {!Object} urlMap A map from URLs to entities structure.
 * @return {!Array.<!Array.<string>>} The table of results.
 */
function errorUrlMapToArray(error, urlMap) {
  var customerId = AdWordsApp.currentAccount().getCustomerId();
  var results = [];
  var mapEntry = urlMap[error.url];
  var campaigns = Object.keys(mapEntry);
  for (var i = 0, campaign; campaign = campaigns[i]; i++) {
    var campaignMap = mapEntry[campaign];
    var adGroups = Object.keys(campaignMap);
    for (var j = 0, adGroup; adGroup = adGroups[j]; j++) {
      var adGroupMap = campaignMap[adGroup];
      var types = Object.keys(adGroupMap);
      for (var k = 0, type; type = types[k]; k++) {
        var ids = Object.keys(adGroupMap[type]);
        for (var m = 0, id; id = ids[m]; m++) {
          results.push([
            customerId, campaign, adGroup, type, id, error.url, error.message
          ]);
        }
      }
    }
  }
  return results;
}

/**
 * Writes the results from the URL checking to the Results sheet.
 *
 * @param {number} totalCheckedResults The number of URLs checked.
 * @param {number} totalCheckedErrors The number of URLs that errored.
 * @param {!Array.<!Array.<string>>} errors The table of errors details.
 */
function writeResultsSpreadsheet(
    totalCheckedResults, totalCheckedErrors, errors) {
  var spreadsheet = SpreadsheetApp.openByUrl(CONFIG_SPREADSHEET_URL);
  spreadsheet.getRangeByName('numErrors').setValue(totalCheckedErrors);
  spreadsheet.getRangeByName('numChecked').setValue(totalCheckedResults);
  var topLeft = spreadsheet.getRangeByName('resultsTopLeft');
  var sheet = spreadsheet.getSheetByName('Results');
  var dataRange = sheet.getDataRange();
  var firstRow = topLeft.getRow();
  var firstCol = topLeft.getColumn();
  var firstDataRow = Math.max(firstRow, dataRange.getNumRows());
  sheet
      .getRange(
          firstRow, firstCol, firstDataRow - firstRow + 1,
          dataRange.getNumColumns() - firstCol + 1)
      .clear();
  for (var s = 0; s < errors.length; s += ROWS_CHUNK_SIZE) {
    var chunk = errors.slice(s, s + ROWS_CHUNK_SIZE);
    sheet.getRange(firstRow + s, firstCol, chunk.length, chunk[0].length)
        .setValues(chunk);
  }
}

/**
 * Retrieve a list of operations from the App Engine instance.
 *
 * @return {!Array.<!Object>} A list of operations.
 */
function listOperations() {
  var response = sharedKeyUrlFetch.fetch(baseUrl);
  var data = JSON.parse(response.getContentText());
  if (data.items) {
    return data.items;
  }
  return [];
}

/**
 * Creates operations to check URLs. URLs are supplied as the keys of a map
 * from URLs to a structure of the associated AdWords entities. The map is
 * split up into batches for submission to the App Engine instance, and only
 * URLs are submitted. The mapping to entities is saved to Drive to allow the
 * results to be reconstituted for each entity when retrieved from the App
 * Engine instance.
 *
 * @param {!Object} urlMap A map from URLs to associated entities.
 */
function createOperations(urlMap) {
  var urlKeys = Object.keys(urlMap);
  var urls = [];
  var urlSubMap = {};
  for (var i = 0, j = urlKeys.length; i < j; i++) {
    if (!(urls.length % URL_BATCH_SIZE)) {
      if (urls.length) {
        var batchId = createOperation(urls);
        saveLookup(batchId, urlSubMap);
      }
      urls = [];
      urlSubMap = {};
    }
    urls.push(urlKeys[i]);
    urlSubMap[urlKeys[i]] = urlMap[urlKeys[i]];
  }
  if (urls.length) {
    var batchId = createOperation(urls);
    saveLookup(batchId, urlSubMap);
  }
}

/**
 * Creates an operation on the App Engine instance.
 *
 * @param {!Array.<string>} urlList A list of URLs to check.
 * @return {string} The ID of the newly-created operation.
 */
function createOperation(urlList) {
  var body = {urls: urlList};
  if (config.failureStrings && config.failureStrings.length) {
    body.failureMatchTexts = config.failureStrings;
  }
  var options = {
    method: 'POST',
    contentType: 'application/json',
    payload: JSON.stringify(body)
  };
  var response = sharedKeyUrlFetch.fetch(baseUrl, options);
  var data = JSON.parse(response.getContentText());
  if (data.items) {
    return data.items[0];
  }
  throw Error('Failed to create new operation');
}

/**
 * Retrieves the details of an operation.
 *
 * @param {string} operationId The ID of the operation.
 * @return {!Object} The results object.
 */
function getOperation(operationId) {
  var url = baseUrl + '/' + operationId;
  var response = sharedKeyUrlFetch.fetch(url);
  return JSON.parse(response.getContentText());
}

/**
 * Deletes all operations in a list and the associated lookup file on Drive.
 *
 * @param {!Array.<!Object>} operations The operations to delete.
 */
function deleteOperations(operations) {
  operations.forEach(function(op) {
    deleteOperation(op.batchId);
    deleteLookup(op.batchId);
  });
}

/**
 * Deletes an operation.
 *
 * @param {string} operationId The ID of the operation.
 */
function deleteOperation(operationId) {
  var url = baseUrl + '/' + operationId;
  var options = {method: 'DELETE'};
  sharedKeyUrlFetch.fetch(url, options);
}

/**
 * Retrieves the folder used for temporary files, creating it first if needed.
 *
 * @return {Folder}
 */
function getLinkCheckerFolder() {
  var folders = DriveApp.getFoldersByName(LINK_CHECKER_DRIVE_FOLDER);
  if (!folders.hasNext()) {
    return DriveApp.createFolder(LINK_CHECKER_DRIVE_FOLDER);
  }
  return folders.next();
}

/**
 * Retrieves a URL mapping from drive for a given operation batch ID.
 *
 * @param {string} operationId The ID of the operation.
 * @return {!Object} The URL mapping.
 */
function loadLookup(operationId) {
  var folder = getLinkCheckerFolder();
  var files = folder.getFilesByName(operationId);
  if (files.hasNext()) {
    var file = files.next();
    var contents = Utilities.unzip(file);
    return JSON.parse(contents[0].getDataAsString());
  }
  throw Error(
      'Could not find any link checker files on Drive in the folder ' +
      'named ' + folder.getName() + '. Please check that this folder exists. ' +
      'If it cannot be found, start a new analysis by clearing the values in ' +
      '"Started analysis" and "Finished analysis" in the configuration ' +
      'spreadsheet.');
}

/**
 * Saves a URL mapping to Drive, for a specified operation ID.
 *
 * @param {string} operationId The ID of the operation.
 * @param {!Object} urlSubMap The URL mapping.
 */
function saveLookup(operationId, urlSubMap) {
  var folder = getLinkCheckerFolder();
  var contents = JSON.stringify(urlSubMap);
  var compressed = Utilities.zip([Utilities.newBlob(contents)]);
  compressed.setName(operationId);
  folder.createFile(compressed);
}

/**
 * Deletes a URL mapping from Drive.
 *
 * @param {string} operationId The ID of the URL mapping to delete.
 */
function deleteLookup(operationId) {
  var folder = getLinkCheckerFolder();
  var files = folder.getFilesByName(operationId);
  var forDeletion = [];
  while (files.hasNext()) {
    forDeletion.push(files.next());
  }
  forDeletion.forEach(function(file) {
    folder.removeFile(file);
  });
}

/**
 * Sends an email when the link checking has started.
 */
function sendStartEmail() {
  var customerId = AdWordsApp.currentAccount().getCustomerId();
  if (config.emailEachRun && config.emailRecipients &&
      config.emailRecipients.length) {
    MailApp.sendEmail(
        config.emailRecipients.join(','),
        customerId + ': Link Checker: Checking started',
        'Link checking has started on account: ' + customerId +
            '. For more details see the spreadsheet: ' +
            CONFIG_SPREADSHEET_URL);
    setDateAsNow('dateEmailed');
  }
}

/**
 * Sends an email when the link checking has completed.
 *
 * @param {number} numErrors The number of errors in the report.
 */
function sendCompletedEmail(numErrors) {
  var customerId = AdWordsApp.currentAccount().getCustomerId();
  if (config.emailOnCompletion && (config.emailNonErrors || numErrors > 0) &&
      config.emailRecipients && config.emailRecipients.length) {
    MailApp.sendEmail(
        config.emailRecipients.join(','),
        customerId + ': Link Checker: Checking completed',
        'Link checking has completed on account: ' + customerId +
            '. For more details see the spreadsheet: ' +
            CONFIG_SPREADSHEET_URL);
    setDateAsNow('dateEmailed');
  }
}

/**
 * Sends an email when the link checking is ongoing.
 */
function sendProcessingEmail() {
  var customerId = AdWordsApp.currentAccount().getCustomerId();
  if (config.emailEachRun && config.emailRecipients &&
      config.emailRecipients.length) {
    MailApp.sendEmail(
        config.emailRecipients.join(','),
        customerId + ': Link Checker: Checking is ongoing',
        'Link checking is still running on account: ' + customerId +
            '. For more details see the spreadsheet: ' +
            CONFIG_SPREADSHEET_URL);
    setDateAsNow('dateEmailed');
  }
}

/**
 * Adds a SharedKeyUrlFetchApp object, for creating requests to the App Engine
 * instance using a shared key.
 *
 * @param {!Object} scope The object to add SharedKeyUrlFetchApp to.
 */
(function(scope) {
  /**
   * Creates an object for making authenticated URL fetch requests with a
   * given sharedKey
   *
   * @param {string} sharedKey The sharedKey to store and use
   * @constructor
   */
  function SharedKeyUrlFetchApp_(sharedKey) {
    this.sharedKey_ = sharedKey;
  }

  /**
   * Performs an HTTP request for the given URL.
   * Has retries, as the service may not have initialised or spun up.
   *
   * @param {string} url The URL to fetch
   * @param {?Object=} opt_options Options as per UrlFetchApp.fetch
   * @return {!HTTPResponse} The HTTP Response object.
   */
  SharedKeyUrlFetchApp_.prototype.fetch = function(url, opt_options) {
    var attempts = 5;
    var fetchOptions = opt_options || {};
    fetchOptions.muteHttpExceptions = true;
    fetchOptions.headers = {Authorization: this.sharedKey_};
    var response = UrlFetchApp.fetch(url, fetchOptions);
    while (response.getResponseCode() >= 400 && attempts) {
      Utilities.sleep(FIFTEEN_SECS_MILLIS);
      attempts--;
      response = UrlFetchApp.fetch(url, fetchOptions);
    }
    if (response.getResponseCode() >= 400) {
      throw Error(
          'Error talking to Google App Engine UrlFetch service: ' +
          response.getResponseCode() + ' : ' + response.getContentText());
    }
    return response;
  };

  function getSharedKey_(oauthUrlFetch, projectId) {
    var attempt = 1;
    var url =
        'https://datastore.googleapis.com/v1/projects/' + projectId + ':lookup';
    var body = {keys: [{path: [{name: 'key', kind: 'SharedKey'}]}]};
    var options = {
      method: 'POST',
      contentType: 'application/json',
      payload: JSON.stringify(body)
    };
    var response = oauthUrlFetch.fetch(url, options);
    var data = JSON.parse(response.getContentText());
    while (!data.found && attempt < 9) {
      Utilities.sleep(attempt * FIFTEEN_SECS_MILLIS);
      response = oauthUrlFetch.fetch(url, options);
      data = JSON.parse(response.getContentText());
      attempt++;
    }
    if (data.found) {
      var firstResult = data.found[0];
      return firstResult.entity.properties.key.stringValue;
    }
    throw Error(
        'Shared key not found. If this is the first time you have run' +
        ' the script, the AppEngine instance may still be initialising. ' +
        'Try again in 5 minutes');
  }

  function withProjectId(oauthUrlFetch, projectId) {
    var sharedKey = getSharedKey_(oauthUrlFetch, projectId);
    return new SharedKeyUrlFetchApp_(sharedKey);
  }

  scope.SharedKeyUrlFetchApp = {withProjectId: withProjectId};
})(this);

  
/**
 * Simple library for sending OAuth2 authenticated requests.
 * See: https://developers.google.com/adwords/scripts/docs/features/third-party-apis#oauth_2
 * for full details.
 */

/**
 * Adds a OAuth object, for creating authenticated requests, to the global
 * object.
 */
(function(scope) {
  /**
   * Creates an object for making authenticated URL fetch requests with a
   * given stored access token.
   * @param {string} accessToken The access token to store and use.
   * @constructor
   */
  function OAuth2UrlFetchApp(accessToken) { this.accessToken_ = accessToken; }

  /**
   * Performs an HTTP request for the given URL.
   * @param {string} url The URL to fetch
   * @param {?Object=} options Options as per UrlFetchApp.fetch
   * @return {!HTTPResponse} The HTTP Response object.
   */
  OAuth2UrlFetchApp.prototype.fetch = function(url, opt_options) {
    var fetchOptions = opt_options || {};
    if (!fetchOptions.headers) {
      fetchOptions.headers = {};
    }
    fetchOptions.headers.Authorization = 'Bearer ' + this.accessToken_;
    return UrlFetchApp.fetch(url, fetchOptions);
  };

  /**
   * Performs the authentication step
   * @param {string} tokenUrl The endpoint for use in obtaining the token.
   * @param {!Object} payload The authentication payload, typically containing
   *     details of the grant type, credentials etc.
   * @param {string=} opt_authHeader Client credential grant also can make use
   *     of an Authorisation header, as specified here
   * @param {string=} opt_scope Optional string of spaced-delimited scopes.
   * @return {string} The access token
   */
  function authenticate_(tokenUrl, payload, opt_authHeader, opt_scope) {
    var options = {muteHttpExceptions: true, method: 'POST', payload: payload};
    if (opt_scope) {
      options.payload.scope = opt_scope;
    }
    if (opt_authHeader) {
      options.headers = {Authorization: opt_authHeader};
    }
    var response = UrlFetchApp.fetch(tokenUrl, options);
    var responseData = JSON.parse(response.getContentText());
    if (responseData && responseData.access_token) {
      var accessToken = responseData.access_token;
    } else {
      throw Error('No access token received: ' + response.getContentText());
    }
    return accessToken;
  }

  /**
   * Creates a OAuth2UrlFetchApp object having authenticated with a refresh
   * token.
   * @param {string} tokenUrl The endpoint for use in obtaining the token.
   * @param {string} clientId The client ID representing the application.
   * @param {string} clientSecret The client secret.
   * @param {string} refreshToken The refresh token obtained through previous
   *     (possibly interactive) authentication.
   * @param {string=} opt_scope Space-delimited set of scopes.
   * @return {!OAuth2UrlFetchApp} The object for making authenticated requests.
   */
  function withRefreshToken(
      tokenUrl, clientId, clientSecret, refreshToken, opt_scope) {
    var payload = {
      grant_type: 'refresh_token',
      client_id: clientId,
      client_secret: clientSecret,
      refresh_token: refreshToken
    };
    var accessToken = authenticate_(tokenUrl, payload, null, opt_scope);
    return new OAuth2UrlFetchApp(accessToken);
  }

  /**
   * Creates a OAuth2UrlFetchApp object having authenticated with client
   * credentials.
   * @param {string} tokenUrl The endpoint for use in obtaining the token.
   * @param {string} clientId The client ID representing the application.
   * @param {string} clientSecret The client secret.
   * @param {string=} opt_scope Space-delimited set of scopes.
   * @return {!OAuth2UrlFetchApp} The object for making authenticated requests.
   */
  function withClientCredentials(tokenUrl, clientId, clientSecret, opt_scope) {
    var authHeader =
        'Basic ' + Utilities.base64Encode([clientId, clientSecret].join(':'));
    var payload = {
      grant_type: 'client_credentials',
      client_id: clientId,
      client_secret: clientSecret
    };
    var accessToken = authenticate_(tokenUrl, payload, authHeader, opt_scope);
    return new OAuth2UrlFetchApp(accessToken);
  }

  /**
   * Creates a OAuth2UrlFetchApp object having authenticated with OAuth2 username
   * and password.
   * @param {string} tokenUrl The endpoint for use in obtaining the token.
   * @param {string} clientId The client ID representing the application.
   * @param {string} username OAuth2 Username
   * @param {string} password OAuth2 password
   * @param {string=} opt_scope Space-delimited set of scopes.
   * @return {!OAuth2UrlFetchApp} The object for making authenticated requests.
   */
  function withPassword(tokenUrl, clientId, username, password, opt_scope) {
    var payload = {
      grant_type: 'password',
      client_id: clientId,
      username: username,
      password: password
    };
    var accessToken = authenticate_(tokenUrl, payload, null, opt_scope);
    return new OAuth2UrlFetchApp(accessToken);
  }

  /**
   * Creates a OAuth2UrlFetchApp object having authenticated as a Service
   * Account.
   * Flow details taken from:
   *     https://developers.google.com/identity/protocols/OAuth2ServiceAccount
   * @param {string} tokenUrl The endpoint for use in obtaining the token.
   * @param {string} serviceAccount The email address of the Service Account.
   * @param {string} key The key taken from the downloaded JSON file.
   * @param {string} scope Space-delimited set of scopes.
   * @return {!OAuth2UrlFetchApp} The object for making authenticated requests.
   */
  function withServiceAccount(tokenUrl, serviceAccount, key, scope) {
    var assertionTime = new Date();
    var jwtHeader = '{"alg":"RS256","typ":"JWT"}';
    var jwtClaimSet = {
      iss: serviceAccount,
      scope: scope,
      aud: tokenUrl,
      exp: Math.round(assertionTime.getTime() / 1000 + 3600),
      iat: Math.round(assertionTime.getTime() / 1000)
    };
    var jwtAssertion = Utilities.base64EncodeWebSafe(jwtHeader) + '.' +
        Utilities.base64EncodeWebSafe(JSON.stringify(jwtClaimSet));
    var signature = Utilities.computeRsaSha256Signature(jwtAssertion, key);
    jwtAssertion += '.' + Utilities.base64Encode(signature);
    var payload = {
      grant_type: 'urn:ietf:params:oauth:grant-type:jwt-bearer',
      assertion: jwtAssertion
    };
    var accessToken = authenticate_(tokenUrl, payload, null);
    return new OAuth2UrlFetchApp(accessToken);
  }

  scope.OAuth2 = {
    withRefreshToken: withRefreshToken,
    withClientCredentials: withClientCredentials,
    withServiceAccount: withServiceAccount,
    withPassword: withPassword
  };
})(this);

Looking for the Manager Account (MCC) version? Click here

Send feedback about...

AdWords Scripts
AdWords Scripts
Need help? Visit our support page.