Duplex on the Web user agent

DuplexWeb-Google is the user agent that supports the Duplex on the web service. You can see the user agent token and full user agent strings here.

Crawl frequency and behavior

  • No services that use the DuplexWeb-Google user agent will perform any purchases or perform any other significant actions when crawling your site.
  • The DuplexWeb-Google user agent crawls occur a few times a day to a few times an hour, depending on the feature being trained, but these runs are calculated to not overload your site or disturb your traffic.
  • The DuplexWeb-Google user agent crawls are not used by Google Search for indexing. Because they are not used for indexing, the DuplexWeb-Google user agent does not recognize the noindex directive.
  • Google Analytics does not record page requests made by the DuplexWeb-Google user agent during crawling and analysis.

Use robots.txt rules to control crawling

You must explicitly block the DuplexWeb-Google user agent using the Disallow robots.txt rule to prevent it from crawling your site. Disabling crawling (training) in your Search Console property setting is not enough.

The DuplexWeb-Google user agent obeys robots.txt rules normally, with the following significant exceptions:

  • When Duplex on the web is enabled using Search Console (the default), the DuplexWeb-Google user agent ignores the Disallow rules in the * wildcard user agent groups.
  • When Duplex on the web is disabled using Search Console, the DuplexWeb-Google user agent respects Disallow rules in the * wildcard user agent groups. Examples:
# Example 1: Block DuplexWeb-Google from crawling your site
User-agent: DuplexWeb-Google
Disallow: /

# Example 2:
# * If Duplex on the web is enabled for this property in Search Console,
#   block all user agents except DuplexWeb-Google.
# * If Duplex on the web is disabled for this property in Search Console,
#   block all user agents including DuplexWeb-Google.
User-agent: *
Disallow: /