Stay organized with collections
Save and categorize content based on your preferences.
Overview of Google crawlers (user agents)
"Crawler" (sometimes also called a "robot" or "spider") is a generic term for any program that
is used to automatically discover and scan websites by following links from one web page to
another. Google's main crawler is called
Googlebot. This table lists information
about the common Google crawlers you may see in your referrer logs, and how to specify them in
robots.txt, the
robotsmeta tags, and the
X-Robots-Tag HTTP rules.
The following table shows the crawlers used by various products and services at Google:
The user agent token is used in the User-agent: line in robots.txt
to match a crawler type when writing crawl rules for your site. Some crawlers have more than
one token, as shown in the table; you need to match only one crawler token for a rule to
apply. This list is not complete, but covers most of the crawlers you might see on your
website.
The full user agent string is a full description of the crawler, and appears in
the HTTP request and your web logs.
Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)
Wherever you see the string Chrome/W.X.Y.Z in the user agent
strings in the table, W.X.Y.Z is actually a placeholder that represents the version
of the Chrome browser used by that user agent: for example, 41.0.2272.96. This version
number will increase over time to
match the latest Chromium release version used by Googlebot.
If you are searching your logs or filtering your server for a user agent with this pattern,
use wildcards for the version number rather than specifying an exact
version number.
User agents in robots.txt
Where several user agents are recognized in the robots.txt file, Google will follow the most
specific. If you want all of Google to be able to crawl your pages, you don't need a
robots.txt file at all. If you want to block or allow all of Google's crawlers from accessing
some of your content, you can do this by specifying Googlebot as the user agent. For example,
if you want all your pages to appear in Google Search, and if you want AdSense ads to appear
on your pages, you don't need a robots.txt file. Similarly, if you want to block some pages
from Google altogether, blocking the Googlebot user agent will also block all
Google's other user agents.
But if you want more fine-grained control, you can get more specific. For example, you might
want all your pages to appear in Google Search, but you don't want images in your personal
directory to be crawled. In this case, use robots.txt to disallow the
Googlebot-Image user agent from crawling the files in your personal directory
(while allowing Googlebot to crawl all files), like this:
To take another example, say that you want ads on all your pages, but you don't want those
pages to appear in Google Search. Here, you'd block Googlebot, but allow the
Mediapartners-Google user agent, like this:
Each Google crawler accesses sites for a specific purpose and at different rates. Google uses
algorithms to determine the optimal crawl rate for each site. If a Google crawler is crawling
your site too often, you can
reduce the crawl rate.
Retired Google crawlers
The following Google crawlers are no longer in use, and are only noted here for historical reference.
Retired Google crawlers
Duplex on the web
Supported the Duplex on the web service.
User agent token
DuplexWeb-Google
Full user agent string
Mozilla/5.0 (Linux; Android 11; Pixel 2; DuplexWeb-Google/1.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Mobile Safari/537.36
Web Light
Checked for the presence of the no-transform header whenever a user clicked
your page in search under appropriate conditions. The Web Light user agent was used only
for explicit browse requests of a human visitor, and so it ignored robots.txt rules,
which are used to block automated crawling requests.
User agent token
googleweblight
Full user agent string
Mozilla/5.0 (Linux; Android 4.2.1; en-us; Nexus 5 Build/JOP40D) AppleWebKit/535.19 (KHTML, like Gecko; googleweblight) Chrome/38.0.1025.166 Mobile Safari/535.19