Verifying Googlebot and other Google crawlers
You can verify if a web crawler accessing your server really is a Google crawler, such as Googlebot. This is useful if you're concerned that spammers or other troublemakers are accessing your site while claiming to be Googlebot.
Google's crawlers fall into three categories:
|Type||Description||Reverse DNS mask||IP ranges|
|Googlebot||The main crawler for Google's search products. Always respects robots.txt rules.||
|Special-case crawlers||Crawlers that perform specific functions (such as AdsBot), which may or may not respect robots.txt rules.||
|User-triggered fetchers||Tools and product functions where the end user triggers a fetch. For example, Google Site Verifier acts on the request of a user. Because the fetch was requested by a user, these fetchers ignore robots.txt rules.||
There are two methods for verifying Google's crawlers:
- Manually: For one-off lookups, use command line tools. This method is sufficient for most use cases.
- Automatically: For large scale lookups, use an automatic solution to match a crawler's IP address against the list of published Googlebot IP addresses.
Use command line tools
Run a reverse DNS lookup on the accessing IP address from your logs, using the
Verify that the domain name is either
Run a forward DNS lookup on the domain name retrieved in step 1 using the
hostcommand on the retrieved domain name.
- Verify that it's the same as the original accessing IP address from your logs.
host 126.96.36.199.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com.
host crawl-66-249-66-1.googlebot.comcrawl-66-249-66-1.googlebot.com has address 188.8.131.52
host 184.108.40.206240.243.247.35.in-addr.arpa domain name pointer geo-crawl-35-247-243-240.geo.googlebot.com.
host geo-crawl-35-247-243-240.geo.googlebot.comgeo-crawl-35-247-243-240.geo.googlebot.com has address 220.127.116.11
host 18.104.22.16822.214.171.124.in-addr.arpa domain name pointer rate-limited-proxy-66-249-90-77.google.com.
host rate-limited-proxy-66-249-90-77.google.comrate-limited-proxy-66-249-90-77.google.com has address 126.96.36.199
Use automatic solutions
Alternatively, you can identify Googlebot by IP address by matching the crawler's IP address to the lists of Google crawlers' and fetchers' IP ranges:
For other Google IP addresses from where your site may be accessed (for example, Apps Scripts), match the accessing IP address against the general list of Google IP addresses. Note that the IP addresses in the JSON files are represented in CIDR format.