You can verify if a web crawler accessing your server really is Googlebot (or another Google user-agent). This is useful if you're concerned that spammers or other troublemakers are accessing your site while claiming to be Googlebot. Google doesn't post a public list of IP addresses for website owners to whitelist. This is because these IP address ranges can change, causing problems for any website owners who have hard-coded them, so you must run a DNS lookup as described next.
To verify Googlebot as the caller:
- Run a reverse DNS lookup on the accessing IP address from your logs, using the
host
command. - Verify that the domain name is either
googlebot.com
orgoogle.com
. - Run a forward DNS lookup on the domain name retrieved in step 1 using the
host
command on the retrieved domain name. Verify that it's the same as the original accessing IP address from your logs.
Example 1:
> host 66.249.66.1 1.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com. > host crawl-66-249-66-1.googlebot.com crawl-66-249-66-1.googlebot.com has address 66.249.66.1
Example 2:
> host 66.249.90.77 77.90.249.66.in-addr.arpa domain name pointer rate-limited-proxy-66-249-90-77.google.com. > host rate-limited-proxy-66-249-90-77.google.com rate-limited-proxy-66-249-90-77.google.com has address 66.249.90.77