Better details about when Googlebot last visited a page
Stay organized with collections
Save and categorize content based on your preferences.
Tuesday, September 05, 2006
Most people know that Googlebot downloads pages from web servers to crawl the web. Not as many
people know that if Googlebot accesses a page and gets a
304 (Not-Modified)
response to a If-Modified-Since qualified request, Googlebot doesn't download the
contents of that page. This reduces the bandwidth consumed on your web server.
When you look at Google's cache of a page (for instance, by using the
cache: operator
or clicking the Cached link under a URL in the search results), you can see the date that
Googlebot retrieved that page. Previously, the date we listed for the page's cache was the date
that we last successfully fetched the content of the page. This meant that even if we visited a
page very recently, the cache date might be quite a bit older if the page hadn't changed since the
previous visit. This made it difficult for webmasters to use the cache date we display to
determine Googlebot's most recent visit. Consider the following example:
Googlebot crawls a page on April 12, 2006.
Our cached version of that page notes that "This is Google's cache of
https://www.example.com/ as retrieved on April 12, 2006 20:02:06 GMT."
Periodically, Googlebot checks to see if that page has changed, and each time, receives a
Not-Modified response. For instance, on August 27, 2006, Googlebot checks the page,
receives a Not-Modified response, and therefore, doesn't download the contents of
the page.
On August 28, 2006, our cached version of the page still shows the April 12, 2006 date—the
date we last downloaded the page's contents, even though Googlebot last visited the day
before.
We've recently changed the date we show for the cached page to reflect when Googlebot last
accessed it (whether the page had changed or not).
This should make it easier for you to determine the most recent date Googlebot visited the
page. For instance, in the above example, the cached version of the page would now say "This is
Google's cache of https://www.example.com/ as retrieved on August 27, 2006 13:13:37
GMT."
Note that this change will be reflected for individual pages as we update those pages in our
index.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],[],[[["\u003cp\u003eGooglebot utilizes \u003ccode\u003e304 (Not-Modified)\u003c/code\u003e responses to conserve bandwidth by avoiding redundant page downloads when content remains unchanged.\u003c/p\u003e\n"],["\u003cp\u003eGoogle's cached page date now reflects the last access time, regardless of content modification, for easier tracking of Googlebot's visits.\u003c/p\u003e\n"],["\u003cp\u003ePreviously, the cached page date indicated the last content download date, which could be misleading if the page hadn't changed since the prior visit.\u003c/p\u003e\n"],["\u003cp\u003eThis change allows webmasters to accurately identify the most recent Googlebot visit based on the displayed cache date.\u003c/p\u003e\n"]]],["Googlebot crawls web pages and uses a `Not-Modified` response to avoid redownloading unchanged content, saving bandwidth. Previously, the cached page date reflected the last content download date. Now, the cached date reflects the last access date, even if the content was not downloaded due to a `Not-Modified` response. This change enables webmasters to determine when Googlebot last visited the page. This change will occur as pages are updated in the Google index.\n"],null,["# Better details about when Googlebot last visited a page\n\n| It's been a while since we published this blog post. Some of the information may be outdated (for example, some images may be missing, and some links may not work anymore).\n\nTuesday, September 05, 2006\n\n\nMost people know that Googlebot downloads pages from web servers to crawl the web. Not as many\npeople know that if Googlebot accesses a page and gets a\n[304](/search/docs/crawling-indexing/http-network-errors#3xx-redirection)` (Not-Modified)`\nresponse to a `If-Modified-Since` qualified request, Googlebot doesn't download the\ncontents of that page. This reduces the bandwidth consumed on your web server.\n\n\nWhen you look at Google's cache of a page (for instance, by using the\n[`cache:` operator](/search/docs/monitor-debug/search-operators/web-search-cache)\nor clicking the Cached link under a URL in the search results), you can see the date that\nGooglebot retrieved that page. Previously, the date we listed for the page's cache was the date\nthat we last successfully fetched the content of the page. This meant that even if we visited a\npage very recently, the cache date might be quite a bit older if the page hadn't changed since the\nprevious visit. This made it difficult for webmasters to use the cache date we display to\ndetermine Googlebot's most recent visit. Consider the following example:\n\n1. Googlebot crawls a page on April 12, 2006.\n2. Our cached version of that page notes that \"This is Google's cache of `https://www.example.com/` as retrieved on April 12, 2006 20:02:06 GMT.\"\n3. Periodically, Googlebot checks to see if that page has changed, and each time, receives a `Not-Modified` response. For instance, on August 27, 2006, Googlebot checks the page, receives a `Not-Modified` response, and therefore, doesn't download the contents of the page.\n4. On August 28, 2006, our cached version of the page still shows the April 12, 2006 date---the date we last downloaded the page's contents, even though Googlebot last visited the day before.\n\n\nWe've recently changed the date we show for the cached page to reflect when Googlebot last\n*accessed* it (whether the page had changed or not).\nThis should make it easier for you to determine the most recent date Googlebot visited the\npage. For instance, in the above example, the cached version of the page would now say \"This is\nGoogle's cache of `https://www.example.com/` as retrieved on August 27, 2006 13:13:37\nGMT.\"\n\n\nNote that this change will be reflected for individual pages as we update those pages in our\nindex."]]