General robots questions
Does my website need a robots.txt file?
No. When Googlebot visits a website, we first ask for permission to crawl by
attempting to retrieve the robots.txt file. A website without a robots.txt file,
robots meta tags, or
X-Robots-Tag HTTP headers will
generally be crawled and indexed normally.
Which method should I use to block crawlers?
It depends. In short, there are good reasons to use each of these methods:
robots.txt: Use it if crawling of your content is causing issues on
your server. For example, you may want to disallow crawling of infinite calendar
scripts. Don't use the robots.txt to block private content (use server-side
authentication instead), or
To make sure that a URL is not indexed, use the robots meta tag or
X-Robots-TagHTTP header instead.
- robots meta tag: Use it if you need to control how an individual HTML page is shown in search results or to make sure that it's not shown.
X-Robots-TagHTTP header: Use it if you need to control how content is shown in search results or to make sure that it's not shown.
Can I use robots.txt, robots meta tag, or the
X-Robots-Tag HTTP header to
remove someone else's site from search results?
No. These methods are only applicable to sites where you can modify the code or add files. Learn more about how to remove information from Google.
How can I slow down Google's crawling of my website?
I use the same robots.txt for multiple websites. Can I use a full URL instead of a relative path?
No. The directives in the robots.txt file (with exception of
are only valid for relative paths.
Can I place the robots.txt file in a subdirectory?
No. The file must be placed in the topmost directory of the website.
I want to block a private folder. Can I prevent other people from reading my robots.txt file?
No. The robots.txt file may be read by various users. If folders or filenames of content aren't meant for the public, don't list them in the robots.txt file. It is not recommended to serve different robots.txt files based on the user agent or other attributes.
Do I have to include an
directive to allow crawling?
No, you do not need to include an
allow directive. All URLs are
implicitly allowed and the
allow directive is used to override
disallow directives in the same robots.txt file.
What happens if I have a mistake in my robots.txt file or use an unsupported directive?
Web crawlers are generally very flexible and typically will not be swayed by minor mistakes in the robots.txt file. In general, the worst that can happen is that incorrect / unsupported directives will be ignored. Bear in mind though that Google can't read minds when interpreting a robots.txt file; we have to interpret the robots.txt file we fetched. That said, if you are aware of problems in your robots.txt file, they're usually easy to fix.
What program should I use to create a robots.txt file?
If I block Google from crawling a page using a robots.txt
disallow directive, will it disappear from search results?
Blocking Google from crawling a page is likely to remove the page from Google's index.
disallow does not guarantee that a page
will not appear in results: Google may still decide, based on external information
such as incoming links, that it is relevant and show the URL in the results. If you
wish to explicitly block a page from being indexed, use the
noindex robots meta tag or
X-Robots-Tag HTTP header. In this
case, don't disallow the page in robots.txt, because the page must be crawled
in order for the tag to be seen and obeyed. Learn how to
control what you share with Google
How long will it take for changes in my robots.txt file to affect my search results?
First, the cache of the robots.txt file must be refreshed (we generally cache the contents for up to one day). You can speed up this process by submitting your updated robots.txt to Google. Even after finding the change, crawling and indexing is a complicated process that can sometimes take quite some time for individual URLs, so it's impossible to give an exact timeline. Also, keep in mind that even if your robots.txt file is disallowing access to a URL, that URL may remain visible in search results despite that fact that we can't crawl it. If you wish to expedite removal of the pages you've blocked from Google, submit a removal request.
How can I temporarily suspend all crawling of my website?
You can temporarily suspend all crawling by returning a
503 (service unavailable) HTTP status code for all URLs, including the
robots.txt file. The robots.txt file will be retried periodically until it can be
accessed again. We do not recommend changing your robots.txt file to disallow crawling.
My server is not case-sensitive. How can I disallow crawling of some folders completely?
Directives in the robots.txt file are case-sensitive. In this case, it is recommended
to make sure that only one version of the URL is indexed using
Doing this allows you to have fewer lines in your robots.txt file, so it's easier for
you to manage it. If this isn't possible, we recommended that you list the common
combinations of the folder name, or to shorten it as much as possible, using only the
first few characters instead of the full name. For instance, instead of listing all
upper and lower-case permutations of
could list the permutations of "/MyP" (if you are certain that no other, crawlable
URLs exist with those first characters). Alternately, it may make sense to use a
robots meta tag or
X-Robots-Tag HTTP header instead, if
crawling is not an issue.
403 Forbidden for all URLs, including the robots.txt
file. Why is the site still being crawled?
403 Forbidden HTTP status code, as well as other
HTTP status codes, is interpreted as the robots.txt file doesn't exist. This means
that crawlers will generally assume that they can crawl all URLs of the website. In
order to block crawling of the website, the robots.txt must be returned with a
200 OK HTTP status code, and must contain an appropriate
Robots meta tag questions
Is the robots meta tag a replacement for the robots.txt file?
No. The robots.txt file controls which pages are accessed. The robots meta tag controls whether a page is indexed, but to see this tag the page needs to be crawled. If crawling a page is problematic (for example, if the page causes a high load on the server), use the robots.txt file. If it is only a matter of whether or not a page is shown in search results, you can use the robots meta tag.
Can the robots meta tag be used to block a part of a page from being indexed?
No, the robots meta tag is a page-level setting.
Can I use the robots meta tag outside of a
No, the robots meta tag needs to be in the
section of a page.
Does the robots meta tag disallow crawling?
No. Even if the robots meta tag currently says
we'll need to recrawl that URL occasionally to check if the meta tag has changed.
How does the
nofollow robots meta tag compare to the
rel="nofollow" link attribute?
X-Robots-Tag HTTP header questions
How can I check the
X-Robots-Tag for a URL?
A simple way to view the server headers is to use the URL Inspection Tool feature in Google Search Console. To check the response headers of any URL, try searching for "server header checker".