Stay organized with collections
Save and categorize content based on your preferences.
Tuesday, March 06, 2007
Search engine robots, including our very own Googlebot, are incredibly polite. They work
hard to respect your every wish regarding what pages they should and should not crawl. How can
they tell the difference? You have to tell them, and you have to speak their language, which is
an industry standard called the
Robots Exclusion Protocol.
Dan Crow has written about this on the Google Blog recently, including an introduction to setting
up your own rules for robots and a description of some of the more advanced options. His first
two posts in the series are:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],[],[[["\u003cp\u003eSearch engines, like Googlebot, utilize the Robots Exclusion Protocol to understand which parts of your website they should and should not crawl.\u003c/p\u003e\n"],["\u003cp\u003eYou can control search engine access to your website by creating a robots.txt file that uses this protocol.\u003c/p\u003e\n"],["\u003cp\u003eGoogle provides resources and documentation to help you understand and implement the Robots Exclusion Protocol, including blog posts and help center articles.\u003c/p\u003e\n"],["\u003cp\u003eDan Crow's blog posts offer insights into setting up robots.txt rules and using advanced options for controlling search engine behavior.\u003c/p\u003e\n"],["\u003cp\u003eGoogle has previously published articles on topics like debugging blocked URLs, Googlebot's functionality, and using robots.txt files.\u003c/p\u003e\n"]]],["Search engine robots respect website owners' wishes regarding crawling. The Robots Exclusion Protocol is the industry standard language used to communicate these preferences. Resources are provided for setting up rules, including blog posts by Dan Crow on controlling search engine access and the protocol itself. Additional help center content and past posts about debugging blocked URLs, Googlebot, and robots.txt files are also linked. The provided resources are not updated and a link to up-to-date information is also provided.\n"],null,["# All about robots\n\nTuesday, March 06, 2007\n| It's been a while since we published this blog post. Some of the information may be outdated (for example, some images may be missing, and some links may not work anymore). Read our [up-to-date documentation about robots.txt/a\\\u003e.](/search/docs/crawling-indexing/robots/intro)\n[](/search/docs/crawling-indexing/robots/intro)\n\n[Search engine robots, including our very own Googlebot, are incredibly polite. They work\nhard to respect your every wish regarding what pages they should and should not crawl. How can\nthey tell the difference? You have to tell them, and you have to speak their language, which is\nan industry standard called the](/search/docs/crawling-indexing/robots/intro)[Robots Exclusion Protocol](https://www.rfc-editor.org/rfc/rfc9309.html).\n\n\nDan Crow has written about this on the Google Blog recently, including an introduction to setting\nup your own rules for robots and a description of some of the more advanced options. His first\ntwo posts in the series are:\n\n- [Controlling how search engines access and index your website](https://googleblog.blogspot.com/2007/01/controlling-how-search-engines-access.html)\n- [The Robots Exclusion Protocol](https://googleblog.blogspot.com/2007/02/robots-exclusion-protocol.html)\n\nStay tuned for the next installment.\n\n\nStay tuned for the next installment.\n\n\nWhile we're on the topic, I'd also like to point you to the\n[robots section of our help center](/search/docs/crawling-indexing/robots/intro)\nand our earlier posts on this topic:\n\n- [Debugging Blocked URLs](/search/blog/2006/09/debugging-blocked-urls_19)\n- [All About Googlebot](/search/blog/2006/08/all-about-googlebot)\n- [Using a robots.txt File](/search/blog/2006/02/using-robotstxt-file)"]]