[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["缺少我需要的資訊","missingTheInformationINeed","thumb-down"],["過於複雜/步驟過多","tooComplicatedTooManySteps","thumb-down"],["過時","outOfDate","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["示例/程式碼問題","samplesCodeIssue","thumb-down"],["其他","otherDown","thumb-down"]],[],[[["\u003cp\u003eSome websites and CDNs are incorrectly using \u003ccode\u003e4xx\u003c/code\u003e client errors (except \u003ccode\u003e429\u003c/code\u003e) to limit Googlebot's crawl rate, which is detrimental.\u003c/p\u003e\n"],["\u003cp\u003eUsing \u003ccode\u003e4xx\u003c/code\u003e errors for rate limiting can lead to content removal from Google Search and unintended exposure of disallowed content.\u003c/p\u003e\n"],["\u003cp\u003eGoogle provides clear documentation and tools to manage Googlebot's crawl rate effectively through Search Console or by returning appropriate HTTP status codes like \u003ccode\u003e500\u003c/code\u003e, \u003ccode\u003e503\u003c/code\u003e, or \u003ccode\u003e429\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eThe correct way to manage crawl rate involves understanding HTTP status codes and using Google's recommended methods to avoid negative impacts on search visibility.\u003c/p\u003e\n"],["\u003cp\u003eFor further assistance or clarification, website owners can reach out through Google's support channels such as Twitter or the help forums.\u003c/p\u003e\n"]]],["Website owners should avoid using `4xx` client errors (except `429`) to manage Googlebot's crawl rate. These errors indicate client-side issues, not server overload. Using `4xx` codes (excluding `429`) can lead to content removal from Google Search, and if applied to `robots.txt`, it will be ignored. Instead, employ Search Console for rate adjustments or utilize `500`, `503`, or `429` status codes to signal server overload and manage crawl rates effectively.\n"],null,["# Don't use 403s or 404s for rate limiting\n\nFriday, February 17, 2023\n\n\nOver the last few months we noticed an uptick in website owners and some content delivery networks\n(CDNs) attempting to use `404` and other `4xx` client errors (but not\n`429`) to attempt to reduce Googlebot's crawl rate.\n\n\nThe short version of this blog post is: please don't do that; we have documentation about\n[how to reduce Googlebot's crawl rate](/search/docs/crawling-indexing/reduce-crawl-rate).\nRead that instead and learn how to effectively manage Googlebot's crawl rate.\n\nBack to basics: `4xx` errors are for client errors\n--------------------------------------------------\n\n\nThe `4xx` errors servers return to clients are a signal from the server that the\nclient's request was wrong in some sense. Most of the errors in this category are pretty benign:\n\"not found\" errors, \"forbidden\", \"I'm a teapot\" (yes, that's a thing). They don't suggest anything\nwrong going on with the server itself.\n\n\nThe one exception is `429`, which stands for \"too many requests\". This error is a clear\nsignal to any well-behaved robot, including our beloved Googlebot, that it needs to slow down\nbecause it's overloading the server.\n\nWhy `4xx` errors are bad for rate limiting Googlebot (except `429`)\n-------------------------------------------------------------------\n\n\nClient errors are just that: client errors. They generally don't suggest an error with the server:\nnot that it's overloaded, not that it's encountered a critical error and is unable to respond\nto the request. They simply mean that the client's request was bad in some way. There's no\nsensible way to equate for example a `404` error to the server being overloaded.\nImagine if that was the case: you get an influx of `404` errors from your friend accidentally\nlinking to the wrong pages on your site, and in turn Googlebot slows down with crawling. That\nwould be pretty bad. Same goes for `403`, `410`, `418`.\n\n\nAnd again, the big exception is the `429` status code, which translates to \"too many\nrequests\".\n\nWhat rate limiting with `4xx` does to Googlebot\n-----------------------------------------------\n\n\nAll `4xx` HTTP status codes (again, except `429`) will cause your content\nto be removed from Google Search. What's worse, if you also serve your robots.txt file with a\n`4xx` HTTP status code, it will be treated as if it didn't exist. If you had a rule\nthere that disallowed crawling your dirty laundry, now Googlebot also knows about it; not great\nfor either party involved.\n\nHow to reduce Googlebot's crawl rate, the right way\n---------------------------------------------------\n\n\nWe have extensive documentation about\n[how to reduce Googlebot's crawl rate](/search/docs/crawling-indexing/reduce-crawl-rate)\nand also about\n[how Googlebot (and Search indexing) handles the different HTTP status codes](/search/docs/crawling-indexing/http-network-errors);\nbe sure to check them out. In short, you want to do either of these things:\n\n- [Use Search Console to temporarily reduce crawl rate](/search/docs/crawling-indexing/reduce-crawl-rate#reduce-with-search-console).\n- Return a `500`, `503`, or `429` HTTP status code to Googlebot when it's crawling too fast.\n\n\nIf you need more tips or clarifications, catch us on\n[Twitter](https://twitter.com/googlesearchc) or post in\n[our help forums](https://support.google.com/webmasters/community).\n\n\nPosted by [Gary Illyes](https://garyillyes.com/+)"]]