Google Transport price accuracy crawlers
Stay organized with collections
Save and categorize content based on your preferences.
This document is intended to describe the traffic from Google Transport price
accuracy crawlers.
Note on the number of queries
For example, if we agreed to send 5000 queries per day, it means that 5000 times
per day (evenly distributed across the day, that is approximately one every 17
seconds), our crawler performs all of the following actions a regular user
would perform:
start from Google Search, and click the partner link
select the intended travel itinerary (if not already selected)
click 'continue' until it reaches the page where the user would have to enter
personal / payment details
read final price details from the page
The crawler filters fetched resources
The crawler only fetches the resources that are required to get the information
we are interested in price and availability details. In particular, it means
that usually it only fetches resources from the partner website (i.e. we only
authorize URLs from the same domain). Additionally we avoid fetching
any resources that are not required to read the correct price data such as
images.
In particular, it means the crawler doesn't load and execute scripts from third
parties (Google Analytics, Facebook, Criteo...), so the crawler traffic should
be excluded from those analytics.
Caching
For purposes of reducing load on the partner website, our crawlers are generally
configured to respect all standard http caching headers present in the response.
That means that for correctly configured websites we avoid repeatedly fetching
content that changes rarely (e.g. JavaScript libraries).
Troubleshooting
The correct operation of our quality checks of our crawler network depends on
having access to the partner website. The information to do so can be found
in this help center article.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-07-03 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-07-03 UTC."],[[["\u003cp\u003eThis document details the traffic generated by Google Transport's price accuracy crawlers.\u003c/p\u003e\n"],["\u003cp\u003eCrawlers mimic user actions, including navigating through Google Search, selecting itineraries, and reaching the final price details page.\u003c/p\u003e\n"],["\u003cp\u003eCrawlers only fetch essential resources from the partner website needed for price and availability, excluding unnecessary elements like images or third party scripts.\u003c/p\u003e\n"],["\u003cp\u003eCrawler traffic should be excluded from analytics services like Google Analytics, Facebook, and Criteo as it will not trigger them.\u003c/p\u003e\n"],["\u003cp\u003eCrawlers respect HTTP caching headers to reduce load on partner websites by avoiding repeated fetches of static content.\u003c/p\u003e\n"]]],[],null,["# Google Transport price accuracy crawlers\n\nThis document is intended to describe the traffic from Google Transport price\naccuracy crawlers.\n\nNote on the number of queries\n-----------------------------\n\nFor example, if we agreed to send 5000 queries per day, it means that 5000 times\nper day (evenly distributed across the day, that is approximately one every 17\nseconds), our crawler performs **all** of the following actions a regular user\nwould perform:\n\n- start from Google Search, and click the partner link\n\n- select the intended travel itinerary (if not already selected)\n\n- click 'continue' until it reaches the page where the user would have to enter\n personal / payment details\n\n- read final price details from the page\n\nThe crawler filters fetched resources\n-------------------------------------\n\nThe crawler only fetches the resources that are required to get the information\nwe are interested in price and availability details. In particular, it means\nthat usually it only fetches resources from the partner website (i.e. we only\nauthorize URLs from the same domain). Additionally we avoid fetching\nany resources that are not required to read the correct price data such as\nimages.\n\nIn particular, it means the crawler doesn't load and execute scripts from third\nparties (Google Analytics, Facebook, Criteo...), so the crawler traffic should\nbe excluded from those analytics.\n\nCaching\n-------\n\nFor purposes of reducing load on the partner website, our crawlers are generally\nconfigured to respect all standard http caching headers present in the response.\nThat means that for correctly configured websites we avoid repeatedly fetching\ncontent that changes rarely (e.g. JavaScript libraries).\n\nTroubleshooting\n---------------\n\nThe correct operation of our quality checks of our crawler network depends on\nhaving access to the partner website. The information to do so can be found\n[in this help center article](https://support.google.com/itasoftware/answer/11009417)."]]