This document is intended to describe the traffic from Google Transport price accuracy crawlers.
Note on the number of queries
For example, if we agreed to send 5000 queries per day, it means that 5000 times per day (evenly distributed across the day, that is approximately one every 17 seconds), our crawler performs all of the following actions a regular user would perform:
start from Google Search, and click the partner link
select the intended travel itinerary (if not already selected)
click 'continue' until it reaches the page where the user would have to enter personal / payment details
read final price details from the page
The crawler filters fetched resources
The crawler only fetches the resources that are required to get the information we are interested in price and availability details. In particular, it means that usually it only fetches resources from the partner website (i.e. we only authorize URLs from the same domain). Additionally we avoid fetching any resources that are not required to read the correct price data such as images.
In particular, it means the crawler doesn't load and execute scripts from third parties (Google Analytics, Facebook, Criteo...), so the crawler traffic should be excluded from those analytics.
Caching
For purposes of reducing load on the partner website, our crawlers are generally configured to respect all standard http caching headers present in the response. That means that for correctly configured websites we avoid repeatedly fetching content that changes rarely (e.g. JavaScript libraries).
Troubleshooting
The correct operation of our quality checks of our crawler network depends on having access to the partner website. The information to do so can be found in this help center article.