Google App Engine

Dealing with DeadlineExceededErrors

João Martins
June 2012

To prevent resource overuse and to provide better isolation between apps running in the same cluster, Google App Engine restricts resource usages to certain limits. The DeadlineExceededError for Python (or DeadlineExceededException for Java) is a part of this resource control mechanism by providing a general indication of overly aggressive performance settings.

Note: For the sake of brevity, both DeadlineExceededError and DeadlineExceededException are addressed as DeadlineExceededError in the remainder of this document.

However, the exact cause of a DeadlineExceededError may not always be obvious. To help you resolve the cause of this error, in this article we will explain the possible causes of a DeadlineExceededError and how to avoid them.

Which DeadlineExceededError? (Python)

Currently, there are several errors named DeadlineExceededError for the Python runtime:

  • google.appengine.runtime.DeadlineExceededError: raised if the overall request times out, typically after 60 seconds, or 10 minutes for task queue requests.
  • google.appengine.runtime.apiproxy_errors.DeadlineExceededError: raised if an RPC exceeded its deadline. This is typically 5 seconds, but it is settable for some APIs using the 'deadline' option.
  • google.appengine.api.urlfetch_errors.DeadlineExceededError: raised if the URLFetch times out.

The focus of this article is solely on the google.appengine.runtime.DeadlineExceededError. There are strategies to deal with the other kinds of errors, but this article does not cover them.

The Request Timer

The Google App Engine request timer (Java/Python/Go) ensures that requests have a finite lifespan and do not get caught in an infinite loop. Currently, the deadline for requests to frontend instances is 60 seconds. (Backend instances have no corresponding limit.) Every request, including warmup (request to /_ah/warmup) and loading requests ("loading_request=1" log header), is subject to this restriction.

If a request fails to return within 60 seconds and a DeadlineExceededError is thrown and not caught, the request is aborted and a 500 internal server error is returned. If the DeadlineExceededError is caught but a response is not produced quickly enough (you have less than a second), the request is aborted and a 500 internal server error is returned.

In the Java runtime, if the DeadlineExceededException is not caught, an uncatchable HardDeadlineExceededError is thrown. The instance is terminated in both cases, but the HardDeadlineExceededError does not give any time margin to return a custom response. To make sure your request returns within the allowed time frame, you can use the ApiProxy.getCurrentEnvironment().getRemainingMillis() method to checkpoint your code and return if you have no time left. The Java runtime page contains an explaination on how to use this method. If concurrent requests are enabled through the "threadsafe" flag, every other running concurrent request is killed with error code 104:

A problem was encountered with the process that handled this request, causing it to exit. This is likely to cause a new process to be used for the next request to your application. If you see this message frequently, you may be throwing exceptions during the initialization of your application. (Error code 104)

Root causes and how to avoid errors

The following root causes of the error are listed along with suggestions on how to avoid them.

Slow-Loading Apps Can Cause Deadline Errors

Slow loading apps that take close to 60 seconds to start in normal operation are likely to suffer from DeadlineExceededErrors. Any latency disturbance, such as high application load or scheduled maintenance procedures can result in the exceeding of the 60 seconds deadline. We strive to minimize these external factors and have a dedicated team of engineers to monitor and prevent these issues from occurring, but sometimes they are unavoidable. The use of extensive frameworks can slow down the code loading process of an instance due to the high volume of libraries and dependencies that they require.

Reducing the number of unused third-party libraries loaded by the instances is an effective way to reduce instance loading times and reduce the number of DeadlineExceededErrors. Another option is to lazy-load frameworks components, whenever such is possible. Some frameworks load a high volume of classes at instance loading time that are kept unused in memory through the instance lifetime. For example, the Java Persistence API loads all the classes from the main jar file at boot time. Since Google App Engine is better tuned to deal with shorter duration requests and aborts requests that take more than 60 seconds to complete, you benefit from distributing the weight of loading all the classes at instance boot time throughout its lifetime. One way to do this in the Java Persistence API is to set the exclude-unlisted-classes parameter to true and explicitly define the classes that an app will use on the framework's configuration file.

<?xml version="1.0" encoding="UTF-8" ?>
<persistence xmlns=""
    xsi:schemaLocation="" version="1.0">

    <persistence-unit name="transactions-optional">


            <property name="datanucleus.NontransactionalRead" value="true"/>
            <property name="datanucleus.NontransactionalWrite" value="true"/>
            <property name="datanucleus.ConnectionURL" value="appengine"/>
            <property name="datanucleus.singletonEMFForName" value="true"/>



Spring also supports class auto-discovery to automatically import beans. Auto-discovery is enabled if the Spring configuration file contains the following expression:

<context:component-scan base-package="org.example"/>
A high volume of classes in the designated packages can cause delays on the loading times of an app. If Spring is used and instances are aborted at loading time due to DeadlineExceededErrors, it is suggested to disable class auto-discovery and manually specify the used classes instead. To do this, remove the context:component-scan element from the configuration file and manually declare the beans that will be used with the bean element in the configuration file.

Optimizing Spring Framework for App Engine Applications is an article that deals in more depth with how to optimize your Spring App Engine app.

Performance Settings Can Cause Deadline Errors

Aggressive idle instance settings causing shortage of resources for overflowing traffic can also result in DeadlineExceededErrors. For example, suppose you have an app with minimum idle instances set to 0 and maximum idle instances set to 1, where your app is handling requests at a rate of 100 requests per second. This configuration might work well with its currently allocated dynamic resources and be very cost-efficient, but if the traffic increases even a little bit, the app will require more instances to handle the extra traffic. Depending on the acuteness of the traffic rise, more instances might need to be generated on-the-fly, during which time the requests are waiting in the pending request queue. As a consequence, the pending time added to the computation time might add up to more than 60 seconds, which is likely to translate into a cascade of DeadlineExceededErrors.

If an app is starved for computing resources, it can generate a high number of DeadlineExceededErrors. The suggested action is to increase the number of idle instances, especially resident instances, since these are loaded once and stay alive for very long time spans. If no custom idle instance settings are effective against DeadlineExceededErrors, setting these to 'Automatic' should be considered in order to allow the Google App Engine scheduler to handle the idle instance management. Bear in mind that setting these parameters to higher values might increase your instance hour consumption. The Google App Engine scheduler manages the idle instances using algorithms that take into account the traffic pattern of an app and prioritize elements as application availability, reliability and stability over instance costs.

When opting for manual performance settings, it is essential to consider the incoming request rate of an app, the amount of time that takes for an instance to load and to respond to a request. These factors determine if the number of idle instances available at any instant is sufficient to handle the exceeding traffic and assure that there is no sudden demand for new instances to be loaded.

Warmup Requests

Turning on warmup requests is strongly advised to avoid the latency induced by instance loading. Based on an app's usage patterns, the Google App Engine scheduler predicts when new instances are needed and generates them in advance. However, in certain cases it cannot predict when more instances are required and, therefore, warmup requests do not eliminate loading requests.

Delays Associated with Logging

When an app is struggling with DeadlineExceededErrors from normal requests (that is, non-loading requests), decreasing the amount of logging will decrease request latency and can help prevent requests from reaching the 60 second deadline.

Delays Associated with UrlFetch

Making requests to external URLs using URLFetch can also produce DeadlineExceededErrors if the target website is having performance issues or normally takes more than 60 seconds to reply. The logged stack trace of the DeadlineExceededErrors should contain calls to the URLFetch libraries in these cases.

Using Asynchronous API Methods to Reduce Deadline Errors

Some of our APIs provide equivalent asynchronous versions of the most popular methods (e.g., fetch_data_async is the asynchronous version of fetch_data to fetch a blob in the Blobstore API). Synchronous IO RPCs are prone to block the system while waiting for a response. Asynchronous versions of those methods are made available so that other methods that do not depend on them can run while they are waiting on the background, contributing to reduce the overall response time.

Revising a Data Model to Reduce Deadline Errors

Datastore contention is generated by many simultaneous updates to the same entity group. If your datastore writes are colliding frequently, you might be spending more time than necessary to write to the datastore. Consider reducing the size of the entity group to the smallest size possible. This prevents operations waiting for other operations working on the same large entity group. Avoiding datastore contention contains some tips on how to do this.

Authentication required

You need to be signed in with Google+ to do that.

Signing you in...

Google Developers needs your permission to do that.