Google App Engine

Using Django 1.0 on App Engine with Zipimport

Dan Sanderson
September 2008

July 2009: This article discusses bundling large Python libraries using the zipimport module, using the Django 1.0 web application framework as an example. As of release 1.2.3 of the Python runtime environment, Django 1.0 is included in the runtime environment, and no longer needs to be bundled with your app. Using the version of Django included with the runtime environment provides faster start-up times for your application, and is the recommended way to use Django 1.0.

The maximum file size is 10 megabytes, and the maximum file count (including application files and static files) is 10,000, with a limit of 1,000 files in a single directory.

Introduction

Using a Python web application framework with your App Engine application is usually as simple as including the files for the framework with your application's code. However, there is a limit to the number of files that can be uploaded for an application, and the standard distributions for some frameworks exceed this limit or leave little room for application code. You can work around the file limit using Python's "zipimport" feature, which is supported by App Engine as of the 1.1.3 release (September 2008).

This article describes how to use Django 1.0 with Google App Engine using the "zipimport" feature. You can use similar techniques with other frameworks, libraries or large applications.

Introducing zipimport

When your application imports a module, Python looks for the module's code in one of several directories. You can access and change the list of directories Python checks from Python code using sys.path. In App Engine, your handler is called with a path that includes the App Engine API and your application root directory.

If any of the items in sys.path refers to a ZIP-format archive, Python will treat the archive as a directory. The archive contains the .py source files for one or more modules. This feature is supported by a module in the standard library called zipimport, though this module is part of the default import process and you do not need to import this module directly to use it. For more information about zipimport, see the zipimport documentation.

To use module archives with your App Engine application:

  1. Create a ZIP-format archive of the modules you want to bundle.
  2. Put the archive in your application directory.
  3. If necessary, in your handler scripts, add the archive file to sys.path.

For example, if you have a ZIP archive named django.zip with the following files in it:

django/forms/__init__.py
django/forms/fields.py
django/forms/forms.py
django/forms/formsets.py
django/forms/models.py
...

A handler script can import a module from the archive as follows:

import sys
sys.path.insert(0, 'django.zip')

import django.forms.fields

This example illustrates zipimport, but is not sufficient for loading Django 1.0 in App Engine. A more complete example follows.

zipimport and App Engine

App Engine uses a custom version of the zipimport feature instead of the standard implementation. It generally works the usual way: add the Zip archive to sys.path, then import as usual.

Because it is a custom implementation, several features do not work with App Engine. For instance, App Engine can load .py files from the archive, but it can't load .pyc files like the standard version can. The SDK uses the standard version, so if you'd like to use features of zipimport beyond those discussed here, be sure to test them on App Engine.

Archiving Django 1.0

When App Engine launched in Summer 2008, it included the Django application framework as part of the environment to make it easy to get started. At the time, the latest release of Django was 0.96, so this is the version that is part of version "1" of the Python runtime environment. Since then, the Django project released version 1.0. For compatibility reasons, App Engine can't update its version of Django without also releasing a new version of the Python runtime environment. To use 1.0 with App Engine with version "1" of the runtime environment, an application must include the 1.0 distribution in its application directory.

The Django 1.0 distribution contains 1,582 files. An App Engine application is limited to 1,000 files, so the Django distribution can't be included directly. Of course, not every file in the distribution needs to be included with the application. You can prune the distribution to remove documentation files, unused locales, database interfaces and other components that don't work with App Engine (such as the Admin application) to get the file count below the limit.

Using zipimport, you can include Django 1.0 with your application using just 1 file, leaving plenty of room for your own application files in the 1,000 file limit. A single ZIP archive of Django 1.0 is about 3 MB. This fits within the 10 MB file size limit. You may wish to prune unused libraries from the Django distribution anyway to further reduce the size of the archive.

Update: Prior to the 1.1.9 release of the Python SDK in February 2009, the file size limit was 1 MB. With 1.1.9, the limit has been increased to 10 MB. These instructions produce a Django archive smaller than 1 MB.

To make an archive containing all of Django, replace steps 2, 3 and 4 below with the following command: zip -r django.zip django

To download and re-package Django 1.0 as a ZIP archives:

  1. Download the Django 1.0 distribution from the Django website. Unpack this archive using an appropriate tool for your operating system (a tool that can unpack a .tar.gz file). For example, on the Linux or Mac OS X command line:
    tar -xzvf Django-1.0.tar.gz
  2. Create a ZIP archive that contains everything in the django/ directory except for the .../conf/ and .../contrib/ sub-directories. (You can also omit bin/ and test/.) The path inside the ZIP must start with django/.
    cd Django-1.0
    zip -r django.zip django/__init__.py django/bin django/core \
                      django/db django/dispatch django/forms \
                      django/http django/middleware django/shortcuts \
                      django/template django/templatetags \
                      django/test django/utils django/views
  3. The conf package contains a large number of localization files. Adding all of these files to the archive would increase the size of the archive beyond the 1 MB limit. However, there's room for a few files, and many Django packages need some parts of conf. Add everything in conf except the locale directory to the archive. If necessary, you can also add the specific locales you need, but be sure to check that the file size of the archive is below 1 MB.

    The following command adds everything in conf except conf/locale to the archive:

    zip -r django.zip django/conf -x 'django/conf/locale/*'
  4. Similarly, if you need anything in .../contrib/, add it to the archive. The largest component in contrib is the Django Admin application, which doesn't work with App Engine, so you can safely omit the admin and admindocs directories. For example, to add formtools:
    zip -r django.zip django/contrib/__init__.py \
                      django/contrib/formtools
  5. Put the archive file in your application directory.
    mv django.zip your-app-dir/

Using the Module Archive

Tip: The latest version of the Django App Engine Helper (starting with version "r64") supports Django 1.0 with zipimport out of the box. Make sure your archive is named django.zip and is in your application root directory. All new projects created using the Google App Engine Helper for Django will automatically use django.zip if present. If you are upgrading an existing project you will need to copy the appengine_django, manage.py and main.py files from Google App Engine Helper for Django into your existing project. See Using the Google App Engine Helper for Django.

The following instructions only apply if you are using Django without the Helper, or if you are preparing another module archive.

To use a module archive, the .zip file must be on the Python module load path. The easiest way to do this is to modify the load path at the top of each handler script, and in each handler's main() routine. All other files that use modules in the archives will work without changes.

Because App Engine pre-loads Django 0.96 for all Python applications, using Django 1.0 requires one more step to make sure the django package refers to 1.0 and not the preloaded version. As described in the article Running Django on App Engine, the handler script must remove Django 0.96 from sys.modules before importing Django 1.0.

The following code uses the techniques described here to run Django 1.0 from an archive named django.zip:

import sys
from google.appengine.ext.webapp import util

# Uninstall Django 0.96.
for k in [k for k in sys.modules if k.startswith('django')]:
    del sys.modules[k]

# Add Django 1.0 archive to the path.
django_path = 'django.zip'
sys.path.insert(0, django_path)

# Django imports and other code go here...
import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'settings'
import django.core.handlers.wsgi

def main():
    # Run Django via WSGI.
    application = django.core.handlers.wsgi.WSGIHandler()
    util.run_wsgi_app(application)

if __name__ == '__main__':
    main()

With appropriate app.yaml, settings.py and urls.py files, this handler displays the Django "It worked!" page. See Running Django on App Engine for more information on using Django.

Using Multiple Archive Files for a Single Package

Since all of Django 1.0 is too large to fit into a single archive, can we split it into multiple archives, each on sys.path? Actually yes, with some bootstrapping code to help Python navigate the different locations.

When Python imports a module, it checks each location mentioned in sys.path for the package that contains the module. If a location does not contain the first package in the module's path, Python checks the next sys.path entry, and so on until it finds the first package or runs out of locations to check.

When Python finds the first package in the module's path, it assumes that wherever it found it is the definitive location for that package, and it won't bother looking for it elsewhere. If Python cannot find the rest of the module path in the package, it raises an import error and stops. Python does not check subsequent sys.path entries after the first package in the path has been found.

You can work around this by importing the package that is split across multiple archives from the first archive, then telling Python that the contents of the package can actually be found in multiple places. The __path__ member of a package (module) object is a list of locations for the package's contents. For example, if the django package is split between two archives called django1.zip and django2.zip, the following code tells Python to look in both archives for the contents of the package:

sys.path.insert(0, 'django1.zip')
import django
django.__path__.append('django2.zip/django')

This imports the django package from django1.zip, so make sure that archive contains django/__init__.py.

With the second archive on the package's __path__, subsequent imports of modules inside django will search both archives.

Additional Notes

Some additional things to note about using zipimport with App Engine:

  • Module archives use additional CPU time the first time a module is imported. Imports are cached in memory for future requests to the same application instance, and modules from archives are cached uncompressed and compiled, so subsequent imports on the same instance will not incur CPU overhead for decompression or compilation.
  • The App Engine implementation of zipimport only supports .py files, not precompiled .pyc files.
  • Because handler scripts are responsible for adding module archives to the path, handler scripts themselves cannot be stored in module archives. Any other Python code can be stored in module archives.

Authentication required

You need to be signed in with Google+ to do that.

Signing you in...

Google Developers needs your permission to do that.