MapReduce for App Engine

Important: Google has transitioned support and further development of the Java and Python MapReduce libraries to the open source community. The source code and documentation are available on GitHub

MapReduce is a programming model for processing large amounts of data in a parallel and distributed fashion. It is useful for large, long-running jobs that cannot be handled within the scope of a single request, tasks like:

  • Analyzing application logs
  • Aggregating related data from external sources
  • Transforming data from one format to another
  • Exporting data for external analysis

App Engine MapReduce is a community-maintained, open source library that is built on top of App Engine services, including Datastore and Task Queues. The library is available on GitHub at these locations:

Where to find documentation

The documentation for Mapreduce is available by clicking the wiki icon for the GitHub projects linked above. However, for your convenience, you can also access the documentation for both Java and Python MapReduce at the following link: