Batch Processing in the Google Data Protocol

Page Summary

Batch processing allows executing multiple operations like insert, update, delete, and query in a single request using a GData batch feed.
To use batch operations, a recent version of your Google Data API client library is required, except for the JavaScript library which is not supported.
A batch request is sent as an HTTP POST to a batch URL, which can be discovered by checking for a "batch" link relation in the feed.
Status information for each operation is returned in the response feed, allowing you to track success or failure and retry only the failed operations.
When using the GData Java client library, you add entries with specified batch operation types to a feed and use the Service.batch method to send the request and process the results.

Batch processing gives you the ability to execute multiple operations in one request, rather than having to submit each operation individually.

Note: To perform batch operations, you need to be using a recent version of your Google Data API client library. Batch operations are not supported by the JavaScript client library.

Audience

This document is intended for programmers who want to submit multiple operations in a single request using batch processing.

This document assumes that you're familiar with using the GData Java Client Library. The examples in this document show how to use the Java client library to run batch operations.

The examples in this document are specific to the Google Base Data API. However, other services may also provide batch capabilities.

Note: The protocol and general procedures will be the same for other client libraries but the specific methods to perform batch requests may differ. Please refer to the client library specific documentation.

Introduction

Using a GData batch feed, you can collect multiple insert, update, delete, and query operations, and then submit and execute them all at once.

For example, the following feed includes four operations:

<feed>
  <entry>
    <batch:operation type="insert"/>
    ... what to insert ...
  </entry> 
  <entry>
    <batch:operation type="update"/>
    ... what to update ...
  </entry>
  <entry>
    <batch:operation type="delete"/>
    ... what to delete ...
  </entry>
  <entry>
    <batch:operation type="query"/>
    ... what to query ...
  </entry>
</feed>

The service will perform as many of the requested changes as possible and return status information that you can use to evaluate the success or failure of each operation.

The service attempts to execute each of the operations within a batch, even if some of the operations included in the batch do not succeed.

Submitting a batch request

A batch request should be sent as an HTTP POST to a batch URL. Different feeds support different batch operations. Read-only feeds only support queries.

To discover whether a given feed supports batch operations, you can query the feed. If the feed contains a "batch" link relation at the feed level, this indicates that the feed supports batch operations.

A "batch" link relation is a <link> element with rel="http://schemas.google.com/g/2005#batch". The href attribute of the link relation defines the URL where feed documents for batch operations may be posted.

For example, if you execute: GET http://www.google.com/base/feeds/items (the regular Google Base "items" feed), you might get the following response:

<feed xmlns=...
  <id>http://www.google.com/base/feeds/items</id>
  <link rel="http://schemas.google.com/g/2005#feed"
    type="application/atom+xml"
    href="http://www.google.com/base/feeds/items"/>
  <link rel="http://schemas.google.com/g/2005#post"
    type="application/atom+xml"
    href="http://www.google.com/base/feeds/items"/>
  <link rel="http://schemas.google.com/g/2005#batch"
    type="application/atom+xml"
    href="http://www.google.com/base/feeds/items/batch"/>
  ...
</feed>

In this example, the batch URL is http://www.google.com/base/feeds/items/batch.

Writing a batch operations feed

An operations feed contains a list of entries to insert, update, delete, or query. Each operation is defined by a <batch:operation type="insert|update|delete|query"/> element.

This element may be a direct child of a <feed> element, a direct child of any of the entries in the feed, or both. When included in an entry, it specifies the operation to execute for that specific entry. When included in the feed, this element specifies the default operation to execute on any entries that do not have a <batch:operation/> element.

When neither the entry nor the feed specifies an operation, the default operation is insert.

Applications should not apply multiple operations to the same entry in a single batch feed. The results are indeterminate if you specify multiple operations for the same entry.

To improve performance, operations may not be processed in the order in which they were requested. However, the final result is always the same as if the entries have been processed in order.

The number of bytes in the XML that you send to the server may not exceed 1 MB (1,048,576 bytes). In general, there are no limits on the number of operations you may request as long as the total byte size does not exceed 1 MB. However, some services may place additional constraints.

To use the batch operations, you must add the batch namespace declaration as an attribute to the <feed> element:

<feed 
  xmlns="http://www.w3.org/2005/Atom"
  xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/"
  ...
  xmlns:batch="http://schemas.google.com/gdata/batch">

Insert operations

An insert operation is denoted as follows:

<batch:operation type="insert">

An insert operation is equivalent to POSTing the entry. When the operation succeeds, the entire entry content is returned, with an updated document <id> element and a <batch:status code="201"/> element.

Here is an example of a successful insert request:

<entry>
  <title type="text">...</title>
  <content type="html">...</content>
  <batch:id>itemA</batch:id>
  <batch:operation type="insert"/>
  <g:item_type>recipes</g:item_type>
  ... 
</entry>

Here is an example of a response to a successful insert request:

<entry>
  <batch:status code="201"/>
  <batch:id>itemA</batch:id>
  <batch:operation type="insert"/>
  <id>http://www.google.com/base/feeds/items/17437536661927313949</id>
  <link rel="self" type="application/atom+xml"
    href="http://www.google.com/base/feeds/items/17437536661927313949"/>
  <title type="text">...</title>
  <content type="html">...</content>
  <g:item_type>recipes</g:item_type>
  ... 
</entry>

Update operations

<batch:operation type="update">

An update operation is equivalent to executing a PUT on the URL referenced by the entry's <id> element. When the operation succeeds, the entire entry content is returned with an <batch:status code="200"/> element.

Note: With certain feeds, you also need to specify the entry's rel="edit" link with batch update requests. This includes those feeds that support Google Data Protocol's v1-style of optimistic concurrency, and those feeds that don't have IDs that are URLs.

Here is an example of an update request:

<entry>
  <id>http://www.google.com/base/feeds/items/17437536661927313949</id>
  <batch:operation type="update"/>
  ...
</entry>

Here is an example of a successful response:

<entry>
  <batch:status code="200"/>
  <id>http://www.google.com/base/feeds/items/17437536661927313949</id>
  <batch:operation type="update"/>
  ... 
</entry>

Note: Some feeds use strong ETags to prevent you from unintentionally modifying another person's changes. When making a batch update request for an entry in one of these feeds, you must provide the ETag value in the entry's gd:etag attribute. For example, <entry gd:etag="'F08NQAxFdip7IWA6WhVR'">...<batch:operation type="update"/>...

Partial update operations

For feeds that support partial updates, you can also use them in batch requests. A partial update operation is equivalent to executing a PATCH on the URL referenced by the entry's <id> element. When the operation succeeds, the entire entry content is returned with a <batch:status code="200"/> element.

<batch:operation type="patch"/>

Here is an example of a partial update request:

<entry gd:fields="content" gd:etag="FE8LQQJJeSp7IWA6WhVa">
  <id>http://www.google.com/calendar/feeds/jo@gmail.com/private/full/entryID</id>
  <batch:operation type="patch"/>
  <title>New title</title>
</entry>

Here is an exmaple of a successful response:

<entry gd:etag="FE8LQQJJeSp7IWA6WhVa">
  <batch:status code="200"/>
  <id>http://www.google.com/calendar/feeds/jo@gmail.com/private/full/entryID</id>
  <batch:operation type="patch"/>
  <title>New title</title>
  <content></content>
  ...rest of the entry...
</entry>

Delete operations

<batch:operation type="delete">

A delete operation is equivalent to executing a DELETE on the URL referenced by by the entry's <id> element. For a delete operation, you only need to send a the <id> element to delete the entry. Any other information you provide in elements that aren't in the batch: namespace will be ignored. When the operation succeeds, an entry with the same ID will be returned with a <batch:status code="200"/> element.

Note: With certain feeds, you also need to specify the entry's rel="edit" link with a batch delete requests. This includes those feeds that support Google Data Protocol's v1-style of optimistic concurrency, and those feeds that don't have IDs that are URLs.

Here is an example of a delete request:

<entry>
  <batch:operation type="delete"/>
  <id>http://www.google.com/base/feeds/items/17437536661927313949</id>
</entry>

Here is an example of a successful response:

<entry>
  <batch:operation type="delete"/>
  <id>http://www.google.com/base/feeds/items/17437536661927313949</id>
  <batch:status code="200" reason="Success"/>
</entry>

Note: Some feeds use strong ETags to prevent you from unintentionally modifying another person's changes. When making a batch delete request for an entry in one of these feeds, you must provide the ETag value in the entry's gd:etag attribute. For example, <entry gd:etag="'F08NQAxFdip7IWA6WhVR'">...<batch:operation type="delete"/>...

Query operations

<batch:operation type="query">

A query operation is equivalent to executing a GET on the URL referenced by the entry's <id> element. When the operation succeeds, the entire entry content is returned.

Note: With certain feeds, you also need to specify the entry's rel="self" link with batch query requests. This includes those feeds that don't have IDs that are URLs.

Here is an example of a query request:

<entry>
  <id>http://www.google.com/base/feeds/items/1743753666192313949</id>
  <batch:operation type="query"/>
</entry>

Here is an example of a successful response:

<entry>
  <id>http://www.google.com/base/feeds/items/1743753666192313949</id>
  <batch:operation type="query"/>
  <batch:status code="200" reason="Success"/>
   ...
</entry>

Tracking operations

GData entry results are not necessarily returned in the same order as the request. You can track an operation through its lifetime using an identifier.

For update, delete, and query operations, you can use the id of the entry itself to track the operation.

For insert operations, since no ID yet exists, you can pass in an operation identifier. This identifier can be used to link the result entries to the request entries. The operation identifier is passed in the <batch:id> element.

For each operation, GData returns a response that states whether the operation succeeded or failed. Each response identifies the related entry. For an update, delete, or query operation, or a successful insert operation, the entry ID is always returned. If you have specified a batch ID, this is also returned. Since unsuccessful insert operations have no associated entry ID, only the batch ID is returned.

Using each operation's identifier, you can retry only the operations that failed, rather than having to resubmit the entire batch of operations.

The content of the <batch:id> is a string value that is client-defined and will be echoed back in the corresponding response entry. You can specify any value that will help the client to correlate the response with the entry in the original request. This element will be echoed as-is in the corresponding entry, even if the operation failed. GData never stores nor interprets the content of this batch ID.

The following example shows a batch operations feed. Notice that the <batch:id> element labels this operation as itemB.

<entry>
  <title type="text">...</title>
  <content type="html">...</content>
  <batch:id>itemB</batch:id>
  <batch:operation type="insert"/>
  <g:item_type>recipes</g:item_type>
</entry>

The following example shows the batch status entry returned in response to this operation.

<entry>
  <id>http://www.google.com/base/feeds/items/2173859253842813008</id>
  <published>2006-07-11T14:51:43.560Z</published>
  <updated>2006-07-11T14:51: 43.560Z</updated>
  <title type="text">...</title>
  <content type="html">...</content>
  <link rel="self" 
    type="application/atom+xml" 
    href="http://www.google.com/base/feeds/items/2173859253842813008"/>
  <link rel="edit" 
    type="application/atom+xml" 
    href="http://www.google.com/base/feeds/items/2173859253842813008"/>
  <g:item_type>recipes</g:item_type>
  <batch:operation type="insert"/>
  <batch:id>itemB</batch:id>
  <batch:status code="201" reason="Created"/>
</entry>

Handling status codes

Status codes are expressed by the following element:

<batch:status code="200|201|404|500|..." reason="reason" [content-type="type"]/>

Each entry in the response feed contains one <batch:status> element. This element describes what happened while executing the operation. It mimics the HTTP response that would have been sent if the operation had been sent individually, rather than as part of a batch feed.

You need to check the <batch:status> element of each entry in the response to find out whether the associated operation was successfully processed. The code="n" attribute contains a GData status code.

Status descriptions

The reason="reason" attribute of the <batch:status> element contains a more verbose explanation of the operation's status.

Content type

The content-type="type" attribute of the <batch:status> element contains the MIME type of the data contained in the <batch:status> element. This corresponds to the Content-Type header of an HTTP status response. This attribute is optional.

When the content type is set, the body of the <batch:status> element describes what went wrong while processing the entry.

Identifying interrupted operations

The following element is included in the response for an interrupted operation:

<batch:interrupted reason="reason" success="N" failures="N" parsed="N">

This element means that batch processing was interrupted and all attempts at recovering the cause of the interruption failed. Some entries might have already been processed successfully. All entries that were not reported as having succeeded before this point were abandoned.

This element is very unusual and usually signals that the feed sent in the body of the request was not in a correct XML format.

As with the <batch:status> element, a short status code can be found in the reason attribute. A longer response might also be found inside the element.

Example batch operations and status feeds

Here is a batch operations feed that could be sent to the server. This feed requests that the server delete two entries and add two new entries. Note that the <feed> element must include a namespace delaration for batch, as highlighted in the example below.

POST : http://www.google.com/base/feeds/items/batch
<?xml version="1.0" encoding="UTF-8"?>
<feed
  xmlns="http://www.w3.org/2005/Atom"
  xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/"
  xmlns:g="http://base.google.com/ns/1.0"
  xmlns:batch="http://schemas.google.com/gdata/batch">
  <title type="text">My Batch Feed</title>
  <entry>
    <id>http://www.google.com/base/feeds/items/13308004346459454600</id>
    <batch:operation type="delete"/>
  </entry>
  <entry>
    <id>http://www.google.com/base/feeds/items/17437536661927313949</id>
    <batch:operation type="delete"/>
  </entry>
  <entry>
    <title type="text">...</title>
    <content type="html">...</content>
    <batch:id>itemA</batch:id>
    <batch:operation type="insert"/>
    <g:item_type>recipes</g:item_type>
  </entry>
  <entry>
    <title type="text">...</title>
    <content type="html">...</content>
    <batch:id>itemB</batch:id>
    <batch:operation type="insert"/>
    <g:item_type>recipes</g:item_type>
  </entry>
</feed>

Let's assume that the two insertions worked, but one of the two deletions failed. In this case, the batch status feed might look like the following. Notice that the entries have been reordered compared to the batch operations feed.

<?xml version="1.0" encoding="UTF-8"?>
<feed
  xmlns="http://www.w3.org/2005/Atom"
  xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/"
  xmlns:g="http://base.google.com/ns/1.0"
  xmlns:batch="http://schemas.google.com/gdata/batch">
  <id>http://www.google.com/base/feeds/items</id>
  <updated>2006-07-11T14:51:42.894Z</updated>
  <title type="text">My Batch</title>
  <link rel="http://schemas.google.com/g/2005#feed"
    type="application/atom+xml"
    href="http://www.google.com/base/feeds/items"/>
  <link rel="http://schemas.google.com/g/2005#post"
    type="application/atom+xml"
    href="http://www.google.com/base/feeds/items"/>
  <link rel=" http://schemas.google.com/g/2005#batch"
    type="application/atom+xml"
    href="http://www.google.com/base/feeds/items/batch"/>
  <entry>
    <id>http://www.google.com/base/feeds/items/2173859253842813008</id>
    <published>2006-07-11T14:51:43.560Z</published>
    <updated>2006-07-11T14:51: 43.560Z</updated>
    <title type="text">...</title>
    <content type="html">...</content>
    <link rel="self"
      type="application/atom+xml"
      href="http://www.google.com/base/feeds/items/2173859253842813008"/>
    <link rel="edit"
      type="application/atom+xml"
      href="http://www.google.com/base/feeds/items/2173859253842813008"/>
    <g:item_type>recipes</g:item_type>
    <batch:operation type="insert"/>
    <batch:id>itemB</batch:id>
    <batch:status code="201" reason="Created"/>
  </entry>
  <entry>
    <id>http://www.google.com/base/feeds/items/11974645606383737963</id>
    <published>2006-07-11T14:51:43.247Z</published>
    <updated>2006-07-11T14:51: 43.247Z</updated>
    <title type="text">...</title>
    <content type="html">...</content>
    <link rel="self"
      type="application/atom+xml"
      href="http://www.google.com/base/feeds/items/11974645606383737963"/>
    <link rel="edit"
      type="application/atom+xml"
      href="http://www.google.com/base/feeds/items/11974645606383737963"/>
    <g:item_type>recipes</g:item_type>
    <batch:operation type="insert"/>
    <batch:id>itemA</batch:id>
    <batch:status code="201" reason="Created"/>
  </entry>
  <entry>
    <id>http://www.google.com/base/feeds/items/13308004346459454600</id>
    <updated>2006-07-11T14:51:42.894Z</updated>
    <title type="text">Error</title>
    <content type="text">Bad request</content>
    <batch:status code="404"
      reason="Bad request"
      content-type="application/xml">
      <errors>
        <error type="request" reason="Cannot find item"/>
      </errors>
    </batch:status>
  </entry>
  <entry>
    <id>http://www.google.com/base/feeds/items/17437536661927313949</id>
    <updated>2006-07-11T14:51:43.246Z</updated>
    <content type="text">Deleted</content>
    <batch:operation type="delete"/>
    <batch:status code="200" reason="Success"/>
  </entry>
</feed>

Using the batch functionality of the GData Java client library

This section explains how to use the batch functionality of the GData Java client library to submit a group of insert, update, and/or delete requests.

The examples given in this section use the Google Base APIs.

First import the classes you'll need, in addition to the standard GData and Google Base classes:

import com.google.gdata.data.batch.*;
import com.google.api.gbase.client.*;

To submit a batch request, you need to get the Batch URL from a feed. The following code snippet illustrates how to do this, assuming that feed is a GoogleBaseFeed object containing information about a feed:

Link batchLink = feed.getLink(Link.Rel.FEED_BATCH, Link.Type.ATOM);
if (batchLink != null) {
  URL batchUrl = new URL(batchLink.getHref());
  ... // batch handling
} else {
  // batching is not supported for this feed
}

The following code snippet prepares a feed that will insert two entries in one operation:

GoogleBaseEntry entry1 = new GoogleBaseEntry();
...   // initialize entry 1 content
BatchUtils.setBatchId(entry1, "A"); // A is the local batch ID for this entry
feed.addEntry(entry1);
GoogleBaseEntry entry2 = new GoogleBaseEntry();
... // initialize entry 2 content
BatchUtils.setBatchId(entry2, "B"); // B is the local batch ID for this entry
feed.addEntry(entry2);

The code in this example never explicitly states that the operation to be performed for these entries is insert. You don't need to explicitly specify that, because insertion is the default operation.

To send the batch feed and receive the results, call the Service.batch method.

Like Service.insert, Service.batch returns the inserted entries with new <atom:id> values set. The returned entries are contained in a GoogleBaseFeed object.

If you want to delete a third entry (which you've already fetched and stored in entry3) at the same time as you insert the other two entries, you could use the following code:

GoogleBaseEntry toDelete = new GoogleBaseEntry();


toDelete.setId(entry3.getId());
BatchUtils.setBatchOperationType(toDelete, BatchOperationType.DELETE);

feed.addEntry(toDelete);


GoogleBaseFeed result = service.batch(batchUrl, feed);

Here, service is an instance of com.google.gdata.client.Service.

If you want to update an entry, specify OperationType.UPDATE, and initialize the entry with the desired changes instead of leaving it mostly blank.

These examples use the Google Base data API. If you are using service.batch with another type of GData service, replace the classes GoogleBaseFeed, GoogleBaseEntry, and GoogleBaseService with the appropriate feed, entry, and service classes.

The results of a batch operation are not necessarily returned in the order in which they were requested. In the example above, the result feed might very well contain entry2 followed by entry1. You should never assume entries are returned in any particular order.

Your batch operations feed should assign a unique batch ID to each insert operation, as explained in Tracking operations. In the above examples, the batch IDs are A and B. Therefore, to find the status of the requested operations, you should iterate over the entries in the returned batch feed and compare their batch ID or entry ID, as follows:

for (GoogleBaseEntry entry : result.getEntries()) {
  String batchId = BatchUtils.getBatchId(entry);      
  if (BatchUtils.isSuccess(entry)) {     
    if ("A".equals(batchId)) {       
      entry1 = entry;     } 
    else if ("B".equals(batchId)) {       
      entry2 = entry;     } 
    else if (BatchUtils.getBatchOperationType(entry) 
      == BatchOperationType.DELETE) {       
      System.out.println("Entry " + entry.getId() +
      " has been deleted successfully.");     
    }      
  } else {     
    BatchStatus status = BatchUtils.getBatchStatus(entry);     
    System.err.println(batchId + " failed (" +                
      status.getReason() + ") " +  status.getContent());      
    }    
  }

Each entry you'll find in the returned feed will have an associated BatchStatus object. The BatchStatus object contains an HTTP return code and a response that describes what went wrong while the entry was processed. You need to check the HTTP return code of each entry to tell whether the operation succeeded.

The check is done in the example above by the convenience method BatchUtils.isSuccess. In this case, it is equivalent to: BatchUtils.getBatchStatus(entry) < 300.

The status codes and responses are further explained in Handling status codes.

Batch Processing in the Google Data Protocol Stay organized with collections Save and categorize content based on your preferences.

Page Summary

Audience

Introduction

Submitting a batch request

Writing a batch operations feed

Insert operations

Update operations

Partial update operations

Delete operations

Query operations

Tracking operations

Handling status codes

Status descriptions

Content type

Identifying interrupted operations

Example batch operations and status feeds

Using the batch functionality of the GData Java client library

Batch Processing in the Google Data Protocol