Google App Engine

Structuring Data for Strong Consistency

The Google App Engine High Replication Datastore (HRD) provides high availability for your reads and writes by storing data synchronously in multiple datacenters. However, the delay from the time a write is committed until it becomes visible in all datacenters means that queries across multiple entity groups (non-ancestor queries) can only guarantee eventually consistent results. Consequently, the results of such queries may sometimes fail to reflect recent changes to the underlying data.

To obtain strongly consistent query results, you need to use an ancestor query limiting the results to a single entity group. This works because entity groups are a unit of consistency as well as transactionality. All data operations are applied to the entire group; an ancestor query won't return its results until the entire entity group is up to date. If your application relies on strongly consistent results for certain queries, you may need to take this into consideration when designing your data model. This page discusses best practices for structuring your data to support strong consistency.

To understand how to structure your data for strong consistency, compare two different approaches for the guestbook example application from the App Engine Getting Started exercise. The first approach creates a new root entity for each greeting:

import com.google.appengine.api.datastore.Entity;

Entity greeting = new Entity("Greeting");
// No parent key specified, so Greeting is a root entity.

greeting.setProperty("user", user);
greeting.setProperty("date", date);
greeting.setProperty("content", content);

It then queries on the entity kind Greeting for the ten most recent greetings.

import com.google.appengine.api.datastore.DatastoreService;
import com.google.appengine.api.datastore.DatastoreServiceFactory;
import com.google.appengine.api.datastore.Entity;

DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();

Query query = new Query("Greeting")
                    .addSort("date", Query.SortDirection.DESCENDING);

List<Entity> greetings = datastore.prepare(query)
                                  .asList(FetchOptions.Builder.withLimit(10));

However, because non-ancestor queries only guarantee eventually consistent results, the datacenter used to perform the query in this scheme may not have seen the new greeting by the time the query is executed. With eventual consistency, nearly all of your writes are available for queries within a few seconds; a solution that provides the data in the context of the current user's own posts will usually be sufficient to make such performance completely acceptable.

If strong consistency is important to your application, an alternate approach is to use a parent key for the kind and save subsequent entities in the entity group defined by this parent key:

import com.google.appengine.api.datastore.DatastoreService;
import com.google.appengine.api.datastore.DatastoreServiceFactory;
import com.google.appengine.api.datastore.Entity;

String guestbookName = req.getParameter("guestbookName");
Key guestbookKey = KeyFactory.createKey("Guestbook", guestbookName);
String content = req.getParameter("content");
Date date = new Date();

// Place greeting in same entity group as guestbook
Entity greeting = new Entity("Greeting", guestbookKey);
greeting.setProperty("user", user);
greeting.setProperty("date", date);
greeting.setProperty("content", content);

Queries for these entities can then use the parent key to perform an ancestor query, which will find only those entities:

import com.google.appengine.api.datastore.DatastoreService;
import com.google.appengine.api.datastore.DatastoreServiceFactory;
import com.google.appengine.api.datastore.Entity;

DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();

Key guestbookKey = KeyFactory.createKey("Guestbook", guestbookName);
Query query = new Query("Greeting", guestbookKey)
                    .setAncestor(guestbookKey)
                    .addSort("date", Query.SortDirection.DESCENDING);

List<Entity> greetings = datastore.prepare(query)
                                  .asList(FetchOptions.Builder.withLimit(10));

This approach achieves strong consistency by writing to a single entity group per guestbook, but it also limits changes to the guestbook to no more than 1 write per second (the supported limit for entity groups). If your application is likely to encounter heavier write usage, you may need to consider using other means: for example, you might put recent posts in a memcache with an expiration and display a mix of recent posts from the memcache and the Datastore, or you might cache them in a cookie, put some state in the URL, or something else entirely. The goal is to find a caching solution that provides the data for the current user for the period of time in which the user is posting to your application. Remember, if you do a get, an ancestor query, or any operation within a transaction, you will always see the most recently written data.

Authentication required

You need to be signed in with Google+ to do that.

Signing you in...

Google Developers needs your permission to do that.