Google Cloud Datastore

GQL Reference

GQL maps roughly to SQL: you can think of GQL kind as a SQL table, a GQL entity as a SQL row, and a GQL property as a SQL column. However, a SQL row-column lookup is limited to a single value, whereas in GQL a property can be a multiple value property.

Grammar

The GQL grammar is summarized as follows:

<query> :=
SELECT ( * | [ DISTINCT ] <aggregated-property>+, )
[ FROM <kind> ]
[ WHERE <compound-condition> ]
[ GROUP BY <property-name>+,  ]
[ ORDER BY ( <property-name> [ ASC | DESC ] )+,  ]
[ OFFSET <result-position> [ “+” <result-position> ]
[ LIMIT ( <result-position> |
          FIRST “(” <result-position> ,
                    <result-position> “)” ) ]

<aggregated-property> :=
  <property-name>
| aggregator “(” <property-name> “)”

<aggregator> := FIRST

<compound-condition> := <condition>+AND

<condition> :=
  <property-name> IS NULL
| <property-name> <comparator> <value>
| <value> <comparator> <property-name>

<comparator> :=
  =
| <
| <=
| >
| >=
| IN
| CONTAINS
| HAS ANCESTOR
| HAS DESCENDANT

<result-position> := <binding-site> | <integer-literal>

<value> :=
  <binding-site>
| <synthetic-literal>
| <string-literal>
| <integer-literal>
| <double-literal>
| <boolean-literal>
| <null-literal>

<synthetic-literal> :=
  KEY “(”
    [ “DATASET” “(“ <string-literal> “)” “,” ]
    [ “NAMESPACE” “(“ <string-literal> “)” “,” ]
    <key-path-element>+, “)”
| BLOB “(” <string-literal> “)”
| BLOBKEY “(” <string-literal> “)”
| DATETIME “(” <string-literal> “)”

<key-path-element> :=
  <kind> "," ( <integer-literal> | <string-literal> )

<kind> := <name>

<property-name> := <name>

In the above grammar list, note that:

  • The symbol +, after an expression indicates that it can be repeated, with repeated expressions separated by a comma.
  • The use of +AND above means an AND-separated set of conditions.

The following are some examples that return entire entities:

    SELECT * FROM myKind WHERE myProp >= 100 AND myProp < 200
    SELECT * FROM myKind OFFSET @startCursor LIMIT 50
    SELECT * FROM myKind ORDER BY myProp DESC

Every GQL query string always begins with SELECT <something>, where <something> is one of the following:

  • *
  • <property-list>, a comma delimited list of properties to be returned from the query.
  • __key__, which returns keys only.

Similar to SQL, GQL keywords are case insensitive. Kind, property, and binding site names are case sensitive.

A GQL query returns zero or more entity results of the requested kind, with each result consisting of an entire entity or some subset of the properties of the entity, or even just the entity key. For example,

  • SELECT * FROM myKind
  • SELECT __key__ FROM myKind
  • SELECT title, year FROM Song WHERE composer = 'Lennon, John'

A result list produced by SELECT * or SELECT __key__ never contains duplicates. A result list produced by SELECT <property-list> may contain multiple results from one entity, typically when any of those properties are multiple value properties.

Tip: SELECT __key__ or SELECT <property-list> queries are faster than SELECT * queries, and are cheaper, because they are charged at the less expensive small operations rates, rather than at the read rates. See also projection queries.

Clauses

The following optional SELECT clauses are recognized:

Clause Description
DISTINCT Specifies that only completely unique results will be returned. Normally used only with projection queries because non-projection queries return unique results. If you use DISTINCT in a projection query where more than one entity has the same values in the properties being projected, only the first entity is returned. Note that a query string can use either DISTINCT or GROUP BY, but not both.
GROUP BY Groups data from a selected set of entities into a set of summary rows by the value of the specfied property or properties.
FROM Limits the result set to those entities of the given kind. A query string without a FROM clause is called a kindless query and the only filtering allowed in the WHERE clause is filtering by __key__.
WHERE Limits the result set to those entities that meet one or more conditions. Each condition compares a property of the entity with a value using a comparison operator. If multiple conditions are combined with the AND keyword, then an entity must meet all of the conditions to be returned by the query. GQL does not have an OR operator.
ORDER BY Causes results to be sorted by the specified properties. The ORDER BY clause can specify multiple sort orders as a comma-delimited list, evaluated from left to right. Specify ASC for ascending or DESC for descending order. Note that the order is applied to each property. If the direction is not specified, it defaults to ASC. If no ORDER BY clause is specified, the order of the results is undefined and may change over time.
OFFSET Specifies offsets into the result set: either a cursor, or a count, or both. If OFFSET has two <result-position>s, the left one must be a cursor and the right one must be an integer. Note that an OFFSET with an integer starts at the beginning or at the cursor, then discards the specified number of entities, and so you still incur the cost of reading those entities. (Note also that OFFSET and LIMIT are independent.)
LIMIT Limits query results to a count, to results preceding a cursor, or both. Often used to page through results of a query. If LIMIT has two <result-position>s, one must be a cursor and the other must be an integer. (Note that OFFSET and LIMIT are independent.)

How to form entity and property names

Kind, property, and binding site names are formed as follows:

  • With any sequence of letters, digits, underscores, dollar signs, or unicode characters in the range from U+0080 to U+FFFF (inclusive), so long as the name does not begin with a digit. For example, foo, bar17, x_y, big$bux, __qux__.

  • You can also use a non-empty backquoted string: `fig-bash` or `x.y`. A backquote character can be represented in a backquoted name by doubling it, for example, `silly``putty`. A backquoted name can contain escaped characters.

  • An unquoted name can match a predefined name, but must not match a keyword. A backquoted name can contain any character except a newline. (It can contain a newline via \n, but not as a raw newline.)

  • Names are case-sensitive.

How to form literals

You can use the following literals in a comparison:

Literal Type Description
string Formed following these rules:
  • Can be a single-quoted or double-quoted string, such as 'foo' or ”foo”.
  • A ' character may be represented in a single-quoted string and a character may be represented in a double-quoted string by doubling it: 'Joe''s Diner' or ”Expected ””.”.
  • Can contain \-escaped characters.
  • Can contain any character except a newline.
integer A sequence of decimal digits with the following options or characteristics:
  • An optional initial plus or minus character.
  • Must be representable as a signed 64 bit integer.
  • Examples: 0, 11, +5831, -37, or 3827438927.
double A sequence of decimal digits with the following options or characteristics:
  • An optional initial plus or minus character.
  • Contains either a decimal point or an exponent consisting of the letter E (or e) with an optional plus or minus character.
  • Must be representable as an IEEE 64 bit floating point number.
  • Examples: 0.0, +58.31, -37.0, 3827438927.0, -3., +.1, 314159e-5, or 6.022E23.
boolean Can be the values TRUE or FALSE, case-insensitive.
null Represents NULL. Case insensitive.

Synthetic literals

A synthetic literal is a value that is constructed by a function. The following table lists the supported synthetic literals:

Literal Name Description
KEY KEY([DATASET(<dataset>),] [NAMESPACE(<namespace>),] <key-path-element>*,) represents a key.

If <dataset> and <namespace> are not supplied, defaults from the current query context are used. (See partition ID.) The <Key-path-element> is an entity path, which is an even-length comma-separated list of kinds alternating with either integer ids or string names. The integers must be greater than 0 and the strings must not be empty.
BLOB BLOB(<string>) represents a blob encoded as <string> via base-64 encoding with character set [A-Za-z0-9-_] and no padding.
BLOBKEY BLOBKEY(<string>) represents a blobkey with value <string>, which must not be empty.
DATETIME DATETIME(<string>) represents a timestamp. <string> must be in the time format specified in RFC 3339 section 5.6. (However, the second precision is limited to microseconds and leap seconds are omitted.) This standard format is: YYYY-MM-DDThh:mm:ss.SSSSSS+zz:ZZ where:
  • Year YYYY is a four digit value between 1969 and 9999, inclusive. For example, 2013-09-29T09:30:20:00002-08:00
  • Month MM must be two digits consisting of values between 01 and 12, inclusive. For example, 09.
  • Day DD must be two digits consisting of values between 01 and 31, inclusive. For example, 29.
  • Date YYYY-MM-DD must be a valid date in the Gregorian calendar. For example, February 29, 2013 would be invalid.
  • Delimiter T must be T or t.
  • Hour hh must be two digits consisting of values between 0and 23, inclusive. For example, 09.
  • Minute mm and second ss both must be two digits consisting of values between 0 and 59, inclusive.
  • SSSSSS represents a fraction of a second:
    • It may be omitted, in which case . must also be omitted. Otherwise...
    • The value must consist of one to six digits.
  • +zz:ZZ represents an offset (or time zone). For example, the Pacific time zone has an offset of -08:00.
    • It may be entirely replaced by Z or z to represent offset 0. Otherwise…
    • Offset sign + must be + or -.
    • Offset hour zz has the same limitations as hour hh.
    • Offset minute ZZ has the same limitations as minute mm.
    • Neither +00:00 nor -00:00 are valid offsets.
  • The represented timestamp must be greater than or equal to January, 1, 1970--1970-01-01T00:00:00Z (See Unix time.)

How to escape characters

You can escape certain characters in string literals and backquoted names. The escaped characters are case sensitive: for example \r is valid while \R is not.

The following is a list of all the characters that can be escaped in GQL:

Character Escaped
backslash character \\
null character \0
backspace character \b
newline character \n
return character \r
tab character \t
the character with decimal code 26 \Z
single quotation mark \’
double quotation mark \”
backquote character \`
\% (2 characters, retaining the backslash, per MySQL) \%
\_ (2 characters, retaining the backslash, per MySQL) \_

Operators and comparisons

Comparators are either equivalence comparators: =, IN, CONTAINS, = NULL, HAS ANCESTOR, and HAS DESCENDANT, or inequality comparators: <, <=, >, and >=.

Notice that the operator = is another name for the IN and CONTAINS operators. For example, <value> = <property-name> is the same as <value> IN <property-name>, and <property-name> = <value> is the same as <property-name> CONTAINS <value>. Also <property-name> IS NULL is the same as <property-name> = NULL.

A condition can also test whether one entity has another entity as an ancestor, using the HAS ANCESTOR or HAS DESCENDANT operators. These operators test ancestor relationships between keys. They can operate on __key__, but they can also operate on a key-valued property. For HAS ANCESTOR, the right operand cannot be a property. For HAS DESCENDANT, the left operand cannot be a property. For more information on ancestor relationships, see ancestor paths.

Only one property may be compared with inequality operators. When a query with an ORDER BY clause applies an inequality operator to a property, that property must be the first property in the ORDER BY clause.

A typical property name consists of alphanumeric characters optionally mixed with underscore (_) and dollar sign ($). In other words, they match the regular expression [a-zA-Z0-9_$]+(.[0-9_$]+)*. Property names containing other printable characters must be quoted with backquotes, for example: `first-name`.

Restrictions

Comparisons must be between a property name and a literal, but these can be on either side of the operator. For example, A < 7 or 7 > A. Note, however, that there is no inverse operator for IS NULL, so while you can have <property-name> IS NULL, you cannot have NULL IS <property-name>.

There is no way to determine whether an entity lacks a value for a property (that is, whether the property has no value). If you use a condition of the form property = NULL, what will occur is a check whether a null value is explicitly stored for that property. Datastore queries that refer to a property will never return entities that don't have a value for that property.

Examples

To find all of the entities of kind Person whose ages are between 18 and 35, use this query string:

SELECT * FROM Person WHERE age >= 18 AND age <= 35

To find the three entities of kind Person whose ages are the greatest, use this query string:

SELECT * FROM Person ORDER BY age DESC LIMIT 3

To return only the name property for each Person, use this query string:

SELECT name FROM Person

To return only the name property for each Person, ordered by age, use this query string:

SELECT name FROM Person ORDER BY age

To find the keys of the entities of kind Person that have an age of NULL, use this query string:

SELECT __key__ FROM Person WHERE age = NULL

To find all the entities, regardless of kind, that are in Amy's entity group (i.e. Amy and Fred), use this query:

SELECT * WHERE __key__ HAS ANCESTOR KEY(Person, 'Amy')

Argument binding

GQL supports argument binding. For information about these, see Using GQL.

Unsupported features and behavior differences from MySQL/Python GQL

If you are familiar with MySQL or the Python GQL provided by Google App Engine, you might want to take a look at the following list, which highlights the main differences between those products and the Datastore GQL behavior.

MySQL Differences

  • MySQL binding sites are represented by ?. Datastore GQL binding sites are represented by @<name> or @<number>.
  • MySQL supports only OFFSET/LIMIT counts. Datastore GQL also supports OFFSET/LIMIT cursors.
  • Datastore GQL supports an OFFSET without a LIMIT, MySQL does not.
  • MySQL supports an offset count via keyword LIMIT, Datastore GQL does not.
  • A MySQL literal string can contain a raw newline. A Datastore GQL literal string cannot.
  • A MySQL literal string can \-escape any character. A Datastore GQL literal string can only \-escape a specified list of characters.
  • A MySQL name can begin with a digit. A Datastore GQL name cannot.
  • A Datastore GQL name can contain null characters. A MySQL name cannot.
  • A quoted Datastore GQL name can contain \-escaped characters. A quoted MySQL name interprets a \ as an ordinary character.
  • MySQL has different operators, keywords, and predefined names than Datastore GQL.

Python GQL for App Engine Differences

  • Python GQL binding sites begin with :. Datastore GQL binding sites begin with @.
  • Python GQL binding argument names may match the regular expression __.*__. Datastore GQL reserves those names for future predefined binding arguments. Note that although binding sites can use those names, argument bindings cannot.
  • Python GQL supports operators != and OR. Those operators are not yet supported by Datastore GQL.
  • Datastore GQL supports the IN operator differently.
  • Python GQL supports functions datetime, date, time, key, user, and geopt. Datastore GQL supports function datetime and key, but with different arguments. Datastore GQL supports functions blob and blobkey. Datastore GQL support for function geopoint (note spelling) is expected to be added in some future release.
  • Python GQL expression ANCESTOR IS <entity-or-key-value> is represented in Datastore GQL as the more general expression __key__ HAS ANCESTOR <key-value>.
  • Datastore GQL supports the expression <property-name> IS NULL. Python GQL does not.
  • Python GQL literal strings are quoted with . Datastore GQL literal strings are quoted with either or .
  • Python GQL names are quoted with . Datastore GQL names are quoted with `.
  • Datastore GQL literal strings and quoted names can contain spaces, return characters, backslashed characters, and the enclosing quote character (doubled). Python GQL literal strings and quoted names cannot have these.
  • Python GQL names may contain . without quoting. Datastore GQL reserves . for future use.
  • Datastore GQL names may contain $ and U+0080 to U+FFFF without quoting. Python GQL names may not.
  • Python GQL has different keywords and operators than Datastore GQL.

Keywords and predefined names

Keywords and predefined names are case-insensitive.

The following keywords are recognized, although not all of these are currently used in GQL. The unused keywords are marked with an asterisk.

Keyword
ALL* DIV* NULL
ANCESTOR EXISTS* OFFSET
AND FALSE OR*
ANY* FROM ORDER
AS* GROUP REGEXP*
ASC HAS RLIKE*
BETWEEN* HAVING* SELECT
BINARY* IN. SUBSET*
BY IS SUPERSET
CONTAINS JOIN* TRUE
CURSOR* LIKE* WHERE
DESC LIMIT XOR*
DESCENDANT MOD*
DISTINCT NOT*

The following predefined names are recognized:

Predefined Name
BLOB
BLOBKEY
DATETIME
FIRST
KEY

Authentication required

You need to be signed in with Google+ to do that.

Signing you in...

Google Developers needs your permission to do that.