April 16, 2012
Datastore Administration is an experimental, innovative, and rapidly changing new feature for Google App Engine. Unfortunately, being on the bleeding edge means that we may make backwards-incompatible changes to Datastore Administration. We will inform the community when this feature is no longer experimental.
Note: Use of this feature is limited to backups started from the application's cron or task queue.
You can run scheduled backups for your application using the App Engine Cron service. To do this for Python or Go apps, specify backup cron jobs in cron.yaml. For Java apps, specify the backup cron job in cron.xml. Currently there is no way to specify a scheduled backup programmatically.
Setting Up a Scheduled Backup
To set a scheduled backup for your app,
- If you haven't already done so, enable Datastore Admin for your app.
- If you are using Google Cloud Storage for your backups, and have not yet done so, properly configure the bucket you are using for backups.
- In your application directory, if you don't already have one, create a
cron.yamlfile for a Python or Go app or a
cron.xmlfile for a Java app.
- Add the backup cron entries. These specify the backup schedule, the set of
entities to back up, and the storage to be used for the backups, as described in
Specifying Backups in a Cron File.
Here are some examples:
cron: - description: My Daily Backup url: /_ah/datastore_admin/backup.create?name=BackupToCloud&kind=LogTitle&kind=EventLog&filesystem=gs&gs_bucket_name=whitsend schedule: every 12 hours target: ah-builtin-python-bundle
cron.xml(note use of "&", as "&" is interpreted by XML)
<?xml version="1.0" encoding="UTF-8"?> <cronentries> <cron> <description>My Daily Backup</description> <url>/_ah/datastore_admin/backup.create?name=BackupToCloud&kind=LogTitle&kind=EventLog&filesystem=gs&gs_bucket_name=whitsend</url> <schedule>every 12 hours</schedule> <target>ah-builtin-python-bundle</target> </cron> </cronentries>
- Deploy this file with your app. (You can verify the Cron job you just deployed by clicking Cron Jobs in the left nav pane.)
The backups will occur on the schedule you specified. While it runs, it will show up in the Pending Backups list. After the backup is complete, you can view it and use it in the list of available backups within the Datastore Admin tab.
Specifying Backups in a Cron File
These are the fields to include in your cron file to perform scheduled backups:
- This is the title that appears in the Cron Job list. It can be anything you wish.
- The url is required and must be in this format:
These fields can appear in the url query string:
nameis an optional prefix that is prepended to the backup name. It helps you identify your backups. If not supplied, the default "cron-" will be used.
kindfield can appear one or more times. Each value specifies an entity kind that you wish to back up. You must specify at least one entity kind. In the Datastore Admin Console, the default is that all entity kinds are backed up. With a cron backup, there is no such default: if you don't specify a kind, it doesn't get backed up.
queueis optional. It specifies the task queue to be used. If not supplied, the default task queue is used.
filesystemspecifies the kind of storage to be used. The value "blobstore" means that Blobstore is used to store the backups; the value "gs" means that Google Cloud Storage is used. If no value is supplied, blobstore is used by default.
gs_bucket_nameis required if you use Google Cloud Storage for backups. It specifies the bucket name used for storage.
namespaceis optional. When provided, only entities from the selected namespace are included in the backup.
Note: The url cannot be longer than 2000 characters. As shown in the cron.xml Java example above, you must use the HTML entity "
&" to separate fields, rather than the ampersand character ("
&") since that will be interpreted by XML.
- This field is required: it defines the recurring schedule at which the backup runs. For complete details, see the Schedule Format documentation for Python or Java).
- This is required. It identifies the app version the cron backup job is to be
run on. You must use the value
ah-builtin-python-bundlebecause that is the version of your app that contains the Datastore Admin features that the cron job needs to execute. Keep in mind that the cron backup job is running against this version of your app, so you incur costs when the cron backup job is running. (The
ah-builtin-python-bundleversion of your app is enabled when you enable Datastore admin for your app.)
Warning! Backup, restore, copy, and delete operations are executed within your application, and thus count against your quota.
Very frequent backups often lead to higher costs. When you run a Datastore Admin job, you are actually running underlying MapReduce jobs. MapReduce jobs cause frontend instance hours to increase on top of Storage operations and Storage usage. To keep an eye on your resource usage, click on the Dashboard link under Main in the left navigation. On the top of the page select ah-builtin-python-bundle from the Version drop down menu.
When the scheduled backup runs, App Engine performs a GET using the backup
url. If the GET succeeds it results in http status 200. When it
fails it results in http status code 400. You can look at the logs to determine
whether a backup succeeded or failed by doing the following:
- In the Admin Console for your application, click Logs in the left navigation pane, under Main.
- Locate the version pulldown menu, which is immediately to the right of the application pulldown. The app pulldown should be showing the name of your app, the version pulldown is most likely showing the number 1.
- In the version pulldown, select
ah-builtin-python-bundleto display the logs.
- Locate your backup job in the log to determine whether it succeeded or failed. If there was a failure, in addition to the status code 400, there will be an error message to help you determine the cause of the error.