Google App Engine

Backup/Restore, Copy, and Delete Data

Experimental!

Datastore Administration is an experimental, innovative, and rapidly changing new feature for Google App Engine. Unfortunately, being on the bleeding edge means that we may make backwards-incompatible changes to Datastore Administration. We will inform the community when this feature is no longer experimental.
 


Enabling Datastore Admin for an application

In order to use the features of the Datastore Admin tab, you must enable Datastore Admin for your application using the Applications Settings page of the Administration Console. Under the "Data" heading in the left-hand navigation menu, click Datastore Admin, then click Enable Datastore Admin in the page that appears.

Caveats on using data admin features

  • For copy, delete, and backups, recent updates may not be considered.
  • All Datastore Admin operations occur within your applications, and thus count against your quota.
  • We strongly recommend that you set your application to read-only mode during a backup or restore.
  • When copying entities to a remote target or restoring from backup all entities with the same keys will be overridden. Operations can be performed multiple times without the risk of creating duplicates. Be aware that the copy/restore operations do not delete extra data.
  • If a non-default queue is chosen for backup/restore, it must not have any target other than ah-builtin-python-bundle specified in queue.yaml.

Very frequent backups often lead to higher costs. When you run a Datastore Admin job, you are actually running an underlying MapReduce jobs. MapReduce jobs cause frontend instance hours to increase on top of Storage operations and Storage usage. To keep an eye on your resource usage, click on the Dashboard link under Main in the left navigation. On the top of the page select ah-builtin-python-bundle from the Version drop down menu.

Backup and restore

You can use the Datastore Admin tab of the Admin Console to backup entities of selected kinds and when needed restore from a selected backup, with the backup/restore affecting all namespaces.

The backup and restore feature is intended to help you recover from accidental deletes of data. When you restore from a backup, any new entities added since the backup are retained, and entities that existed at backup-time and that were modified after the backup are overwritten with values from the backup. You can restore all data from a backup or you can restore specific entity kinds from the backup. In addition, you can also use this feature to restore a backup of one app's data to some other app, provided that you use Google Cloud Storage for your backups.

You can use either Blobstore or Google Cloud Storage as the location where backups are stored, as described in the instructions below. If you choose the Google Cloud Storage (GCS) option, you must specify a bucket. Applications created with App Engine 1.9.0 or greater have a GCS default bucket that you can use with no further configuration or permissions necessary. Or, if you wish, you can use another GCS bucket that you have set up with write permissions for your app.

Backing up data

To back up your data:

  1. Optionally, disable datastore writes for your app.
  2. Go to the Datastore Admin screen in the Data section of the Administration Console.
  3. Select the entity kind(s) that you wish to backup.
  4. Click Backup Entities.
  5. In the advisory page that is displayed, notice that the default queue is used for the backup job. Change this to another queue if desired, making sure the queue chosen does not have any target specified in queue.yaml other than ah-builtin-python-bundle.
  6. Also, in the advisory page, notice that a backup name is supplied and that it includes a datestamp. You can change this to anything you want. (If you make more than one backup per day, you'll have to change this value.)
  7. In the advisory page, select the backup storage location by making the appropriate choice in the dropdown menu, either Blobstore or Google Cloud Storage.
  8. If you choose Google Cloud Storage (GCS), you are prompted for the bucket name where the backups are to be stored, in the format /gs/my_bucket. Note that you can optionally specify the bucket name suffixed with a directory structure (e.g., bucket_name/backups/backup1). If those folders you specify don't already exist, they will be created.

    If you don't use the default GCS bucket, you must create a new GCS bucket and assign permissions. Be aware that the backup will not be able to complete unless your app has write permissions for that bucket, as described in the Google Cloud Storage documents."

  9. Start the backup jobs by clicking Backup Entities. Notice that a job status page is displayed.

  10. If you disabled writes, re-enable datastore writes for your application.

Aborting a backup

If Backup jobs are currently running, they appear in a Pending Backups list in the Datastore Admin screen. You can stop these running backups by selecting the backup in the list and clicking Abort. When you abort a backup job, App Engine attempts to delete backup data that has been saved up to that point. However, in some cases, some files may remain after the abort. You can locate these files in the location you chose for your backups (blobstore or Google Cloud Storage) and safely delete them after the abort completes. The names of such files will start with the following pattern: datastore_backup__your_backup_name_.

Finding information about a backup

You may want to find out details about a backup, such as which entity kinds it contains, where it was saved (e.g., blobstore or Google Cloud Storage), and its starting and ending time. To display this backup information:

  1. Select one or more backups in the Backups or Pending Backups list.
  2. Click Info to display information for those backups.
  3. Click Back to return to the main Datastore Admin screen

Scheduled backups

You can run scheduled backups using the App Engine Cron service. For details, see Scheduled Backups.

Restoring data

To restore from a backup:

  1. Optionally, disable datastore writes for your app. (It's normally a good idea to do this to avoid conflicts between the restore and any new data written to the datastore.)
  2. Go to the Datastore Admin screen in the Data section of the Administration Console.
  3. In the list of available backups, select the backup that you want to restore from.
  4. Click Restore.
  5. In the advisory page that is displayed, notice the list of entities with checkboxes. By default, all of the entities will be restored. Uncheck the checkbox next to each entity that you don't want to restore.
  6. Also in the advisory page, notice that the default queue, with its pre-configured performance settings, is used for the restore job. Change this to another queue that you have configured differently if you need different queue performance characteristics, making sure the queue chosen does not have any target specified in queue.yaml other than ah-builtin-python-bundle.
  7. Start the restore by clicking Restore. Notice that a job status page is displayed.
  8. If you disabled writes, re-enable datastore writes for your application.

Restoring data to another app

If you back up your data using Google Cloud Storage, you can restore backups to apps other than the one used to create the backup.

To restore backup data from one app to a different app:

  1. Using the Google API Console, locate the project that has the bucket used for your backups and add the target app (the app you are restoring to) to the project team with Edit permissions. Alternatively, modify the ACLs for both the bucket and also for the objects that are saved to the bucket. (Make sure you specify FULL_CONTROL for the permissions default object ACL.).
  2. Make a new backup in your applications whose data is to be copied. The permissions set in the previous step are not retroactive to existing backups, so the target app will not be able to access those earlier backups. The target app can access only backups made after it was given permissions.
  3. Optionally, disable datastore writes for your target app.(It's normally a good idea to do this to avoid conflicts between the restore and any new data written to the datastore.)
  4. For the target app, go to the Datastore Admin screen in the Data section of the Administration Console.
  5. In the textbox next to the button labelled Import Backup Information specify the bucket containing the backup, in the format /gs/my_bucket: this will result in a displayed list of all the backups in that bucket. Alternatively, supply the file handle for a specific backup; the handle can be obtained from the source application by selecting the backup and clicking Info; the file handle appears next to the label Handle.
  6. Click Import Backup Information.
  7. The resulting selection page shows the available backups for the bucket you specified, unless you specified a backup by its handle. Select the desired backup and click one of the following:
    • Add to Backup List if you want this backup to be retained in the list of available backups for your app.
    • Restore From Backup if you want to restore from this backup but do not want the backup displayed in the list of available backups for your app.
  8. In the advisory page that is displayed, notice the list of entities with checkboxes. By default, all of the entities will be restored. Uncheck the checkbox next to each entity that you don't want to restore.
  9. Also in the advisory page, notice that the default queue, with its pre-configured performance settings, is used for the restore job. Change this to another queue that you have configured differently if you need different queue performance characteristics.
  10. Start the restore by clicking Restore. Notice that a job status page is displayed.
  11. If you disabled writes, re-enable datastore writes for your application.

Copying entities to another application

You can use the Datastore Admin tab of the Admin Console to copy all entities of a kind, or all entities of all kinds, to another application. The datastore copy functionality uses the Datastore Admin screen in the Data section of the source application's Admin Console. The default queue, with its pre-configured performance settings, is used for the copying job. Change this to another queue that you have configured differently if you need different queue performance characteristics.

From the Datastore Admin screen, you can select and copy entity kind(s) with the click of a button:

A note for Java developers

The datastore copy feature is currently available only for Python applications. If your target app is built in Java, you'll need to create a non-default Python runtime for it and use that as the target application. The following steps describe how to create a non-default Python runtime for your target application using a sample application called datastore_admin:

  1. Download Python 2.7.
  2. Download the datastore_admin app.
  3. Download the Python SDK.
  4. Grant permission for the source application to write to the target application, as described in step 3 of Procedure for copying a datastore.
  5. Run appcfg.py -A <your_app_id> update <directory-of-demo-app>

Procedure for copying a datastore

To copy a datastore:

  1. Make sure Datastore Admin is enabled for your application.
  2. Enable the remote_api builtin for the target application so it can receive data from the source. To do this, add the following to app.yaml:

    builtins:
    - remote_api: on
    
  3. Grant permission for the source app to write to the target app by doing the following:

    1. Add the following to the appengine_config.py file in the root directory of your application:

      remoteapi_CUSTOM_ENVIRONMENT_AUTHENTICATION = ('HTTP_X_APPENGINE_INBOUND_APPID',['source appid here'])
      

      If you do not have a appengine_config.py file, you can create a new one or copy the sample located in google/appengine/ext/appstats/sample_appengine_config.py.

    2. Upload the modified version of your application using appcfg.py update.

  4. Set the source application to read-only mode.

    Although this step is not required, it is strongly recommended. Copying entities into a new datastore takes time and writes done during the copy might not be transferred during the copy. If the source application is not in read-only mode, you can still copy data, but you'll see a notreadonly warning in the Admin Console. The destination datastore is not guaranteed to receive a complete copy of the new data unless writes are disabled.

  5. Select the entity kind(s) to copy individually or in bulk, and copy them using Copy To Other App. On the confirmation screen, enter the remote endpoint of the target app:

    • The typical remote endpoint is http://_your_target_app_id_.appspot.com/_ah/remote_api.
    • If you used the sample app to copy your data, the remote endpoint is http://datastore-admin._your_target_app_id_.appspot.com/_ah/remote_api.
    • If you used an alternate major version, the remote endpoint of your target application is http://app_version._your_target_app_id_.appspot.com/_ah/remote_api.

    After you confirm, the system validates the request. If the remote_api connection can be established, one or more mapreduce operations begins to copy the data. You can follow the link to see the status of the initial set of mapreduce operations. If the application uses the namespace feature, a mapreduce runs for each namespace for each entity kind.

    You can view a summary of the copy status from the Datastore Admin page. To view the individual mapreduce status, you can visit http://_your_target_app_id_.appspot.com/_ah/mapreduce/

  6. If you disabled datastore writes as recommended prior to the copy, re-enable writes.

Deleting entities in bulk

You can use the Datastore Admin tab of the Admin Console to delete all entities of a kind, or all entities of all kinds, in all namespaces. To enable this feature, simply enable Datastore Admin for your application in the Administration Console.

Adding this builtin enables the Datastore Admin screen in the Data section of the Administration Console. From this screen, you can select the entity kind(s) to delete individually or in bulk, and delete them using the Delete Entities button.

Authentication required

You need to be signed in with Google+ to do that.

Signing you in...

Google Developers needs your permission to do that.