Google App Engine

Google Cloud Storage Python API Overview

Python |Java |PHP |Go

Deprecation Notice: The Files API feature is going to be removed at some time in the future, replaced by the Google Cloud Storage Client Library. The documentation for the deprecated API is retained for the convenience of developers still using the Files API.

The Google Cloud Storage API allows your application to store and serve your data as opaque blobs known as "objects". While Cloud Storage offers a RESTful API, this documentation describes how to use Cloud Storage on App Engine.

  1. Introducing Google Cloud Storage
  2. Prerequisites
  3. Using Google Cloud Storage
  4. Complete sample app
  5. Quotas and limits

Introducing Google Cloud Storage

Google App Engine provides an easier way to read from and write to Google Cloud Storage objects , which allows applications to create and serve data objects. These objects are stored within buckets in Cloud Storage but can be additionally accessed by Google App Engine applications through the Google Cloud Storage API. You can interact with the Google Cloud Storage API using the RESTful interface or through the Google Cloud Storage Python API for Google App Engine applications, which is discussed in this document.

Cloud Storage is useful for storing and serving large files because it does not have a size limit for your objects. Additionally, Cloud Storage offers the use of access control lists (ACLs), the ability to resume upload operations if they're interrupted, use of the POST policy document, and many other features.

Once you have created an object, it cannot be modified. To modify an existing object, you need to overwrite the object with a new object that contains your desired changes.

For more information, see the Google Cloud Storage pricing.

Prerequisites

To use the Cloud Storage API, you must complete the following prerequisites:

  1. Google Cloud Storage API for Python is available for App Engine SDKs 1.5.5 and later. So, if you are using an older SDK, you need to download the latest SDK.

    For instructions to update or download the App Engine SDK, see the downloads page.

  2. Activate Cloud Storage.

    Activate the Cloud Storage service in the Google Developers Console.

  3. Set up billing.

    Google Cloud Storage requires that you provide billing information. For details about charges, see the Google Cloud Storage pricing page.

  4. Create a bucket.

    Cloud Storage is made up of buckets, basic containers that hold your data. Buckets hold all your data in the form of objects, individual pieces of data that you upload to Google Cloud Storage. Before you can use the Google Cloud Storage API, you need to create a bucket(s) where you would like to store your data. The easiest way to do so is using the gsutil tool or the online Google Storage browser that is accessible through the Google Developers Console.

  5. Give permissions to your bucket or objects.

    To enable your app to create new objects in a bucket, you need to do the following:

    1. Log into the App Engine Admin Console.
    2. Click on the application you want to authorize for your Cloud Storage bucket.
    3. Click on Application Settings under the Administration section on the left-hand side.
    4. Copy the value under Service Account Name. This is the service account name of your application, in the format application-id@appspot.gserviceaccount.com. If you are using an App Engine Premier Account, the service account name for your application is in the format application-id.example.com@appspot.gserviceaccount.com.
    5. Grant access permissions using one of the following methods:
      • The easiest way to grant app access to a bucket is to use the Google Developers Console to add the service account name of the app as a team member to the project that contains the bucket. You can do this under Permissions in the left sidebar of the Google Developers Console. The app should have edit permissions if it needs to write to the bucket. For information about permissions in Cloud Storage, see Scopes and Permissions. Add more apps to the project team if desired.

        Note: In some circumstances, you might not be able to add the service account as a team member. If you cannot add the service account, use the alternative method, bucket ACLs, as described next.

      • An alternate way to grant app access to a bucket is manually edit and set the bucket ACL and the default object ACL, using the gsutil utility:
        1. Get the ACL for the bucket and save it to a file for editing: gsutil acl get gs://mybucket > myAcl.txt
        2. Add the following Entry to the ACL file you just retrieved:
          <Entry>
            <Scope type="UserByEmail">
               <EmailAddress>
                  your-application-id@appspot.gserviceaccount.com
               </EmailAddress>
            </Scope>
            <Permission>
               WRITE
            </Permission>
          </Entry>
        3. If you are adding multiple apps to the ACL, repeat the above entry for each app, changing only the email address to reflect each app's service name.
        4. Set the modified ACL on your bucket: gsutil acl set myAcl.txt gs://mybucket
        5. You also need to set the default object ACL on the bucket. Many App Engine features require access to objects in the bucket, for example, the Datastore Admin backup and restore feature. To set the default object ACL on the bucket:
          1. Get the default object ACL for the bucket and save it to a file for editing: gsutil defacl get gs://mybucket > myDefAcl.txt
          2. Add the same entries that you added above to the bucket ACL, but replacing WRITE with FULL_CONTROL.
          3. Set the modified default object ACL on your bucket: gsutil defacl set myDefAcl.txt gs://mybucket
      • If you need to prevent non-authorized applications from reading certain objects, you can:
        1. Set the predefined project-private ACL on your objects manually
        2. Store your objects in a bucket with the default project-private object ACL
        3. Then, add the service account as a project viewer in the Google Developers Console, giving your App Engine application READ-ONLY access to your objects. You can also set the predefined public-read ACL, which allows all users, whether or not they are authenticated, access to read your object. You won't need to add your application's service account name as a project viewer, but if you have data you do not want to be publicly accessible, do not set the public-read ACL on your object.

          Note: Setting READ permission only on your buckets does not provide READ permission to any objects inside those buckets. You must set READ permission on objects individually or by setting the default ACL before creating any objects in the bucket.

Using Google Cloud Storage

Applications can use Cloud Storage to read from and write to large files of any type, or to upload, store, and serve large files from users. Files are called "objects" in Google Cloud Storage once they're uploaded.

Before you begin

You can use the Cloud Storage API from your App Engine application to read and write data to Cloud Storage. In order to use the Cloud Storage Python API, you must include the following import statement at the beginning of your code:

from google.appengine.api import files

Creating an object

To create an object, call the files.gs.create() function. You need to pass a file name, which must be of the form /gs/bucket_name/desired_object_name, and a MIME type:

# Create a file
filename = '/gs/my_bucket/my_file'
writable_file_name = files.gs.create(filename, mime_type='application/octet-stream', acl='public-read')

In this example, we also pass in the acl='public-read' parameter, as described below. If you do not set this parameter, Cloud Storage sets this parameter as null and uses the default object ACL for that bucket (by default, this is project-private).

You can use the following optional parameters to set ACLs and HTTP headers on your object:

Parameter Description Usage Default
acl Set a predefined ACL on your object. acl='public_read'
If you do not set this parameter, Cloud Storage sets this parameter as null and uses the default object ACL for that bucket (by default, this is project-private).
None
cache_control Set a Cache-Control header on your object. cache_control='no-cache' 'max-age=3600'
content_encoding If your object is compressed, specify the compression method using the Content-Encoding header. content_encoding='gzip' None
content_disposition Set the Content-Disposition header for your object. content_disposition='attachment;filename=filename.ext' None
user_metadata Set a Dictionary data set of custom headers and values. params = {'header1':'value1', 'header2':'value2'}
gs.create(... user_metadata=params)
These custom headers are added to the object in the format:

x-goog-meta-custom_header: custom_value
None

The following code snippet provides an example using some of these optional parameters:

params = {'date-created':'092011', 'owner':'Jon'}
files.gs.create(filename='/gs/my_bucket/my_object',
          acl='public-read',
          mime_type='text/html',
          cache_control='no-cache',
          user_metadata=params)

Opening and writing to the object

Before you can write to the object, you must open the object for writing by calling files.open(), passing in the file and 'a' as parameters. Then, write to the object using files.write():

# Open and write the file.
with files.open(writable_file_name, 'a') as f:
    f.write('Hello World!')
    f.write('This is my first Google Cloud Storage object!')
    f.write('How exciting!')

The parameter 'a' opens the file for writing. If you want to open a file for read, pass in the parameter 'r', as described below.

Finalizing the object

Once you are done writing to the object, close it by calling files.finalize():

# Finalize the file.
files.finalize(writable_file_name)

Once you finalize an object, you can no longer write to it. If you want to modify your file, you will need to create a new object with the same name to overwrite it.

Reading the object

Before you can read an object, you must finalize the object.

To read an object, call files.open(), passing in the full file name in the format '/gs/bucket/object', and 'r' as parameters. Passing 'r' opens the file for reads, as opposed to passing 'a', which opens the file for write. f.read() takes the number of bytes to read as a parameter:

# Open and read the file.
print 'Opening file', filename
with files.open(filename, 'r') as f:
    data = f.read(1000)
    while data:
        print data
        data = f.read(1000)

The maximum size of bytes that can be read by an application with one API call is 32 megabytes.

Complete sample app

The following is a sample application that demonstrates one use of the Cloud Storage API within an App Engine application. In order to successfully run the application, you must be an owner or have WRITE access to a bucket.

The sample application demonstrates two different use cases of the Cloud Storage API:

  • Making one-time requests

    The EchoPage function allows you to make a one-time request to create, write, and read to an object using the request format http://your_app_id.appspot.com/echo?message=Your+Message.

  • Making multiple requests over time

    The MainPage, CreatePage, AppendPage, and ReadPage functions create and maintain an event log that stores individual events as objects. This demonstrates how to use the Cloud Storage API to make multiple requests to an object.

    The main page of this application loads a form field with the name Title for the title of the event (and the object). After you provide a name and click the New Event Log button, the next page provides another form field to enter a message to write to the object. Enter your message and click Append Field. If you want to finalize the object, which means you cannot append to it later, check the Finalize log box. Your new object will appear on the main page.

To set up and run this application:

  1. Follow the instructions to create a simple Hello World app on the first Python Hello World page.
  2. Edit your app.yaml file and replace the value of the application: setting with the registered ID of your App Engine application.
  3. Replace the contents of helloworld.py with the following sample code, replacing the parts where it asks for a bucket name.

    # Copyright 2011 Google Inc. All Rights Reserved.
    """Create, Write, Read and Finalize Google Cloud Storage objects.
    
    EchoPage will create, write and read the Cloud Storage object in one request:
      http://your_app_id.appspot.com/echo?message=Leeroy+Jenkins
    
    MainPage, CreatePage, AppendPage and ReadPage will do the same in multiple
    requests:
      http://your_app_id.appspot.com/
    """
    
    import cgi
    import urllib
    import webapp2
    from google.appengine.api import files
    from google.appengine.ext import db
    
    try:
        files.gs
    except AttributeError:
        import gs
        files.gs = gs
    
    
    class EchoPage(webapp2.RequestHandler):
      """A simple echo page that writes and reads the message parameter."""
    
        # TODO: Change to a bucket your app can write to.
        READ_PATH = '/gs/bucket/obj'
    
        def get(self):
            # Create a file that writes to Cloud Storage and is readable by everyone
            # in the project.
            write_path = files.gs.create(self.READ_PATH, mime_type='text/plain',
                                         acl='public-read')
            # Write to the file.
            with files.open(write_path, 'a') as fp:
                fp.write(self.request.get('message'))
    
            # Finalize the file so it is readable in Google Cloud Storage.
            files.finalize(write_path)
    
            # Read the file from Cloud Storage.
            with files.open(self.READ_PATH, 'r') as fp:
                buf = fp.read(1000000)
                while buf:
                    self.response.out.write(buf)
                    buf = fp.read(1000000)
    
    
    class EventLog(db.Model):
        """Stores information used between requests."""
        title = db.StringProperty(required=True)
        read_path = db.StringProperty(required=True)
        write_path = db.TextProperty(required=True)  # Too long for StringProperty
        finalized = db.BooleanProperty(default=False)
    
    
    class MainPage(webapp2.RequestHandler):
        """Prints a list of event logs and a link to create a new one."""
    
        def get(self):
            """Page to list event logs or create a new one.
    
            Web page looks like the following:
              Event Logs
                * Dog barking
                * Annoying Squirrels (ongoing)
                * Buried Bones
              [New Event Log]
            """
            self.response.out.write(
                """
                <html> <body>
                <h1>Event Logs</h1>
                <ul>
                """)
            # List all the event logs in the datastore.
            for event_log in db.Query(EventLog):
                # Each EventLog has a unique key.
                key_id = event_log.key().id()
                if event_log.finalized:
                    # Finalized events must be read
                    url = '/read/%d' % key_id
                    title = '%s' % cgi.escape(event_log.title)
                else:
                    # Can only append to writable events.
                    url = '/append/%d' % key_id
                    title = '%s (ongoing)' % cgi.escape(event_log.title)
                self.response.out.write(
                    """
                    <li><a href="%s">%s</a></li>
                    """ % (url, title))
            # A form to allow the user to create a new Cloud Storage object.
            self.response.out.write(
                """
                </ul>
                <form action="create" method="post">
                  Title: <input type="text" name="title" />
                  <input type="submit" value="New Event Log" />
                </form>
                </body> </html>
                """)
    
    
    class CreatePage(webapp2.RequestHandler):
        """Creates a Cloud Storage object that multiple requests can write to."""
    
        BUCKET = 'my-bucket'  # TODO: Change to a bucket your app can write to.
    
        def post(self):
            """Create a event log that multiple requests can build.
    
            This creates an appendable Cloud Storage object and redirects the user
            to the append page.
            """
            # Choose an interesting title for the event log.
            title = self.request.get('title') or 'Unnamed'
    
            # We will store the event log in a Google Cloud Storage object.
            # The Google Cloud Storage object must be in a bucket the app has access
            # to, and use the title for the key.
            read_path = '/gs/%s/%s' % (self.BUCKET, title)
            # Create a writable file that eventually become our Google Cloud Storage
            # object after we finalize it.
            write_path = files.gs.create(read_path, mime_type='text/plain')
            # Save these paths as well as the title in the datastore so we can find
            # this during later requests.
            event_log = EventLog(
                read_path=read_path, write_path=write_path, title=title)
            event_log.put()
            # Redirect the user to the append page, where they can start creating
            # the file.
            self.redirect('/append/%d?info=%s' % (
                event_log.key().id(), urllib.quote('Created %s' % title)))
    
    
    class AppendPage(webapp2.RequestHandler):
        """Appends data to a Cloud Storage object between multiple requests."""
    
        @property
        def key_id(self):
            """Extract 123 from /append/123."""
            return int(self.request.path[len('/append/'):])
    
        def get(self):
            """Display a form the user can use to build the event log.
    
            Web page looks like:
              Append to Event Title
    
              /--------------\
              |Log detail... |
              |              |
              |              |
              \--------------/
    
              [ ] Finalize log
              [ Append message ]
            """
    
            # Grab the title, which we saved to an EventLog object in the datastore.
            event_log = db.get(db.Key.from_path('EventLog', self.key_id))
            title = event_log.title
            # Display a form that allows the user to append a message to the log.
            self.response.out.write(
                """
                <html> <body>
                <h1>Append to %s</h1>
                <div>%s</div>
                <form method="post">
                <div><textarea name="message" rows="5" cols="80"></textarea></div>
                <input type="checkbox" name="finalize" value="1">Finalize log</input>
                <input type="submit" value="Append message" />
                </form>
                </body> </html>
                """ % (
                    cgi.escape(title),
                    cgi.escape(self.request.get('info')),
                    ))
    
        def post(self):
            """Append the message to the event log.
    
            Find the writable Cloud Storage path from the specified EventLog.
            Append the form's message to this path.
            Optionally finalize the object if the user selected the checkbox.
            Redirect the user to a page to append more or read the finalized object.
            """
            # Use the id in the post path to find the EventLog object we saved in the
            # datastore.
            event_log = db.get(db.Key.from_path('EventLog', self.key_id))
            # Get writable Google Cloud Storage path, which we saved to an EventLog
            # object in the datastore.
            write_path = event_log.write_path
            # Get the posted message from the form.
            message = self.request.get('message')
            # Append the message to the Google Cloud Storage object.
            with files.open(event_log.write_path, 'a') as fp:
                fp.write(message)
            # Check to see if the user is finished writing.
            if self.request.get('finalize'):
                # Finished writing.  Finalize the object so it becomes readable.
                files.finalize(write_path)
                # Tell the datastore that we finalized this object. This makes the
                # main page display a link that reads the object.
                event_log.finalized = True
                event_log.put()
                self.redirect('/read/%d' % self.key_id)
            else:
                # User is not finished writing.  Redirect to the append form.
                self.redirect('/append/%d?info=%s' % (
                    self.key_id, urllib.quote('Appended %d bytes' % len(message))))
    
    
    class ReadPage(webapp2.RequestHandler):
        """Reads a Cloud Storage object and prints it to the page."""
    
        @property
        def key_id(self):
            """Extract 123 from /read/123."""
            return int(self.request.path[len('/read/'):])
    
        def get(self):
            """Display the EventLog to the user.
    
            Web page looks like the following:
              Event Log Title
    
              Event log description.
              [ Download from Cloud Storage ]
            """
            self.response.out.write(
                """
                <html> <body>
                """)
            # Use the get request path to find the event log in the datastore.
            event_log = db.get(db.Key.from_path('EventLog', self.key_id))
            read_path = event_log.read_path
            title = event_log.title
            # Print the title
            self.response.out.write(
                """
                <h1>%s</h1>
                """ % cgi.escape(title))
            # Read the object from Cloud Storage and write it out to the web page.
            self.response.out.write(
                """
                <pre>
                """)
            with files.open(read_path, 'r') as fp:
                buf = fp.read(1000000)
                while buf:
                    self.response.out.write(cgi.escape(buf))
                    buf = fp.read(1000000)
            self.response.out.write(
                """
                </pre>
                """)
    
            self.response.out.write(
                """
                <div><a href="/">Event Logs</a></div>
                </body> </html>
                """)
    
    app = webapp2.WSGIApplication(
        [
            ('/create', CreatePage),
            ('/append/.*', AppendPage),
            ('/read/.*', ReadPage),
            ('/echo', EchoPage),
            ('/.*', MainPage),
        ], debug=True)
    
  4. Upload your application using the following command:

    appcfg.py update helloworld/
    
  5. View your application at http://your_app_id.appspot.com

Quotas and limits

Cloud Storage is a pay-to-use service; you will be charged according to the Cloud Storage price sheet.

Authentication required

You need to be signed in with Google+ to do that.

Signing you in...

Google Developers needs your permission to do that.