Google Cloud Storage Client Library Overview
- About Google Cloud Storage (GCS)
- About the Google Cloud Storage (GCS) client library
- Where to download the GCS client library
- What you need to do to use the GCS client library
- Key concepts of Google Cloud Storage
- Using GCS client library with the development app server
- Pricing, quotas, and limits
- What to do next
About Google Cloud Storage (GCS)
Google Cloud Storage is useful for storing and serving large files. Additionally, Cloud Storage offers the use of access control lists (ACLs), and the ability to resume upload operations if they're interrupted, and many other features. (The GCS client library makes use of this resume capability automatically for your app, providing you with a robust way to stream data into GCS.)
About the Google Cloud Storage (GCS) client library
The GCS client library lets your application read files from and write files to buckets in Google Cloud Storage (GCS). This library supports reading and writing large amounts of data to GCS, with internal error handling and retries, so you don't have to write your own code to do this. Moreover, it provides read buffering with prefetch so your app can be more efficient.
The GCS client library provides the following functionality:
openmethod that returns a file-like buffer on which you can invoke standard Python file operations for reading and writing.
listbucketmethod for listing the contents of a GCS bucket.
statmethod for obtaining metadata about a specific file.
deletemethod for deleting files from GCS.
Where to download the GCS client library
For download instructions and distribution contents, see the downloads page.
What you need to do to use the GCS client library
There are two different options to choose from:
- An application can use the default GCS bucket, which provides an already configured bucket with free quota.
- If you cannot use or don't want to use the default GCS bucket, you must first activate a Cloud project for GCS as described on the activation page.
Alternative methods for accessing Google Cloud Storage
The GCS client library provides a way to read from and write to Google Cloud Storage that is closely integrated with Google App Engine, enabling App Engine apps to create objects in GCS and serve them from GCS.
However, there are other ways to access GCS from App Engine besides using the GCS client library. You can use any of these methods as well:
You can use the Blobstore API to upload objects to and serve objects from GCS using the BlobStore API. You'll need to use the create_gs_key() function to create a blob key representing the GCS object. This approach is useful for uploading files from a web page. When the Blobstore API is used together with the Images API, you get a powerful way to serve images, because you can serve images directly from GCS, bypassing the App Engine app, which saves on instance hour costs.
GCS REST API
You can use the Cloud Storage REST API directly to read and write data to GCS. The GCS client library actually uses the Cloud Storage REST API. However, the GCS REST API lacks the App Engine optimizations already done for you by the GCS client library, so you may be doing unnecessary work if you use the GCS REST API directly. If the GCS client library lacks some feature you need, and the REST API supplies that feature, using the REST API may be a good option.
Cloud Storage Viewer
If you need to upload objects quickly and don't mind a manual process, you can use the Cloud Storage Viewer on your project, which is accessible by clicking on Cloud Storage once your project is open in the Google Developers Console.
Key concepts of Google Cloud Storage
For complete details on GCS, incuding a complete description of concepts, you need to refer to the GCS documentation. The following brief synopsis of some GCS features impacting the GCS client library are provided as a convenience.
Buckets, objects, and ACLs
The storage location you read files from and write files to is a GCS bucket. GCS client library calls always specify the bucket being used. Your project can access multiple buckets. Notice that there are no client library calls currently for creating GCS buckets, so you need to create these upfront by clicking on Cloud Storage once your project is open in the Google Developers Console. Alternatively, you can use the gsutil tool provided by GCS.
Access to the buckets and to the objects contained in them is controlled by an access control list (ACL). Your Google Cloud project and your App Engine app are added to the ACL permitting bucket access during activation. The ACL governing bucket access is distinct from the potentially many ACLs governing the objects in that bucket. Thus, your app has read and write priviledges to the bucket(s) it is activated for, but it only has full rights to the objects it creates in the bucket. Your app's access to objects created by other apps or persons is limited to the rights given your app by the objects' creator.
If an object is created in the bucket without an ACL explicitly defined for it, it uses whatever default object ACL has been assigned to the bucket by the bucket owner. If the bucket owner has not specified a default object ACL, the object default is
public-read, which means that anyone allowed bucket access can read the object.
ACLs and the GCS Client Library
An app using the GCS client library cannot change the bucket ACL, but it can specify an ACL that controls access to the objects it creates. The available ACL settings are described under documentation for the open method .
Modifying GCS objects
Once you have created an object in a bucket, it cannot be modified (no appending). To modify an object in a bucket, you need to overwrite the object with a new object of the same name that contains your desired changes.
GCS and "subdirectories"
Google Cloud Storage documentation refers to "subdirectories" and the GCS client library allows you to supply subdirectory delimiters when you create an object. However, GCS does not actually store the objects into any real subdirectory. Instead, the subdirectories are simply part of the object filename. For example, if I have a bucket my_bucket and store the file
somewhere/over/the/rainbow.mp3, the file
rainbow.mp3 is not really stored in the subdirectory
somewhere/over/the/. It is actually a file named
somewhere/over/the/rainbow.mp3. Understanding this is important for using
listbuckethas an optional directory emulation mode. See listbucket for more information.
Retries and exponential backoff
The GCS client library provides a configurable mechanism for automatic request retries in event of timeout failures when accessing GCS. The same mechanism also provides exponential backoff to determine an optimal processing rate. (For a description of exponential backoff in GCS, see the Google Cloud Storage documentation on backoff.)
To change the default values for retries and backoff, you use the RetryParams class.
Using GCS client library with the development app server
You can use the client library with the development server from SDK version 1.8.1 and greater. This provides GCS emulation using the local disk.
Pricing, quotas, and limits
There are no bandwidth charges associated with making GCS client library calls to Google Cloud Storage. There are operations and storage charges, however.
If you don't use the default GCS bucket, you'll need to sign up for GCS as described in the activation page. If you choose this, there is no free quota and any data stored at GCS is charged the usual GCS data storage fees. Cloud Storage is a pay-to-use service; you will be charged according to the Cloud Storage price sheet.
What to do next
To create, deploy, and run your app:
- Download the client library.
- Create an App Engine project and activate it for GCS.
- Optionally, if you have an existing app that uses the older Google Cloud Storage API, migrate your app.
- Go through the brief Getting Started instructions for a quick orientation in using the library.
- Upload and deploy your app to production App Engine.
- Test the app for expected behavior with GCS.