Getting image data from Earth Engine
To get image data from Earth Engine to Google Drive, Cloud Storage, or an Earth Engine asset,
you can use
Export and the job
is handled entirely by Earth Engine. If your export jobs have scaling issues (e.g.
take longer than a day, return memory or timeout errors) or you're already familiar
with a framework like Apache Beam,
Spark or Dask,
you may prefer the data extraction methods described here. Workflows implemented in these
frameworks can be scaled using Google Cloud tools such as
Specifically, this guide describes methods for manually making requests
for image data using
Here, "image data" means multi-dimensional arrays of pixel values with consistent
scale and projection. The region, scale, projection and/or dimensions are specified
in the request. The
ImageFileFormat page lists
possible output formats. Output destinations include Cloud Storage or any locally mounted
directory. Manual requests add complexity, but can scale to larger workloads.
Getting image data from existing assets
to get image data from existing Earth Engine assets. You
pass the asset ID directly to the request, so you can't do any computation on the pixels
prior to extracting them. A block of pixels in the specified region, scale, projection
and format is returned. The following example demonstrates getting time series of NDVI
from a MODIS image collection using
Getting image data from computed images
to get image data from a computed image, for example a composite. With
you pass a computed
ee.Image object through the
parameter. A block of computed pixels in the specified region, scale, projection and
format is returned. The following example shows getting patches of multispectral data
from a cloud-free Sentinel-2 composite.
Manual parallelization of requests
Though you can make requests for any purpose in any volume, you may want to parallelize requests for larger workflows. To make many such requests in parallel, you should use the Earth Engine high volume endpoint. The number of parallel requests you can have is set by your concurrent interactive request quota. See the Earth Engine high volume page for details on when to use the high volume endpoint.
You can use threads to make concurrrent requests. This approach is demonstrated in the
computePixels example notebooks.
You can use Apache Beam pipelines to parallelize requests. These pipelines can be run locally or as Google Dataflow jobs. For examples, see this Geo for Good training or this People, Planet and AI demonstration. Alternatively, other parallelization libraries include Dask and Apache Spark.