Create a dataset

Creating a dataset is a two-step process:

  1. Make a request to create the dataset.

  2. Make a request to upload data to the dataset.

Prerequisites

When creating a dataset:

  • Display names must be unique within your Google Cloud project.
  • Display names must be less than 64 bytes (Because these characters are represented in UTF-8, in some languages each character can be represented by multiple bytes).
  • Descriptions must be less than 1000 bytes.

When uploading data:

  • The supported file types are CSV, GeoJSON, and KML.
  • The maximum supported file size is 350 MB.
  • Attribute column names cannot begin with the string "?_".
  • Three-dimensional geometries are not supported. This includes the "Z" suffix in the WKT format, and the altitude coordinate in the GeoJSON format.

GeoJSON requirements

Maps Datasets API supports the current GeoJSON specification. Maps Datasets API also support GeoJSON files that contain any of the following object types:

  • Geometry objects. A geometry object is a spatial shape, described as a union of points, lines, and polygons with optional holes.
  • Feature objects. A feature object contains a geometry plus additional name/value pairs, whose meaning is application-specific.
  • Feature collections. A feature collection is a set of feature objects.

Maps Datasets API does not support GeoJSON files that have data in a coordinate reference system (CRS) other than WGS84.

For more information on GeoJSON, see RFC 7946 compliant.

KML requirements

Maps Datasets API has the following requirements:

  • All URLs must be local (or relative) to the file itself.
  • Point, line, and polygon geometries supported.
  • All data attributes are considered strings.
The following KML features are not supported:
  • Icons or <styleUrl> defined outside of the file.
  • Network links, such as <NetworkLink>
  • Ground overlays, such as <GroundOverlay>
  • 3D geometries or any altitude-related tags such as <altitudeMode>
  • Camera specifications such as <LookAt>
  • Styles defined inside the KML file.

CSV requirements

For CSV files, the supported column names are listed below in order of priority:

  • latitude, longitude
  • lat, long
  • x, y
  • wkt (Well-Known Text)
  • address, city, state, zip
  • address
  • A single column containing all address information, such as 1600 Amphitheatre Parkway Mountain View, CA 94043

For example, your file contains columns named x, y, and wkt. Because x and y have a higher priority, as determined by the order of supported column names in the list above, the values in the x and y columns are used and the wkt column is ignored.

In addition:

  • Each column name must belong to a single column. That is, you cannot have a column named xy that contains both x and y coordinate data. The x and y coordinates must be in separate columns.
  • Column names are case-insensitive.
  • The order of the column names does not matter. For example, if your CSV file contains lat and long columns, they can occur in any order.

Handling data upload errors

When uploading data to a dataset, you might experience one of the common errors described in this section.

GeoJSON errors

Common GeoJSON errors include:

  • Missing type field, or the type is not a string. The uploaded GeoJSON data file must contain a string field named type as part of each Feature object and Geometry object definition.

KML errors

Common KML errors include:

  • The data file must not contain any of the unsupported KML features listed above, otherwise the data import might fail.

CSV errors

Common CSV errors include:

  • Some rows are missing values for a geometry column. All rows in a CSV file must contain non-empty values for the geometry columns. The geometry columns include:
    • latitude, longitude
    • lat, long
    • x, y
    • wkt
    • address, city, state, zip
    • address
    • A single column containing all address information, such as 1600 Amphitheatre Parkway Mountain View, CA 94043
  • If x and y are your geometry columns, ensure that the units are longitude and latitude. Some public datasets use different coordinate systems under the headers x and y. If the wrong units are used, the dataset might import successfully, but the rendered data can show the dataset points in unexpected locations.

Make a request to create the dataset

Create a dataset by sending a POST request to the datasets endpoint:

https://mapsplatformdatasets.googleapis.com/v1/projects/PROJECT_NUMBER_OR_ID/datasets

Pass a JSON body to the request defining the dataset. You must:

  • Specify the displayName of the dataset. The value of displayName must be unique for all datasets.

  • Set usage to USAGE_DATA_DRIVEN_STYLING.

For example:

curl -X POST -d '{
    "displayName": "My Test Dataset", 
    "usage": "USAGE_DATA_DRIVEN_STYLING"
  }' \
  -H 'X-Goog-User-Project: PROJECT_NUMBER_OR_ID' \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $TOKEN" \
  https://mapsplatformdatasets.googleapis.com/v1/projects/PROJECT_NUMBER_OR_ID/datasets

The response contains the ID of the dataset, in the form projects/PROJECT_NUMBER_OR_ID/datasets/DATASET_ID along with additional information. Use the dataset ID when making requests to update or modify the dataset.

{
  "name": "projects/PROJECT_NUMBER_OR_ID/datasets/f57074a0-a8b6-403e-9df1-e9fc46",
  "displayName": "My Test Dataset",
  "usage": [
    "USAGE_DATA_DRIVEN_STYLING"
  ],
  "createTime": "2022-08-15T17:50:00.189682Z",
  "updateTime": "2022-08-15T17:50:00.189682Z" 
}

Make a request to upload data to the dataset

After you create the dataset, upload the data from Google Cloud Storage or from a local file to the dataset.

Upload data from Cloud Storage

You upload from Cloud Storage to your dataset by sending a POST request to the datasets endpoint that also includes the ID of the dataset:

https://mapsplatformdatasets.googleapis.com/v1/projects/PROJECT_NUMBER_OR_ID/datasets/DATASET_ID:import

In the JSON request body:

  • Use inputUri to specify the file path to the resource containing the data in Cloud Storage. This path is in the form gs://GCS_BUCKET/FILE.

    The user making the request requires the Storage Object Viewer role, or any other role that includes the storage.objects.get permission. For more information about managing access to Cloud Storage, see Overview of access control.

  • Use fileFormat to specify the file format of the data as either: FILE_FORMAT_GEOJSON (GeoJson file), FILE_FORMAT_KML (KML file), or FILE_FORMAT_CSV (CSV file).

For example:

curl -X POST  -d '{
    "gcs_source":{
      "inputUri": "gs://my_bucket/my_csv_file",
      "fileFormat": "FILE_FORMAT_CSV"
    }
  }' \
  -H 'X-Goog-User-Project: PROJECT_NUMBER_OR_ID' \
  -H "content-type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  https://mapsplatformdatasets.googleapis.com/v1/projects/PROJECT_NUMBER_OR_ID/datasets/f57074a0-a8b6-403e-9df1-e9fc46:import

The response is in the form:

{
  "name": "projects/PROJECT_NUMBER_OR_ID/datasets/DATASET_ID@VERSION_NUMBER"
}

Upload data from a file

To upload data from a file, send an HTTP POST request to the datasets endpoint that also includes the ID of the dataset::

https://mapsplatformdatasets.googleapis.com/upload/v1/projects/PROJECT_NUMBER_OR_ID/datasets/DATASET_ID:import

The request contains:

  • The Goog-Upload-Protocol header is set to multipart.

  • The metadata property specifying the path to a file that specifies the type of data to upload, as either: FILE_FORMAT_GEOJSON (GeoJSON file), FILE_FORMAT_KML (KML file), or FILE_FORMAT_CSV (CSV file).

    The contents of this file have the following format:

    {"local_file_source": {"file_format": "FILE_FORMAT_GEOJSON"}}
  • The rawdata property specifying the path to the GeoJSON, KML, or CSV file containing the data to upload.

The following request uses the curl -F option to specify the path to the two files:

curl -X POST \
  -H 'X-Goog-User-Project: PROJECT_NUMBER_OR_ID' \
  -H "Authorization: Bearer $TOKEN" \
  -H "X-Goog-Upload-Protocol: multipart" \
  -F "metadata=@csv_metadata_file" \
  -F "rawdata=@csv_data_file" \
  https://mapsplatformdatasets.googleapis.com/upload/v1/projects/PROJECT_NUMBER_OR_ID/datasets/f57074a0-a8b6-403e-9df1-e9fc46:import

The response is in the form:

{
  "name": "projects/PROJECT_NUMBER_OR_ID/datasets/DATASET_ID@VERSION_NUMBER"
}