Google BigQuery

Jobs

Jobs are objects that manage asynchronous tasks such as loading data, exporting data, and running queries.

You can run multiple jobs concurrently in BigQuery, and completed jobs will be listed in the Jobs collection indefinitely.

For a list of methods for this resource, see the end of this page.

Resource representations

{
  "kind": "bigquery#job",
  "etag": etag,
  "id": string,
  "selfLink": string,
  "jobReference": {
    "projectId": string,
    "jobId": string
  },
  "configuration": {
    "query": {
      "query": string,
      "destinationTable": {
        "projectId": string,
        "datasetId": string,
        "tableId": string
      },
      "createDisposition": string,
      "writeDisposition": string,
      "defaultDataset": {
        "datasetId": string,
        "projectId": string
      },
      "priority": string,
      "preserveNulls": boolean,
      "allowLargeResults": boolean,
      "useQueryCache": boolean
    },
    "load": {
      "sourceUris": [
        string
      ],
      "schema": {
        "fields": [
          {
            "name": string,
            "type": string,
            "mode": string,
            "fields": [
              (TableFieldSchema)
            ],
            "description": string
          }
        ]
      },
      "destinationTable": {
        "projectId": string,
        "datasetId": string,
        "tableId": string
      },
      "createDisposition": string,
      "writeDisposition": string,
      "fieldDelimiter": string,
      "skipLeadingRows": integer,
      "encoding": string,
      "quote": string,
      "maxBadRecords": integer,
      "schemaInlineFormat": string,
      "schemaInline": string,
      "allowQuotedNewlines": boolean,
      "sourceFormat": string,
      "allowJaggedRows": boolean,
      "ignoreUnknownValues": boolean
    },
    "link": {
      "sourceUri": [
        string
      ],
      "destinationTable": {
        "projectId": string,
        "datasetId": string,
        "tableId": string
      },
      "createDisposition": string,
      "writeDisposition": string
    },
    "copy": {
      "sourceTable": {
        "projectId": string,
        "datasetId": string,
        "tableId": string
      },
      "destinationTable": {
        "projectId": string,
        "datasetId": string,
        "tableId": string
      },
      "createDisposition": string,
      "writeDisposition": string
    },
    "extract": {
      "sourceTable": {
        "projectId": string,
        "datasetId": string,
        "tableId": string
      },
      "destinationUri": string,
      "destinationUris": [
        string
      ],
      "printHeader": boolean,
      "fieldDelimiter": string,
      "destinationFormat": string
    },
    "dryRun": boolean
  },
  "status": {
    "state": string,
    "errorResult": {
      "reason": string,
      "location": string,
      "debugInfo": string,
      "message": string
    },
    "errors": [
      {
        "reason": string,
        "location": string,
        "debugInfo": string,
        "message": string
      }
    ]
  },
  "statistics": {
    "creationTime": long,
    "startTime": long,
    "endTime": long,
    "totalBytesProcessed": long,
    "query": {
      "totalBytesProcessed": long,
      "cacheHit": boolean
    },
    "load": {
      "inputFiles": long,
      "inputFileBytes": long,
      "outputRows": long,
      "outputBytes": long
    }
  }
}
Property name Value Description Notes
configuration nested object [Required] Describes the job configuration.
configuration.copy nested object An object that must be present when copying an existing table to another table.
configuration.copy.createDisposition string [Optional] Specifies whether the job is allowed to create new tables.

The following values are supported:
  • CREATE_IF_NEEDED: If the table does not exist, BigQuery creates the table. 
  • CREATE_NEVER: The table must already exist. If it does not, a 'notFound' error is returned in the job result. 
The default value is CREATE_IF_NEEDED.

Creation, truncation and append actions occur as one atomic update upon job completion.
configuration.copy.destinationTable nested object [Required] The destination table
configuration.copy.destinationTable.datasetId string [Required] The ID of the dataset that contains the table.
configuration.copy.destinationTable.projectId string [Required] The ID of the project that contains the table.
configuration.copy.destinationTable.tableId string [Required] The table ID.
configuration.copy.sourceTable nested object An object describing the BigQuery table to be copied.
configuration.copy.sourceTable.datasetId string [Required] The ID of the dataset that contains the table.
configuration.copy.sourceTable.projectId string [Required] The ID of the project that contains the table.
configuration.copy.sourceTable.tableId string [Required] The table ID.
configuration.copy.writeDisposition string [Optional] Specifies the action that occurs if the destination table already exists.

The following values are supported:
  • WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data. 
  • WRITE_APPEND: If the table already exists, BigQuery appends the data to the table. 
  • WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result. 
The default value is WRITE_EMPTY.

Each action is atomic and only occurs if BigQuery is able to complete the job successfully. Creation, truncation and append actions occur as one atomic update upon job completion.
configuration.dryRun boolean [Optional] If set, don't actually run this job. A valid query will return a mostly empty response with some processing statistics, while an invalid query will return the same error it would if it wasn't a dry run. Behavior of non-query jobs is undefined.
configuration.extract nested object [Pick one] Configures an extract job.
configuration.extract.destinationFormat string [Experimental] Optional and defaults to CSV. Format with which files should be exported. To export to CSV, specify "CSV". Tables with nested or repeated fields cannot be exported as CSV. To export to newline-delimited JSON, specify "NEWLINE_DELIMITED_JSON".
configuration.extract.destinationUri string [Pick one] DEPRECATED: Use destinationUris instead, passing only one URI as necessary. The fully-qualified Google Cloud Storage URI where the extracted table should be written.
configuration.extract.destinationUris[] list [Pick one] A list of fully-qualified Google Cloud Storage URIs where the extracted table should be written.
configuration.extract.fieldDelimiter string [Optional] Delimiter to use between fields in the exported data. Default is ','
configuration.extract.printHeader boolean [Optional] Whether to print out a header row in the results. Default is true.
configuration.extract.sourceTable nested object [Required] A reference to the table being exported.
configuration.extract.sourceTable.datasetId string [Required] The ID of the dataset that contains the table.
configuration.extract.sourceTable.projectId string [Required] The ID of the project that contains the table.
configuration.extract.sourceTable.tableId string [Required] The table ID.
configuration.link.createDisposition string [Optional] Specifies whether the job is allowed to create new tables.

The following values are supported:
  • CREATE_IF_NEEDED: If the table does not exist, BigQuery creates the table. 
  • CREATE_NEVER: The table must already exist. If it does not, a 'notFound' error is returned in the job result. 
The default value is CREATE_IF_NEEDED.

Creation, truncation and append actions occur as one atomic update upon job completion.
configuration.link.destinationTable nested object [Required] The destination table of the link job.
configuration.link.destinationTable.datasetId string [Required] The ID of the dataset that contains the table.
configuration.link.destinationTable.projectId string [Required] The ID of the project that contains the table.
configuration.link.destinationTable.tableId string [Required] The table ID.
configuration.link.sourceUri[] list [Required] URI of source table to link.
configuration.link.writeDisposition string [Optional] Specifies the action that occurs if the destination table already exists.

The following values are supported: 
  • WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data. 
  • WRITE_APPEND: If the table already exists, BigQuery appends the data to the table. 
  • WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result. 
The default value is WRITE_EMPTY

Each action is atomic and only occurs if BigQuery is able to complete the job successfully. Creation, truncation and append actions occur as one atomic update upon job completion.
configuration.load nested object [Pick one] Configures a load job.
configuration.load.allowJaggedRows boolean [Optional] Accept rows that are missing trailing optional columns. The missing values are treated as nulls. Default is false which treats short rows as errors. Only applicable to CSV, ignored for other formats.
configuration.load.allowQuotedNewlines boolean Indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file. The default value is false.
configuration.load.createDisposition string [Optional] Specifies whether the job is allowed to create new tables. 

The following values are supported: 
  • CREATE_IF_NEEDED: If the table does not exist, BigQuery creates the table. 
  • CREATE_NEVER: The table must already exist. If it does not, a 'notFound' error is returned in the job result. 
The default value is CREATE_IF_NEEDED

Creation, truncation and append actions occur as one atomic update upon job completion.
configuration.load.destinationTable nested object [Required] The destination table to load the data into.
configuration.load.destinationTable.datasetId string [Required] The ID of the dataset that contains the table.
configuration.load.destinationTable.projectId string [Required] The ID of the project that contains the table.
configuration.load.destinationTable.tableId string [Required] The table ID.
configuration.load.encoding string [Optional] The character encoding of the data. The supported values are UTF-8 or ISO-8859-1. The default value is UTF-8.

BigQuery decodes the data after the raw, binary data has been split using the values of the quote and fieldDelimiter properties.
configuration.load.fieldDelimiter string [Optional] The separator for fields in a CSV file. BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. BigQuery also supports the escape sequence "\t" to specify a tab separator. The default value is a comma (',').
configuration.load.ignoreUnknownValues boolean [Optional] Accept rows that contain values that do not match the schema. The unknown values are ignored. Default is false which treats unknown values as errors. For CSV this ignores extra values at the end of a line. For JSON this ignores named values that do not match any column name.
configuration.load.maxBadRecords integer [Optional] The maximum number of bad records that BigQuery can ignore when running the job. If the number of bad records exceeds this value, an 'invalid' error is returned in the job result and the job fails. The default value is 0, which requires that all records are valid.
configuration.load.quote string [Optional] The value that is used to quote data sections in a CSV file. BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. The default value is a double-quote ('"'). If your data does not contain quoted sections, set the property value to an empty string. If your data contains quoted newline characters, you must also set the allowQuotedNewlines property to true.
configuration.load.schema nested object [Optional] The schema for the destination table. The schema can be omitted if the destination table already exists or if the schema can be inferred from the loaded data.
configuration.load.schema.fields[] list Describes the fields in a table.
configuration.load.schema.fields[].description string [Optional] The field description.
configuration.load.schema.fields[].fields[] list [Optional] Describes the nested schema fields if the type property is set to RECORD.
configuration.load.schema.fields[].mode string [Optional] The field mode. Possible values include NULLABLE, REQUIRED and REPEATED. The default value is NULLABLE.
configuration.load.schema.fields[].name string [Required] The field name.
configuration.load.schema.fields[].type string [Required] The field data type. Possible values include STRING, INTEGER, FLOAT, BOOLEAN, TIMESTAMP or RECORD (where RECORD indicates that the field contains a nested schema).
configuration.load.schemaInline string [Deprecated] The inline schema. For CSV schemas, specify as "Field1:Type1[,Field2:Type2]*". For example, "foo:STRING, bar:INTEGER, baz:FLOAT".
configuration.load.schemaInlineFormat string [Deprecated] The format of the schemaInline property.
configuration.load.skipLeadingRows integer [Optional] The number of rows at the top of a CSV file that BigQuery will skip when loading the data. The default value is 0. This property is useful if you have header rows in the file that should be skipped.
configuration.load.sourceFormat string [Optional] The format of the data files. For CSV files, specify "CSV". For datastore backups, specify "DATASTORE_BACKUP". For newline-delimited JSON, specify "NEWLINE_DELIMITED_JSON". The default value is CSV.
configuration.load.sourceUris[] list [Required] The fully-qualified URIs that point to your data on Google Cloud Storage.
configuration.load.writeDisposition string [Optional] Specifies the action that occurs if the destination table already exists.

The following values are supported: 
  • WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data. 
  • WRITE_APPEND: If the table already exists, BigQuery appends the data to the table. 
  • WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result. 
The default value is WRITE_EMPTY

Each action is atomic and only occurs if BigQuery is able to complete the job successfully. Creation, truncation and append actions occur as one atomic update upon job completion.
configuration.query nested object [Pick one] Configures a query job.
configuration.query.allowLargeResults boolean If true, allows the query to produce arbitrarily large result tables at a slight cost in performance. Requires destination_table to be set.
configuration.query.createDisposition string [Optional] Specifies whether the job is allowed to create new tables. 

The following values are supported: 
  • CREATE_IF_NEEDED: If the table does not exist, BigQuery creates the table. 
  • CREATE_NEVER: The table must already exist. If it does not, a 'notFound' error is returned in the job result. 
The default value is CREATE_IF_NEEDED

Creation, truncation and append actions occur as one atomic update upon job completion.
configuration.query.defaultDataset nested object [Optional] Specifies the default dataset to use for unqualified table names in the query.
configuration.query.defaultDataset.datasetId string [Required] A unique ID for this dataset, without the project name.
configuration.query.defaultDataset.projectId string [Optional] The ID of the container project.
configuration.query.destinationTable nested object Describes the table where the query results should be stored. If not present, a new table will be created to store the results.
configuration.query.destinationTable.datasetId string [Required] The ID of the dataset that contains the table.
configuration.query.destinationTable.projectId string [Required] The ID of the project that contains the table.
configuration.query.destinationTable.tableId string [Required] The table ID.
configuration.query.preserveNulls boolean [Deprecated] This property is deprecated.
configuration.query.priority string [Optional] Specifies a priority for the query. Possible values include INTERACTIVE and BATCH. The default value is INTERACTIVE.
configuration.query.query string [Required] BigQuery SQL query to execute.
configuration.query.useQueryCache boolean [Optional] Whether to look for the result in the query cache. The query cache is a best-effort cache that will be flushed whenever tables in the query are modified. Moreover, the query cache is only available when a query does not have a destination table specified.
configuration.query.writeDisposition string [Optional] Specifies the action that occurs if the destination table already exists.

The following values are supported: 
  • WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data. 
  • WRITE_APPEND: If the table already exists, BigQuery appends the data to the table. 
  • WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result. 
The default value is WRITE_EMPTY

Each action is atomic and only occurs if BigQuery is able to complete the job successfully. Creation, truncation and append actions occur as one atomic update upon job completion.
etag etag [Output-only] A hash of this resource.
id string [Output-only] Opaque ID field of the job
jobReference nested object [Optional] An object that contains structured parts of the job ID. Reference describing the unique-per-user name of the job.
jobReference.jobId string [Required] ID of the job.
jobReference.projectId string [Required] Project ID being billed for the job.
kind bigquery#jobs [Output-only] The resource type. This property always returns the value bigquery#job.
statistics nested object [Output-only] Information about the job, including starting time and ending time of the job.
statistics.creationTime long [Output-only] Creation time of this job, in milliseconds since the epoch. This field will be present on all jobs.
statistics.endTime long [Output-only] End time of this job, in milliseconds since the epoch. This field will be present whenever a job is in the DONE state.
statistics.load nested object [Output-only] Statistics for a load job.
statistics.load.inputFileBytes long [Output-only] Number of bytes of source data in a joad job.
statistics.load.inputFiles long [Output-only] Number of source files in a load job.
statistics.load.outputBytes long [Output-only] Size of the loaded data in bytes. Note that while an import job is in the running state, this value may change.
statistics.load.outputRows long [Output-only] Number of rows imported in a load job. Note that while an import job is in the running state, this value may change.
statistics.query nested object [Output-only] Statistics for a query job.
statistics.query.cacheHit boolean [Output-only] Whether the query result was fetched from the query cache.
statistics.query.totalBytesProcessed long [Output-only] Total bytes processed for this job.
statistics.startTime long [Output-only] Start time of this job, in milliseconds since the epoch. This field will be present when the job transitions from the PENDING state to either RUNNING or DONE.
statistics.totalBytesProcessed long [Output-only] [Deprecated] Use the bytes processed in the query statistics instead.
status nested object [Output-only] The status of this job. Examine this value when polling an asynchronous job to see if the job is complete.
status.errorResult nested object An object that will only be present if the job has failed.
status.errorResult.debugInfo string Debugging information. This property is internal to Google and should not be used.
status.errorResult.location string Specifies where the error occurred, if present.
status.errorResult.message string A human-readable description of the error.
status.errorResult.reason string A short error code that summarizes the error.
status.errors[] list [Output-only] All errors encountered during the running of the job. Errors here do not necessarily mean that the job has completed or was unsuccessful.
status.errors[].debugInfo string Debugging information. This property is internal to Google and should not be used.
status.errors[].location string Specifies where the error occurred, if present.
status.errors[].message string A human-readable description of the error.
status.errors[].reason string A short error code that summarizes the error.
status.state string [Output-only] Running state of the job.

Methods

The following methods are supported:

get
Retrieves the specified job by ID.
getQueryResults
Retrieves the results of a query job.
insert
Starts a new asynchronous job.
list
Lists all the Jobs in the specified project that were started by the user.
query
Runs a BigQuery SQL query synchronously and returns query results if the query completes within a specified timeout.

Authentication required

You need to be signed in with Google+ to do that.

Signing you in...

Google Developers needs your permission to do that.