Machine Learning Glossary: Google Cloud

This page contains Google Cloud glossary terms. For all glossary terms, click here.

A

accelerator chip

#GoogleCloud

A category of specialized hardware components designed to perform key computations needed for deep learning algorithms.

Accelerator chips (or just accelerators, for short) can significantly increase the speed and efficiency of training and inference tasks compared to a general-purpose CPU. They are ideal for training neural networks and similar computationally intensive tasks.

Examples of accelerator chips include:

  • Google's Tensor Processing Units (TPUs) with dedicated hardware for deep learning.
  • NVIDIA's GPUs which, though initially designed for graphics processing, are designed to enable parallel processing, which can significantly increase processing speed.

B

batch inference

#TensorFlow
#GoogleCloud

The process of inferring predictions on multiple unlabeled examples divided into smaller subsets ("batches").

Batch inference can leverage the parallelization features of accelerator chips. That is, multiple accelerators can simultaneously infer predictions on different batches of unlabeled examples, dramatically increasing the number of inferences per second.

C

Cloud TPU

#TensorFlow
#GoogleCloud

A specialized hardware accelerator designed to speed up machine learning workloads on Google Cloud.

D

device

#TensorFlow
#GoogleCloud

An overloaded term with the following two possible definitions:

  1. A category of hardware that can run a TensorFlow session, including CPUs, GPUs, and TPUs.
  2. When training an ML model on accelerator chips (GPUs or TPUs), the part of the system that actually manipulates tensors and embeddings. The device runs on accelerator chips. In contrast, the host typically runs on a CPU.

H

host

#TensorFlow
#GoogleCloud

When training an ML model on accelerator chips (GPUs or TPUs), the part of the system that controls both of the following:

  • The overall flow of the code.
  • The extraction and transformation of the input pipeline.

The host typically runs on a CPU, not on an accelerator chip; the device manipulates tensors on the accelerator chips.

M

mesh

#TensorFlow
#GoogleCloud

In ML parallel programming, a term associated with assigning the data and model to TPU chips, and defining how these values will be sharded or replicated.

Mesh is an overloaded term that can mean either of the following:

  • A physical layout of TPU chips.
  • An abstract logical construct for mapping the data and model to the TPU chips.

In either case, a mesh is specified as a shape.

S

shard

#TensorFlow
#GoogleCloud

A logical division of the training set or the model. Typically, some process creates shards by dividing the examples or parameters into (usually) equal-sized chunks. Each shard is then assigned to a different machine.

Sharding a model is called model parallelism; sharding data is called data parallelism.

T

Tensor Processing Unit (TPU)

#TensorFlow
#GoogleCloud

An application-specific integrated circuit (ASIC) that optimizes the performance of machine learning workloads. These ASICs are deployed as multiple TPU chips on a TPU device.

TPU

#TensorFlow
#GoogleCloud

Abbreviation for Tensor Processing Unit.

TPU chip

#TensorFlow
#GoogleCloud

A programmable linear algebra accelerator with on-chip high bandwidth memory that is optimized for machine learning workloads. Multiple TPU chips are deployed on a TPU device.

TPU device

#TensorFlow
#GoogleCloud

A printed circuit board (PCB) with multiple TPU chips, high bandwidth network interfaces, and system cooling hardware.

TPU master

#TensorFlow
#GoogleCloud

The central coordination process running on a host machine that sends and receives data, results, programs, performance, and system health information to the TPU workers. The TPU master also manages the setup and shutdown of TPU devices.

TPU node

#TensorFlow
#GoogleCloud

A TPU resource on Google Cloud with a specific TPU type. The TPU node connects to your VPC Network from a peer VPC network. TPU nodes are a resource defined in the Cloud TPU API.

TPU Pod

#TensorFlow
#GoogleCloud

A specific configuration of TPU devices in a Google data center. All of the devices in a TPU Pod are connected to one another over a dedicated high-speed network. A TPU Pod is the largest configuration of TPU devices available for a specific TPU version.

TPU resource

#TensorFlow
#GoogleCloud

A TPU entity on Google Cloud that you create, manage, or consume. For example, TPU nodes and TPU types are TPU resources.

TPU slice

#TensorFlow
#GoogleCloud

A TPU slice is a fractional portion of the TPU devices in a TPU Pod. All of the devices in a TPU slice are connected to one another over a dedicated high-speed network.

TPU type

#TensorFlow
#GoogleCloud

A configuration of one or more TPU devices with a specific TPU hardware version. You select a TPU type when you create a TPU node on Google Cloud. For example, a v2-8 TPU type is a single TPU v2 device with 8 cores. A v3-2048 TPU type has 256 networked TPU v3 devices and a total of 2048 cores. TPU types are a resource defined in the Cloud TPU API.

TPU worker

#TensorFlow
#GoogleCloud

A process that runs on a host machine and executes machine learning programs on TPU devices.