Machine Learning Glossary: TensorFlow

This page contains TensorFlow glossary terms. For all glossary terms, click here.

B

batch inference

#TensorFlow
#GoogleCloud

The process of inferring predictions on multiple unlabeled examples divided into smaller subsets ("batches").

Batch inference can leverage the parallelization features of accelerator chips. That is, multiple accelerators can simultaneously infer predictions on different batches of unlabeled examples, dramatically increasing the number of inferences per second.

C

Cloud TPU

#TensorFlow
#GoogleCloud

A specialized hardware accelerator designed to speed up machine learning workloads on Google Cloud Platform.

D

Dataset API (tf.data)

#TensorFlow

A high-level TensorFlow API for reading data and transforming it into a form that a machine learning algorithm requires. A tf.data.Dataset object represents a sequence of elements, in which each element contains one or more Tensors. A tf.data.Iterator object provides access to the elements of a Dataset.

For details about the Dataset API, see tf.data: Build TensorFlow input pipelines in the TensorFlow Programmer's Guide.

device

#TensorFlow
#GoogleCloud

An overloaded term with the following two possible definitions:

  1. A category of hardware that can run a TensorFlow session, including CPUs, GPUs, and TPUs.
  2. When training an ML model on accelerator chips (GPUs or TPUs), the part of the system that actually manipulates tensors and embeddings. The device runs on accelerator chips. In contrast, the host typically runs on a CPU.

E

eager execution

#TensorFlow

A TensorFlow programming environment in which operations run immediately. In contrast, operations called in graph execution don't run until they are explicitly evaluated. Eager execution is an imperative interface, much like the code in most programming languages. Eager execution programs are generally far easier to debug than graph execution programs.

Estimator

#TensorFlow

A deprecated TensorFlow API. Use tf.keras instead of Estimators.

F

feature engineering

#fundamentals
#TensorFlow

A process that involves the following steps:

  1. Determining which features might be useful in training a model.
  2. Converting raw data from the dataset into efficient versions of those features.

For example, you might determine that temperature might be a useful feature. Then, you might experiment with bucketing to optimize what the model can learn from different temperature ranges.

Feature engineering is sometimes called feature extraction.

feature spec

#TensorFlow

Describes the information required to extract features data from the tf.Example protocol buffer. Because the tf.Example protocol buffer is just a container for data, you must specify the following:

  • the data to extract (that is, the keys for the features)
  • the data type (for example, float or int)
  • The length (fixed or variable)

G

graph

#TensorFlow

In TensorFlow, a computation specification. Nodes in the graph represent operations. Edges are directed and represent passing the result of an operation (a Tensor) as an operand to another operation. Use TensorBoard to visualize a graph.

graph execution

#TensorFlow

A TensorFlow programming environment in which the program first constructs a graph and then executes all or part of that graph. Graph execution is the default execution mode in TensorFlow 1.x.

Contrast with eager execution.

H

host

#TensorFlow
#GoogleCloud

When training an ML model on accelerator chips (GPUs or TPUs), the part of the system that controls both of the following:

  • The overall flow of the code.
  • The extraction and transformation of the input pipeline.

The host typically runs on a CPU, not on an accelerator chip; the device manipulates tensors on the accelerator chips.

L

Layers API (tf.layers)

#TensorFlow

A TensorFlow API for constructing a deep neural network as a composition of layers. The Layers API enables you to build different types of layers, such as:

The Layers API follows the Keras layers API conventions. That is, aside from a different prefix, all functions in the Layers API have the same names and signatures as their counterparts in the Keras layers API.

M

mesh

#TensorFlow
#GoogleCloud

In ML parallel programming, a term associated with assigning the data and model to TPU chips, and defining how these values will be sharded or replicated.

Mesh is an overloaded term that can mean either of the following:

  • A physical layout of TPU chips.
  • An abstract logical construct for mapping the data and model to the TPU chips.

In either case, a mesh is specified as a shape.

metric

#TensorFlow

A statistic that you care about.

An objective is a metric that a machine learning system tries to optimize.

N

node (TensorFlow graph)

#TensorFlow

An operation in a TensorFlow graph.

O

operation (op)

#TensorFlow

In TensorFlow, any procedure that creates, manipulates, or destroys a Tensor. For example, a matrix multiply is an operation that takes two Tensors as input and generates one Tensor as output.

P

Parameter Server (PS)

#TensorFlow

A job that keeps track of a model's parameters in a distributed setting.

Q

queue

#TensorFlow

A TensorFlow Operation that implements a queue data structure. Typically used in I/O.

R

rank (Tensor)

#TensorFlow

The number of dimensions in a Tensor. For instance, a scalar has rank 0, a vector has rank 1, and a matrix has rank 2.

Not to be confused with rank (ordinality).

root directory

#TensorFlow

The directory you specify for hosting subdirectories of the TensorFlow checkpoint and events files of multiple models.

S

SavedModel

#TensorFlow

The recommended format for saving and recovering TensorFlow models. SavedModel is a language-neutral, recoverable serialization format, which enables higher-level systems and tools to produce, consume, and transform TensorFlow models.

See the Saving and Restoring chapter in the TensorFlow Programmer's Guide for complete details.

Saver

#TensorFlow

A TensorFlow object responsible for saving model checkpoints.

shard

#TensorFlow
#GoogleCloud

A logical division of the training set or the model. Typically, some process creates shards by dividing the examples or parameters into (usually) equal-sized chunks. Each shard is then assigned to a different machine.

Sharding a model is called model parallelism; sharding data is called data parallelism.

summary

#TensorFlow

In TensorFlow, a value or set of values calculated at a particular step, usually used for tracking model metrics during training.

T

Tensor

#TensorFlow

The primary data structure in TensorFlow programs. Tensors are N-dimensional (where N could be very large) data structures, most commonly scalars, vectors, or matrices. The elements of a Tensor can hold integer, floating-point, or string values.

TensorBoard

#TensorFlow

The dashboard that displays the summaries saved during the execution of one or more TensorFlow programs.

TensorFlow

#TensorFlow

A large-scale, distributed, machine learning platform. The term also refers to the base API layer in the TensorFlow stack, which supports general computation on dataflow graphs.

Although TensorFlow is primarily used for machine learning, you may also use TensorFlow for non-ML tasks that require numerical computation using dataflow graphs.

TensorFlow Playground

#TensorFlow

A program that visualizes how different hyperparameters influence model (primarily neural network) training. Go to http://playground.tensorflow.org to experiment with TensorFlow Playground.

TensorFlow Serving

#TensorFlow

A platform to deploy trained models in production.

Tensor Processing Unit (TPU)

#TensorFlow
#GoogleCloud

An application-specific integrated circuit (ASIC) that optimizes the performance of machine learning workloads. These ASICs are deployed as multiple TPU chips on a TPU device.

Tensor rank

#TensorFlow

See rank (Tensor).

Tensor shape

#TensorFlow

The number of elements a Tensor contains in various dimensions. For example, a [5, 10] Tensor has a shape of 5 in one dimension and 10 in another.

Tensor size

#TensorFlow

The total number of scalars a Tensor contains. For example, a [5, 10] Tensor has a size of 50.

tf.Example

#TensorFlow

A standard protocol buffer for describing input data for machine learning model training or inference.

tf.keras

#TensorFlow

An implementation of Keras integrated into TensorFlow.

TPU

#TensorFlow
#GoogleCloud

Abbreviation for Tensor Processing Unit.

TPU chip

#TensorFlow
#GoogleCloud

A programmable linear algebra accelerator with on-chip high bandwidth memory that is optimized for machine learning workloads. Multiple TPU chips are deployed on a TPU device.

TPU device

#TensorFlow
#GoogleCloud

A printed circuit board (PCB) with multiple TPU chips, high bandwidth network interfaces, and system cooling hardware.

TPU master

#TensorFlow
#GoogleCloud

The central coordination process running on a host machine that sends and receives data, results, programs, performance, and system health information to the TPU workers. The TPU master also manages the setup and shutdown of TPU devices.

TPU node

#TensorFlow
#GoogleCloud

A TPU resource on Google Cloud Platform with a specific TPU type. The TPU node connects to your VPC Network from a peer VPC network. TPU nodes are a resource defined in the Cloud TPU API.

TPU Pod

#TensorFlow
#GoogleCloud

A specific configuration of TPU devices in a Google data center. All of the devices in a TPU Pod are connected to one another over a dedicated high-speed network. A TPU Pod is the largest configuration of TPU devices available for a specific TPU version.

TPU resource

#TensorFlow
#GoogleCloud

A TPU entity on Google Cloud Platform that you create, manage, or consume. For example, TPU nodes and TPU types are TPU resources.

TPU slice

#TensorFlow
#GoogleCloud

A TPU slice is a fractional portion of the TPU devices in a TPU Pod. All of the devices in a TPU slice are connected to one another over a dedicated high-speed network.

TPU type

#TensorFlow
#GoogleCloud

A configuration of one or more TPU devices with a specific TPU hardware version. You select a TPU type when you create a TPU node on Google Cloud Platform. For example, a v2-8 TPU type is a single TPU v2 device with 8 cores. A v3-2048 TPU type has 256 networked TPU v3 devices and a total of 2048 cores. TPU types are a resource defined in the Cloud TPU API.

TPU worker

#TensorFlow
#GoogleCloud

A process that runs on a host machine and executes machine learning programs on TPU devices.