Attention: This MediaPipe Solutions Preview is an early release. Learn more

tasks-vision package

Classes

Class	Description
DrawingUtils	Helper class to visualize the result of a MediaPipe Vision task.
FaceDetector	Performs face detection on images.
FaceLandmarker	Performs face landmarks detection on images.This API expects a pre-trained face landmarker model asset bundle.
FaceStylizer	Performs face stylization on images.
FilesetResolver	Resolves the files required for the MediaPipe Task APIs.This class verifies whether SIMD is supported in the current environment and loads the SIMD files only if support is detected. The returned filesets require that the Wasm files are published without renaming. If this is not possible, you can invoke the MediaPipe Tasks APIs using a manually created `WasmFileset`.
GestureRecognizer	Performs hand gesture recognition on images.
HandLandmarker	Performs hand landmarks detection on images.
HolisticLandmarker	Performs holistic landmarks detection on images.
ImageClassifier	Performs classification on images.
ImageEmbedder	Performs embedding extraction on images.
ImageSegmenter	Performs image segmentation on images.
ImageSegmenterResult	The output result of ImageSegmenter.
InteractiveSegmenter	Performs interactive segmentation on images.Users can represent user interaction through `RegionOfInterest`, which gives a hint to InteractiveSegmenter to perform segmentation focusing on the given region of interest.The API expects a TFLite model with mandatory TFLite Model Metadata.Input tensor: (kTfLiteUInt8/kTfLiteFloat32) - image input of size `[batch x height x width x channels]`. - batch inference is not supported (`batch` is required to be 1). - RGB inputs is supported (`channels` is required to be 3). - if type is kTfLiteFloat32, NormalizationOptions are required to be attached to the metadata for input normalization. Output tensors: (kTfLiteUInt8/kTfLiteFloat32) - list of segmented masks. - if `output_type` is CATEGORY_MASK, uint8 Image, Image vector of size 1. - if `output_type` is CONFIDENCE_MASK, float32 Image list of size `channels`. - batch is always 1
InteractiveSegmenterResult	The output result of InteractiveSegmenter.
MPImage	The wrapper class for MediaPipe Image objects.Images are stored as `ImageData`, `ImageBitmap` or `WebGLTexture` objects. You can convert the underlying type to any other type by passing the desired type to `getAs...()`. As type conversions can be expensive, it is recommended to limit these conversions. You can verify what underlying types are already available by invoking `has...()`.Images that are returned from a MediaPipe Tasks are owned by by the underlying C++ Task. If you need to extend the lifetime of these objects, you can invoke the `clone()` method. To free up the resources obtained during any clone or type conversion operation, it is important to invoke `close()` on the `MPImage` instance.Converting to and from ImageBitmap requires that the MediaPipe task is initialized with an `OffscreenCanvas`. As we require WebGL2 support, this places some limitations on Browser support as outlined here: https://developer.mozilla.org/en-US/docs/Web/API/OffscreenCanvas/getContext
MPMask	The wrapper class for MediaPipe segmentation masks.Masks are stored as `Uint8Array`, `Float32Array` or `WebGLTexture` objects. You can convert the underlying type to any other type by passing the desired type to `getAs...()`. As type conversions can be expensive, it is recommended to limit these conversions. You can verify what underlying types are already available by invoking `has...()`.Masks that are returned from a MediaPipe Tasks are owned by by the underlying C++ Task. If you need to extend the lifetime of these objects, you can invoke the `clone()` method. To free up the resources obtained during any clone or type conversion operation, it is important to invoke `close()` on the `MPMask` instance.
ObjectDetector	Performs object detection on images.
PoseLandmarker	Performs pose landmarks detection on images.
PoseLandmarkerResult	Represents the pose landmarks deection results generated by `PoseLandmarker`. Each vector element represents a single pose detected in the image.

Interfaces

Interface	Description
BoundingBox	An integer bounding box, axis aligned.
Category	A classification category.
Classifications	Classification results for a given classifier head.
Detection	Represents one detection by a detection task.
DetectionResult	Detection results of a model.
DrawingOptions	Options for customizing the drawing routines
Embedding	List of embeddings with an optional timestamp.One and only one of the two 'floatEmbedding' and 'quantizedEmbedding' will contain data, based on whether or not the embedder was configured to perform scalar quantization.
FaceDetectorOptions	Options to configure the MediaPipe Face Detector Task
FaceLandmarkerOptions	Options to configure the MediaPipe FaceLandmarker Task
FaceLandmarkerResult	Represents the face landmarks deection results generated by `FaceLandmarker`.
FaceStylizerOptions	Options to configure the MediaPipe Face Stylizer Task
GestureRecognizerOptions	Options to configure the MediaPipe Gesture Recognizer Task
GestureRecognizerResult	Represents the gesture recognition results generated by `GestureRecognizer`.
HandLandmarkerOptions	Options to configure the MediaPipe HandLandmarker Task
HandLandmarkerResult	Represents the hand landmarks deection results generated by `HandLandmarker`.
HolisticLandmarkerOptions	Options to configure the MediaPipe HolisticLandmarker Task
HolisticLandmarkerResult	Represents the holistic landmarks detection results generated by `HolisticLandmarker`.
ImageClassifierOptions	Options to configure the MediaPipe Image Classifier Task.
ImageClassifierResult	Classification results of a model.
ImageEmbedderOptions	Options for configuring a MediaPipe Image Embedder task.
ImageEmbedderResult	Embedding results for a given embedder model.
ImageSegmenterOptions	Options to configure the MediaPipe Image Segmenter Task
InteractiveSegmenterOptions	Options to configure the MediaPipe Interactive Segmenter Task
Landmark	Landmark represents a point in 3D space with x, y, z coordinates. The landmark coordinates are in meters. z represents the landmark depth, and the smaller the value the closer the world landmark is to the camera.
LandmarkData	Data that a user can use to specialize drawing options.
NormalizedLandmark	Normalized Landmark represents a point in 3D space with x, y, z coordinates. x and y are normalized to [0.0, 1.0] by the image width and height respectively. z represents the landmark depth, and the smaller the value the closer the landmark is to the camera. The magnitude of z uses roughly the same scale as x.
ObjectDetectorOptions	Options to configure the MediaPipe Object Detector Task
PoseLandmarkerOptions	Options to configure the MediaPipe PoseLandmarker Task
RegionOfInterest	A Region-Of-Interest (ROI) to represent a region within an image.

Variables

Variable	Description
DEFAULT_CATEGORY_TO_COLOR_MAP	A color map with 22 classes. Used in our demos.

Type Aliases

Type Alias	Description
Callback	A user-defined callback to take input data and map it to a custom output value.
CategoryToColorMap	A category to color mapping that uses either a map or an array to assign category indexes to RGBA colors.
FaceStylizerCallback	A callback that receives an `MPImage` object from the face stylizer, or `null` if no face was detected. The lifetime of the underlying data is limited to the duration of the callback. If asynchronous processing is needed, all data needs to be copied before the callback returns (via `image.clone()`).
HolisticLandmarkerCallback	A callback that receives the result from the holistic landmarker detection. The returned result are only valid for the duration of the callback. If asynchronous processing is needed, the masks need to be copied before the callback returns.
ImageSegmenterCallback	A callback that receives the computed masks from the image segmenter. The returned data is only valid for the duration of the callback. If asynchronous processing is needed, all data needs to be copied before the callback returns.
ImageSource	Valid types of image sources which we can run our GraphRunner over.
InteractiveSegmenterCallback	A callback that receives the computed masks from the interactive segmenter. The returned data is only valid for the duration of the callback. If asynchronous processing is needed, all data needs to be copied before the callback returns.
PoseLandmarkerCallback	A callback that receives the result from the pose detector. The returned masks are only valid for the duration of the callback. If asynchronous processing is needed, the masks need to be copied before the callback returns.
RGBAColor	A four channel color with values for red, green, blue and alpha respectively.

DEFAULT_CATEGORY_TO_COLOR_MAP

A color map with 22 classes. Used in our demos.

Signature:

DEFAULT_CATEGORY_TO_COLOR_MAP: number[][]

Callback

A user-defined callback to take input data and map it to a custom output value.

Signature:

export declare type Callback<I, O> = (input: I) => O;

CategoryToColorMap

A category to color mapping that uses either a map or an array to assign category indexes to RGBA colors.

Signature:

export declare type CategoryToColorMap = Map<number, RGBAColor> | RGBAColor[];

FaceStylizerCallback

A callback that receives an MPImage object from the face stylizer, or null if no face was detected. The lifetime of the underlying data is limited to the duration of the callback. If asynchronous processing is needed, all data needs to be copied before the callback returns (via image.clone()).

Signature:

export declare type FaceStylizerCallback = (image: MPImage | null) => void;

HolisticLandmarkerCallback

A callback that receives the result from the holistic landmarker detection. The returned result are only valid for the duration of the callback. If asynchronous processing is needed, the masks need to be copied before the callback returns.

Signature:

export declare type HolisticLandmarkerCallback = (result: HolisticLandmarkerResult) => void;

ImageSegmenterCallback

A callback that receives the computed masks from the image segmenter. The returned data is only valid for the duration of the callback. If asynchronous processing is needed, all data needs to be copied before the callback returns.

Signature:

export declare type ImageSegmenterCallback = (result: ImageSegmenterResult) => void;

ImageSource

Valid types of image sources which we can run our GraphRunner over.

Signature:

export declare type ImageSource = HTMLCanvasElement | HTMLVideoElement | HTMLImageElement | ImageData | ImageBitmap | VideoFrame;

InteractiveSegmenterCallback

A callback that receives the computed masks from the interactive segmenter. The returned data is only valid for the duration of the callback. If asynchronous processing is needed, all data needs to be copied before the callback returns.

Signature:

export declare type InteractiveSegmenterCallback = (result: InteractiveSegmenterResult) => void;

PoseLandmarkerCallback

A callback that receives the result from the pose detector. The returned masks are only valid for the duration of the callback. If asynchronous processing is needed, the masks need to be copied before the callback returns.

Signature:

export declare type PoseLandmarkerCallback = (result: PoseLandmarkerResult) => void;

RGBAColor

A four channel color with values for red, green, blue and alpha respectively.

Signature:

export declare type RGBAColor = [
    number,
    number,
    number,
    number
] | number[];