The MediaPipe Interactive Image Segmenter task takes a location in an image, estimates the boundaries of an object at that location, and returns the segmentation for the object as image data. These instructions show you how to use the Interactive Image Segmenter with the Python language. For more information about the capabilities, models, and configuration options of this task, see the Overview.
The example code for Interactive Image Segmenter provides a complete implementation of this task in Python for your reference. This code helps you test this task and get started on building your own interactive image segmentation application. You can view, run, and edit the Interactive Image Segmenter example code using just your web browser.
This section describes key steps for setting up your development environment and code projects specifically to use Interactive Image Segmenter. For general information on setting up your development environment for using MediaPipe tasks, including platform version requirements, see the Setup guide for Python. You can review the source code for this example on GitHub
The MediaPipe Interactive Image Segmenter task requires the
mediapipe package. You can install the
required dependencies with the following command:
$ python -m pip install mediapipe
Import the following classes to access the Interactive Image Segmenter task functions:
import mediapipe as mp from mediapipe.tasks import python from mediapipe.tasks.python import vision
The MediaPipe Interactive Image Segmenter task requires a trained model that is compatible with this task. For more information on available trained models for Interactive Image Segmenter, see the task overview Models section.
Select and download the model, and then store it within your project directory:
model_path = '/absolute/path/to/model.tflite'
Specify the path of the model within the
model_asset_path parameter, as shown
base_options = BaseOptions(model_asset_path=model_path)
Create the task
The MediaPipe Interactive Image Segmenter task uses the
create_from_options function to
set up the task. The
create_from_options function accepts values
for configuration options to handle. For more information on configuration
options, see Configuration options.
The following code demonstrates how to build and configure this task.
BaseOptions = mp.tasks.BaseOptions InteractiveSegmenter = mp.tasks.vision.InteractiveSegmenter InteractiveSegmenterOptions = mp.tasks.vision.InteractiveSegmenterOptions VisionRunningMode = mp.tasks.vision.RunningMode # Create a image segmenter instance with the image mode: options = InteractiveSegmenterOptions( base_options=BaseOptions(model_asset_path='/path/to/model.task'), running_mode=VisionRunningMode.IMAGE, output_type=InteractiveSegmenterOptions.OutputType.CATEGORY_MASK) with InteractiveSegmenter.create_from_options(options) as segmenter: # segmenter is initialized and ready to use
This task has the following configuration options for Python applications:
|Option Name||Description||Value Range||Default Value|
||If set to
||If set to
||Sets the language of labels to use for display names provided in the
metadata of the task's model, if available. Default is ||Locale code||en|
Prepare your input as an image file or a numpy array,
then convert it to a
# Load the input image from an image file. mp_image = mp.Image.create_from_file('/path/to/image') # Load the input image from a numpy array. mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=numpy_image)
For a code example showing preparation of data for Interactive Image Segmenter, see the code example.
Run the task
The Interactive Image Segmenter uses the
segment function to trigger inferences. For image
segmentation, this includes preprocessing input data, running segmentation model
and postprocessing the raw model outputs to the segmented masks.
The following code example shows how to execute processing with the task model.
RegionOfInterest = vision.InteractiveSegmenterRegionOfInterest # Perform image segmentation on the provided single image. # The image segmenter must be created with the image mode. roi = RegionOfInterest(format=RegionOfInterest.Format.KEYPOINT, keypoint=NormalizedKeypoint(x, y)) segmented_masks = segmenter.segment(mp_image, roi)
For a more complete example of running Interactive Image Segmenter inferences, see the code example.
Handle and display results
The output results for Interactive Image Segmenter is a list of
Image data, and may
include a category mask, confidence masks, or both, depending on what you set
when you configured the task. If you set
True, the output is a list containing single
segmented mask as an uint8 image. The pixel value indicates if it is part of the
object at the area of interest. recognized category index of the input image. If
True, the output is a list of channels
containing pixel values within the range
[0,1] representing the confidence
score of the pixel belonging to the object at the area of interest.
The following sections further explain the output data from this task:
The following images show a visualization of the task output for a category
value mask with a point area of interest indicated. Each pixel is a
value indicating if the pixel is part of the object located at the area of
interest. The black and white circle on the second image indicates the selected
area of interest.
Original image and category mask output. Source image from the Pascal VOC 2012 dataset.
The output for a confidence mask contains float values between
[0, 1] for
each image input channel. Higher values indicate a higher confidence that
the image pixel is part of the object located at the area of interest.