Label images with a custom model on iOS

You can use ML Kit to recognize entities in an image and label them. This API supports a wide range of custom image classification models. Please refer to Custom models with ML Kit for guidance on model compatibility requirements, where to find pre-trained models, and how to train your own models.

See the ML Kit quickstart sample on GitHub for an example of this API in use.

Before you begin

  1. Include the following ML Kit libraries in your Podfile:
    pod 'GoogleMLKit/ImageLabelingCustom'
  2. After you install or update your project's Pods, open your Xcode project using its .xcworkspace. ML Kit is supported in Xcode version 11.3.1 or higher.

1. Bundle model with your app

To bundle your TensorFlow Lite model with your app, copy the model file (usually ending in .tflite or .lite) to your Xcode project, taking care to select Copy bundle resources when you do so. The model file will be included in the app bundle and available to ML Kit.

2. Prepare the input image

Create a VisionImage object using a UIImage or a CMSampleBufferRef.

If you use a UIImage, follow these steps:

  • Create a VisionImage object with the UIImage. Make sure to specify the correct .orientation.

    Swift

    let image = VisionImage(image: UIImage)
    visionImage.orientation = image.imageOrientation

    Objective-C

    MLKVisionImage *visionImage = [[MLKVisionImage alloc] initWithImage:image];
    visionImage.orientation = image.imageOrientation;

If you use a CMSampleBufferRef, follow these steps:

  • Specify the orientation of the image data contained in the CMSampleBufferRef buffer.

    To get the image orientation:

    Swift

    func imageOrientation(
      deviceOrientation: UIDeviceOrientation,
      cameraPosition: AVCaptureDevice.Position
    ) -> UIImage.Orientation {
      switch deviceOrientation {
      case .portrait:
        return cameraPosition == .front ? .leftMirrored : .right
      case .landscapeLeft:
        return cameraPosition == .front ? .downMirrored : .up
      case .portraitUpsideDown:
        return cameraPosition == .front ? .rightMirrored : .left
      case .landscapeRight:
        return cameraPosition == .front ? .upMirrored : .down
      case .faceDown, .faceUp, .unknown:
        return .up
      }
    }
          

    Objective-C

    - (UIImageOrientation)
      imageOrientationFromDeviceOrientation:(UIDeviceOrientation)deviceOrientation
                             cameraPosition:(AVCaptureDevicePosition)cameraPosition {
      switch (deviceOrientation) {
        case UIDeviceOrientationPortrait:
          return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationLeftMirrored
                                                                : UIImageOrientationRight;
    
        case UIDeviceOrientationLandscapeLeft:
          return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationDownMirrored
                                                                : UIImageOrientationUp;
        case UIDeviceOrientationPortraitUpsideDown:
          return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationRightMirrored
                                                                : UIImageOrientationLeft;
        case UIDeviceOrientationLandscapeRight:
          return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationUpMirrored
                                                                : UIImageOrientationDown;
        case UIDeviceOrientationUnknown:
        case UIDeviceOrientationFaceUp:
        case UIDeviceOrientationFaceDown:
          return UIImageOrientationUp;
      }
    }
          
  • Create a VisionImage object using the CMSampleBufferRef object and orientation:

    Swift

    let image = VisionImage(buffer: sampleBuffer)
    image.orientation = imageOrientation(
      deviceOrientation: UIDevice.current.orientation,
      cameraPosition: cameraPosition)

    Objective-C

     MLKVisionImage *image = [[MLKVisionImage alloc] initWithBuffer:sampleBuffer];
     image.orientation =
       [self imageOrientationFromDeviceOrientation:UIDevice.currentDevice.orientation
                                    cameraPosition:cameraPosition];

3. Configure and run the image labeler

  1. Create a new local model

    Swift

    let localModel = LocalModel(path: localModelFilePath)
    

    Objective-C

    MLKLocalModel *localModel =
        [[MLKLocalModel alloc] initWithPath:localModelFilePath];
    
  2. Set options

    The following options are available:

    Options
    confidenceThreshold

    Minimum confidence score of detected labels. If not set, any classifier threshold specified by the model’s metadata will be used. If the model does not contain any metadata or the metadata does not specify a classifier threshold, a default threshold of 0.0 will be used.

    maxResultCount

    Maximum number of labels to return. If not set, the default value of 10 will be used.

    Swift

    let options = CustomImageLabelerOptions(localModel: localModel)
    options.maxResultCount = 3
    

    Objective-C

    MLKCommonImageLabelerOptions *options =
    [[MLKCustomImageLabelerOptions alloc] initWithLocalModel:localModel];
    options.maxResultCount = 3
    
  3. Create a new ImageLabeler with the options

    Swift

    let imageLabeler = ImageLabeler.imageLabeler(options)
    

    Objective-C

    MLKImageLabeler *imageLabeler =
        [MLKImageLabeler imageLabelerWithOptions:options];
    
  4. Then, use the labeler:

    Asynchronously:

    Swift

    imageLabeler.process(image) { labels, error in
        guard error == nil, let labels = labels, !labels.isEmpty else {
            // Handle the error.
            return
        }
        // Show results.
    }
    

    Objective-C

    [imageLabeler
        processImage:image
          completion:^(NSArray<MLKImageLabel *> *_Nullable labels,
                       NSError *_Nullable error) {
            if (label.count == 0) {
                // Handle the error.
                return;
            }
            // Show results.
         }];
    

    Synchronously:

    Swift

    var labels: [ImageLabel]
    do {
        labels = try imageLabeler.results(in: image)
    } catch let error {
        // Handle the error.
        return
    }
    // Show results.
    

    Objective-C

    NSError *error;
    NSArray<MLKImageLabel *> *labels =
        [imageLabeler resultsInImage:image error:&error];
    // Show results or handle the error.
    

4. Get information about labeled entities

If the image labeling operation succeeds, it returns an array of ImageLabel. Each ImageLabel represents something that was labeled in the image. You can get each label's text description (if available in the metadata of the TensorFlow Lite model file), confidence score, and index. For example:

Swift

for label in labels {
  let labelText = label.text
  let confidence = label.confidence
  let index = label.index
}

Objective-C

for (MLKImageLabel *label in labels) {
  NSString *labelText = label.text;
  float confidence = label.confidence;
  NSInteger index = label.index;
}

Tips to improve real-time performance

If you want to label images in a real-time application, follow these guidelines to achieve the best framerates:

Next steps