Recognizing digital ink with ML Kit on iOS

With ML Kit's Digital Ink Recognition, you can recognize text handwritten on a digital surface in hundreds of languages, as well as classify sketches.

Before you begin

  1. Include the following ML Kit libraries in your Podfile:

    pod 'GoogleMLKit/DigitalInkRecognition'
    
    
  2. After you install or update your project's Pods, open your Xcode project using its .xcworkspace. ML Kit is supported in Xcode version 11.3.1 or higher.

You are now ready to start recognizing text in Ink objects.

Build an Ink object

The main way to build an Ink object is to draw it on a touch screen. On iOS, you can use a UIImageView along with touch event handlers which draw the strokes on the screen and also store the strokes' points to build the Ink object. This general pattern is demonstrated in the following code snippet. See the quickstart app for a more complete example.

Objective-C

NSMutableArray<MLKStroke *> *strokes;
NSMutableArray<MLKStrokePoint *> *points;

/** Begins a new stroke when the user touches the screen. */
- (void)startStrokeAtPoint:(CGPoint)point time:(NSTimeInterval)t {
  points = [NSMutableArray array];
  [points addObject:[[MLKStrokePoint alloc] initWithX:point.x
                                             y:point.y
                                             t:t * kMillisecondsPerTimeInterval]];
}

/** Adds an additional point to the stroke when the user moves their finger. */
- (void)continueStrokeAtPoint:(CGPoint)point time:(NSTimeInterval)t {
  [points addObject:[[MLKStrokePoint alloc]
                         initWithX:point.x
                                 y:point.y
                                 t:t * kMillisecondsPerTimeInterval]];
}

/** Completes a stroke when the user lifts their finger. */
- (void)endStrokeAtPoint:(CGPoint)point time:(NSTimeInterval)t {
  [points addObject:[[MLKStrokePoint alloc]
                         initWithX:point.x
                                 y:point.y
                                 t:t * kMillisecondsPerTimeInterval]];
  if (!strokes) {
    strokes = [NSMutableArray array];
  }
  [strokes addObject:[[MLKStroke alloc] initWithPoints:points]];
  points = nil;
}

/* Creating an Ink object with the current strokes. */
MLKInk *ink = [[MLKInk alloc] initWithStrokes:strokes];

Swift

var strokes: [Stroke] = []
var points: [StrokePoint] = []

/** Begins a new stroke when the user touches the screen. */
func startStrokeAtPoint(point: CGPoint, t: TimeInterval) {
  points = [StrokePoint.init(x: Float(point.x), y: Float(point.y), t: Int(t * kMillisecondsPerTimeInterval))]
}

/** Adds an additional point to the stroke when the user moves their finger. */
func continueStrokeAtPoint(point: CGPoint, t: TimeInterval) {
  points.append(
    StrokePoint.init(
      x: Float(point.x), y: Float(point.y),
      t: Int(t * kMillisecondsPerTimeInterval)))
}

/** Completes a stroke when the user lifts their finger. */
func endStrokeAtPoint(point: CGPoint, t: TimeInterval) {
  points.append(
    StrokePoint.init(
      x: Float(point.x), y: Float(point.y),
      t: Int(t * kMillisecondsPerTimeInterval)))
  // Create an array of strokes if it doesn't exist already, and add 
  // this stroke to it.
  strokes.append(Stroke.init(points: points!))
  points = []
}

// Creating Ink object with the current strokes.
let ink = Ink.init(strokes: strokes)

Get an instance of DigitalInkRecognizer

To perform a recognition, send the Ink object to a DigitalInkRecognizer instance:

Objective-C

// Specify the recognition model for a language
MLKDigitalInkRecognitionModelIdentifier *identifier = [MLKDigitalInkRecognitionModelIdentifier 
              modelIdentifierForLanguageTag:"en-US"];
if (identifier == nil) {
  // no model was found or the language tag couldn't be parsed, handle error.
}
MLKDigitalInkRecognitionModel *model = [[MLKDigitalInkRecognitionModel alloc]
        initWithModelIdentifier:identifier];

// Get a recognizer for the language
MLKDigitalInkRecognizerOptions *options =
        [[MLKDigitalInkRecognizerOptions alloc] initWithModel:model];
MLKDigitalInkRecognizer *recognizer = [MLKDigitalInkRecognizer DigitalInkRecognizerWithOptions:options];

Swift

// Specify the recognition model for a language
let languageTag = "en-US"
let identifier = DigitalInkRecognitionModelIdentifier(forLanguageTag: languageTag)
if identifier == nil {
  // no model was found or the language tag couldn't be parsed, handle error.
}
let model = DigitalInkRecognitionModel.init(modelIdentifier: identifier)

// Get a recognizer for the language
let options: DigitalInkRecognizerOptions = DigitalInkRecognizerOptions.init(model: model)
let recognizer = DigitalInkRecognizer.DigitalInkRecognizer(options: options)

Process the Ink object

Objective-C

[recognizer recognizeHandwritingFromInk:ink
            completion:^(MLKDigitalInkRecognitionResult 
                         *_Nullable result, NSError *_Nullable error) {
                               if (result.candidates.count > 0) {
                                 NSLog(@"Recognition result %@", 
                                     result.candidates[0].text);
                               } else {
                                 // process error
                               }
                         }];

Swift

recognizer.recognizeHandwriting(
  from: ink,
  completion: {
    (result: DigitalInkRecognitionResult?, error: Error?) in
    if let result = result, let candidate = result.candidates.first {
      NSLog("Recognized: \(candidate.text)")
    } else {
      NSLog("Recognition error \(error)")
    }
  })

The sample code above assumes that the recognition model has already been downloaded, as described in the next section.

Managing model downloads

Download a new model

The following example code demonstrates how to download a model.

The quickstart apps include additional code that shows how to handle multiple downloads at the same time, and how to determine which download succeeded when you receive a completion notification.

Objective-C

[NSNotificationCenter.defaultCenter
        addObserverForName:MLKModelDownloadDidSucceedNotification
                    object:nil
                     queue:NSOperationQueue.mainQueue
                usingBlock:^(NSNotification *notification) {
                  NSLog(@"Model download succeeded");
                }];

[NSNotificationCenter.defaultCenter
        addObserverForName:MLKModelDownloadDidFailNotification
                    object:nil
                     queue:NSOperationQueue.mainQueue
                usingBlock:^(NSNotification *notification) {
                  NSLog(@"Model download failed.");
                }];

MLKDigitalInkRecognitionModel *model = ...;
MLKModelManager *modelManager = [MLKModelManager modelManager];
[modelManager
      downloadModel:model
         conditions:[[MLKModelDownloadConditions alloc]
                         initWithAllowsCellularAccess:YES
                         allowsBackgroundDownloading:YES]];

Swift

NotificationCenter.default.addObserver(
  forName: NSNotification.Name.mlkitModelDownloadDidSucceed,
  object: nil,
  queue: OperationQueue.main,
  using: {
    (notification) in
    NSLog("Model download succeeded")
  })

NotificationCenter.default.addObserver(
  forName: NSNotification.Name.mlkitModelDownloadDidFail,
  object: nil,
  queue: OperationQueue.main,
  using: {
    (notification) in
      NSLog("Model download failed")
  })

let model = ...
modelManager = ModelManager.modelManager()
modelManager.download(
  model,
  conditions: ModelDownloadConditions.init(
    allowsCellularAccess: true, allowsBackgroundDownloading: true)
)

Check whether a model has been downloaded already

Objective-C

MLKDigitalInkRecognitionModel *model = ...;
MLKModelManager *modelManager = [MLKModelManager modelManager];
[modelManager isModelDownloaded:model];

Swift

let model : DigitalInkRecognitionModel = ...
let modelManager = ModelManager.modelManager()
modelManager.isModelDownloaded(model)

Delete a downloaded model

Objective-C

MLKDigitalInkRecognitionModel *model = ...;
MLKModelManager *modelManager = [MLKModelManager modelManager];

if ([self.modelManager isModelDownloaded:model]) {
  [self.modelManager deleteDownloadedModel:model
                                completion:^(NSError *_Nullable error) {
                                  if (error) {
                                    // Handle error.
                                    return;
                                  }
                                  NSLog(@"Model deleted.");
                                }];
}

Swift

let model : DigitalInkRecognitionModel = ...
let modelManager = ModelManager.modelManager()

if modelManager.isModelDownloaded(model) {
  modelManager.deleteDownloadedModel(
    model!,
    completion: {
      error in
      if error != nil {
        // Handle error
        return
      }
      NSLog(@"Model deleted.");
    })
}

Tips to improve text recognition accuracy

The accuracy of text recognition can vary across different languages. Accuracy also depends on writing style. While Digital Ink Recognition is trained to handle many kinds of writing styles, results can vary from user to user.

Here are some ways to improve the accuracy of a text recognizer. Note that these techniques do not apply to the drawing classifiers for emojis, autodraw, and shapes.

Writing area

The writing area is the rectangular region of the screen where the user writes by hand. The meaning of a symbol is partially determined by its size relative to the size of the writing area that contains it. For example, the difference between a lower or upper case letter "o" or "c", and a comma versus a forward slash.

It helps to tell the recognizer the size of the writing area. The recognizer assumes that the writing area only contains a single line of written text, of arbitrary length. If the user is writing so small that they could enter two or more lines in the writing area, or the area is so large that the user could write more than a line, you may not get any more accuracy by providing the writing area.

When you specify the writing area, specify its width and height in the same units as the stroke coordinates.

Pre-context

Pre-context is the text that immediately precedes the strokes in the Ink that you are trying to recognize. You can help the recognizer by telling it about the pre-context.

For example, the cursive letters "n" and "u" are often mistaken for one another. If the user has already entered the partial word "arg", they might continue with strokes that can be recognized as "ument" or "nment". Specifying the pre-context "arg" resolves the ambiguity, since the word "argument" is more likely than "argnment".

Pre-context can also help the recognizer identify word breaks, the spaces between words. You can type a space character but you cannot draw one, so how can a recognizer determine when one word ends and the next one starts? If the user has already written "hello" and continues with the written word "world", without pre-context the recognizer returns the string "world". However, if you specify the pre-context "hello", the model will return the string " world", with a leading space, since "hello world" makes more sense than "helloword".

You should provide the longest possible pre-context string, up to 20 characters, including spaces. If the string is longer, the recognizer only uses the last 20 characters.

The code sample below shows how to define a writing area and use a RecognitionContext object to specify pre-context.

Objective-C

MLKInk *ink = ...;
MLKDigitalInkRecognizer *recognizer = ...;
NSString *preContext = ...;
MLKWritingArea *writingArea = [MLKWritingArea initWithWidth:...
                                              height:...];

MLKDigitalInkRecognitionContext *context = [MLKDigitalInkRecognitionContext
       initWithPreContext:preContext
       writingArea:writingArea];

[recognizer recognizeHandwritingFromInk:ink
            context:context
            completion:^(MLKDigitalInkRecognitionResult 
                         *_Nullable result, NSError *_Nullable error) {
                               NSLog(@"Recognition result %@", 
                                     result.candidates[0].text);
                         }];

Swift

let ink: Ink = ...;
let recognizer: DigitalInkRecognizer =  ...;
let preContext: String = ...;
let writingArea = WritingArea.init(width: ..., height: ...);

let context: DigitalInkRecognitionContext.init(
    preContext: preContext, 
    writingArea: writingArea);

recognizer.recognizeHandwriting(
  from: ink,
  context: context,
  completion: {
    (result: DigitalInkRecognitionResult?, error: Error?) in
    if let result = result, let candidate = result.candidates.first {
      NSLog("Recognized \(candidate.text)")
    } else {
      NSLog("Recognition error \(error)")
    }
  })

Dealing with ambiguous shapes

There are cases where the meaning of the shape provided to the recognizer is ambiguous. For example, a rectangle with very rounded edges could be seen as either a rectangle or an ellipse.

These unclear cases can be handled by using recognition scores when they are available. Only shape classifiers provide scores. If the model is very confident, the top result's score will be much greater than the second best. If there is uncertainty, the scores for the two top results will be close.

Next steps

See the ML Kit quickstart sample on GitHub for an example of this API in use.