Text Recognition API Overview

Text recognition is the process of detecting text in images and video streams and recognizing the text contained therein. Once detected, the recognizer then determines the actual text in each block and segments it into lines and words. The Text API detects text in Latin based languages (French, German, English, etc.), in real-time, on device.

Try out the MLKit Android codelab to learn how to integrate the latest Text API into your application.

Recognized Languages

The Text API can recognize text in any Latin based language. This includes, but is not limited to:

  • Catalan
  • Danish
  • Dutch
  • English
  • Finnish
  • French
  • German
  • Hungarian
  • Italian
  • Latin
  • Norwegian
  • Polish
  • Portugese
  • Romanian
  • Spanish
  • Swedish
  • Tagalog
  • Turkish

Text Structure

The Text Recognizer segments text into blocks, lines, and words. Roughly speaking:

  • a Block is a contiguous set of text lines, such as a paragraph or column,

  • a Line is a contiguous set of words on the same vertical axis, and

  • a Word is a contiguous set of alphanumeric characters on the same vertical axis.

The image below highlights examples of each of these in descending order. The first highlighted block, in cyan, is a Block of text. The second set of highlighted blocks, in blue, are Lines of text. Finally, the third set of highlighted blocks, in dark blue, are Words.