Text recognition v2

AI-generated Key Takeaways

The ML Kit Text Recognition v2 API recognizes text in Chinese, Devanagari, Japanese, Korean, and Latin scripts and can automate data entry for documents like credit cards and receipts.
It analyzes text structure by identifying blocks, lines, elements (words), and symbols, returning bounding boxes, corner points, and confidence scores for each.
The API supports real-time text recognition on various devices and can identify the language of the recognized text.

The ML Kit Text Recognition v2 API can recognize text in any Chinese, Devanagari, Japanese, Korean and Latin character set. The API can also be used to automate data-entry tasks such as processing credit cards, receipts, and business cards.

iOS Android

Key capabilities

Recognize text across various scripts and languages Supports recognizing text in Chinese, Devanagari, Japanese, Korean and Latin scripts
Analyzes structure of text Supports detection of symbols, elements, lines and paragraphs
Identify language of text Identifies the language of the recognized text
Real-time recognition Can recognize text in real-time on a wide range of devices

Text structure

The Text Recognizer segments text into blocks, lines, elements and symbols. Roughly speaking:

a Block is a contiguous set of text lines, such as a paragraph or column,
a Line is a contiguous set of words on the same axis, and
an Element is a contiguous set of alphanumeric characters ("word") on the same axis in most Latin languages, or a word in others
an Symbol is a single alphanumeric character on the same axis in most Latin languages, or a character in others

The image below highlights examples of each of these in descending order. The first highlighted block, in cyan, is a Block of text. The second set of highlighted blocks, in blue, are Lines of text. Finally, the third set of highlighted blocks, in dark blue, are Words.

For all detected blocks, lines, elements and symbols, the API returns the bounding boxes, corner points, rotation information, confidence score, recognized languages and recognized text.

Example results

Recognized Text
Text	Wege der parlamentarischen Demokratie
Blocks	(1 block)

Block 0
Text	Wege der parlamentarischen Demokratie
Frame	(296, 665 - 796, 882)
Corner Points	(296, 719), (778, 665), (796, 828), (314, 882)
Recognized Language Code	de
Lines	(3 lines)

Line 0
Text	Wege der
Frame	(434, 678 - 670, 749)
Corner Points	(434, 705), (665, 678), (670, 722), (439, 749)
Recognized Language Code	de
Confidence Score	0.8766741
Rotation Degree	-6.6116457
Elements	(2 elements)

Element 0
Text	Wege
Frame	(434, 689 - 575, 749)
Corner Points	(434, 705), (570, 689), (575, 733), (439, 749)
Recognized Language Code	de
Confidence Score	0.8964844
Rotation Degree	-6.6116457
Elements	(4 elements)

Symbol 0
Text	W
Frame	(434, 698 - 500, 749)
Corner Points	(434, 706), (495, 698), (500, 741), (439, 749)
Confidence Score	0.87109375
Rotation Degree	-6.611646

Text recognition v2 Stay organized with collections Save and categorize content based on your preferences.

AI-generated Key Takeaways

Key capabilities

Text structure

Example results

Text recognition v2