Text

  • Text represents a hierarchical structure of recognized text, containing blocks, lines, and elements.

  • It provides access to the entire recognized text as a single string via getText().

  • It allows you to get a list of TextBlock objects, representing paragraphs of text, using getTextBlocks().

  • Each TextBlock can be further broken down into lines and elements, providing granular access to the text structure.

  • Text is organized in reading order based on the detected language.

public class Text extends Object

A hierarchical representation of texts.

A Text contains a list of Text.TextBlock, and a Text.TextBlock contains a list of Text.Line which is composed of a list of Text.Element.

Nested Class Summary

class Text.Element Represents a space-separated segment in a line of text (for example, a word in most Latin languages). 
class Text.Line Represents a line of text. 
class Text.Symbol Represents a single symbol in an Text.Element
class Text.TextBlock A block of text (think of it as a paragraph) as deemed by the OCR engine. 

Public Method Summary

String
getText()
Retrieves all the recognized text in the image.
List<Text.TextBlock>
getTextBlocks()
Gets an unmodifiable list of Text.TextBlock, which is a block of text and can be further decomposed to a list of Text.Line.

Inherited Method Summary

Public Methods

public String getText ()

Retrieves all the recognized text in the image. It concatenates text strings from underlying Text.TextBlocks separated by '\n'.

Returns an empty string if nothing is found.

public List<Text.TextBlock> getTextBlocks ()

Gets an unmodifiable list of Text.TextBlock, which is a block of text and can be further decomposed to a list of Text.Line.

The recognized text is in reading order for the language. For Latin, this is top to bottom within a Text.TextBlock, and left-to-right within a Text.Line.

Returns an empty list if nothing is found.