TextEmbedder

public final class TextEmbedder

Performs embedding extraction on text.

This API expects a TFLite model with (optional) TFLite Model Metadata.

Metadata is required for models with int32 input tensors because it contains the input process unit for the model's Tokenizer. No metadata is required for models with string input tensors.

  • Input tensors
    • Three input tensors (kTfLiteInt32) of shape [batch_size x bert_max_seq_len] representing the input ids, mask ids, and segment ids. This input signature requires a Bert Tokenizer process unit in the model metadata.
    • Or one input tensor (kTfLiteInt32) of shape [batch_size x max_seq_len] representing the input ids. This input signature requires a Regex Tokenizer process unit in the model metadata.
    • Or one input tensor (kTfLiteString) that is shapeless or has shape [1] containing the input string.
  • At least one output tensor (kTfLiteFloat32/kTfLiteUint8) with shape [1 x N] where N is the number of dimensions in the produced embeddings.

Nested Classes

class TextEmbedder.TextEmbedderOptions Options for setting up a TextEmbedder

Public Methods

void
close()
Closes and cleans up the TextEmbedder.
static double
cosineSimilarity(Embedding u, Embedding v)
Utility function to compute cosine similarity between two Embedding objects.
static TextEmbedder
createFromFile(Context context, String modelPath)
Creates a TextEmbedder instance from a model file and the default TextEmbedder.TextEmbedderOptions.
static TextEmbedder
createFromFile(Context context, File modelFile)
Creates a TextEmbedder instance from a model file and the default TextEmbedder.TextEmbedderOptions.
static TextEmbedder
TextEmbedderResult
embed(String inputText)
Performs embedding extraction on the input text.

Inherited Methods

Public Methods

public void close ()

Closes and cleans up the TextEmbedder.

public static double cosineSimilarity (Embedding u, Embedding v)

Utility function to compute cosine similarity between two Embedding objects.

Parameters
u
v
Throws
IllegalArgumentException if the embeddings are of different types (float vs. quantized), have different sizes, or have an L2-norm of 0.

public static TextEmbedder createFromFile (Context context, String modelPath)

Creates a TextEmbedder instance from a model file and the default TextEmbedder.TextEmbedderOptions.

Parameters
context an Android ERROR(/Context).
modelPath path to the text model with metadata in the assets.
Throws
if there is is an error during TextEmbedder creation.

public static TextEmbedder createFromFile (Context context, File modelFile)

Creates a TextEmbedder instance from a model file and the default TextEmbedder.TextEmbedderOptions.

Parameters
context an Android ERROR(/Context).
modelFile the text model File instance.
Throws
IOException if an I/O error occurs when opening the tflite model file.
if there is an error during TextEmbedder creation.

public static TextEmbedder createFromOptions (Context context, TextEmbedder.TextEmbedderOptions options)

Parameters
context an Android ERROR(/Context).
options a TextEmbedder.TextEmbedderOptions instance.
Throws
if there is an error during TextEmbedder creation.

public TextEmbedderResult embed (String inputText)

Performs embedding extraction on the input text.

Parameters
inputText a String for processing.