GenAI Summarization API

With ML Kit's GenAI Summarization API, you can automatically generate summaries of articles and conversations as a list of bullet points. This helps users understand large bodies of text.

Summarization benefits from on-device generative AI because it addresses concerns about data privacy and cost efficiency. Apps that summarize personal chats, emails, notes, and reminders often handle sensitive information, making on-device processing important for user privacy. In addition, summarization tasks, especially those with long contexts or many items, can require significant processing power. Processing this content on the device reduces server load and lowers serving costs, while keeping user data private.

Key capabilities

The GenAI Summarization API covers the following capabilities:

Summarize text, categorized as article or conversation.
Output summarization in one, two, or three bullets.

Get started

Add the ML Kit summarization API as a dependency in your build.gradle configuration.

implementation("com.google.mlkit:genai-summarization:1.0.0-beta1")

Next, implement the code in your project:

Create a Summarizer object.
Download the feature if it is downloadable.
Create a summarization request.
Run inference and retrieve the result.

Kotlin

val articleToSummarize = "Announcing a set of on-device GenAI APIs..."

// Define task with required input type, output type, and language
val summarizerOptions = SummarizerOptions.builder(context)
    .setInputType(InputType.ARTICLE)
    .setOutputType(OutputType.ONE_BULLET)
    .setLanguage(Language.ENGLISH)
    .build()
val summarizer = Summarization.getClient(summarizerOptions)

suspend fun prepareAndStartSummarization() {
    // Check feature availability. Status will be one of the following:
    // UNAVAILABLE, DOWNLOADABLE, DOWNLOADING, AVAILABLE
    val featureStatus = summarizer.checkFeatureStatus().await()

    if (featureStatus == FeatureStatus.DOWNLOADABLE) {
        // Download feature if necessary. If downloadFeature is not called,
        // the first inference request will also trigger the feature to be
        // downloaded if it's not already downloaded.
        summarizer.downloadFeature(object : DownloadCallback {
            override fun onDownloadStarted(bytesToDownload: Long) { }

            override fun onDownloadFailed(e: GenAiException) { }

            override fun onDownloadProgress(totalBytesDownloaded: Long) {}

            override fun onDownloadCompleted() {
                startSummarizationRequest(articleToSummarize, summarizer)
            }
        })
    } else if (featureStatus == FeatureStatus.DOWNLOADING) {
        // Inference request will automatically run once feature is
        // downloaded. If Gemini Nano is already downloaded on the device,
        // the feature-specific LoRA adapter model will be downloaded
        // quickly. However, if Gemini Nano is not already downloaded, the
        // download process may take longer.
        startSummarizationRequest(articleToSummarize, summarizer)
    } else if (featureStatus == FeatureStatus.AVAILABLE) {
        startSummarizationRequest(articleToSummarize, summarizer)
    }
}

fun startSummarizationRequest(text: String, summarizer: Summarizer) {
    // Create task request
    val summarizationRequest = SummarizationRequest.builder(text).build()

    // Start summarization request with streaming response
    summarizer.runInference(summarizationRequest) { newText ->
        // Show new text in UI
    }

    // You can also get a non-streaming response from the request
    // val summarizationResult = summarizer.runInference(
    //     summarizationRequest).get().summary
}

// Be sure to release the resource when no longer needed
// For example, on viewModel.onCleared() or activity.onDestroy()
summarizer.close()

Java

String articleToSummarize = "Announcing: a set of on-device GenAI AI APIs.";

// Define task with required input type, output type, and language
SummarizerOptions summarizerOptions = 
    SummarizerOptions.builder(context)
        .setInputType(SummarizerOptions.InputType.ARTICLE)
        .setOutputType(SummarizerOptions.OutputType.ONE_BULLET)
        .setLanguage(SummarizerOptions.Language.ENGLISH)
        .build();
Summarizer summarizer = Summarization.getClient(summarizerOptions);

void prepareAndStartSummarization()
        throws ExecutionException, InterruptedException {
    // Check feature availability. Status will be one of the following:
    // UNAVAILABLE, DOWNLOADABLE, DOWNLOADING, AVAILABLE
    try {
        int featureStatus = summarizer.checkFeatureStatus().get();
        if (featureStatus == FeatureStatus.DOWNLOADABLE) {
            // Download feature if necessary.
            // If downloadFeature is not called, the first inference request
            // will also trigger the feature to be downloaded if it's not
            // already downloaded.
            summarizer.downloadFeature(new DownloadCallback() {
                @Override
                public void onDownloadCompleted() {
                    startSummarizationRequest(articleToSummarize, summarizer);
                }

                @Override
                public void onDownloadFailed(GenAiException e) { /* handle error */ }

                @Override
                public void onDownloadProgress(long totalBytesDownloaded) {}

                @Override
                public void onDownloadStarted(long bytesDownloaded) {}
            });
        } else if (featureStatus == FeatureStatus.DOWNLOADING) {
            // Inference request will automatically run once feature is
            // downloaded. If Gemini Nano is already downloaded on the
            // device, the feature-specific LoRA adapter model will be
            // downloaded quickly. However, if Gemini Nano is not already
            // downloaded, the download process may take longer.
            startSummarizationRequest(articleToSummarize, summarizer);
        } else if (featureStatus == FeatureStatus.AVAILABLE) {
            startSummarizationRequest(articleToSummarize, summarizer);
        }
    } catch (ExecutionException | InterruptedException e) {
        e.printStackTrace();
    }
}

void startSummarizationRequest(String text, Summarizer summarizer) {
    // Create task request
    SummarizationRequest summarizationRequest =
        SummarizationRequest.builder(text).build();

    // Start summarization request with streaming response
    summarizer.runInference(summarizationRequest, newText -> {
        // Show new text in UI
    });

    // You can also get a non-streaming response from the request
    // ListenableFuture<SummarizationResult> summarizationResult
    //         = summarizer.runInference(summarizationRequest);
    // String summary = summarizationResult.get().getSummary();
}

// Be sure to release the resource when no longer needed
// For example, on viewModel.onCleared() or activity.onDestroy()
summarizer.close();

How the model handles different input types

When the text input is specified as InputType.CONVERSATION, the model expects input in the following format:

<name>: <message>
<name2>: <message2>
<name>: <message3>
<name3>: <message4>

This enables the model to produce a more accurate summary by providing a better understanding of the conversation and interactions.

Supported features and limitations

Input must be under 4000 tokens (or approximately 3000 English words). If the input exceeds 4000 tokens, consider these options:

Prioritize summarization of the first 4000 tokens. Testing indicates this usually yields good results for longer inputs. Consider turning on auto truncation by calling setLongInputAutoTruncationEnabled so the extra input is automatically truncated.
Segment the input into groups of 4000 tokens, and summarize them individually.
Consider a cloud solution that is more suitable for the larger input.

For InputType.ARTICLE, input must also be over 400 characters, with the model performing best when the article is at least 300 words.

The GenAI Summarization API supports English, Japanese, and Korean, and is defined in SummarizerOptions.Language.

Availability of the specific feature configuration (specified by SummarizerOptions) may vary depending on the particular device's configuration and the models that have been downloaded to the device.

The most reliable way for developers to ensure that the intended API feature is supported on a device with the requested SummarizerOptions is to call the checkFeatureStatus() method. This method provides the definitive status of feature availability on the device at runtime.

Common setup issues

ML Kit GenAI APIs rely on the Android AICore app to access Gemini Nano. When a device is just setup (including reset), or the AICore app is just reset (e.g. clear data, uninstalled then reinstalled), the AICore app may not have enough time to finish initialization (including downloading latest configurations from server). As a result, the ML Kit GenAI APIs may not function as expected. Here are the common setup error messages you may see and how to handle them:

Example error message	How to handle
AICore failed with error type 4-CONNECTION_ERROR and error code 601-BINDING_FAILURE: AICore service failed to bind.	This could happen when you install the app using ML Kit GenAI APIs immediately after device setup or when AICore is uninstalled after your app is installed. Updating AICore app then reinstalling your app should fix it.
AICore failed with error type 3-PREPARATION_ERROR and error code 606-FEATURE_NOT_FOUND: Feature ... is not available.	This could happen when AICore hasn't finished downloading the latest configurations. Keep network connection and wait for a few minutes to a few hours. Note that if the device's bootloader is unlocked, you'll also see this error—this API does not support devices with unlocked bootloaders.
AICore failed with error type 1-DOWNLOAD_ERROR and error code 0-UNKNOWN: Feature ... failed with failure status 0 and error esz: UNAVAILABLE: Unable to resolve host ...	Keep network connection, wait for a few minutes and retry.