Document scanner with ML Kit on Android

Use the ML Kit document scanner API to easily add a document scanner feature to your app.

Feature Details
Sdk name play-services-mlkit-document-scanner
Implementation The models, scanning logic and UI flow are dynamically downloaded by Google Play services.
App size impact ~300KB download size increase.
Initialization time Users might have to wait for the models, logic and UI flow to download before first use.

Try it out

Play around with the sample app to see an example usage of this API.

Before you begin

  1. In your project-level build.gradle file, make sure to include Google's Maven repository in both your buildscript and allprojects sections.

  2. Add the dependency for the ML Kit document scanner library to your module's app-level gradle file, which is usually app/build.gradle:

dependencies {
   // …
   implementation 'com.google.android.gms:play-services-mlkit-document-scanner:16.0.0-beta1'
}

Document Scanner configuration

The document scanner user flow (which includes a dedicated viewfinder screen and preview screen) is provided by the SDK. The viewfinder and preview screen supports the following customizable controls:

  • importing from the photo gallery
  • setting a limit to the number of pages scanned
  • scanner mode (to control the feature sets in the flow)

You can retrieve both PDF and JPEG files for your scanned documents.

Instantiate GmsDocumentScannerOptions to configure the scanner options:

Kotlin

val options = GmsDocumentScannerOptions.Builder()
    .setGalleryImportAllowed(false)
    .setPageLimit(2)
    .setResultFormats(RESULT_FORMAT_JPEG, RESULT_FORMAT_PDF)
    .setScannerMode(SCANNER_MODE_FULL)
    .build()

Java

GmsDocumentScannerOptions options = new GmsDocumentScannerOptions.Builder()
    .setGalleryImportAllowed(false)
    .setPageLimit(2)
    .setResultFormats(RESULT_FORMAT_JPEG, RESULT_FORMAT_PDF)
    .setScannerMode(SCANNER_MODE_FULL)
    .build();

Scan documents

After creating your GmsDocumentScannerOptions, get an instance of GmsDocumentScanner. You can then start the scanner activity following Activity Result APIs introduced in AndroidX.

When the document scanning is complete, a GmsDocumentScanningResult object will give access to the number of pages scanned, the URIs of the images in JPEG format and PDF accordingly to what was defined via setResultFormats:

Kotlin

val scanner = GmsDocumentScanning.getClient(options)
val scannerLauncher = registerForActivityResult(StartIntentSenderForResult()) {
  result -> {
    if (result.resultCode == RESULT_OK) {
      val result =
        GmsDocumentScanningResult.fromActivityResultIntent(result.data)
      result.getPages()?.let { pages ->
        for (page in pages) {
          val imageUri = pages.get(0).getImageUri()
        }
      }
      result.getPdf()?.let { pdf ->
        val pdfUri = pdf.getUri()
        val pageCount = pdf.getPageCount()
      }
    }
  }
}

scanner.getStartScanIntent(activity)
  .addOnSuccessListener { intentSender ->
     scannerLauncher.launch(IntentSenderRequest.Builder(intentSender).build())
   }
  .addOnFailureListener {
    ...
  }

Java

GmsDocumentScanner scanner = GmsDocumentScanning.getClient(options);
ActivityResultLauncher<IntentSenderRequest> scannerLauncher =
  registerForActivityResult(
    new StartIntentSenderForResult(),
      result -> {
        if (result.getResultCode() == RESULT_OK) {
          GmsDocumentScanningResult result = GmsDocumentScanningResult.fromActivityResultIntent(result.getData());
          for (Page page : result.getPages()) {
            Uri imageUri = pages.get(0).getImageUri();
          }

          Pdf pdf = result.getPdf();
          Uri pdfUri = pdf.getUri();
          int pageCount = pdf.getPageCount();
        }
      });

scanner.getStartScanIntent(activity)
  .addOnSuccessListener(intentSender ->
    scannerLauncher.launch(new IntentSenderRequest.Builder(intentSender).build()))
  .addOnFailureListener(...);

Tips to improve performance

Consider that generating document files takes time and requires processing power, so only request the output formats (JPEG, or PDF, or both) you actually need.