Derm Foundation serving API

This document describes the Application Programming Interface (API) for Derm Foundation when deployed as an HTTPS service endpoint, referred to as the service in this document.

Overview

The serving source code for Derm Foundation can be built and hosted on any API management system, but it's specially designed to take advantage of Vertex AI prediction endpoints. Therefore, it conforms to Vertex AI's required API signature and implements a predict method.

The service is designed to support micro batching, not to be mistaken with batch jobs. For every image in the request, if the processing is successful, the service returns a one-dimensional embedding vector in the same order as the image in the request. Refer to the sections on API request, response, and micro batching for details.

You can provide dermatology images to the service either directly within the request (inlined) or by providing a reference to their location. Inlining the images in the request is not recommended for large-scale productions; read more. When using data storage links the service expects corresponding OAuth 2.0 bearer tokens to retrieve the data on your behalf. For detailed information on constructing API requests and the different ways to provide image data, refer to the API request section.

To invoke the service, consult the request section, compose a valid request JSON and send a POST request to your endpoint. If you haven't already deployed Derm Foundation as an endpoint, the easiest way is through Model Garden. The following script shows a sample cURL command which you can use to invoke the service. Set LOCATION, PROJECT_ID and ENDPOINT_ID to target your endpoint:

LOCATION="your endpoint location"
PROJECT_ID="your project ID"
ENDPOINT_ID="your endpoint ID"
REQUEST_JSON="path/to/your/request.json"

curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/endpoints/${ENDPOINT_ID}:predict" \
-d "@${REQUEST_JSON}"

Request

An API request can include multiple instances, each conforming to this schema. Note that this schema is based on Vertex AI PredictSchemata standard and is a partial OpenAPI specification. The complete JSON request has the following structure:

{
  "instances": [
    {...},
    {...}
  ]
}

The service accepts dermatology images in two ways:

  • Directly within the HTTPS request: You can include image data as base64-encoded bytes using the input_bytes JSON field; read more about inlined images.

  • Indirectly via storage links: You can provide links to the images stored in GCS using the gcs_uri JSON field.

To illustrate these methods, the following example JSON request shows input_bytes and gcs_uri all in one request. In a real-world scenario, you'll typically only use one of these options for all images within a single request:

{
  "instances": [
    {
      "input_bytes": "your base 64 encoded image bytes",
    },
    {
      "gcs_uri": "gs://your-bucket/path/to/image.png",
      "bearer_token": "your-bearer-token",
    }
  ]
}

Inlined images

You can inline the images in the API request as a base64-encoded string in the input_bytes JSON field. However, keep in mind most API management systems enforce a limit on the maximum size of the request payloads. When Derm Foundation is hosted as a Vertex AI Prediction endpoint, Vertex AI quotas apply.

To optimize the request size, you should compress the images using common image compression codecs. If you require lossless compression, use PNG encoding. If lossy compression is acceptable, use JPEG encoding.

Here is a code snippet for converting compressed JPEG image files from local file system into a base64-encoded string:

import base64

def encode_file_bytes(file_path: str) -> str:
  """Reads a file and returns its contents as a base64-encoded string."""

  with open(file_path, 'rb') as imbytes:
    return base64.b64encode(imbytes.read())

Another code snippet for converting uncompressed image bytes into a lossless PNG format and then converting it into a base64-encoded string:

import base64
import io
import numpy as np
import PIL.Image

def convert_uncompressed_image_bytes_to_base64(image: np.ndarray) -> str:
  """Converts an uncompressed image array to a base64-encoded PNG string."""

  with io.BytesIO() as compressed_img_bytes:
    with PIL.Image.fromarray(image) as pil_image:
      pil_image.save(compressed_img_bytes, 'png')
    return base64.b64encode(compressed_img_bytes.getvalue())

Response

An API response can include multiple predictions that correspond to the order of the instances in the request. Each prediction conforms to this schema. Note that this schema is based on Vertex AI PredictSchemata standard and is a partial OpenAPI specification. The complete JSON request has the following structure:

{
  "predictions": [
    {...},
    {...}
  ],
  "deployedModelId": "model-id",
  "model": "model",
  "modelVersionId": "version-id",
  "modelDisplayName": "model-display-name",
  "metadata": {...}
}

Each request instance can independently succeed or fail. When succeeded, the corresponding prediction JSON includes an embedding field and when failed an error field. Here is an example of a response to a request with two instances where the first one has succeeded and the second one failed:

{
  "predictions": [
    {
      "embedding": [0.1, 0.2, 0.3, 0.4]
    },
    {
      "error": {
        "description": "Some actionable text."
      }
    }
  ],
  "deployedModelId": "model-id",
  "model": "model",
  "modelVersionId": "version-id",
  "modelDisplayName": "model-display-name",
  "metadata": {...}
}

Micro batching

The API request supports micro batching. You can request embeddings for multiple images using different instances within the same JSON request:

{
  "instances": [
    {...},
    {...}
  ]
}

Keep in mind that the total number of embeddings that you can request in one API call will be capped by the service to a fixed limit. A link to the service configuration is coming soon.