Derm Foundation model card
Stay organized with collections
Save and categorize content based on your preferences.
Model documentation:
Derm Foundation
Resources:
Terms of use:
Health AI Developer Foundations terms of use
Author: Google
This section describes the Derm Foundation model and how to use it.
Description
Derm Foundation is a machine learning model designed to accelerate AI
development for skin image analysis for dermatology applications. It is
pre-trained on large amounts of labeled skin images to produce 6144 dimensional
embeddings that capture dense features relevant for analyzing these images. As a
result, Derm Foundation’s embeddings enable the efficient training of AI models
with significantly less data and compute than traditional methods.
How to use
Following are some example code snippets to help you quickly get started running
the model locally. If you want to use the model at scale, we recommend that you
create a production version using
Model Garden.
# Download test image.
from PIL import Image
from io import BytesIO
from IPython.display import Image as IPImage, display
from huggingface_hub import from_pretrained_keras
import tensorflow as tf
# Download sample image
!wget -nc -q https://storage.googleapis.com/dx-scin-public-data/dataset/images/3445096909671059178.png
# Load the image
img = Image.open("3445096909671059178.png")
buf = BytesIO()
img.convert('RGB').save(buf, 'PNG')
image_bytes = buf.getvalue()
# Format input
input_tensor= tf.train.Example(features=tf.train.Features(
feature={'image/encoded': tf.train.Feature(
bytes_list=tf.train.BytesList(value=[image_bytes]))
})).SerializeToString()
# Load the model directly from Hugging Face Hub
loaded_model = from_pretrained_keras("google/derm-foundation")
# Call inference
infer = loaded_model.signatures["serving_default"]
output = infer(inputs=tf.constant([input_tensor]))
# Extract the embedding vector
embedding_vector = output['embedding'].numpy().flatten()
Examples
See the following Colab notebooks for examples of how to use Derm Foundation:
Model architecture overview
Derm Foundation was trained in two stages. The first pre-training stage used
contrastive learning to train on a large number of public image-text pairs from
the internet. The image component of this pre-trained model was then fine-tuned
for condition classification and a couple other downstream tasks using a number
of clinical datasets (see below).
Technical specifications
- Model type: BiT-101x3 CNN (Convolutional Neural Network)
- Key publications:
- Model created: 2023-12-19
- Model version: Version: 1.0.0
Derm Foundation was evaluated for data-efficient accuracy across a range of
skin-related classifications tasks. Training a linear classifier on
Derm-Foundations embeddings were substantially more performant (10-15% increase
in accuracy) than doing the same for a standard BiT-M model across different
proportions of training data. See this
Health-specific embedding tools for dermatology and pathology
for more details.
Dataset details
Training dataset
Derm Foundation was trained in two stages. The first pre-training stage used
contrastive learning to train on a large number of public image-text pairs from
the internet. The image component of this pre-trained model was then fine-tuned
for condition classification and a couple of other downstream tasks using a
number of clinical datasets (see below).
- Base model (pre-training): A large number of health-related image-text pairs
from the public web
- SFT (supervised fine-tuned) model: tele-dermatology datasets from the United
States and Colombia, a skin cancer dataset from Australia, and additional
public images. The images come from a mix of device types, including images
from smartphone cameras, other cameras, and dermatoscopes. The images also
have a mix of image takers; images may have been taken by clinicians during
consultations or self-captured by patients.
Labeling
Labeling sources vary by dataset. Examples include:
- (image, caption) pairs from the public web
- Dermatology condition labels provided by dermatologists labelers funded by
Google
- Dermatology condition labels provided with a clinical dataset based on a
telehealth visit, an in-person visit, or a biopsy
License
The use of Derm Foundation is governed by the
Health AI Developer Foundations terms of use.
Details about the model internals.
Software
Training was done using JAX
JAX allows researchers to take advantage of the latest generation of hardware,
including TPUs, for faster and more efficient training of large models.
Use and limitations
Intended use
Derm Foundation can reduce the training data, compute, and technical
expertise necessary to develop task-specific models for skin image analysis.
Embeddings from the model can be used for a variety of user-defined
downstream tasks including, but not limited to:
- Classifying clinical conditions like psoriasis, melanoma or dermatitis
- Scoring severity or progression of clinical conditions
- Identifying the body part the skin is from
- Determining image quality for dermatological assessment
To see how to use the model to train a classifier see this
Linear classifier example
Benefits
Derm Foundation Embeddings can be used for efficient training of AI
development for skin image analysis with significantly less data and compute
than traditional methods.
By leveraging the large set of pre-trained images Derm Foundation is trained
on, users need less data but can also build more generalizable models than
training on more limited datasets.
Limitations
Derm Foundation is trained on images with various lightning and noise
conditions captured in a real-world environment. However, its quality can
degrade in extreme conditions, such as photos that are too light or too
dark.
The base model was trained using image-text pairs from the public web. These
images come from a variety of sources but may by noisy or low-quality. The
SFT (supervised fine-tuned) model was trained data from a limited set of
countries (United States, Colombia, Australia, public images) and settings
(mostly clinical). It may not generalize well to data from other countries,
patient populations, or image types not used in training.
The model is only used to generate embeddings of user-provided data. It does
not generate any predictions or diagnosis on its own.
As with any research, developers should ensure any downstream application is
validated to understand performance using data that is appropriately
representative of the intended use setting for the specific application
(e.g., skin tone/type, age, sex, gender etc.).
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-03-31 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-03-31 UTC."],[[["\u003cp\u003eDerm Foundation is a machine learning model designed to accelerate AI development for skin image analysis for dermatology applications using pre-trained embeddings.\u003c/p\u003e\n"],["\u003cp\u003eThe model produces a 6144-dimensional embedding vector as output, which can be used for downstream tasks like classifying skin conditions and scoring severity.\u003c/p\u003e\n"],["\u003cp\u003eDerm Foundation was trained using contrastive learning on a large dataset of public image-text pairs and fine-tuned on clinical datasets for specific tasks.\u003c/p\u003e\n"],["\u003cp\u003eDevelopers can leverage pre-trained embeddings from Derm Foundation to build data-efficient and generalizable AI models for dermatology.\u003c/p\u003e\n"],["\u003cp\u003eWhile offering significant benefits, the model has limitations regarding image quality, dataset representation, and should be validated for specific use cases.\u003c/p\u003e\n"]]],[],null,["# Derm Foundation model card\n\n**Model documentation** :\n[Derm Foundation](https://developers.google.com/health-ai-developer-foundations/derm-foundation)\n\n**Resources**:\n\n- [Health AI Developer Foundations](https://arxiv.org/abs/2411.15128) arXiv article\n- Model on Google Cloud Model Garden: [Derm Foundation](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/derm-foundation)\n- Model on Hugging Face: [google/derm-foundation](https://huggingface.co/google/derm-foundation)\n- GitHub repository (supporting code, Colab notebooks, discussions, and issues): [derm-foundation](https://github.com/google-health/derm-foundation)\n- Quick start notebook: [notebooks/quick_start](https://github.com/google-health/derm-foundation/blob/master/notebooks/quick_start_with_hugging_face.ipynb)\n- Support: See [Contact](https://developers.google.com/health-ai-developer-foundations/derm-foundation/get-started.md#contact).\n\n**Terms of use** :\n[Health AI Developer Foundations terms of use](https://developers.google.com/health-ai-developer-foundations/terms)\n\n**Author**: Google\n\nModel information\n-----------------\n\nThis section describes the Derm Foundation model and how to use it.\n\n### Description\n\nDerm Foundation is a machine learning model designed to accelerate AI\ndevelopment for skin image analysis for dermatology applications. It is\npre-trained on large amounts of labeled skin images to produce 6144 dimensional\nembeddings that capture dense features relevant for analyzing these images. As a\nresult, Derm Foundation's embeddings enable the efficient training of AI models\nwith significantly less data and compute than traditional methods.\n\n### How to use\n\nFollowing are some example code snippets to help you quickly get started running\nthe model locally. If you want to use the model at scale, we recommend that you\ncreate a production version using\n[Model Garden](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/derm-foundation). \n\n # Download test image.\n from PIL import Image\n from io import BytesIO\n from IPython.display import Image as IPImage, display\n from huggingface_hub import from_pretrained_keras\n import tensorflow as tf\n\n # Download sample image\n !wget -nc -q https://storage.googleapis.com/dx-scin-public-data/dataset/images/3445096909671059178.png\n\n # Load the image\n img = Image.open(\"3445096909671059178.png\")\n buf = BytesIO()\n img.convert('RGB').save(buf, 'PNG')\n image_bytes = buf.getvalue()\n\n # Format input\n input_tensor= tf.train.Example(features=tf.train.Features(\n feature={'image/encoded': tf.train.Feature(\n bytes_list=tf.train.BytesList(value=[image_bytes]))\n })).SerializeToString()\n\n # Load the model directly from Hugging Face Hub\n loaded_model = from_pretrained_keras(\"google/derm-foundation\")\n\n # Call inference\n infer = loaded_model.signatures[\"serving_default\"]\n output = infer(inputs=tf.constant([input_tensor]))\n\n # Extract the embedding vector\n embedding_vector = output['embedding'].numpy().flatten()\n\n### Examples\n\nSee the following Colab notebooks for examples of how to use Derm Foundation:\n\n- To give the model a quick try, running it locally with weights from Hugging\n Face, see\n [Quick start notebook in Colab](https://colab.research.google.com/github/google-health/derm-foundation/blob/master/notebooks/quick_start_with_hugging_face.ipynb).\n\n- For an example of how to use the model to train a linear classifier see\n [Linear classifier notebook in Colab](https://colab.research.google.com/github/google-health/derm-foundation/blob/master/notebooks/train_data_efficient_classifier.ipynb).\n\n- [DERM12345 Embeddings and Demo](https://github.com/abdurrahimyilmaz/derm12345_google-derm-foundation/tree/main) includes a demo using Derm Foundation precomputed embeddings for [DERM12345](https://www.nature.com/articles/s41597-024-04104-3). Special thanks to Abdurrahim Yilmaz for providing this.\n\n### Model architecture overview\n\n- The model is a [BiT-M ResNet101x3](https://arxiv.org/abs/1912.11370).\n\nDerm Foundation was trained in two stages. The first pre-training stage used\ncontrastive learning to train on a large number of public image-text pairs from\nthe internet. The image component of this pre-trained model was then fine-tuned\nfor condition classification and a couple other downstream tasks using a number\nof clinical datasets (see below).\n\n### Technical specifications\n\n- Model type: BiT-101x3 CNN (Convolutional Neural Network)\n- Key publications:\n - BiT: \u003chttps://arxiv.org/abs/1912.11370\u003e\n - ConVIRT: \u003chttps://arxiv.org/abs/2010.00747\u003e\n- Model created: 2023-12-19\n- Model version: Version: 1.0.0\n\n### Performance and validation\n\nDerm Foundation was evaluated for data-efficient accuracy across a range of\nskin-related classifications tasks. Training a linear classifier on\nDerm-Foundations embeddings were substantially more performant (10-15% increase\nin accuracy) than doing the same for a standard BiT-M model across different\nproportions of training data. See this\n[Health-specific embedding tools for dermatology and pathology](https://research.google/blog/health-specific-embedding-tools-for-dermatology-and-pathology/)\nfor more details.\n\n### Inputs and outputs\n\n- **Input** : `PNG` image file 448 x 448 pixels\n\n- **Output**: Embedding vector of floating point values (Dimensions: 6144)\n\nDataset details\n---------------\n\n### Training dataset\n\nDerm Foundation was trained in two stages. The first pre-training stage used\ncontrastive learning to train on a large number of public image-text pairs from\nthe internet. The image component of this pre-trained model was then fine-tuned\nfor condition classification and a couple of other downstream tasks using a\nnumber of clinical datasets (see below).\n\n- Base model (pre-training): A large number of health-related image-text pairs from the public web\n- SFT (supervised fine-tuned) model: tele-dermatology datasets from the United States and Colombia, a skin cancer dataset from Australia, and additional public images. The images come from a mix of device types, including images from smartphone cameras, other cameras, and dermatoscopes. The images also have a mix of image takers; images may have been taken by clinicians during consultations or self-captured by patients.\n\n### Labeling\n\nLabeling sources vary by dataset. Examples include:\n\n- (image, caption) pairs from the public web\n- Dermatology condition labels provided by dermatologists labelers funded by Google\n- Dermatology condition labels provided with a clinical dataset based on a telehealth visit, an in-person visit, or a biopsy\n\nLicense\n-------\n\nThe use of Derm Foundation is governed by the\n[Health AI Developer Foundations terms of use](https://developers.google.com/health-ai-developer-foundations/terms).\n\nImplementation information\n--------------------------\n\nDetails about the model internals.\n\n### Software\n\nTraining was done using [JAX](https://github.com/jax-ml/jax)\n\nJAX allows researchers to take advantage of the latest generation of hardware,\nincluding TPUs, for faster and more efficient training of large models.\n\nUse and limitations\n-------------------\n\n### Intended use\n\n- Derm Foundation can reduce the training data, compute, and technical\n expertise necessary to develop task-specific models for skin image analysis.\n\n- Embeddings from the model can be used for a variety of user-defined\n downstream tasks including, but not limited to:\n\n - Classifying clinical conditions like psoriasis, melanoma or dermatitis\n - Scoring severity or progression of clinical conditions\n - Identifying the body part the skin is from\n - Determining image quality for dermatological assessment\n- To see how to use the model to train a classifier see this\n [Linear classifier example](https://colab.research.google.com/github/google-health/derm-foundation/blob/master/notebooks/train_data_efficient_classifier.ipynb)\n\n### Benefits\n\n- Derm Foundation Embeddings can be used for efficient training of AI\n development for skin image analysis with significantly less data and compute\n than traditional methods.\n\n- By leveraging the large set of pre-trained images Derm Foundation is trained\n on, users need less data but can also build more generalizable models than\n training on more limited datasets.\n\n### Limitations\n\n- Derm Foundation is trained on images with various lightning and noise\n conditions captured in a real-world environment. However, its quality can\n degrade in extreme conditions, such as photos that are too light or too\n dark.\n\n- The base model was trained using image-text pairs from the public web. These\n images come from a variety of sources but may by noisy or low-quality. The\n SFT (supervised fine-tuned) model was trained data from a limited set of\n countries (United States, Colombia, Australia, public images) and settings\n (mostly clinical). It may not generalize well to data from other countries,\n patient populations, or image types not used in training.\n\n- The model is only used to generate embeddings of user-provided data. It does\n not generate any predictions or diagnosis on its own.\n\n- As with any research, developers should ensure any downstream application is\n validated to understand performance using data that is appropriately\n representative of the intended use setting for the specific application\n (e.g., skin tone/type, age, sex, gender etc.)."]]