LangExtract is a general-purpose Natural Language Processing (NLP) library designed to structure and ground information extracted from unstructured text using Large Language Models (LLMs). It is particularly well-suited for tasks such as information extraction, entity recognition, and content structuring, making it useful across multiple healthcare use cases. It supports integration with a variety of LLMs, including Gemini, enabling users to create versatile information extraction workflows.
Radiology report structuring with RadExtract
An example use case of LangExtract is RadExtract, a specialized implementation tailored for radiology reports using the power of Gemini 2.5. LangExtract allows users to define structured prompt templates for grounded information extraction, ensuring outputs maintain clear and precise references to the original source text.
RadExtract transforms unstructured radiology narratives into clear, structured sections with section headers, improving the readability and clinical utility of the data. For an example of report structuring with grounding, see the RadExtract demo on HuggingFace.
RadExtract is one of many use cases where the LangExtract library could be useful. We encourage you to explore other use cases!