Stay organized with collections
Save and categorize content based on your preferences.
Ship your LLM powered apps faster and with greater confidence. Stax removes the
headache of AI evaluation by letting you test models and prompts against your
own criteria.
Core features
- Manage and Build Test Datasets: Import production datasets or use Stax
to construct new ones by prompting any major LLM.
- Leverage Pre-Built and Custom Evaluators: Use a suite of default
evaluators for standard metrics like instruction following and verbosity, or
create custom ones to test for nuanced qualities like brand voice or
business logic.
- Make Data-Driven Decisions: Get actionable data on quality, latency, and
token count to identify an effective AI model, prompt, or iteration for your
application.
Stax supports text-based calls to models, with image support coming soon. If
you'd like to see additional support or have other questions, let us know in our
Discord or by filling out
this contact form.
Getting started
Want to know what AI model or prompt is better for your use case? Get started
with Stax:
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-27 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-27 UTC."],[],[],null,["\u003cbr /\u003e\n\nStax \n[Get started with Stax](https://stax.withgoogle.com)\n\nShip your LLM powered apps faster and with greater confidence. Stax removes the\nheadache of AI evaluation by letting you test models and prompts against your\nown criteria. \n\nCore features\n\n- **Manage and Build Test Datasets**: Import production datasets or use Stax to construct new ones by prompting any major LLM.\n- **Leverage Pre-Built and Custom Evaluators**: Use a suite of default evaluators for standard metrics like instruction following and verbosity, or create custom ones to test for nuanced qualities like brand voice or business logic.\n- **Make Data-Driven Decisions**: Get actionable data on quality, latency, and token count to identify an effective AI model, prompt, or iteration for your application.\n\nStax supports text-based calls to models, with image support coming soon. If\nyou'd like to see additional support or have other questions, let us know in our\n[Discord](https://discord.com/invite/googlelabs) or by filling out\n[this contact form](https://forms.gle/Ef2secxYuNtEYGGPA).\n\nGetting started\n\nWant to know what AI model or prompt is better for your use case? Get started\nwith Stax:\n\n- [Quickstart](/stax/quickstart)"]]