coSTAR: How We Ship AI Agents at Databricks Fast, Without Breaking Things
Original URL: https://www.databricks.com/blog/costar-how-we-ship-ai-agents-databricks-fast-without-breaking-things
Article Written: March 20, 2026
Added:
Type: project
Summary
The article discusses the coSTAR methodology developed at Databricks for building and deploying AI agents with a focus on automated testing and refinement. It highlights the transition from a slow, manual review process to a rapid, automated testing framework that significantly reduces the time to verify changes. By using MLflow and a structured approach involving scenario definitions, trace capture, and judge assessments, coSTAR enhances development velocity and confidence in the quality of AI agents. The methodology addresses the unique challenges of testing non-deterministic outputs in AI systems.
💠Your Thoughts
best-practices coSTAR (coupled Scenario, Trace, Assess, Refine) Using LLM for Judge with prompt, and judge the production traffic data as well The agent depends on external tools and infrastructure, and those change too.