Dynamic Evaluation Frameworks for AI Agents

Developing adaptable evaluation frameworks that can evolve with changing AI capabilities and user expectations.

Level: product

The article discusses the coSTAR methodology developed at Databricks for building and deploying AI agents with a focus on automated testing and...

The article discusses the evolution of generative AI applications into agentic AI systems at Amazon, highlighting the need for a comprehensive...

The article discusses the complexities of evaluating AI agents, emphasizing the importance of rigorous evaluations (evals) throughout the agent...