Learn how to iterate on your AI capabilities by using production data and evaluation scores to drive improvements.
Prompt
object, you need to verify that it’s actually an improvement. The best way to do this is to run an “offline evaluation”—testing your new version against the same ground truth collection you used in the Measure stage.
The Axiom Console will provide views to compare these evaluation runs side-by-side:
deploy
function.