Dataset Item Versioning
Track dataset changes over time with automatic versioning on every addition, update, or deletion of dataset items.
Complete audit trail: View full history of changes at item level to understand what changed and when. Identify unintended edits and revert problematic changes.
Experiment reproducibility: Experiments are automatically tied to the exact dataset state at run time. When dataset items are modified after running an experiment, previous results remain tied to the dataset version they actually ran against.
Dataset evolution: Track how gold-label datasets improve as domain experts refine expected outputs. See exactly what changed and how it affects benchmark results.
Every addition, update, or deletion of dataset items creates a new dataset version identified by timestamp. Includes item-level versioning with full history and diffs, and dataset-level metadata tracking high-level changes.
Coming soon: API support for fetching datasets at specific version timestamps and SDK support for running experiments on specific dataset versions.

Fetched April 13, 2026