ViralTopic

ML data versioning backlash

April 4, 2026clem

clem says Git is the wrong abstraction for most ML data like checkpoints and traces, arguing ML needs cheap mutable storage instead of version control and pitching Buckets on the Hugging Face Hub.

Hot take: Git was the wrong abstraction for 90% of ML data.
Checkpoints, optimizer states, training logs, agent traces - none of this needs version control.
It needs fast, cheap, mutable storage.
So we built Buckets. S3-like storage on the @huggingface Hub
clem
mlopsstoragehuggingface

See what experts are saying right now

This finding is one of many signals tracked across Artificial Intelligence. The live feed updates every few hours with new expert voices, debates, and emerging ideas.

← Back to Artificial Intelligence