Hugging Face has launched Storage Buckets, a mutable and non-versioned object storage layer for checkpoints, processed data, logs, and agent traces on the Hub. The company says Xet-based deduplication and cloud pre-warming should make large ML workflows faster and cheaper to operate.
#mlops
RSS FeedA March 15, 2026 r/MachineLearning post introduced preflight, a new PyTorch-oriented CLI that runs 10 pre-training checks such as label leakage, NaN detection, gradient checks, and VRAM estimation before a job starts.
An r/MachineLearning post introduced TraceML, an open-source tool that instruments PyTorch runs with a single context manager and surfaces timing, memory, and rank skew while training is still running. The pitch is practical observability rather than heavyweight profiling.
A well-received MachineLearning post introduced GoodSeed as a simpler experiment tracker that stores runs in local SQLite, serves them through a built-in web app, and optionally syncs to a remote API. The project also logs hardware metrics, stdout/stderr, Git state, and offers a migration path for Neptune users.