Hugging Face has launched Storage Buckets, a mutable and non-versioned object storage layer for checkpoints, processed data, logs, and agent traces on the Hub. The company says Xet-based deduplication and cloud pre-warming should make large ML workflows faster and cheaper to operate.
#mlops
RSS FeedA March 15, 2026 r/MachineLearning post introduced preflight, a lightweight PyTorch validator that reached 56 points and 13 comments by promising a fast pre-training gate for label leakage, NaNs, channel order, dead gradients, class imbalance, and VRAM risk.
A March 15, 2026 r/MachineLearning post introduced preflight, a new PyTorch-oriented CLI that runs 10 pre-training checks such as label leakage, NaN detection, gradient checks, and VRAM estimation before a job starts.
Hugging Face introduced Storage Buckets on March 10, 2026 as non-versioned, S3-like storage for checkpoints, processed data, logs, and agent traces. The feature is built on Xet deduplication and includes pre-warming for AWS and GCP to move hot data closer to compute.
An r/MachineLearning post introduced TraceML, an open-source tool that instruments PyTorch runs with a single context manager and surfaces timing, memory, and rank skew while training is still running. The pitch is practical observability rather than heavyweight profiling.
A well-received MachineLearning post introduced GoodSeed as a simpler experiment tracker that stores runs in local SQLite, serves them through a built-in web app, and optionally syncs to a remote API. The project also logs hardware metrics, stdout/stderr, Git state, and offers a migration path for Neptune users.