Google’s unlearning audit catches privacy failures with thousands of samples
Original: New framework for auditing machine unlearning View original →
“The model forgot it” is becoming a claim that needs evidence. On June 10, 2026, Google Research introduced a framework for auditing machine unlearning and differential privacy when auditors can only query a model and inspect output samples, not inspect the training run from the inside.
The work, described in a Google Research post, focuses on Regularized f-Divergence Kernel Tests, presented at AISTATS 2026. Machine unlearning aims to remove specific training data without retraining a model from scratch, a capability tied to privacy law, safety, and model quality. The hard part is proving that removal happened.
Traditional two-sample tests compare output distributions and ask whether they differ. That can be useful, but Google argues it becomes weak and expensive at model scale. Subtle or localized failures can be missed, while harmless distribution changes can be flagged as unsafe. Auditors may need very large sample counts to separate real privacy leakage from random noise.
Google’s framework uses f-divergences, including chi-squared, KL, and hockey-stick divergence, with kernel regularization to make the tests tractable. The adaptive approach also reduces manual hyperparameter tuning and avoids sample splitting. For privacy auditing, Google says its hockey-stick based tester detected violations in a sparse vector technique mechanism known as SVT3 using only a few thousand samples, while previously studied DP-Auditorium techniques required millions of samples to approximate the same detection rate.
The unlearning result is just as pointed. Instead of asking whether an unlearned model exactly matches a retrained “safe” model, Google proposes a three-sample relative test: is the unlearned model closer to the safe retrained model or to the compromised model that memorized sensitive data? In simplified evaluations, only the random label technique passed. Finetuning, pruning, and Selective Synaptic Dampening were found ineffective at truly forgetting the target data.
This is research, not a production certification scheme. Still, it raises the bar for AI systems trained on sensitive data. Privacy promises will increasingly need audit methods that are statistically grounded, sample-efficient, and usable without privileged access to the full training pipeline.
Related Articles
A Massachusetts privacy bill passed the House 146-0 and would ban the sale of precise location data. Because it covers companies processing data from more than 100,000 consumers, the pressure lands directly on ad tech, mobile apps, and data brokers.
The Reddit debate focused on whether an AI detector was being used as evidence or as an uncalibrated decision-maker.
HN focused less on the leaderboard and more on how refusals, tool loops, and account permissions shaped the result.