OpenAI’s Privacy Filter runs locally with 128K context and 97.43% corrected F1
Original: Introducing OpenAI Privacy Filter View original →
Privacy work usually breaks down at the ugliest point in the pipeline: the unfiltered logs, chat transcripts, tickets, and review queues that have to be cleaned before anyone can safely use them. OpenAI’s April 22 release of Privacy Filter matters because it tries to move that cleanup step back onto the developer’s own machine. Instead of shipping raw text to a remote redaction service, teams can run a small open-weight model locally and mask sensitive spans before indexing, training, logging, or review.
The model is not a chat assistant dressed up as a privacy feature. OpenAI describes it as a bidirectional token-classification model with constrained span decoding, built for one-pass labeling rather than token-by-token generation. The released model has 1.5B total parameters with 50M active parameters, supports up to 128,000 tokens of context, and predicts eight privacy categories: private people, addresses, emails, phones, URLs, dates, account numbers, and secrets such as passwords or API keys. That last category matters for software teams, where a privacy incident often starts with a stray credential, not just a person’s phone number.
The benchmark numbers are strong enough to make the release more than a niche research note. On PII-Masking-300k, OpenAI reports 96% F1 with 94.04% precision and 98.04% recall. On a corrected version of the benchmark, after fixing annotation issues it identified during review, the score rises to 97.43% F1 with 96.79% precision and 98.08% recall. OpenAI also says small amounts of domain-specific fine-tuning lifted one evaluation from 54% to 96%, which suggests the base model is useful but not meant to be the last step for finance, healthcare, or legal workflows.
That caveat matters. OpenAI is explicit that Privacy Filter is not an anonymization guarantee, not a compliance certification, and not a substitute for human review in high-stakes settings. But the combination of local execution, Apache 2.0 licensing, and context-aware detection makes it practical infrastructure rather than a demo. If the model holds up outside OpenAI’s own testing, it could become a default pre-processing layer for teams that want AI systems to learn from text without learning too much about the people inside it.
Related Articles
OpenAI says threat actors usually combine AI with traditional web and social infrastructure rather than operating inside one model. The company framed the new report as guidance for detecting and disrupting cross-platform abuse.
OpenAI on March 25 launched a public Safety Bug Bounty program on Bugcrowd for AI abuse, agentic misuse, and platform-integrity reports. The company says the new track complements its existing Security Bug Bounty rather than replacing it.
OpenAI said on April 10, 2026 that a compromised Axios package touched a GitHub Actions workflow used in its macOS app-signing pipeline. The company says no user data, systems, or software were compromised, but macOS users need updated builds signed with a new certificate before May 8, 2026.
Comments (0)
No comments yet. Be the first to comment!