OpenAI Privacy Filter

229 points - last Thursday at 12:14 AM

Source

Comments

fzxu22 today at 6:15 AM
Working on this: https://github.com/KevinXuxuxu/anon_proxy, a sort of anonymization proxy to use with LLM providers. It does model (OpenAI privacy filter) + regex PII detection, and replaces them back-and-forth for API requests and responses. With locally hosted detection model, no PII leaves your local environment. I find it very useful especially when you're working on sensitive documents (legal, tax, immigration etc.), hope you find it helpful as well :)
stratos123 last Thursday at 9:30 AM
There's some interesting technical details in this release:

> Privacy Filter is a bidirectional token-classification model with span decoding. It begins from an autoregressive pretrained checkpoint and is then adapted into a token classifier over a fixed taxonomy of privacy labels. Instead of generating text token by token, it labels an input sequence in one pass and then decodes coherent spans with a constrained Viterbi procedure.

> The released model has 1.5B total parameters with 50M active parameters.

> [To build it] we converted a pretrained language model into a bidirectional token classifier by replacing the language modeling head with a token-classification head and post-training it with a supervised classification objective.

nl today at 3:51 AM
I'm no where near as smart as OpenAI of course, but I did build https://tools.nicklothian.com/webner/index.html that uses a BERT based named-entity-recognition model running in your browser to do a subset of PII redaction.

It works pretty well for the use cases I was playing with.

The OpenAI model is small enough that I might enhance my tool to use it.

mplanchard last Thursday at 10:50 AM
It would be nice if their examples weren’t mostly things that are easy to catch with regex, but it’s cool to see if released as an open, local model.
maciejzj today at 7:31 AM
On a side note, when I click the link it redirects me to machine-translated version of OpenAI website with completely botched meaning - the word “redacted” is translated to a false friend “redagować” which means to edit/refine text, not anonymize.
mayneack today at 3:13 AM
Curious how this compares to presidio which mixes regex with a model: https://microsoft.github.io/presidio/
mentalgear last Friday at 7:19 AM
SuperagentLM made available on-edge PPI redaction models already a few years ago in sizes 20B, 3B, 200M. They still seem to be available via their legacy API - well worth checking out to compare against this one. https://docs.superagent.sh/legacy/llms/superagent-lm-redact-...
hiAndrewQuinn last Thursday at 2:39 AM
I'm surprised nobody else has commented on this. This is a very straightforward and useful thing for a small locally runnable model to do.
usdogu today at 8:59 AM
Someone has created the reverse of it: https://github.com/chiefautism/privacy-parser
7777777phil last Thursday at 6:14 AM
> The model is available today under the Apache 2.0 license on Hugging Face (opens in a new window) and Github (opens in a new window).

Bringing back the Open to OpenAI..

Havoc last Thursday at 9:38 AM
50M effective parameters is impressively light. Is there a similarly light model on the prompt injection side? Most of the mainstream ones seem heavier
freakynit today at 2:38 AM
Can someone explaon how can I reconstruct the original entities back if there are, for example, more than one person names?
flashdesk today at 5:12 AM
This is exactly where stochastic approaches feel uncomfortable.

For anything touching security or privacy, even small inconsistencies can quickly erode trust.

I_am_tiberius today at 7:41 AM
I assume they use this model to be able to train new models with user data.
flashdesk today at 6:12 AM
This is where stochastic approaches start to feel a bit uncomfortable.

Even small mistakes can make something dealing with sensitive data hard to trust. It seems useful as a first pass, but I’d probably still want some deterministic checks or a human in the loop to feel confident using it.

ares623 today at 8:18 AM
This looks actually useful. But can someone help me understand how you address the non-perfect scores: "Privacy Filter achieves an F1 score of 96% (94.04% precision and 98.04% recall)."

How would you actually use this if it can fail redacting 4% of the data. How do you reliably know which 4% failed?

ndom91 last Thursday at 10:34 AM
Where's the gguf from Unsloth and co?
nickthegreek today at 1:52 AM
[dead]
aubinkure last Thursday at 3:12 PM
[dead]
haricomputer today at 3:24 AM
[dead]
y0eswddl last Thursday at 3:37 AM
[flagged]