Training our own AI models

132 points - today at 4:08 PM

Source

Comments

JimDabell today at 4:36 PM
ā€œOpt-in by defaultā€ is an oxymoron. If it’s default then I haven’t opted into anything. It’s been enabled by default.
Waterluvian today at 5:03 PM
PostHog was a system we set up once, generally don't think about, and review from time to time, providing some occasional value. It was mostly harmless to leave around.

But it's apparently yet one more thing we have to be actively suspicious of as it defaults towards an intolerable state. So it's easier to just rip it out of the system and move on.

sixtyj today at 4:15 PM
Most companies would bury this change in a deceptively boring T&Cs update, but we value transparency, so here's what you need to know in an internet-friendly numbered list:

Users on our EU cloud instance are opted out by default

So too users with agreements that prevent training (e.g. BAA, MSA, or similar)

All other users on our US cloud instance are opted in by default

We will anonymize all data before it's used for training

We will only use data that already exists in your PostHog instance

We will do all the model training ourselves, which means...

We won't sell or send your data to third-party model providers

You can opt out at any time via your org settings in PostHog (admin access required)

Training won't start until June 29, so there's plenty of time to decide

Dave_Rosenthal today at 5:44 PM
They say, "our goal here is to improve PostHog as a product for our customers, not to expose or sell models trained on your data" but then don't actually list that as a limitation in the bulleted points.

AFAICT this now gives them default permission to train an LLM on your code (as Posthog telemetry data is inextricably tied to your code) use it, and even sell it if they wanted to (as it's not your data anymore, it's their model). Yikes.

thecatapps today at 5:33 PM
It's probably very obvious by now, but there's something to be said about companies with the "SF Quirky" vibes:

- The OS Redesign

- "Sexy Legal Documents"

- Emails with "<relevant hedgehog meme goes here>" as the subject line

- Having a merch shop with action figures of your CEO

It works both ways. When you're looking for adoption and making very pro-user moves, I guess it can be a benefit. However, when you're now looking to grow revenue and making very anti-user moves, it's insult to injury.

I'm the last person to say that tech "shouldn't be fun" or something overly-broad like that, but if your messaging doesn't match the decisions of leadership, you're gonna have a bad time.

frankest today at 4:45 PM
What a great reminder to build my own analytics and self host. PostHog just lost a customer. They could easily send a email to each customer asking if we want this. The assumption means they have no product intuition about their own customers, let alone the customers of their customers. Bye.
infecto today at 4:51 PM
Thanks for posting. I had been in the fence for the past few months of switching. The new AI products combined with the weird UIs had been irking me for a while. This is the final nail in the coffin. Opt-in is a terrible business model imo.
tines today at 4:33 PM
ā€œOpt-in by defaultā€ = opt-out?
brauhaus today at 4:45 PM
Every day I'm more glad about EU legislation, that's all I have to say for now
abustamam today at 5:25 PM
> Why this is opt out, not opt in

> Put simply, because otherwise we will not have enough data to train a model that's actually useful.

AKA we won't be able to make as much money if we required you to give us permission to use your data.

rad_val today at 5:48 PM
All of them do if you don't do something about it(e.g. migrate to self hosted solutions), trusting a ToS in 2026 is as naive as it gets.
freshnode today at 5:03 PM
Why won't companies explain what anonymisation means for them?

Posthog has unfettered logged in access to some sensitive stuff. What steps are they actually taking to scrub sensitive data from my replay before being used to train a model?

the__alchemist today at 5:19 PM
How much are they paying the users?
ASinclair today at 5:14 PM
Mostly unrelated but the name of this company makes me think it's a Dick-Pics-as-a-Service provider.
stevoski today at 5:54 PM
I’ve been evaluating PostHog for our company.

I’ve now made our decision. We won’t be using them.

If they are going to position yourself as the non-slimy no-BS guys, they can’t pull this nonsense.

mrcwinn today at 5:14 PM
Gross.

They’ll use your product and your data to later sell a product back to you.

gyoridavid today at 5:27 PM
I feel that the US should step up their legislation game and make sure these companies can't retroactively make rules to steal their users data. I know it's trendy to hate the EU but their legislation actually protects the users, and not the companies interests.
jen20 today at 4:48 PM
Perhaps if they hopped on a quick call for five minutes with some customers, they'd realize quite how little appetite there is for putting up with being opted into things automatically in the US but not in the EU.

As an aside, this also means the EU rules are working.

bigstrat2003 today at 4:40 PM
This is the fastest way possible to ensure I will never do business with you, or stop doing business with you if I already am.
tartieret today at 4:08 PM
I initially used Posthog as an alternative to Google Analytics with more privacy. Now they want to use the data for a business purpose. Working hard towards enshitification?
calmbonsai today at 5:14 PM
LOL. You stay classy PostHog.
Henchman21 today at 4:40 PM
You can’t ā€œopt-inā€ to something that is the default. The choice is made for you — and when the choice is made for you? You haven’t opted in or out?
dzonga today at 5:25 PM
another would be excellent product company destroyed or being destroyed slowly due to VCs and the ever chase for 'growth'
mikkelam today at 5:25 PM
The enshittification has begun. Time to move on!
TZubiri today at 4:45 PM
Today I was thinking, if I start a company in the LLM tooling space, I would put in the company mission in the incorporation documents that client data will not be used to train.

The temptation and the value is too great, and the opt-in opt-out consent thing ends up being a fuckery where the company tries to trick the user into allowing them to take a look into the data, presumably because they are selling the product at a loss and need an alternative revenue model.

Just make it impossible from the get-go, the fine print would be that the data can be shared off-band explicitly, in an email, or if explicitly copy pasted in a support chatbox, but there would be no mechanism for us to read the data from the databases much less from the client.

I don't mean it would be an air-tight mechanism like Signal or ProtonMail, if a court order would ask us to produce client info, we would still reserve the right to produce the data, but exceptionally, and definitely not for training models.

slopinthebag today at 4:41 PM
PostHog better transition to an AI company soon because they are one of the SAAS's which are absolutely cooked by vibe coding. What it does is extremely amenable to LLMs and it's also non-critical for a business, making it an excellent candidate for replacement by in-house solutions. And if it means never having to use their website again that's even better.

I wonder if they regret opensource, considering people will be using LLMs to replace them which have surely trained off of their code.

Ayush_Khati1 today at 4:59 PM
[flagged]
jasonmp85 today at 4:56 PM
[dead]