There is minimal downside to switching to open models

359 points - yesterday at 8:56 PM

Comments

spiralcoaster today at 5:10 PM

This can't be for real.

The title asserts there is minimal downside to switching to open models, but the article provides zero evidence that this is true, and the author hasn't even attempted it yet. The end of the article states "I’m hoping it’s going to be minimal".

I wonder if I can get a post to the front page with the title: "There are no real barriers to humans colonizing Mars next month". And at the end, "I'm hoping there are no real challenges."

hungryhobbit today at 6:49 PM

What a stupid and pointless article. It's like OP decided "I might go for a walk today" and then wasted his own time writing an essay about how he might go for a walk ... and then wasted hundreds of people's time publishing it!

coffinbirth today at 4:35 AM

> Open models are served via various means, some by the companies that released them and some by third parties like OpenRouter. Unfortunately, both of these routes are dodgier in terms of privacy and data sharing, and I would not feel the same comfort sending API calls containing client or confidential data to them.

That's why I'm using eurouter.ai with the following routing rule for all my requests:

  {
    "model": "glm-5.2",
    "models": [
      "deepseek-v4-pro",
      "deepseek-v4-flash"
    ],
    "provider": {
      "allow_fallbacks": true,
      "data_collection": "deny",
      "data_residency": "EU",
      "max_retention_days": 0,
      "eu_owned": true
    }
  }

Sure, it's quite expensive, but at least on a legal side data privacy is ensured. I trust them more than e.g. Anthropic, OpenAI or OpenRouter.

Personally, I find it morally unacceptable to use U.S. AI tools, because I do not want to support them financially and thus support the crimes they are involved in[1].

[1]: https://news.ycombinator.com/item?id=48512339

julianlam today at 12:31 AM

I think it's interesting that people write off open weight models because they're "a few months behind" proprietary models.

I know LLMs move at the speed of light (especially these past few quarters), but if Opus and GPT "a few months ago" were really like open weight models, then there's really no reason to not switch, especially for those who were using these models a few months ago.

Your codebase didn't change, so use the open weight model. Don't move the goalposts.

tumdum_ today at 10:12 AM

I find the attitude shown in this post very surprising. On the one hand, the post starts with a story of adopting Linux and other FOSS. The core of FOSS is giving its users the ability to understand and modify software they run. On the other hand, the rest of the post is about using a tool (LLM) that the author has no way to modify and no way to understand. Huge matrices of floats are at best comparable to compiled code. But the reality is even worse - it’s actually easier to decompile and understand proprietary software. Not to mention the fact the most of the time users can’t even run the “open” models since it requires hardware that most can’t afford.

How did we get from prising software freedoms to this?

DrScientist today at 9:48 AM

What's amazing about these models is they are effectively a distillation of the internet in something that can fit onto your local machine [1] and be queried via natural language.

[1] It seems inevitable that decent local models will be possible as the technology and the hardware is improving at a rate beyond the growth of the knowledge base to be distilled.

GL26 today at 2:15 PM

What makes an open model worse is ultimately the budget : you have access to worse data, not SOTA models, less GPU compute time, and having a good fine tuning team is extremely expensive. Linux works because the entry barriers are purely on a software side : a lot of contributers all around the world can outclass any OS by contributing on their scale to Linux. All you need to contribute is a computer, and your brain. Open models don't have the same community push, they rely on core ressources that not anyone owns. And injecting them in the model costs too much money. If there are no public breakthroughs in the way we train large open models that makes community led models 10x better, the shift to open models will never happen on a large scale.

Aurornis today at 3:00 AM

The headline says one thing, then the article text says this:

> I’m hoping it’s going to be minimal.

I have multiple subscriptions and I pay per token to try out different LLM providers through OpenRouter. I also run open weight models locally.

I just can’t agree yet. The models from Anthropic and OpenAI really are that much better than anything else. The open weight models must be universally benchmaxxed across the board because my real world experience with them is very different than what the benchmarks imply. I get downvoted a lot for speaking about my experience because I don’t think it’s the reality that people want to hear right now, but it’s true for complex work.

I do think there are a lot of easier tasks that can be handled appropriately by the open weight models in the hands of a skilled operator. If an entire job is simple enough that you wouldn’t hesitate to hand it off to a junior with a little supervision then any model will do. However for a lot of the work I do, even Opus 4.8 on Max requires a lot of attention and extra steering and review to keep it on track. Fable did, too, though to a lesser degree. When I try to use the big open weight models (hosted, because they’re not running at reasonable speeds locally at a quantization I can tolerate) it feels like I spend more time waiting while they burn tokens for output that I probably have to reject anyway, at least for the bigger tasks. I wish they were there, but that’s not the case yet.

anuramat today at 3:07 PM

there is zero downside to not switching though: just use claude while it's good and subsidized, switch if rugpulled

whatever1 today at 5:16 AM

Claude started becoming useful for my coding purposes after it hit version 4.6. After that sure some nice to have additions but I think if I had 4.6 sonnet & opus as open weights, I would not need something more.

Having played a bit with Fable, reinforced the above.

PeterStuer today at 6:31 AM

While I agree with some of the gist of the article, 2 remarks:

1. Unfortunatly in my tests the open models do not (yet?) rival, at least Claude Opus, for software development/engineering and adjacent tasks.

2. Enjoy while it lasts. I'll be genuinly amazed these open models will not be declared 'illegal' under some security pretense by the end of the year. And I say 'pretense' because the primary driver will be regulatory capture and industry protectionism.

pkulak today at 2:35 AM

Sure. But OpenAI is the same price. Why would I pay $18/month for z.ai when OpenAI is $20/month?

bnj today at 3:17 AM

I’ve been wanting to get better acquainted with local inference but I don’t have the hardware, which has made me think about something I haven’t seen discussed, which is local collaboratives. The economics makes it seem like a group of people joining together to run good hardware and an open model might make sense, but I haven’t seen anything like this mentioned. Have I been missing it?

I think it would be pretty neat to launch a service helping people who wanted to participate in something like that locate one another.

reacharavindh today at 9:16 AM

It was easy to be a rebel and use Linux when it was clearly competent, but needed hacks and extra elbow grease to get it polished for use. IME, the open models are “not there yet” in terms of capability or operational needs. Sure, GLM5.2 looks competent, but I will only be able to get it to run that competent if I had a huge cluster of GPUs.. if I am accessing an open model via hosted API, I might as well run a closed model via hosted API. The incentives fall apart in comparison to using Linux 15 years ago.

Don’t get me wrong. I wish I could run a local model and be happy about it. At the moment, I’m not.

mdale today at 12:52 AM

I think the frontier will command premium for sometime just as slight better software developers were 10x's vs their peers as their architecture & development strategies and code approach compounded quickly. One less error per block of work compounds quickly.

Sure, there may be some cases and reasons for local models and industry is so large they will continue to make progress and gather economic value and users for specific use case; but frontier will command vast majority of the economic value distinct from Linux and open source where the model created better than proriatary economic incentives around development

radhitya today at 1:24 AM

Have you read about Opencode Go? They are great provider for open model, like GLM 5.2, Deepseek v4 Pro, Kimi 2.7 Code. You should give it shot to them :-)

_pdp_ today at 9:14 AM

There are downsides depending on how good is your harness. Switching the model is easy enough. Ensuring that the harness continues working the way it did is a completely different thing. This is not just about the prompts but also general behaviour around the model and its infrastructure.

So while it is not complicated and certainly something that can be solved, it is not plug and play.

That being said, we switch to open weight models earlier this month and the results has been more than positive so far. The cost savings are also hard to dismiss.

c-b today at 1:55 PM

What's confusing to me is that there is no discussion about the actual downside experienced it's just theoretical.

arttaboi today at 6:31 AM

I guess this will happen soon. There are two catalysts needed for this to happen:

1. Evals that can quickly tell you how much downside there is to switching 2. Something like OpenRouter that can help you run those evals quickly

Now #2 is starting to become popular, and I think we'll soon see more people adopting a model-agnostic approach. Of course, there will still be high-intelligence use cases where nothing comes close to Claude or GPT.

linzhangrun today at 1:56 AM

Open source models are still not good enough for now, but with the current speed of one new SOTA every two months, by this time next year we will definitely have cheap open source models at least as good as Fable :)

ZeroGravitas today at 8:57 AM

It seems the best self-hosted and the worst models served by big providers has some considerable overlap in quality.

Whatever reason people have to run those (cheaper? backwards compatibility once you get something running) surely applies to the open models too, maybe even more so.

myzek today at 7:00 AM

Any tips on which model to use and how to use them? I have 64 RAM and 16 VRAM (I know it's not a lot, it's a gaming GPU) and I'm trying to find a good model to use but it's a bit of a struggle

petesergeant today at 10:13 AM

Headline: "The is minimal downside"

Article: "I’m hoping it’s going to be minimal"

peter_retief today at 5:50 AM

What open models are "recommended"?

I like the Linux analogy, I struggled with Linux way back.

Animats today at 6:56 AM

OK, now what? Someone offers open models as a service? That's basically a time-sharing computing business - people at terminals sharing remote computing resources. If you buy your own H100 it will be idle while you're typing or reading or thinking. So sharing makes sense.

But it doesn't have to be an "AI company". It's just a compute service. The companies that offer web hosting could get into this.

PcChip today at 12:23 AM

Is it just me or is half the article missing?

I enjoyed the first part though

DANmode today at 12:09 AM

But, what model are you using?

and what hardware are you using?

epolanski today at 5:15 PM

I unsubscribed from Anthropic and our (EU-based) team is moving to an "ai-server" running opencode + GLM 5.2 and DS4.

There are several benefits:

- we cut AI spending by thousands

- there is one AI server and starting different sessions for each user, one memory/skills/etc and everybody is involved into reviewing what went wrong and why. Harness finally makes sense and pays off more.

- we can trust that the models are those that we run and not black boxes

- no more money flowing to US narcissistic entrepeneurs and no more business being tied to US legislation

Not gonna lie, GPT 5.5 Pro and Fable 5 were a tiny bit ahead, especially on longer vibecode-style tasks, but it's just not worth it.

OtomotO today at 5:39 AM

I am absolutely pro local and true open source models.

Personally I haven't seen any productivity gain since Opus 4.5 times.

But: I can't fully get behind the opinion that (so called) "open source models" are simply superior and will be in the future, because when I asked some models who they are, they answered with "I am Claude from Anthropic", which could mean they have been trained by exfiltrating Claude.

I have NO moral objection to this, as Anthropic and "Open""AI".also trained their models on anything they could get their hands on.

It's more about the question: can and will these models be updated, even if Anthropic et al fail. Who's gonna pay for training then? What's their incentive? Have we reached a plateau?

cpill today at 2:48 AM

I think once the hardware process comes down and these mini DGXs become cheaper, and by then open models still be smaller and better, there is going to be less and less reason to use the providers. CEOs are already complaining that they are costing too much. There are also large organisations like Banks which can't use external services and are already looking at internal housing. it's a good thing so the big AI companies just went IPO as once the self hosting trend kicks in they are going bust.

aussieguy1234 today at 1:34 AM

>There was a time not too long ago when using Linux entailed some professional risk1. First there was compatibility: you may not have been able to render a Word document or PowerPoint correctly, and you might have had to trust Open Office’s export capability to render docs the way you wanted

For a while during this era, I used to port my laptops windows installation into a virtual machine that can run on Linux. It took a bit of hacking away but I could usually do it in a day or two. Then its all Linux with the windows vm being used for the microsoft stuff.

blindriver today at 3:29 AM

As someone that has pretty powerful desktop that I've been using with local open weight models, people are far exaggerating the quality of them. Some of them are now useful. They don't compare yet to the online models of ChatGPT, Claude, Gemini, etc. They are still about 18 months behind. I have accomplished useful work with them, like image classification on Gemma4, but they are much much slower, much much more expensive and they don't scale at all.

A $10,000 RTX 6000 Blackwell card will pay for 500 months of Claude or Codex, which is 40 years worth of compute. Obviously they are going to raise their prices, my prediction being to $200-500/month, but that still makes them at least years of compute and they scale very well with more traffic. Single GPUs do not, they are pegged at 100% and good luck getting it to answer multiple queries at the same time.

causality0 today at 2:24 AM

I know open models have gotten quite good in many tasks such as coding or composition, but are there any that can access the internet and retrieve data like ChatGPT, Claude, etc can?

I do have to admit I have recently begun wishing I could pay five dollars a month for a "just answer the fucking question" plan that would give me results without the guardrails and without the constant simpering and ego-stroking. I keep finding myself going a quick evaluation of "is it faster for me to skim search results myself or to construct an elaborate narrative to make an AI give me a real answer".

impartshadow today at 3:00 PM

[flagged]

cws_ai_buddy today at 2:35 AM

[flagged]

fabijanbajo today at 8:55 AM

[dead]

Atom_Foundry today at 9:44 AM

[flagged]

c_chenfeng today at 2:14 AM

[dead]

codelong888 today at 1:41 AM

[dead]

root_axis today at 4:00 AM

Imagine taking 6 months longer to release your cookie cutter CRUD app.