Qwen3.6-35B-A3B: Agentic coding power, now open to all

744 points - today at 1:36 PM

Source

Comments

simonw today at 5:38 PM
I've been running this on my laptop with the Unsloth 20.9GB GGUF in LM Studio: https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF/blob/mai...

It drew a better pelican riding a bicycle than Opus 4.7 did! https://simonwillison.net/2026/Apr/16/qwen-beats-opus/

bertili today at 1:59 PM
A relief to see the Qwen team still publishing open weights, after the kneecapping [1] and departures of Junyang Lin and others [2]!

[1] https://news.ycombinator.com/item?id=47246746 [2] https://news.ycombinator.com/item?id=47249343

homebrewer today at 2:12 PM
Already quantized/converted into a sane format by Unsloth:

https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF

kanemcgrath today at 8:07 PM
I have been using Qwen3.5-35B-A3B a lot in local testing, and it is by far the most capable model that could fit on my machine. I think quantization technology has really upped its game around these models, and there were two quants that blew me away

Mudler APEX-I-Quality. then later I tried Byteshape Q3_K_S-3.40bpw

Both made claims that seemed too good to be true, but I couldn't find any traces of lobotomization doing long agent coding loops. with the byteshape quant I am up to 40+ t/s which is a speed that makes agents much more pleasant. On an rtx 3060 12GB and 32GB of system ram, I went from slamming all my available memory to having like 14GB to spare.

mtct88 today at 2:07 PM
Nice release from the Qwen team.

Small openweight coding models are, imho, the way to go for custom agents tailored to the specific needs of dev shops that are restricted from accessing public models.

I'm thinking about banking and healthcare sector development agencies, for example.

It's a shame this remains a market largely overlooked by Western players, Mistral being the only one moving in that direction.

alecco today at 3:39 PM
Related interesting find on Qwen.

"Qwen's base models live in a very exam-heavy basin - distinct from other base models like llama/gemma. Shown below are the embeddings from randomly sampled rollouts from ambiguous initial words like "The" and "A":"

https://xcancel.com/N8Programs/status/2044408755790508113

armanj today at 2:12 PM
I recall a Qwen exec posted a public poll on Twitter, asking which model from Qwen3.6 you want to see open-sourced; and the 27b variant was by far the most popular choice. Not sure why they ignored it lol.
cpburns2009 today at 7:04 PM
Anyone else getting gibberish when running unsloth/Qwen3.6-35B-A3B-GGUF:UD-IQ4_XS on CUDA (llama.cpp b8815)? UD-Q4_K_XL is fine, as is Vulkan in general.
seemaze today at 3:00 PM
Fingers crossed for mid and larger models as well. I'd personally love to see Qwen3.6-122B-A10B.
rvnx today at 2:30 PM
China won again in terms of openness
fooblaster today at 2:01 PM
Honestly, this is the AI software I actually look forward to seeing. No hype about it being too dangerous to release. No IPO pumping hype. No subscription fees. I am so pumped to try this!
abhikul0 today at 2:06 PM
I hope the other sizes are coming too(9B for me). Can't fit much context with this on a 36GB mac.
KronisLV today at 5:29 PM
I wonder how this one compares to Qwen3 Coder Next (the 80B A3B model), since you'd think that even though it's older, it having more parameters would make it more useful for agentic and development use cases: https://huggingface.co/collections/Qwen/qwen3-coder-next
jake-coworker today at 2:30 PM
This is surprisingly close to Haiku quality, but open - and Haiku is quite a capable model (many of the Claude Code subagents use it).
giantg2 today at 6:38 PM
I cant wait to see some smaller sizes. I would love to run some sort of coding centric agent on a local TPU or GPU instead of having to pay, even if it's slower.
cyrialize today at 5:54 PM
My last laptop was a used 2012 T530.

My current is a used M1 MBP Pro with 16GB of ram.

I thought this was all I was ever going to need, but wanting to run really nice models locally has me thinking about upgrading.

Although, part of me wants to see how far I could get with my trusty laptop.

amelius today at 4:57 PM
Looks like they compare only to open models, unfortunately.

As I am using mostly the non-open models, I have no idea what these numbers mean.

the__alchemist today at 7:23 PM
Is this the hybrid variant of Gwent and Quen? I hope this is in The Witcher IV!
codeugo today at 7:13 PM
Are we going to get to the point where a local model can do almost what sonnet 4.6 can do?
Glemllksdf today at 3:41 PM
I tried Gemma 4 A4B and was surprised how hart it is to use it for agentic stuff on a RTX 4090 with 24gb of ram.

Balancing KV Cache and Context eating VRam super fast.

aliljet today at 2:41 PM
I'm broadly curious how people are using these local models. Literally, how are they attaching harnesses to this and finding more value than just renting tokens from Anthropic of OpenAI?
andy_ppp today at 5:19 PM
Do we know if other models have started detecting and poisoning training/fine tuning that these Chinese models seem to use for alignment, I’d certainly be doing some naughty stuff to keep my moat if I was Anthropic or OpenAI…
adrian_b today at 2:05 PM
zengid today at 7:45 PM
any tips for running it locally within an agent harness? maybe using pi or opencode?
dataflow today at 2:33 PM
I'm a newbie here and lost how I'm supposed to use these models for coding. When I use them with Continue in VSCode and start typing basic C:

  #include <stdio.h>
  int m
I get nonsensical autocompletions like:

  #include <stdio.h>
  int m</fim_prefix>
What is going on?
syntaxing today at 5:07 PM
Is it worth running speculative decoding on small active models like this? Or does MTP make speculative decoding unnecessary?
kombine today at 2:32 PM
What kind of hardware (preferably non-Apple) can run this model? What about 122B?
solomatov today at 5:34 PM
Did anyone try it and Gemma 4? Does it feel that it's better than Gemma 4?
999900000999 today at 4:21 PM
Looking to move off ollama on Open Suse tumbleweed.

Should I use brew to install llma.ccp or the zypper to install the tumbleweed package?

ghc today at 2:08 PM
how does this compare to gpt-oss-120b? It seems weird to leave it out.
psim1 today at 5:07 PM
(Please don't downvote - serious question) Are Chinese models generally accepted for use within US companies? The company I work for won't allow Qwen.
tmaly today at 5:46 PM
What is the min VRAM this can run on given it is MOE?
incomingpain today at 1:57 PM
Wowzers, we were worried Qwen was going to suffer having lost several high profile people on the team but that's a huge drop.

It's better than 27b?

deleted today at 2:45 PM
zoobab today at 2:19 PM
"open source"

give me the training data?

deleted today at 2:38 PM
lopsotronic today at 3:24 PM
Dangit, I'll need to give this a run on my personal machine. This looks impressive.

At the time of writing, all deepseek or qwen models are de facto prohibited in govcon, including local machine deployments via Ollama or similar. Although no legislative or executive mandate yet exists [1], it's perceived as a gap [2], and contracts are already including language for prohibition not just in the product but any part of the software environment.

The attack surface for a (non-agentic) model running in local ollama is basically non-existent . . but, eh . . I do get it, at some level. While they're not l33t haXX0ring your base, the models are still largely black boxes, can move your attention away from things, or towards things, with no one being the wiser. "Landing Craft? I see no landing craft". This would boil out in test, ideally, but hey, now you know how much time your typical defense subcon spends in meaningful software testing[3].

[1] See also OMB Memorandum M-25-22 (preference for AI developed and produced in the United States), NIST CAISI assessment of PRC-origin AI models as "adversary AI" (September 2025), and House Select Committee on the CCP Report (April 16, 2025), "DeepSeek Unmasked".

[2] Overall, rather than blacklist, I'd recommend a "whitelist" of permitted models, maintained dynamically. This would operate the same way you would manage libraries via SSCG/SSCM (software supply chain governance/management) . . but few if any defense subcons have enough onboard savvy to manage SSCG let alone spooling a parallel construct for models :(. Soooo . . ollama regex scrubbing it is.

[3] i.e. none at all, we barely have the ability to MAKE anything like software, given the combination of underwhelming pay scales and the fact defense companies always seem to have a requirement for on-site 100% in some random crappy town in the middle of BFE. If it wasn't for the downturn in tech we wouldn't have anyone useful at all, but we snagged some silcon refugees.

ActorNightly today at 6:59 PM
Can anyone confirm this fits on a 3090? Size is exactly 24gb
btbr403 today at 2:34 PM
Planning to deploy Qwen3.6-35B-A3B on NVIDIA Spark DGX for multi-agent coding workflows. The 3B active params should help with concurrent agent density.
deleted today at 3:39 PM
zshn25 today at 2:37 PM
What do all the numbers 6-35B-A3B mean?
nurettin today at 3:18 PM
I tried the car wash puzzle:

You want to wash your car. Car wash is 50m away. Should you walk or go by car?

> Walk. At 50 meters, the round trip is roughly 100 meters, taking about two minutes on foot. Driving would require starting the engine, navigating, parking, and dealing with unnecessary wear for a negligible distance. Walk to the car wash, and if the bay requires the vehicle inside, have it moved there or return on foot. Walking is faster and more efficient.

Classic response. It was really hard to one shot this with Qwen3.5 Q4_K_M.

Qwen3.6 UD-IQ4_XS also failed the first time, then I added this to the system prompt:

> Double check your logic for errors

Then I created a new dialog and asked the puzzle and it responded:

> Drive it. The car needs to be present to be washed. 50 meters is roughly a 1-minute walk or a 10-second drive. Walking leaves the car behind, making the wash impossible. Driving it the short distance is the only option that achieves the goal.

Now 3.6 gets it right every time. So not as great as a super model, but definitely an improvement.

fred_is_fred today at 2:00 PM
How does this compare to the commercial models like Sonnet 4.5 or GPT? Close enough that the price is right (free)?
yieldcrv today at 3:41 PM
Anybody use these instead of codex or claude code? Thoughts in comparison?

benchmarks dont really help me so much

tristor today at 2:44 PM
I'm disappointed they didn't release a 27B dense model. I've been working with Qwen3.5-27B and Qwen3.5-35B-A3B locally, both in their native weights and the versions the community distilled from Opus 4.6 (Qwopus), and I have found I generally get higher quality outputs from the 27B dense model than the 35B-A3B MOE model. My basic conclusion was that MoE approach may be more memory efficient, but it requires a fairly large set of active parameters to match similarly sized dense models, as I was able to see better or comparable results from Qwen3.5-122B-A10B as I got from Qwen3.5-27B, however at a slower generation speed. I am certain that for frontier providers with massive compute that MoE represents a meaningful efficiency gain with similar quality, but for running models locally I still prefer medium sized dense models.

I'll give this a try, but I would be surprised if it outperforms Qwen3.5-27B.

bossyTeacher today at 2:10 PM
Does anyone have any experience with Qwen or any non-Western LLMs? It's hard to get a feel out there with all the doomerists and grifters shouting. Only thing I need is reasonable promise that my data won't be used for training or at least some of it won't. Being able to export conversations in bulk would be helpful.
maxothex today at 4:00 PM
[dead]
LouisvilleGeek today at 5:14 PM
[dead]
ninjahawk1 today at 5:26 PM
[dead]
reynaventures today at 2:47 PM
[dead]
reynaventures today at 2:47 PM
[dead]
typia today at 3:33 PM
[dead]
shevy-java today at 2:08 PM
I don't want "Agentic Power".

I want to reduce AI to zero. Granted, this is an impossible to win fight, but I feel like Don Quichotte here. Rather than windmill-dragons, it is some skynet 6.0 blob.

amazingamazing today at 2:06 PM
More benchmaxxing I see. Too bad there’s no rig with 256gb unified ram for under $1000