DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost

230 points - today at 1:02 PM

Source

Comments

embedding-shape today at 2:24 PM
I'm not sure you need a "DeepSeek native coding agent" to take advantage of DeepSeeks cache, yesterday as the Codex quota usage issue still wasn't solved for me, I wrote a tiny little bridge so I could use DeepSeek V4 Pro via Codex, and seems most of everything I did was basically cached as far as I can tell: https://i.imgur.com/7eKn6wN.png (2026-05-23 Input (Cache hit): 39,123,200 tokens, Input (Cache miss) 1,692,286), and the bridge is doing not special, just massage the DeepSeek API shape into what Codex expects, nothing particular about caching at all.

Besides being even better at the caching, I'm not sure what benefits you'd get compared to just firing up OpenCode with the DeepSeek API yourself, it'll similarly do caching for sure and also "talks directly to api.deepseek.com" if that matters, and you'll get a much more mature harness.

agrippanux today at 6:23 PM
This website seems to have been generated by Codex - I asked Codex to create an HTML overview of a feature for my team and it made an overly produced monstrosity - complete with the same large stat boxes that were for the most part devoid of meaningful information - using the same font, colors, layout, hero section, etc. It was also terrible on mobile just like this is.

In the end I had Claude produce a one-page html file that was 95% of the way there and it took minor editing to clearly explain the intent of the feature.

skeledrew today at 2:30 PM
Not a fan of that page. The animated typing and resulting continuous resize of the example keeps moving the content beneath it down and up. Such bad UX.
arikrahman today at 7:11 PM
Saw nix suffixed and was excited a new dotfiles was about to hit the market.
carterschonwald today at 6:59 PM
i cant find anything substantiated in the code that actually differentiates it from any other harness.

my fork of oh my pi that i have a lot of experiments in, is lterally designed to only work well with models that have decent reasoning levels, like deep seek models. check it out!

https://github.com/cartazio/oh-punkin-pi/blob/main/scripts/b... — thats the install script for after clone

fair warning: tis my dog food test bed as i build even fancier stuff

declan_roberts today at 2:47 PM
I love the focus on cache hit efficiency. Hats off to the deekseek team for creating a great product that maximizes cost efficiency for the user.
wg0 today at 6:21 PM
Performance is horrible when you type but caching is magical.

Extremely pro consumer tool. I have been hammering it hard with 97% cache utilization and barely $0.03 dollar spent for me constantly exploring a codebase.

storus today at 5:42 PM
Can it instruct DeepSeek during an LLM call to start removing old tool calls from the context instead of waiting for the LLM call to finish if the context size approaches DeepSeek's dumb zone? Claude Code can't do that, /compact can only happen after the LLM call; it's often preferable to start cleaning up context during an LLM call, especially when tool calls are huge like reading markdown files; implementation-wise all that is needed is to start removing earliest <tool call start> ... <tool call end> and replacing them just with some log entry stating this tool call was already performed, then re-running KV cache prefill (so the "online" compaction would get 0.5s latency hit every time it's performed). That way one can read 1000 files in one LLM call.
unshavedyak today at 3:35 PM
It's pretty funny, i'm a $200/m Claude subscriber and i've had little need to use anything else. However the more Claude has been restricting my workflow (notably around the recent IDE/-p usage change) the more i've been wanting to go elsehwere.

I'm concerned since i really want SOTA reasoning, but DeepSeek still has me interested.

schaefer today at 3:14 PM
Okay, I'm curious.

From the FAQ, I see:

>Can I point it at a self-hosted / private DeepSeek endpoint?

>Yes. Since 0.30 we accept non-standard key prefixes for self-hosted DeepSeek endpoints. Just point `baseUrl` at your internal address — the loop, cache strategy, and tool protocol are unchanged.

But my question is: If I use Reasonix to talk to a deepseek endpoint through openrouter, am I still getting the cache-hit benifits of this agent harness?

danborn26 today at 4:44 PM
High caching rates for coding agents can drastically reduce latency and API costs. I am curious to see how the caching strategy handles context invalidation across multiple files.
nextaccountic today at 6:15 PM
> Tool-call repair

> Tool arguments the model produces occasionally have JSON typos, unclosed quotes, or shape mismatches. Reasonix runs a schema-aware repair pass before dispatch so malformed args still execute.

So Deepseek API doesn't have a structured output option where you give a grammar and the model promises the output will follow this grammar?

Or it does, but it's buggy?

imagetic today at 4:29 PM
mmaunder today at 3:22 PM
Unusable thanks to the top animation pushing the rest of the site down repeatedly as you’re trying to read.
singiamtel today at 3:39 PM
I would've liked benchmarks against other harnesses showing the caching performance
mmarcant today at 6:22 PM
"byte-stable prefix cache" -- give us your codebase in a way that's even EASIER for us to train on.
hebetude today at 3:56 PM
Wow the UI looks exactly what I vibe coded yesterday. What a coincidence
hirako2000 today at 2:32 PM
Good timing given the cost spike across other frontier models.
singingtoday today at 6:09 PM
That site does not render correctly on my android. Lots of text on the right breaking the reactive layout.
m101 today at 5:15 PM
For those of you that use deepseek v4 occasionally, what harness do you use it with? I’m only familiar with claude code and codex.

Any comments on what you can or cannot rely on it for relative to cc and codex would be appreciated too!

theanonymousone today at 2:39 PM
Isn't caching a server-side thing? How does the agent affect it, significantly at least?
yalogin today at 4:16 PM
Can someone give me a eli5 version of what this is? It really sounds useful to Claude subscribers.

Is this improving the cache hit and hence overall efficiency of coding workflows?

Does it also let me host a local llm (deepseek)? What are model min requirements for this?

fouric today at 4:34 PM
I don't think it's particularly effective to create a new coding agent when there's existing open-source agents (especially extremely extensible ones like Pi) that already optimize for cache hits, have far larger communities, and work for providers other than Deepseek.

I specifically use multiple different models and providers, so this wouldn't be useful for me.

And it contributes to the problem of each person vibe-coding their own, incompatible, half-baked tool in a space, instead of contributing to a small set of tools and expanding them.

It'd be better to just extend an existing tool.

ricardobeat today at 4:31 PM
> The loop is append-only, engineered around DeepSeek's byte-stable prefix cache — long sessions hold 90%+ cache hit and input-token cost collapses to ~1/5. Terminal-first, leave it running.

AI marketing slop. This is how all models and coding harnesses work, isn't it?

The author claims (in another AI-written post):

> LangChain — along with every generic agent framework I checked — rebuilds the prompt every turn. Timestamps get injected. History gets reordered. Tool schemas re-serialize with different whitespace.

I haven't touched LangChain in a long, long time, but don't think any of the current harnesses, Claude Code, Pi, Crush, OpenCode etc do that except if you change configuration? Keeping the context stable for caching is a very basic principle and not a wild innovation.

This posing as DeepSeek-specific is also a mystery.

hmokiguess today at 4:33 PM
Click on the download page, it's hilarious. It has a lot of information about the "smart probe" on the download and it's a realtime probe you can rerun.

That's the pinnacle of AI slop over engineered garbage in my opinion. All of that information is noise.

pkulak today at 4:23 PM
Doesn't Pi Agent do exactly this? Assuming "append only" means they do some kind of compaction as well.
deleted today at 2:26 PM
quotemstr today at 3:57 PM
> no reordering, no marker-based compaction

Is this really the behavior you want? Yes, doing tool-result clearing and such will blow your cache, but if you do it only occasionally, it's still likely a win. Yes, cache hits are good, but not so good that it's okay to be profligate with context to preserve those precious, precious KVs.

Hfuffzehn today at 4:50 PM
This is really tickling the conspiracy theorist part of my brain.

"Independent open-source project · not affiliated with DeepSeek" "Reasonix only targets DeepSeek because..." "Why DeepSeek only? Can I swap to Claude / GPT? It's a design choice, not a limitation"

The lady doth protest too much, methinks?

Nicely timed shortly after the making the rebate permanent anouncement.

Could just be Chinese devs trying to help western devs with some software and a western facing marketing campaign to raise awareness. Could be DeepSeek astroturfing. Could be "someone" in China trying to get more access to western data.

Who knows?

andai today at 4:56 PM
But Claude made the website?
am17an today at 4:31 PM
This Claude front end skill is now soon to be slop.
sergiotapia today at 2:47 PM
What AI model did you use for the website design? This is the second one I see with the exact same font and color scheme. Just curious because Claude models lean towards purples for example. Thank you!
ankitwarbhe today at 5:09 PM
you created it yourself ?
canadiantim today at 2:29 PM
So what's best low cost coding agent these days? Kimi 2.6? Qwen's latest closed model? Composer 2.5? DeepSeek?
deleted today at 3:21 PM
WhereIsTheTruth today at 6:10 PM
Y'all should not be writing js/ts/slop/npm based clis anymore

It's the agentic era, pick a better option

Just stop

aplomb1026 today at 5:39 PM
[flagged]
benjiro3000 today at 5:32 PM
[dead]
the_mitsuhiko today at 2:59 PM
[dead]