Show HN: Axe – A 12MB binary that replaces your AI framework
104 points - today at 1:49 PM
I built Axe because I got tired of every AI tool trying to be a chatbot.
Most frameworks want a long-lived session with a massive context window doing everything at once. That's expensive, slow, and fragile. Good software is small, focused, and composable... AI agents should be too.
Axe treats LLM agents like Unix programs. Each agent is a TOML config with a focused job. Such as code reviewer, log analyzer, commit message writer. You can run them from the CLI, pipe data in, get results out. You can use pipes to chain them together. Or trigger from cron, git hooks, CI.
What Axe is:
- 12MB binary, two dependencies. no framework, no Python, no Docker (unless you want it)
- Stdin piping, something like `git diff | axe run reviewer` just works
- Sub-agent delegation. Where agents call other agents via tool use, depth-limited
- Persistent memory. If you want, agents can remember across runs without you managing state
- MCP support. Axe can connect any MCP server to your agents
- Built-in tools. Such as web_search and url_fetch out of the box
- Multi-provider. Bring what you love to use.. Anthropic, OpenAI, Ollama, or anything in models.dev format
- Path-sandboxed file ops. Keeps agents locked to a working directory
Written in Go. No daemon, no GUI.
What would you automate first?
Comments
I have a known workflow to create an RPG character with steps, lets automate some of the boilerplate by having a succession of LLMs read my preferences about each step and apply their particular pieces of data to that step of the workflow, outputting their result to successive subdirectories, so I can pub/sub the entire process and make edits to intermediate files to tweak results as I desire.
Now that's cool!
The first question that comes to mind is: how do you think about cost control? Putting a ton in a giant context window is expensive, but unintentionally fanning out 10 agents with a slightly smaller context window is even more expensive. The answer might be "well, don't do that," and that certainly maps to the UNIX analogy, where you're given powerful and possibly destructive tools, and it's up to you to construct the workflow carefully. But I'm curious how you would approach budget when using Axe.
Aside but 12 MB is ... large ... for such a thing. For reference, an entire HTTP (including crypto, TLS) stack with LLM API calls in Zig would net you a binary ~400 KB on ReleaseSmall (statically linked).
You can implement an entire language, compiler, and a VM in another 500 KB (or less!)
I don't think 12 MB is an impressive badge here?
A Proper self-contained, self improving AI@home with the AI as the OS is my end goal, I have a nice high spec but older laptop I am currently using as a sacrificial pawn experimenting with this, but there is a big gap in my knowledge and I'm still working through GPT2 level stuff, also resources are tight when you're retired. I guess someone will get there this year the way things are going, but I'm happy to have fun until then.
I just don't see this in the readme… It is not in the Features section at least.
Anyway, i have MCP server that can post inline comments into Gitlab MR. Would like to try to hook it up to the code reviewer.
OP, what have you used this on in practice, with success?
I'm a bit skeptical of this approach, at least for building general purpose coding agents. If the agents were humans, it would be absolutely insane to assign such fine-grained responsibilities to multiple people and ask them to collaborate.
It looks like Axe works the same way: fire off a request and later look at the results.
how would you say this compares to similar tools like google’s dotprompt? https://google.github.io/dotprompt/getting-started/
One thing I’ve noticed when experimenting with agent pipelines is that the “single-purpose agent” model tends to make both cost control and reasoning easier. Each agent only gets the context it actually needs, which keeps prompts small and behavior easier to predict.
Where it gets interesting is when the pipeline starts producing artifacts instead of just text — reports, logs, generated files, etc. At that point the workflow starts looking less like a chat session and more like a series of composable steps producing intermediate outputs.
That’s where the Unix analogy feels particularly strong: small tools, small contexts, and explicit data flowing between steps.
Curious if you’ve experimented with workflows where agents produce artifacts (files, reports, etc.) rather than just returning text.
Tiny note: there's a typo in your repo description.