Show HN: Mcp2cli β One CLI for every API, 96-99% fewer tokens than native MCP
79 points - today at 5:18 AM
Every MCP server injects its full tool schemas into context on every turn β 30 tools costs ~3,600 tokens/turn whether the model uses them or not. Over 25 turns with 120 tools, that's 362,000 tokens just for schemas.
mcp2cli turns any MCP server or OpenAPI spec into a CLI at runtime. The LLM discovers tools on demand:
mcp2cli --mcp https://mcp.example.com/sse --list # ~16 tokens/tool
mcp2cli --mcp https://mcp.example.com/sse create-task --help # ~120 tokens, once
mcp2cli --mcp https://mcp.example.com/sse create-task --title "Fix bug"
No codegen, no rebuild when the server changes. Works with any LLM β it's just a CLI the model shells out to. Also handles OpenAPI specs (JSON/YAML, local or remote) with the same interface.Token savings are real, measured with cl100k_base: 96% for 30 tools over 15 turns, 99% for 120 tools over 25 turns.
It also ships as an installable skill for AI coding agents (Claude Code, Cursor, Codex): `npx skills add knowsuchagency/mcp2cli --skill mcp2cli`
Inspired by Kagan Yilmaz's CLI vs MCP analysis and CLIHub.
Comments
- https://github.com/apify/mcpc
- https://github.com/chrishayuk/mcp-cli
- https://github.com/wong2/mcp-cli
- https://github.com/f/mcptools
- https://github.com/adhikasp/mcp-client-cli
- https://github.com/thellimist/clihub
- https://github.com/EstebanForge/mcp-cli-ent
- https://github.com/knowsuchagency/mcp2cli
- https://github.com/philschmid/mcp-cli
- https://github.com/steipete/mcporter
- https://github.com/mattzcarey/cloudflare-mcp
- https://github.com/assimelha/cmcpI consider this a bug. I'm sure the chat clients will fix this soon enough.
Something like: on each turn, a subagent searches available MCP tools for anything relevant. Usually, nothing helpful will be found and the regular chat continues without any MCP context added.
As an aside: this is a cool idea but the prose in the readme and the above post seem to be fully generated, so who knows whether it is actually true.
It works by schematising the upstream and making data locally synchronised + a common query language, so the longer term goals are more about avoiding API limits / escaping the confines of the MCP query feature set - i.e. token savings on reading data itself (in many cases, savings can be upwards of thousands of times fewer tokens)
Looking forward to trying this out!
Tell me the hottest day in Paris in the
coming 7 days. You can find useful tools
at www.weatherforadventurers.com/tools
And then the tools url can simply return a list of urls in plain text like /tool/forecast?city=berlin&day=2026-03-09 (Returns highest temp and rain probability for the given day in the given city)
Which return the data in plain text.What additional benefits does MCP bring to the table?
anthropic mentions MCPs eating up context and solutions here: https://www.anthropic.com/engineering/code-execution-with-mc...
I built one specifically for Cognition's DeepWiki (https://crates.io/crates/dw2md) -- but it's rather narrow. Something more general like this clearly has more utility.
So, I dont see why a typical productivity app build CLI than MCP. Am I missing anything?
If the service is using more tokens to produce the same output from the same query, but over a different protocol, than the service is a scam.
You might as well directly create a CLI tool that works with the AI agents which does an API call to the service anyway.
If you want humans to spend time reading your prose, then spend time actually writing it.