My AI Adoption Journey

216 points - today at 7:04 PM

Comments

mjr00 today at 8:42 PM

> Break down sessions into separate clear, actionable tasks. Don't try to "draw the owl" in one mega session.

This is the key one I think. At one extreme you can tell an agent "write a for loop that iterates over the variable `numbers` and computes the sum" and they'll do this successfully, but the scope is so small there's not much point in using an LLM. On the other extreme you can tell an agent "make me an app that's Facebook for dogs" and it'll make so many assumptions about the architecture, code and product that there's no chance it produces anything useful beyond a cool prototype to show mom and dad.

A lot of successful LLM adoption for code is finding this sweet spot. Overly specific instructions don't make you feel productive, and overly broad instructions you end up redoing too much of the work.

libraryofbabel today at 10:19 PM

This is such a lovely balanced thoughtful refreshingly hype-free post to read. 2025 really was the year when things shifted and many first-rate developers (often previously AI skeptics, as Mitchell was) found the tools had actually got good enough that they could incorporate AI agents into their workflows.

It's a shame that AI coding tools have become such a polarizing issue among developers. I understand the reasons, but I wish there had been a smoother path to this future. The early LLMs like GPT-3 could sort of code enough for it to look like there was a lot of potential, and so there was a lot of hype to drum up investment and a lot of promises made that weren't really viable with the tech as it was then. This created a large number of AI skeptics (of whom I was one, for a while) and a whole bunch of cynicism and suspicion and resistance amongst a large swathe of developers. But could it have been different? It seems a lot of transformative new tech is fated to evolve this way. Early aircraft were extremely unreliable and dangerous and not yet worthy of the promises being made about them, but eventually with enough evolution and lessons learned we got the Douglas DC-3, and then in the end the 747.

If you're a developer who still doesn't believe that AI tools are useful, I would recommend you go read Mitchell's post, and give Claude Code a trial run like he did. Try and forget about the annoying hype and the vibe-coding influencers and the noise and just treat it like any new tool you might put through its paces. There are many important conversations about AI to be had, it has plenty of downsides, but a proper discussion begins with close engagement with the tools.

EastLondonCoder today at 9:07 PM

This matches my experience, especially "don’t draw the owl" and the harness-engineering idea.

The failure mode I kept hitting wasn’t just "it makes mistakes", it was drift: it can stay locally plausible while slowly walking away from the real constraints of the repo. The output still sounds confident, so you don’t notice until you run into reality (tests, runtime behaviour, perf, ops, UX).

What ended up working for me was treating chat as where I shape the plan (tradeoffs, invariants, failure modes) and treating the agent as something that does narrow, reviewable diffs against that plan. The human job stays very boring: run it, verify it, and decide what’s actually acceptable. That separation is what made it click for me.

Once I got that loop stable, it stopped being a toy and started being a lever. I’ve shipped real features this way across a few projects (a git like tool for heavy media projects, a ticketing/payment flow with real users, a local-first genealogy tool, and a small CMS/publishing pipeline). The common thread is the same: small diffs, fast verification, and continuously tightening the harness so the agent can’t drift unnoticed.

sho_hn today at 8:42 PM

Much more pragmatic and less performative than other posts hitting frontpage. Good article.

keyle today at 10:16 PM

It's amusing how everyone seems to be going through the same journey.

I do run multiple models at once now. On different parts of the code base.

I focus solely on the less boring tasks for myself and outsource all of the slam dunk and then review. Often use another model to validate the previous models work while doing so myself.

I do git reset still quite often but I find more ways to not get to that point by knowing the tools better and better.

Autocompleting our brains! What a crazy time.

zubspace today at 10:41 PM

It's so sad that we're the ones who have to tell the agent how to improve by extending agent.md or whatever. I constantly have to tell it what I don't like or what can be improved or need to request clarifications or alternative solutions.

This is what's so annoying about it. It's like a child that does the same errors again and again.

But couldn't it adjust itself with the goal of reducing the error bit by bit? Wouldn't this lead to the ultimate agent who can read your mind? That would be awesome.

underdeserver today at 9:42 PM

> At a bare minimum, the agent must have the ability to: read files, execute programs, and make HTTP requests.

That's one very short step removed from Simon Willison's lethal trifecta.

taikahessu today at 10:47 PM

Do you have any ideas on how to harness AI to only change specific parts of a system or workpiece? Like "I consider this part 80/100 done and only make 'meaningful' or 'new contributions' here" ...?

cal_dent today at 10:27 PM

Just wanted to say that was a nice and very grounded write up; and as a result very informative. Thank you. More stuff like this is a breath of fresh air in a landscape that has veered into hyperbole territory both in the for and against ai sides

senko today at 9:50 PM

For those wondering how that looks in practice, here's one of OP's past blog posts describing a coding session to implement a non-trivial feature: https://mitchellh.com/writing/non-trivial-vibing (covered on HN here: https://news.ycombinator.com/item?id=45549434)

dudewhocodes today at 10:44 PM

Refreshing to read a balanced opinion, from a person who has significant experience and grounding in the real world.

davidw today at 9:48 PM

This seems like a pretty reasonable approach that charts a course between skepticism and "it's a miracle".

I wonder how much all this costs on a monthly basis?

raphinou today at 8:46 PM

I recently also reflected on the evolution of my use of ai in programming. Same evolution, other path. If anyone is interested: https://www.asfaload.com/blog/ai_use/

butler14 today at 9:03 PM

I'd be interested to know what agents you're using. You mentioned Claude and GPT in passing, but don't actually talk about which you're using or for which tasks.

mwigdahl today at 8:33 PM

Good article! I especially liked the approach to replicate manual commits with the agent. I did not do that when learning but I suspect I'd have been much better off if I had.

deleted today at 10:09 PM

fix4fun today at 8:52 PM

Thanks for sharing your experiences :)

You mentioned "harness engineering". How do you approach building "actual programmed tools" (like screenshot scripts) specifically for an LLM's consumption rather than a human's? Are there specific output formats or constraints you’ve found most effective?

0xbadcafebee today at 10:02 PM

> I'm not [yet?] running multiple agents, and currently don't really want to

This is the main reason to use AI agents, though: multitasking. If I'm working on some Terraform changes and I fire off an agent loop, I know it's going to take a while for it to produce something working. In the meantime I'm waiting for it to come back and pretend it's finished (really I'll have to fix it), so I start another agent on something else. I flip back and forth between the finished runs as they notify me. At the end of the day I have 5 things finished rather than two.

The "agent" doesn't have to be anything special either. Anything you can run in a VM or container (vscode w/copilot chat, any cli tool, etc) so you can enable YOLO mode.

apercu today at 9:55 PM

I find it interesting that this thread is full of pragmatic posts that seem to honestly reflect the real limits of current Gen-Ai.

Versus other threads (here on HN, and especially on places like LinkedIn) where it's "I set up a pipeline and some agents and now I type two sentences and amazing technology comes out in 5 minutes that would have taken 3 devs 6 months to do".

polyrand today at 9:59 PM

> a period of inefficiency

I think this is something people ignore, and is significant. The only way to get good at coding with LLMs is actually trying to do it. Even if it's inefficient or slower at first. It's just another skill to develop [0].

And it's not really about using all the plugins and features available. In fact, many plugins and features are counter-productive. Just learn how to prompt and steer the LLM better.

[0]: https://ricardoanderegg.com/posts/getting-better-coding-llms...

jonathanstrange today at 9:48 PM

There are so many stories about how people use agentic AI but they rarely post how much they spend. Before I can even consider it, I need to know how it will cost me per month. I'm currently using one pro subscription and it's already quite expensive for me. What are people doing, burning hundreds of dollars per month? Do they also evaluate how much value they get out of it?

jeffrallen today at 9:48 PM

> babysitting my kind of stupid and yet mysteriously productive robot friend

LOL, been there, done that. It is much less frustrating and demoralizing than babysitting your kind of stupid colleague though. (Thankfully, I don't have any of those anymore. But at previous big companies? Oh man, if only their commits were ONLY as bad as a bad AI commit.)

whatifnomoney today at 10:36 PM

[dead]

xyst today at 9:18 PM

[flagged]

vonneumannstan today at 8:31 PM

For the AI skeptics reading this, there is an overwhelming probability that Mitchell is a better developer than you. If he gets value out of these tools you should think about why you can't.

therein today at 8:28 PM

[flagged]