Ask HN: Are we ready for vulnerabilities to be words instead of code?

4 points - last Thursday at 9:07 PM

Until now, security has been math. Buffer overflows, SQL injections, crypto flaws — deterministic, testable, formally verifiable.

But we're giving agents terminal access and API keys now. The attack vector is becoming natural language. An agent gets "socially engineered" by a prompt; another hallucinates fake data and passes it down the chain.

Trying to secure these systems feels like trying to write a regex that catches every possible lie. We've shifted the foundation of security from numbers to words, and I don't think we've figured out what that means yet.

Is anyone thinking about actual architectural solutions to this? Not just "use another LLM to guard the LLM" — that feels like circular logic. Something fundamentally different.

(Not a native English speaker, used AI to clean up the grammar.)

Comments

aetherps yesterday at 7:53 PM

The sql injection analogy is actually pretty apt. we had parameterized queries as a systematic defense -- the question is whats the equivalent for prompt injection. right now the answer is layered: input validation, output filtering, least-privilege scoping, and critically actually testing your prompts against known attack patterns before deployment. you can run your system prompt through aiunbreakable.com/scanner for free -- it will flag which injection categories you're vulnerable to....

raw_anon_1111 last Thursday at 11:52 PM

It’s really not that hard to secure agents. Just give them tightly scope API Keys, put them in front of your API and treat it like you would a user instead of behind your API.

If I were to ever use Claude in a production environment for an AWS account for instance, you best believe the role it was running with with temporary access keys would be the bare minimum.

lielcohen last Thursday at 9:43 PM

To be clear - I'm not really talking about my personal laptop. I'm thinking about where this is going at scale. When companies start replacing entire teams with agents (and looking at the layoffs, that's clearly the direction), those agents will need real access to production systems. That's the scenario where "just don't give it access" stops being an answer.

nine_k last Thursday at 9:13 PM

Scams and "social engineering", as known for a long time, could be a good approximation.

stephenr yesterday at 5:37 AM

If at this point you (where you may be a person or a company) still think relying on spicy autocomplete is a smart decision, I can't fucking help you, and you deserve whatever bad things happen to you.

This is akin to saying "we are fully committed to slapping together sql queries directly from request data, but I wonder if it's risky?"

Part of security awareness is knowing when something is simply not worth the risks.

iam_circuit yesterday at 3:40 AM

[dead]