Google Engineers Launch "Sashiko" for Agentic AI Code Review of the Linux Kernel
74 points - today at 4:17 PM
SourceComments
For an example of a review (picked pretty much at random) see: https://sashiko.dev/#/patchset/20260318151256.2590375-1-andr...
The original patch series corresponding to that is: https://lkml.org/lkml/2026/3/18/1600
Edit: Here's a simpler and better example of a review: https://sashiko.dev/#/patchset/20260318110848.2779003-1-liju...
I'm very glad they're not spamming the mailing list.
That's cool. Another interesting metric, however, would be the false positive ratio: like, I could just build a bogus system that simply marks everything as a bug and then claim "my system found 100% of all bugs!"
In practice, not just the recall of a bug finding system is important but also its precision: if human reviewers get spammed with piles of alleged bug reports by something like Sashiko, most of which turn out not to be bugs at all, that noise binds resources and could undermine trust in the usefulness of the system.
I think the table might be slightly inside-out? The Status column appears to show internal pipeline states ("Pending", "In Review") that really only matter to the system, while Findings are buried in the column on the far right. For example, one reviewed patchset with a critical and a high finding is just causally hanging out below the fold. I couldn't immediately find a way to filter or search for severe findings.
It might help to separate unreviewed patches from reviewed ones, and somehow wire the findings into the visual hierarchy better. Or perhaps I'm just off base and this is targeting a very specific Linux kernel community workflow/mindset.
Just my 1c.
(Also tests can be focused per defect.. which prevents overload)
From some of the changes I'm seeing: This looks like it's doing style and structure changes, which for a codebase this size is going to add drag to existing development. (I'm supportive of cleanups.. but done on an automated basis is a bad idea)
I.e. https://sashiko.dev/#/message/20260318170604.10254-1-erdemhu...
We've already seen how bug bounty projects were closed by AI spam; I think it was curl? Or some other project I don't remember right now.
I think AI tools should be required, by law, to verify that what they report is actually a true bug rather than some hypothetical, hallucinated context-dependent not-quite-a-real-bug bug.