We stopped AI bot spam in our GitHub repo using Git's –author flag

440 points - yesterday at 3:24 PM

Comments

captn3m0 yesterday at 4:07 PM

This has a security implication which is overlooked. Contributors to a repository have higher rights, such as avoiding approval requirements for fork PR runs. GitHub warns in the docs:

> When requiring approvals only for first-time contributors (the first two settings), a user that has had any commit or pull request merged into the repository will not require approval. A malicious user could meet this requirement by getting a simple typo or other innocuous change accepted by a maintainer, either as part of a pull request they have authored or as part of another user's pull request.

halapro yesterday at 6:39 PM

Screw GitHub for letting this happen. If they implemented some very basic requirements to comment and open PRs we wouldn't be here.

Also please let us delete PRs just like we can delete issues.

silverwind yesterday at 3:52 PM

PR spam is a major problems for repo that run bounties. Maybe GitHub should temporarily block accounts from raising PRs if like 95%+ of them are getting rejected.

krupan yesterday at 5:31 PM

This is what we get for telling everyone how amazing AI is at writing code. It started with the people selling AI and for some reason tons of independent developers, some quite well respected in our field, piled on. Facebook now laying people off and saying it's because AI is just so good adds more fuel to the fire. Now you have a bunch of people fully confident that their AI friend is pumping out amazing code and submitting it to projects that are completely overwhelmed

arecsu yesterday at 4:01 PM

Makes me wonder if an ELO-based system would work to mitigate these issues. People who merged PR successfully onto a project, that had real issues acknowledged, the quality of their responses measured by other users reactions or something, etc, multiplied possibly by the degree of importance of the project where their activity has been made. Won't be about human vs AI, but actual helpful effective being vs low effort/spammy contributions. Issues and PRs could be sorted and filtered by their ELO score. I'm saying ELO as analogy to "score based given the context", not really a 1:1 translation of the ELO system.

Negative score would be reports from other users because of spammy content or not acknowledged issues, with a middle ground of neutral score (+-0) or little positive score to issues or whatever with clear good intention, but couldn't reach a proper merged PR or were not issues (e.g. issue existed but wasn't the correct repo to be addressed, PR was good but needed other stuff to be implemented prior to it, maybe in the long run, etc)

thih9 yesterday at 4:56 PM

> It's not a contract job— it's our optional way of saying thank you to the community.

The writing style in their onboarding doc has common AI tells (in the quote: em dashes, “it’s not A, it’s B” sentence).

I can understand that, perhaps they want to fight fire with fire or don’t have time as they already say. Still, it all feels like inadequate half measures to me.

infinitifall yesterday at 4:55 PM

Is the solution to everything simply more catgirls [1]? Proof-of-work was, after all, about countering email spam. PR spam is but the latest in that long tradition.

1- https://anubis.techaro.lol

hiccuphippo yesterday at 3:58 PM

The irony of the .ai domain.

zer0tonin yesterday at 3:53 PM

> Should we stop giving fun test tasks to our job candidates?

Yes

foresto yesterday at 9:49 PM

> If the email matches their GitHub account, GitHub links the commit to their profile and grants them contributor status.

When the article mentioned email matching, I was concerned that it would break down when a contributor's email address changes. (I have contributed to more than a few projects over the years, using email addresses that no longer exist.)

However, it looks like they're not using the email address recorded in the author's original git commit, but instead a GitHub-generated address whose unique parts are the GitHub user ID and username. That should survive authors changing their email addresses. It would still break down if a contributor loses access to their account and has to create a new one, but that's probably less common.

jart yesterday at 5:09 PM

This is great example of the toxic effect money has on open source. Reward people with respect and recognition instead. Weird anonymous accounts no one's ever heard of will leave, because someone (or something) who's concealing their identity has nothing to gain from recognition. Honestly GitHub should have a real names policy. Because if you're not Satoshi Nakamoto then there's only three reasons I can think of to be anonymous on GitHub: (1) to avoid obtaining your employer's authorization, (2) to spam, harass, and engage in toxic behaviors, or (3) you're not even human. All three of these are the last things I want when engaging on the GitHub platform. Don't get me wrong, I love robots. But I'm perfectly capable of talking to the robot on my own. I don't want to talk to your robot. I also don't want people slipping me intellectual property below the board without their employer's consent. And I certainly don't enjoy all the hate and harassment. GitHub has tried to help with the last part, by making overt displays of hate something that can get you in trouble. The issue is that people just get more guilesome with more anonymous accounts, because the issue was never disrespect (which can actually be strategic and pro-social if we look at Torvalds' career), but rather bad faith participation. If GitHub can guarantee that all its users are human real names good faith actors, then we might be able to start talking about open bounties.

nubinetwork yesterday at 6:44 PM

While git has always allowed this, I don't really like the idea that someone can write some code, slap my name on it, and push it to their repo.

ildari yesterday at 3:24 PM

Hi HN community, I wanted to share our approach to reduce amount of AI slop PR's and issues in our repo. We enabled "require prior contribution" flag on GH and created a CI script that creates a tiny commit co-authored with you, if you pass captcha on our website. Worked really well and we were able to block at least 500 bots in the first week. Sharing a screenshot from cloudflare: https://archestra.ai/hn-comment-cloudflare-challenge-outcome...

bykhun today at 3:14 AM

You should release this as a service.

_joel yesterday at 4:13 PM

Woudln't it be trivial to farm the stats needed to pass the bot checker's theshold?

aizk yesterday at 4:40 PM

I'm not sure why gh hasn't already implemented stricter measures / filters / tools for PRs. It would cut down on spam and also help save their servers that can't handle the increased AI load!

embedding-shape yesterday at 5:13 PM

Sounds kind of weird that the blog post complains about `poisoning the conversation with pointless "implementation plans"` when literally they ask for that, after attaching $900 USD bounty to a very under-specified issue, and even replies with "Do you have an implementation plan in mind?" to some of the first "attempters". Sounds like they got exactly what they'd been asking for, and even before LLMs if you pulled something similar, the effects would have been similar.

agunapal yesterday at 5:34 PM

My first thought after reading the blog was, let me share the blog with Claude and ask it how bots can circumvent this.

imo AI bots have significantly affected OSS and we need better qualitative measures to define success

xivzgrev yesterday at 9:01 PM

I like how they are taking a stand against vanity metrics. Rare to see that these days

Muromec yesterday at 4:25 PM

How is the status revoked without rewriting git history?

rglullis yesterday at 5:19 PM

'I will take "problems that could be easily be solved by implementing a Pfand system" for $200, Alex.'

Seriously. Just ask for a US$10 deposit for the each PR. If the PR is accepted (not even merged, just accepted as "this is a good effort"), give it back. Hell, give double the amount for good effort and you got yourself a cheap way to attract good contributors.

Best case, bots will balk at the payment. Worst case, the funds can be used to hire someone specifically for triage.

cemoktra yesterday at 7:03 PM

AI company annoyed by AI ... Surprise

optionalsquid yesterday at 4:20 PM

I don't have a better solution, unfortunately, but it doesn't seem seem to like the spam problem has been solved. It has just been moved from pull requests to commits:

Currently, more than 10% of all commits in the archestra repo are essentially noise (369 of 3521 commits), accounting for more than half of all commits in the last month (303 of 578 commits).

But maybe (probably) the amount of such commits will go down over time, compared to the growing amounts of AI slop

exabrial yesterday at 5:30 PM

Signed Commits from known authors would also help!

zzzeek yesterday at 4:20 PM

so...they are manually re-setting the "interaction limits" over and over again, since they are only temporary?

why not use hooks to automatically reject issue comments / PRs etc. from users that didnt go through onboarding, rather than repurposing GH features that aren't really designed for that use (and are hence in danger of being changed someday)?

opengrass yesterday at 5:22 PM

submitting attempts — but soon...

not just this issue — but the entire repo.

contributors like @ethanwater, @developerfred, and @Geetk172 — people actively working on bounties — were getting buried.

two identity fields — author and committer — and they can be different people.

metric growth — a substantial part of

kazinator yesterday at 7:09 PM

> Final Words

> While GitHub reports massive metric growth — a substantial part of which is AI-generated — we as an open source project team have to do the heavy lifting of cleaning up AI slop from our repository and come up with esoteric workarounds to keep the level of legitimacy of our open source audience.

AI generated slop!

kittikitti yesterday at 10:20 PM

There's got to be a concept to differentiate the industry plants who start an "open source" project that has enough funding for a $900 bug bounty. They are speaking and developing in the language of corruption and they don't even know it. Of course you will receive AI bot spam, but unfortunately it will continue if you don't take a hard look in the mirror.

karel-3d yesterday at 8:45 PM

I don't understand how clicking "I agree" a few times will stop the AI bots?

The captcha - maybe.

metalliqaz yesterday at 6:06 PM

Why does this company use the Slashdot logo?

yieldcrv yesterday at 8:50 PM

reindeer games

xdennis yesterday at 6:08 PM

It's quite ironic to complain about AI slop in a piece that's quite clearly AI slop.

Soon there will be no more AI doomer comments. The bots will take over that job too.

---

I'm working for an open source company, and my God, are 95% of contributions useless.

There are really dumb ones where the bot writes 10 paragraphs about how he implemented the feature, but the entire changeset is adding one line to .gitignore or adding a CLAUDE.md file.

There are even worse ones where the bot submits 3000 lines of code that seemingly works, but you have to spend an hour to figure out why it doesn't work.

The dumb ones are so much better.

ramon156 yesterday at 4:01 PM

See, this is an article that uses dashes correctly. It adds value, creates a bit of buildup

standbyme yesterday at 5:54 PM

cool

syezdin yesterday at 6:58 PM

Interesting

9front yesterday at 8:05 PM

Musk before the verdict: "It's not okay to steal a charity"

Altman after the verdict: "It's okay to steal a charity"

IshKebab yesterday at 4:20 PM

That's a neat way to interface with GitHub's authentication system, but I don't see how they've solved the fundamental problem because their whitelisting process is just "click ok fine 10 times". Why won't the slop peddlers just do that too?

delduca yesterday at 4:00 PM

For now…

kspetkov79 today at 4:05 AM

[dead]

kestiny yesterday at 10:55 PM

[flagged]

KaiShips yesterday at 7:02 PM

[flagged]

syi0808 today at 2:26 AM

[dead]

maxothex yesterday at 4:00 PM

[flagged]

Serhii-Set yesterday at 6:13 PM

[dead]

maryamshafaqat yesterday at 3:54 PM

[dead]

petterroea yesterday at 4:05 PM

What I see is a (clever) hack, and GitHub continuing to provide good tools to its users.