GitHub's Fake Star Economy

210 points - today at 8:26 AM

Comments

mauvehaus today at 11:55 AM

Can anyone explain why on earth VC's are making actual investment decisions based on imaginary internet points? This would be like an NFL team drafting a quarterback based on how many instagram followers they have rather than a relevant metric like pass completion, or god forbid, doing some work and actually scouting candidates. Maybe the Cleveland Browns would do that[0], but it's not a way to mount a serious Super Bowl campaign[1].

Are VC's just that lazy about making investment decisions? Is this yet another side-effect of ZIRP[2] and too much money chasing a return? Is nobody looking too hard in the hope of catching the next rocket to the moon?

From the outside, investing based on GitHub stars seems insane. Like, this can't be a serious way of investing money. If you told me you were going to invest my money based on GitHub stars, I'd laugh, and then we'd have an awkward silence while I realize there isn't a punchline coming.

[0] I'm from Cleveland. I get to pick on them.

[1] https://en.wikipedia.org/wiki/List_of_Cleveland_Browns_seaso... I think their record speaks for itself.

[2] https://en.wikipedia.org/wiki/Zero_interest-rate_policy

whatisthiseven today at 11:24 AM

I don't think I have ever used stars in making a decision to use a library and I don't understand why anyone would.

Here are the things I look at in order:

* last commit date. Newer is better

* age. old is best if still updating. New is not great but tolerable if commits aren't rapid

* issues. Not the count, mind you, just looking at them. How are they handled, what kind of issues are lingering open.

* some of the code. No one is evaluating all of the code of libraries they use. You can certainly check some!

What does stars tell me? They are an indirect variable caused by the above things (driving real engagement and third interest) or otherwise fraud. Only way to tell is to look at the things I listed anyway.

I always treated stars like a bookmark "I'll come back to this project" and never thought of it as a quality metric. Years ago when this problem first surfaced I was surprised (but should not have been in retrospect) they had become a substitute for quality.

I hope the FTC comes down hard on this.

Edit:

* commit history: just browse the history to see what's there. What kind of changes are made and at what cadence.

gobdovan today at 10:58 AM

These kinds of articles make you feel like there are specific, actionable problems that just need an adjustment and then they disappear. However, the system is much worse than you'd expect. Studies like this are extremely valuable, but they don't address the systematic problems affecting all signaling channels: most signals themselves have been manufactured into a product.

Build a SaaS and you'll have "journalists" asking if they can include you in their new "Top [your category] Apps in [current year]", you just have to pay $5k for first place, $3k for second, and so on (with a promotional discount for first place, since it's your first interaction).

You'll get "promoters" offering to grow your social media following, which is one reason companies may not even realize that some of their own top accounts and GitHub stars are mostly bots.

You'll get "talent scouts" claiming they can find you experts exactly in your niche, but in practice they just scrape and spam profiles with matching keywords on platforms like LinkedIn once you show interest, while simultaneously telling candidates that they work with companies that want them.

And in hiring, you'll see candidates sitting in interview farms quite clearly in East Asia, connecting through Washington D.C. IPs, present themselves with generic European names, with synthetic camera backgrounds, who somehow ace every question, and list experience with every technology you mention in the job post in their CVs already (not hyperbole, I've seen exactly this happen).

If a metric or signal matters, there is already an ecosystem built to fake it, and faking it starts to be operational and just another part of doing business.

donatj today at 10:41 AM

I run a tiny site that basically gave a point-at-able definition to an existing adhoc standard. As part of the effort I have a list of software and libraries following the standard on the homepage. Initially I would accept just about anything but as the list grew I started wanting to set a sort of notability baseline.

Specifically someone submitted a library that was only several days old, clearly entirely AI generated, and not particularly well built.

I noted my concerns with listing said library in my reply declining to do so, among them that it had "zero stars". The author was very aggressive and in his rant of a reply asked how many stars he needed. I declined to answer, that's not how this works. Stars are a consideration, not the be all end all.

You need real world users and more importantly real notability. Not stars. The stars are irrelevant.

This conversation happened on GitHub and since then I have had other developers wander into that conversation and demand I set a star count definition for my "vague notability requirement". I'm not going to, it's intentionally vague. When a metric becomes a target it ceases to be a good metric as they say.

I don't want the page to get overly long, and if I just listed everything with X star count I'd certainly list some sort of malware.

I am under no obligation to list your library. Stop being rude.

mlpotato today at 11:41 AM

I wonder if it makes sense for GitHub to use graph-theoretic measures like PageRank instead of raw stars. In simple terms, a repo is considered important if it is starred or forked by GitHub users who maintain other important repos.

It’s more expensive to compute, but the resulting scores would be more trustworthy unless I’m missing something.

9cb14c1ec0 today at 11:53 AM

Github could easily crack down on this. Spend $10 at each star provider, then ban all accounts involved. A tiny bit of money could create a huge drag on the ecosystem.

mentalgear today at 11:12 AM

> VC funding pipeline that treats GitHub popularity as proof of traction

Why am I not surprised big Capital corrupts everything. Also, Goodhart's law applies again: "When a measure becomes a target, it ceases to be a good measure".

HN Folks: What reliant, diverse signals do you use to quickly eval a repo's quality? For me it is: Maintenance status, age, elegance of API and maybe commit history.

PS: From the article:

> instead tracks unique monthly contributor activity - anyone who created an issue, comment, PR, or commit. Fewer than 5% of top 10,000 projects ever exceeded 250 monthly contributors; only 2% sustained it across six months.

> [...] recommends five metrics that correlate with real adoption: package downloads, issue quality (production edge cases from real users), contributor retention (time to second PR), community discussion depth, and usage telemetry.

ernst_klim today at 10:20 AM

I think people expect the star system to be a cheap proxy for "this is a reliable piece of sorfware which has a good quality and a lot of eyes".

I think as a proxy it fails completely: astroturfing aside stars don't guarantee popularity (and I bet the correlation is very weak, a lot of very fundamental system libraries have small number of stars). Stars also don't guarantee the quality.

And given that you can read the code, stars seem to be a completely pointless proxy. I'm teaching myself to skip the stars and skim through the code and evaluate the quality of both architecture and implementation. And I found that quite a few times I prefer a less-"starry" alternative after looking directly at the repo content.

dafi70 today at 9:07 AM

Honest question: how can VCs consider the 'star' system reliable? Users who add stars often stop following the project, so poorly maintained projects can have many stars but are effectively outdated. A better system, but certainly not the best, would be to look at how much "life" issues have, opening, closing (not automatic), and response times. My project has 200 stars, and I struggle like crazy to update regularly without simple version bumps.

mercurialsolo today at 11:47 AM

15 mins into this - Built this to identify the fraudsters https://github.com/mercurialsolo/realstars

We should do a hall of shame!

gslin today at 10:36 AM

* https://dagster.io/blog/fake-stars (2023) - Tracking the Fake GitHub Star Black Market with Dagster, dbt and BigQuery

* https://arxiv.org/abs/2412.13459 (2024/2025) - Six Million (Suspected) Fake Stars in GitHub: A Growing Spiral of Popularity Contests, Spams, and Malware

tsylba today at 11:03 AM

Personally I use stars in two ways: 1) It's interesting and I want to keep track of it for possible future use and 2) It's a fantastic idea and kudos to you even if I'll never use it.

As a side note it's kind of disheartening that everytime there is a metric related to popularity there would be some among us that will try to game it for profit, basically to manipulate our natural bias.

As a side note it's always a bit sad how the parasocial nature of the modern web make us like machine interfacing via simple widgets, becoming mechanical robot ourselves rationalising IO via simple metrics kind of forgetting that the map is never the territory.

aledevv today at 9:42 AM

> VCs explicitly use stars as sourcing signals

In my opinion, nothing could be more wrong. GitHub's own ratings are easily manipulated and measure not necessarily the quality of the project itself, but rather its Popularity. The problem is that popularity is rarely directly proportional to the quality of the project itself.

I'm building a product and I'm seeing what important is the distribution and comunication instead of the development it self.

Unfortunately, a project's popularity is often directly proportional to the communication "built" around it and inversely proportional to its actual quality. This isn't always the case, but it often is.

Moreover, adopting effective and objective project evaluation tools is quite expensive for VCs.

lkm0 today at 9:23 AM

We're this close to rediscovering pagerank

apples_oranges today at 9:23 AM

I look at the starts when choosing dependencies, it's a first filter for sure. Good reminder that everything gets gamed given the incentives.

ImJasonH today at 11:48 AM

Why would OpenAI have bought stars for openai-fm I wonder?

ricardo81 today at 11:20 AM

Same old story of centralised algorithms being abused.

Github stars is akin to 'link popularity' or 'pagerank' which is ripe for abuse.

One way around it is to trust well known authors/users more. But it's hard to verify who is who. And accounts get bought/closed/hacked.

Another way is to hand over the algo in a way where individuals and groups can shape it, so there's no universal answer to everyone.

hnmullany today at 11:18 AM

I came across one of these in 2018 with a "hot" open source company raising a Series B. An impressive star ramp (about 300% YoY growth) before the (high-priced/competitive) raise and three months later Github had revoked almost all the star growth from the previous year, resulting in a 20% YoY record. The company eventually got acquihired.

ossusermivami today at 11:36 AM

what is this one about:

> When nobody is forking a 157,000-star repository, nobody is using it

that is completely not true, i don't fork a repo when i use it, only when i want to contribute to it (and usually cleanup my forks)

socketcluster today at 9:58 AM

My project https://github.com/socketCluster/socketcluster has been accumulating stars slowly but steadily over about 13 years. Now it has over 6k stars but it doesn't seem to mean much nowadays as a metric. It sucks having put in the effort and seeing it get lost in a sea of scams and seeing people doubting my project's own authenticity.

It does feel like everything is a scam nowadays though. All the numbers seem fake; whether it's number of users, number of likes, number of stars, amount of money, number of re-tweets, number of shares issued, market cap... Maybe it's time we focus on qualitative metrics instead?

mercurialsolo today at 11:21 AM

Stars are like for developers? and you have a bunch of creators now entering the arena. what did you expect?

Lapel2742 today at 9:11 AM

I do not look at the stars. I look at the list of contributors, their activities and the bug reports / issues.

mercurialsolo today at 11:25 AM

Cost of signalling is way lesser than the cost of verification.

elashri today at 9:38 AM

I usually use stars as a bookmark list to visit later (which I rarely do). I probably would need to stop doing that and use my self-hosted "Karkeep" instance for github projects as well.

ildari today at 11:34 AM

Bots are killing opensource, but they pump product metrics so nobody cares. I maintain an open source repo and we've made a decision to limit all bot activity, even if it makes us less sexy in front of VCs.

We figured out a workaround to limit activity to prior contributors only, and add a CI job that pushes a coauthored commit after passing captcha on our website. It cut the AI slop by 90%. Full write-up https://archestra.ai/blog/only-responsible-ai

spocchio today at 9:51 AM

I think the reason is that investors are not IT experts and don't know better metrics to evaluate.

I guess it's like fake followers on other social media platforms.

To me, it just reflects a behaviour that is typical of humans: in many situations, we make decisions in fields we don't understand, so we evaluate things poorly.

Topfi today at 9:10 AM

I don't know what is more, for lack of a better word, pathetic, buying stars/upvotes/platform equivalent or thinking of oneself as a serious investor and using something like that as a metric guiding your decision making process.

I'd give a lot of credit to Microsoft and the Github team if they went on a major ban/star removal wave of affected repos, akin to how Valve occasionally does a major sweep across CSGO2 banning verified cheaters.

talsania today at 8:59 AM

Seen this firsthand, repos with hundreds of stars and zero meaningful commits or issues. In hardware/RTL projects it's less prominent.

nottorp today at 9:43 AM

Why is zero public repos a criteria?

I paid github for years to keep my repos private...

But then I don't participate in the stars "economy" anyway, I don't star and I don't count stars, so I'm probably irrellevant for this study.

AKSF_Ackermann today at 9:25 AM

So, if star to fork ratio is the new signal, time to make an extra fake star tier, where the bot forks the repo, generates a commit with the cheapest LLM available and pushes that to gh, right?

umrashrf today at 10:44 AM

The stick of God doesn't make sound. God's work indeed

anant-singhal today at 9:35 AM

Seen this happen first-hand with mid-to-large open source projects that sometimes "sponsor" hackathons, literally setting a task to "star the repo" to be eligible.

It’s supposed to get people to actually try your product. If they like it, they star it. Simple.

At that point, forcing the action just inflates numbers and strips them of any meaning.

Gaming stars to set it as a positive signal for the product to showcase is just SHIT.

Oras today at 9:44 AM

Would be nice to see the ratio of OpenClaw stars

nryoo today at 9:41 AM

The real metric is: does it solve my problem, and is the maintainer still responding to issues? Everything else is just noise.

ozgrakkurt today at 10:05 AM

> Jordan Segall, Partner at Redpoint Ventures, published an analysis of 80 developer tool companies showing that the median GitHub star count at seed financing was 2,850 and at Series A was 4,980. He confirmed: "Many VCs write internal scraping programs to identify fast growing github projects for sourcing, and the most common metric they look toward is stars."

> Runa Capital publishes the ROSS (Runa Open Source Startup) Index quarterly, ranking the 20 fastest-growing open-source startups by GitHub star growth rate. Per TechCrunch, 68% of ROSS Index startups that attracted investment did so at seed stage, with $169 million raised across tracked rounds. GitHub itself, through its GitHub Fund partnership with M12 (Microsoft's VC arm), commits $10 million annually to invest in 8-10 open-source companies at pre-seed/seed stages based partly on platform traction.

This all smells like BS. If you are going to do an analysis you need to do some sound maths on amount of investment a project gets in relation to github starts.

All this says is stars are considered is some ways, which is very far from saying that you get the fake stars and then you have investment.

This smells like bait for hating on people that get investment

rvz today at 10:39 AM

Who ever thought that GitHub stars were a legitimate measure of a project's popularity does not understand Goodhart's Law and such metrics were easily abused, faked, gamed and manipulated.

kortilla today at 10:27 AM

I asked Claude for an analysis on the maturity of various open source projects accomplishing the same thing. Its first searches were for GitHub star counts for each project. I was appalled at how dumb an approach that was and mortified at how many people must be espousing that equivocation online to make the training jump to that.

scotty79 today at 10:22 AM

Definite proof that github is social network for programmers.

bjourne today at 9:58 AM

> The CMU researchers recommended GitHub adopt a weighted popularity metric based on network centrality rather than raw star counts. A change that would structurally undermine the fake star economy. GitHub has not implemented it.

> As one commenter put it: "You can fake a star count, but you can't fake a bug fix that saves someone's weekend."

I'm curious what the research says here---can you actually structurally undermine the gamification of social influence scores? And I'm pretty sure fake bugfixes are almost trivial to generate by LLMs.

fontain today at 10:11 AM

https://x.com/garrytan/status/2045404377226285538

“gstack is not a hypothetical. It’s a product with real users:

75,000+ GitHub stars in 5 weeks

14,965 unique installations (opt-in telemetry, so real number is at least 2x higher)

305,309 skill invocations recorded since January 2026

~7,000 weekly active users at peak”

GitHub stars are a meaningless metric but I don’t think a high star count necessarily indicates bought stars. I don’t think Garry is buying stars for his project.

People star things because they want to be seen as part of the in-crowd, who knows about this magical futuristic technology, not because they care to use it.

Some companies are buying stars, sure, but the methodology for identifying it in this article is bad.

drcongo today at 10:41 AM

I got gently admonished on here a while back for mentioning that I find those star graph things people put on their READMEs to have entirely the opposite effect than that which was intended. I see one of those and I'm considerably less likely to trust the project because a) you're chasing a stupider metric than lines of code, and b) people obviously buy stars.

m00dy today at 9:52 AM

same here on HN as well

RITESH1985 today at 10:35 AM

The fake star problem is a symptom of a deeper issue — developers can't tell signal from noise in the agent ecosystem. The tools that actually get real adoption are the ones that solve acute production problems. Agents are hitting these in production issues of state management every day and there's almost no tooling for it. That's where genuine organic stars come from — solving a real pain, not gaming rankings