Was my $48K GPU server worth it?

74 points - last Monday at 7:33 PM

Comments

freediddy today at 6:02 PM

In the last year, I have bought an M3 Ultra Mac Studio with 512 GB, a Macbook Pro M5 MAX with 128 GB and an RTX 6000 Pro. I have spent around $25k so far, not including electricity. I figured worst case scenario I can sell them in the next year and only take a haircut as opposed to losing my entire investment.

In comparison to just spending for tokens, the tokens would have been much cheaper and much much faster. I've been running against Gemma4:31b, Qwen3.5 and 3.6, and getting local LLMs to solve AMC 8/10 math questions and it's about 10-100x slower than just doing it online. When I tried it with ChatGPT late last year, it took about one night and $25 to solve about 1000 questions. Using my RTX 6000 and M3 Ultra and Gemma4:31b on both, it answered about 40 questions in 7 hours and I haven't checked how good the answer is yet. At 800 watts (600 for RTX and 200 for M3 Ultra) and running for 7 hours, it solved around 40 questions.

At the very least I'm going to try to sell my M3 Ultra if I can find a reliable place to sell it without getting ripped off by scammers.

dekhn today at 6:51 PM

I can't imagine spending $48K on a home GPU server, but I did just splurge and buy a PC with an RTX 5090, specifically to hold the largest models you can fit in 32GB. It's a top of the line PC with water cooled high end CPUs, 64GB RAM, RTX 5090 for $5K. To me the jury is still out whether this was a worthwhile investment, but I do expect to use this machine for a decade. I don't run it at 100% power (it's mostly idle, except for times when I'm training or doing batch inference). It has the nice property of being blackwell generation, similar to the machines we use at work.

It just scares me to own a box that is $48K in my house, especially if it breaks, or gets stolen.

shout5 today at 6:53 PM

Moral of the story: if you are spending this much on a server do not cheap out on risers and buy 3M.

datadrivenangel today at 6:21 PM

I did the math at least on a Macbook pro, and for inference it's definitely not worth it.

- https://www.williamangel.net/blog/2026/05/17/offline-llm-ene... - Discussion: https://news.ycombinator.com/item?id=48168198

amarant today at 6:53 PM

The research that's presented in another article on the same site is way more interesting than the betteridges law article linked here. It'll be very useful in my own latest project if this research is incorporated into some model I can rent by the token!

0xbadcafebee today at 6:13 PM

So the answer is: "TBD if I can actually make money to pay this back"

hasteg today at 6:00 PM

Just curious OP (if you're the one posting) -- what do you mean by independent researcher? What are you researching and are you making $$ from it or are you living off previous built up savings? Seems like an interesting path. What research have you looked into so far?

jameson today at 6:14 PM

The idea is similar to maintaining on-prem vs cloud

Cloud is optimized for development velocity but its nature of high margin business eventually makes on-prem more promising

It could be too late but it might be worth looking into tax saving if you have a business. Depreciation of asset is a loss and may deduct your income. (I'm NOT a tax expert)

Aurornis today at 6:03 PM

This is a difficult calculation to make because you wouldn't rent time on the exact same system in the cloud. Depending on what you're running, a bigger server with better inter-GPU interconnects in the cloud might complete the task so much faster that the additional per-hour expense is more than covered.

tombert today at 5:58 PM

I have four old 24gb Nvidia cards. They're not great but they're not useless either. The problem is that I haven't really figured out a good way to actually use them.

Genuine question; would anyone here recommend any specific motherboard to best utilize these cards?

jmyeet today at 6:10 PM

So some things have changed since this rig was first built (2024). The most relevant is that $6800 RTX 6000 Ada 48GB has arguably been supplanted by the $9500 RTX 6000 Pro 96GB.

The Ada has a memory bandwidth of 960GB/s. The Pro has 1.8TB/s and about 40-50% better performance so is at least equivalent in processing power, much better in memory bandwidth (important for inference) and can hold larger models on a single card.

I've considered buying a rig with 1-2 6000 Pros for similar reasons but I want to see what happens with this year's Mac Studios with a likely M5 Ultra. Macs have a shared memory architecture whereas NVidia segments the market based on max memory where the biggest consumer card (RTX 5090) has 32GB of VRAM but still excellent memory bandwidth (1.8TB/s). A RTX 5090 rig will still trounce a Mac Studio seems to be the conventional wisdom. Despite being able to hold larger models and being able to chain Mac Studios on TB5, their lower memory bandwidth (~900GB/s) and lower overall GFLOPS mean they still come out behind.

That being said, the current Mac Studios are relatively long in the tooth, being released in 2024.

I'm still not sure any of this is really wroth it because things are still changing so fast. I think there's a decent chance of a number of large AI companies going bust in the next 2-3 years such that you'll be able to buy enterprise AI hardware at cents on the dollar, a bit like how Google bought data centers in the post-dot-com crash.

But anyway, nowadays I'd be looking at the RTX 6000 Pro as the sweet spot, having anywhere from 1-4 in a single server.

The electricial issues the author mentions are interesting. I hadn't really thought about the max amperage on a residential circuit. In a DC, these would typically operate on three phase power and much higher overall amperage. I wonder if there's a device you can buy that can combine multiple residential circuits into a single power source for a server this power hungry?

doctorpangloss today at 5:45 PM

> Because of this I got a motherboard with slow GPU interconnect. It’s good for running many small experiments in parallel (which is my main use case) but horrible for any models split across gpus.

:( you paid a professional pc builder and you weren't told this?

pelasaco today at 6:21 PM

out of curiosity, did you check how much would cost to rent a cage in a colocation space? Having to power your computer from two different outlets sounds wild..

gosub100 today at 5:57 PM

It doesn't cover risk. If one or more gpus dies, who pays for it? If you rent, you are guaranteed to be insulated from this risk. But owning, you might not have the best return policy from the vendor. And if you are actually at fault for breaking it, they have every right to deny a return. Or if your apartment is burglarized or catches fire (possibly from overloading the circuit) you are out the entire investment.