Bugs Rust won't catch

376 points - today at 2:19 AM

Source

Comments

collinfunk today at 4:48 AM
Hi, I am one of the maintainers of GNU Coreutils. Thanks for the article, it covers some interesting topics. In the little Rust that I have used, I have felt that it is far too easy to write TOCTOU races using std::fs. I hope the standard library gets an API similar to openat eventually.

I just want to mention that I disagree with the section titled "Rule: Resolve Paths Before Comparing Them". Generally, it is better to make calls to fstat and compare the st_dev and st_ino. However, that was mentioned in the article. A side effect that seems less often considered is the performance impact. Here is an example in practice:

  $ mkdir -p $(yes a/ | head -n $((32 * 1024)) | tr -d '\n')
  $ while cd $(yes a/ | head -n 1024 | tr -d '\n'); do :; done 2>/dev/null
  $ echo a > file
  $ time cp file copy

  real 0m0.010s
  user 0m0.002s
  sys 0m0.003s
  $ time uu_cp file copy

  real 0m12.857s
  user 0m0.064s
  sys 0m12.702s
I know people are very unlikely to do something like that in real life. However, GNU software tends to work very hard to avoid arbitrary limits [1].

Also, the larger point still stands, but the article says "The Rust rewrite has shipped zero of these [memory saftey bugs], over a comparable window of activity." However, this is not true [2]. :)

[1] https://www.gnu.org/prep/standards/standards.html#Semantics [2] https://github.com/advisories/GHSA-w9vv-q986-vj7x

satvikpendem today at 12:18 PM
Unrelated but also in the category of bugs Rust won't catch (natively), there are crates that allow C++ style contracts, or more generally, dependent typing and can be used to catch issues at compile time rather than runtime. I use this one, anodized.

https://docs.rs/anodized/latest/anodized/

wahern today at 4:37 AM
> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing

They knew how to write Rust, but clearly weren't sufficiently experienced with Unix APIs, semantics, and pitfalls. Most of those mistakes are exceedingly amateur from the perspective of long-time GNU coreutils (or BSD or Solaris base) developers, issues that were identified and largely hashed out decades ago, notwithstanding the continued long tail of fixes--mostly just a trickle these days--to the old codebases.

hombre_fatal today at 5:28 AM
One thing that's hard about rewriting code is that the original code was transformed incrementally over time in response to real world issues only found in production.

The code gets silently encumbered with those lessons, and unless they are documented, there's a lot of hidden work that needs to be done before you actually reach parity.

TFA is a good list of this exact sort of thing.

Before you call people amateur for it, also consider it's one of the most softwarey things about writing software. It was bound to happen unless coreutils had really good technical docs and included tests for these cases that they ignored.

lionkor today at 9:22 AM
I struggle to find anything on this post that wouldn't be caught by some kind of unit test or manual review, especially when comparing with the GNU source for the coreutils. The whole coreutils rewrite is a terrible idea[1] and clearly being done in the wrong way (without the knowledge gained from the previous software).

If you do a rewrite, you should fully understand and learn from the predecessor, otherwise youre bound to repeat all the mistakes. Embarassing.

To be clear; I love Rust, I use it for various projects, and it's great. It doesn't save you from bad engineering.

[1]: https://www.joelonsoftware.com/2000/04/06/things-you-should-...

bluGill today at 11:40 AM
I have to partially disagree with applying Hyrum's law here. In the case of core details, there's not just the common GNU version. There's also what POSIX says they should do and what the various BSD does, plus some other implementations from various vendors that we mostly forget about. If in any case what this version of Core Utils does is different from what GNU does in a way that others are also different, it would be a good thing to break behavior because anyone's script already is wrong in ways that are going to matter in the real world and it may matter in the future anyway, so breaking them now is good. If your script depends on GNU's behavior, then you shouldn't be calling the standard version. You should be explicitly specifying the GNU version. That is, don't use CP. Use GNU-CP or whatever it is commonly installed at. Or you check for what version of CP you have.
Joker_vD today at 5:54 AM
> The pattern is always the same. You do one syscall to check something about a path, then another syscall to act on the same path. Between those two calls, an attacker with write access to a parent directory can swap the path component for a symbolic link. The kernel re-resolves the path from scratch on the second call, and the privileged action lands on the attacker’s chosen target.

It's actually even worse than that somewhat, because the attacker with write access to a parent directory can mess with hard links as well... sure, it only messes with the regular files themselves but there is basically no mitigations. See e.g. [0] and other posts on the site.

[0] https://michael.orlitzky.com/articles/posix_hardlink_heartac...

tdiff today at 8:18 AM
Ok if there were some rust guys rewriting coreutils with no experience in linux, but how come Ubuntu accepted it into its mainline?
alkonaut today at 7:49 AM
> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing

So does this mean that neither did the original utils have any test harness, the process of rewriting them didn't start by creating one either?

Sure there are many edge cases, but surely the OS and FS can just be abstracted away and you can verify that "rm .//" actually ends up doing what is expected (Such as not deleting the current directory)?

This doesn't seem like sloppy coding, nor a critique of the language, it's just the same old "Oh, this is systems programming, we don't do tests"?

Alternatively: if the original utils _did_ have tests, and there were this many holes in the tests, then maybe there is a massive lack in the original utils test suite?

marcosscriven today at 9:01 AM
That’s a great article, and indeed a very good blog. Just spent ages reading lots of their other articles.

Of the bugs mentioned I think the most unforgivable one is the lossy UTF conversion. The mind boggles at that one!

oconnor663 today at 6:05 AM
> The trap is that get_user_by_name ends up loading shared libraries from the new root filesystem to resolve the username.

That's kind of horrifying. Is there a reliable list somewhere of all the functions that do that? Is that list considered stable?

bayindirh today at 11:02 AM
> This is the largest cluster of bugs in the audit. It’s also the reason cp, mv, and rm are still GNU in Ubuntu 26.04 LTS. :(

This is what grinds my gears. Why all the hate against GNU?

Honestly, this is why I don't learn Rust, and why I didn't bother to read the rest of the article.

misja111 today at 7:13 AM
The root cause of some of the bugs seems to be the opaque nature of some of the Unix API. E.g.

> The trap is that get_user_by_name ends up loading shared libraries from the new root filesystem to resolve the username. An attacker who can plant a file in the chroot gets to run code as uid 0.

To me such a get_user_by_name function is like a booby trap, an accident that is waiting to happen. You need to have user data, you have this get_user_by_name function, and then it goes and starts loading shared libraries. This smells like mixing of concerns to me. I'd say, either split getting the user data and loading any shared libraries in two separate functions, or somehow make it clear in the function name what it is doing.

z3t4 today at 9:47 AM
To be fair these are mostly gotchas with Linux and not Rust itself, but I guess the std in Rust could handle some of these issues, in that a std should not allow you to shoot yourself in the foot by default.
9fwfj9r today at 5:26 AM
So it's basically failing on - necessary atomicity for filesystem operation - annoying path & string encoding - inertia for historical behaviors
fschuett today at 5:26 AM
Thanks for the list. I like these lists, so I can put them into a .md file, then launch "one agent per file" on my codebase and see if they can find anything similar to the mentioned CVEs.

Rust won't catch it, but now the agents will.

Edit: https://gist.github.com/fschutt/cc585703d52a9e1da8a06f9ef93c... for anyone who needs copying this

eb08a167 today at 9:17 AM
I'm totally fine with people experimenting and making amateur attempts at what adult people do. After all, that's how we grow. What I'm actually curious about is how the decision-making chain at Ubuntu got so messed up that this made it into production.
osmsucks today at 8:43 AM
I feel like one of the takeaways here is that Rust protects your code as long as what your code is doing stays predictably in-process. Touching the filesystem is always ripe with runtime failures that your programming language just can't protect you from. (Or maybe it also suggests the `std::fs` API needs to be reworked to make some of these occurrences, if not impossible, at least harder.)

On a separate note: I have a private "coretools" reimplementation in Zig (not aiming to replace anything, just for fun), and I'm striving to keep it 100% Zig with no libc calls anywhere. Which may or may not turn out to be possible, we'll see. However, cross-checking uutils I noticed it does have a bunch of unsafe blocks that call into libc, e.g. https://github.com/uutils/coreutils/blob/77302dbc87bcc7caf87.... Thankfully they're pretty minimal, but every such block can reduce the safety provided by a Rust rewrite.

stackedinserter today at 11:48 AM
TIL that

> uutils read it as “send the default signal to PID -1”, which on Linux means every process you can see.

What's the use case for killing all process you can see?

jolt42 today at 4:41 AM
I wonder if Rust becomes more popular with AI as Rust can help catch what AI misses, but then if that's the case then what about Haskell, or Lean, or?
PunchyHamster today at 11:21 AM
Seems like typical pattern of

* Let's rewrite thing in X, it is better

* Let's not look at existing code, X is better so writing it from scratch will look nicer

* Whoops, existing code was written like this for a reason

* Whoops, we re-introduce decade+ old problems that original already fixed at some point

einpoklum today at 8:56 AM
Note:

TOCTOU means "Time-of-check to time-of-use"

See also: https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use

r2vcap today at 9:18 AM
Just use Fedora :)
micheles today at 5:21 AM
> uutils now runs the upstream GNU coreutils test suite against itself in CI. That’s the right scale of defense for this class of bug. That's the minimum, it is absurd that they did not start from that!
timcobb today at 7:09 AM
The title of this article should be "Rust can't stop you from not giving a fuck" or "Rust can't give a fuck for you."

---

> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing

...

[List of bugs a diligent person would be mindful of, unix expert or not]

---

Only conclusion I can make is, unfortunately, the people writing these tools are not good software developers, certainly not sufficiently good for this line of work.

For comparison, I am neither a unix neckbeard nor a rust expert, but with the magic of LLMs I am using rust to write a music player. The amount of tokens I've sunk into watching for undesirable panics or dropped errors is pretty substantial. Why? Because I don't want my music player to suck! Simple as that. If you don't think about panics or errors, your software is going to be erratic, unpredictable and confusing.

Now, coreutils isn't my hobby music player, it's fundamental Internet infrastructure! I hate sounding like a Breitbart commenter but it is quite shocking to see the lack of basic thought going into writing what is meant to be critical infrastructure. Wow, honestly pathetic. Sorry to be so negative and for this word choice, but "shock" and "disappointment" are mild terms here for me.

Anyway, thanks for the author of this post! This is a red flag that should be distributed far and wide.

immanuwell today at 5:46 AM
rust promised you memory safety and delivered - but turns out the filesystem doesn't care about your borrow checker, and these 44 cves are the receipt
slopinthebag today at 5:21 AM
I find it interesting how people will criticise Rust for not preventing all bugs, when the alternative languages don't prevent those same bugs nor the bugs rust does catch. If you're comparing Rust to a perfect language that doesn't exist, you should probably also compare your alternative to that perfect language as well right?

I'd be interested in a comparison with the amount of bugs and CVE's in GNU coreutils at the start of its lifetime, and compare it with this rewrite. Same with the number of memory bugs that are impossible in (safe) Rust.

Don't just downvote me, tell me how I'm wrong.

Analemma_ today at 4:50 AM
I know nobody's perfect and I'm not asking for perfection, but these bugs are pretty alarming? It seems like these supposed coreutils replacements are being written by people who don't know anything about Unix, and also didn't even bother looking at the GNU tools they are trying to replace. Or at least didn't have any curiosity about why the GNU tools work the way they do. Otherwise they might've wondered about why things operate on bytes and file descriptors instead of strings and paths.

I hate to armchair general, but I clicked on this article expecting subtle race conditions or tricky ambiguous corners of the POSIX standard, and instead found that it seems to be amateur hour in uutils.

melodyogonna today at 10:40 AM
TL;DR: Rust can't catch logic bugs
SpectreHat today at 5:28 AM
[dead]
Scarbutt today at 4:45 AM
[flagged]
tokyobreakfast today at 5:10 AM
[flagged]
marsven_422 today at 4:44 AM
[dead]
amelius today at 8:01 AM
[flagged]
jonjon16 today at 8:32 AM
[flagged]
rvz today at 5:17 AM
This is what happens when many people hype about a technology that solves a specific class of vulnerabilities, but it is not designed to prevent the others such as logic errors because of human / AI error.

Granted, the uutils authors are well experienced in Rust, but it is not enough for a large-scale rewrite like this and you can't assume that it's "secure" because of memory safety.

In this case, this post tells us that Unix itself has thousands of gotchas and re-implementing the coreutils in Rust is not a silver bullet and even the bugs Unix (and even the POSIX standard) has are part of the specification, and can be later to be revealed as vulnerabilities in reality.