Gzip decompression in 250 lines of Rust

124 points - last Tuesday at 6:35 AM

Source

Comments

stgn last Friday at 3:39 PM
> so i wrote a gzip decompressor from scratch

After skimming through the author's Rust code, it appears to be a fairly straightforward port of puff.c (included in the zlib source): https://github.com/madler/zlib/blob/develop/contrib/puff/puf...

nayuki last Friday at 3:02 PM
Just like that author, many years ago, I went through the process of understanding the DEFLATE compression standard and producing a short and concise decompressor for gzip+DEFLATE. Here are the resources I published as a result of that exploration:

* https://www.nayuki.io/page/deflate-specification-v1-3-html

* https://www.nayuki.io/page/simple-deflate-decompressor

* https://github.com/nayuki/Simple-DEFLATE-decompressor

Lerc last Friday at 4:40 PM
The function

  fn bits(&mut self, need: i32) -> i32 { ....
Put me in mind of one of my early experiments in Rust. It would be interesting to compare a iterator based form that just called .take(need)

I haven't written a lot of Rust, but one thing I did was to write an iterator that took an iterator of bytes as input and provided bits as output. Then used an iterator that gave bytes from a block of memory.

It was mostly as a test to see how much high level abstraction left an imprint on the compiled code.

The dissasembly showed it pulling in 32 bits at a time and shifting out the bits pretty much the same way I would have written in ASM.

I was quite impressed. Although I tested it was working by counting the bits and someone critizised it for not using popcount, so I guess you can't have everything.

MisterTea last Friday at 2:51 PM
> twenty five thousand lines of pure C not counting CMake files. ...

Keep in mind this is also 31 years of cruft and lord knows what.

Plan 9 gzip is 738 lines total:

  gzip.c 217 lines
  gzip.h 40 lines
  zip.c  398 lines
  zip.h  83 lines
Even the zipfs file server that mounts zip files as file systems is 391 lines.

edit - post a link to said code: https://github.com/9front/9front/tree/front/sys/src/cmd/gzip

> ... (and whenever working with C always keep in mind that C stands for CVE).

Sigh.

jmmv last Friday at 9:09 PM
I was reading this and couldn't stop thinking https://en.wikipedia.org/wiki/Literate_programming
carlos256 last Friday at 5:07 PM
>the only flag we care about is FNAME The specification does not define an encoding for the file name. Different file systems may impose restrictions on certain names, so FNAME should not be used.
socalgal2 last Friday at 9:20 PM
that reminds me a zip file creator in a few lines of JS. Now that CompressionStream is a built in feature of the browser and node. No need to use some bloated npm lib. But momentum and popularity (and LLMs) will keep people using JSZip for eternity
jeffrallen last Friday at 3:00 PM
[flagged]
sulplisetalk yesterday at 12:53 AM
[flagged]
up2isomorphism last Friday at 3:17 PM
Another dev who doesn’t show respect to what has been done and expect a particular language will do wonders for him. Also I don’t see this is much better in term of readability.