Debunking Zswap and Zram Myths

159 points - today at 10:39 AM

Source

Comments

rini17 today at 8:17 PM
You mean zswap part of cleancache? But that fell out of kernel completely, no? And zram gained support for backing device.

BTW most of zram tutorials get it wrong, you are supposed to manually mark idle pages and initiate writeback by periodically writing to /sys/block/zramX/idle and /sys/block/zramX/writeback . Otherwise zram will never ever write anything to backing device. It is documented in kernel docs, just that if you expect it to work automatically you might misread it.

patrakov today at 12:30 PM
User here, who also acts as a Level 2 support for storage.

The article contains some solid logic plus an assumption that I disagree with.

Solid logic: you should prefer zswap if you have a device that can be used for swap.

Solid logic: zram + other swap = bad due to LRU inversion (zram becomes a dead weight in memory).

Advice that matches my observations: zram works best when paired with a user-space OOM killer.

Bold assumption: everybody who has an SSD has a device that can be used for swap.

The assumption is simply false, and not due to the "SSD wear" argument. Many consumer SSDs, especially DRAMless ones (e.g., Apacer AS350 1TB, but also seen on Crucial SSDs), under synchronous writes, will regularly produce latency spikes of 10 seconds or more, due to the way they need to manage their cells. This is much worse than any HDD. If a DRAMless consumer SSD is all that you have, better use zram.

0x_rs today at 7:51 PM
Been using zram since it hit the kernel, with the same priorities "petard" and disk backed swap. I don't remember the details now, but zswap many years ago would not handle hibernation "well" (as well as it can get..), or not better than zram+distinct hibernation with -XXX priority. But zram definitely has some caveats with that setup and it will lead to disk cache being used and requiring manual flushing, for example after hibernation, because zram-generator (if you're using it) isn't ready yet on resume, from what I recall about it. This seems like such a neatly written post I'm going to try and go with zswap from now on.
lproven today at 3:13 PM
I wrote about this recently too:

https://www.theregister.com/2026/03/13/zram_vs_zswap/

I prefer zswap to zram and as I linked at the end of the piece, it's not just me:

https://linuxblog.io/zswap-better-than-zram/

Maybe I am overthinking but I am wondering if this piece about myths is in any way a response to my article?

seba_dos1 today at 6:34 PM
A simpler alternative to OOM daemons could be enabling MGLRU's thrashing prevention: https://www.kernel.org/doc/html/next/admin-guide/mm/multigen...

I'm using it together with zram sized to 200% RAM size on a low RAM phone with no disk swap (plus some tuning like the mentioned clustering knob) and it works pretty well if you don't mind some otherwise preventable kills, but I will happily switch to diskless zswap once it's ready.

CoolGuySteve today at 2:15 PM
Would be nice if zswap could be configured to have no backing cache so it could completely replace zram. Having two slightly different systems is weird.

There's not really any difference between swap on disk being full and swap in ram being full, either way something needs to get OOM killed.

Simplifying the configuration would probably also make it easier to enable by default in most distros. It's kind of backwards that the most common Linux distros other than ChromeOS are behind Mac and Windows in this regard.

MrDrMcCoy today at 7:31 PM
Thank you for all your wonderful work, Chris! Just curious: is it feasible to eventually support sending the page clusters to backing swap in their compressed state to further reduce I/O? It's my understanding that the clusters get decompressed before getting sent to disk, which I presume is to simplify addressing.
garaetjjte today at 5:16 PM
>They size the zram device to 100% of your physical RAM, capped at 8GB. You may be wondering how that makes any sense at all – how can one have a swap device that's potentially the entire size of one's RAM?

zram size applies to uncompressed data, real usage is dynamically growing (plus static bookkeeping). Most memory compresses well, so you probably want to have zram device size even larger than physical RAM.

astrobe_ today at 6:17 PM
> It only really makes sense for extremely memory-constrained embedded systems

Even "mildly" memory constrained embedded systems don't use swap because their resources are tailored for their function. And they are typically not fans [1] of compression either because the compression rate is often unpredictable.

[1] Yes, they typically don't need fans because overheating and using a motor for cooling is a double waste of energy.

prussian today at 2:28 PM
With zram, I can just use zram-generator[0] and it does everything for me and I don't even need to set anything up, other than installing the systemd generator, which on some distros, it's installed by default. Is there anything equivalent for zswap? Otherwise, I'm not surprised most people are just using zram, even if sub-optimal.

[0]: https://crates.io/crates/zram-generator

MaxCompression today at 3:29 PM
One underappreciated aspect of zswap vs zram is the compression algorithm choice and its interaction with the data being compressed.

LZ4 (default in both) is optimized for speed at the expense of ratio — typically 2-2.5x on memory pages. zstd can push that to 3-3.5x but at significantly higher CPU cost per page fault.

The interesting tradeoff: memory pages are fundamentally different from files. They contain lots of pointer-sized values, stack frames, and heap metadata — data patterns where simple LZ variants actually perform surprisingly well relative to more complex algorithms. Going beyond zstd (e.g., BWT-based or context mixing) would give diminishing returns on memory pages while destroying latency.

So the real question isn't just "zswap vs zram" but "how much CPU are you willing to spend per compressed page, given your workload's memory access patterns?" For latency-sensitive workloads, LZ4 with zswap writeback is hard to beat.

guenthert today at 1:05 PM
So much polemic and no numbers? If it is a performance issue, show me the numbers!
Szpadel today at 6:16 PM
There is one more feature that zram can do: multiple compression levels. I use simple bash script to first use fast compression and after 1h recompress it using much stronger compression.

unfortunately you cannot chain it with any additional layer or offload to disk later on, because recompression breaks idle tracking by setting timestamp to 0 (so it's 1970 again)

https://gist.github.com/Szpadel/9a1960e52121e798a240a9b320ec...

Mashimo today at 1:56 PM
Is this advice also applicable to Desktops installations?
adgjlsfhk1 today at 12:50 PM
can you make a follow-up here for the best way to setup swap to support full disk encryption+hybernation?
nephanth today at 12:10 PM
I used to put swap on zram when my laptop had one of those early ssds, that people would tell you not to put swap on for fear of wearing them out

Setup was tedious

tonnydourado today at 3:15 PM
That's a banger article, I don't even like low level stuff and yet read the whole thing. Hopefully I will have opportunity to use some of it if I ever get around to switch my personal notebook back to linux
jitl today at 12:32 PM
thank goodness Kubernetes got support for swap; zswap has been a great boon for one of my workloads
devnotes77 today at 2:02 PM
[dead]
quapster today at 3:04 PM
The interesting meta-point here is how a kernel mechanism turned into cargo-cult tuning advice.

"Use zram, save your SSD" made sense in the era of tiny eMMC, no TRIM, and mystery flash controllers. It also fit a very human bias: disk I/O feels scary and finite, CPU cycles feel free and infinite. So zram became a kind of talisman you enable once and never think about again.

But the kernel isn't optimizing for your feelings about SSD wear, it's optimizing for global memory pressure. zswap fits into that feedback loop, zram mostly sits outside it. Once you see that, the behavior people complain about ("my system thrashes and then dies mysteriously") stops being mysterious: they effectively built a second, opaque memory pool that the MM subsystem can't reason about or reclaim from cleanly.

What's funny is that on modern desktops and servers, the alleged downside of zswap (writing to disk sometimes) is the one thing the hardware is extremely good at, while the downside of zram (locking cold garbage in RAM and confusing reclaim/oom) is exactly what you don't want when the machine is under stress. The folk wisdom never updated, but the hardware and the kernel did.