What are skiplists good for?

257 points - last Friday at 1:57 PM

Comments

carlsverre yesterday at 1:38 PM

(I used to work at SingleStore, and now work at Antithesis)

SingleStore (f.k.a. MemSQL) used lock-free skiplists extensively as the backing storage of their rowstore tables and indexes. Adam Prout (ex CTO) wrote about it here: https://www.singlestore.com/blog/what-is-skiplist-why-skipli...

When SingleStore added a Columnar storage option (LSM tree), L0 was simply a rowstore table. Since rowstore was already a highly optimized, durable, and large-scale storage engine, it allowed L0 to absorb a highly concurrent transactional write workload. This capability was a key part of SingleStore's ability to handle HTAP workloads. If you want to learn more, take a look at this paper which documents the entire system end-to-end: https://dl.acm.org/doi/10.1145/3514221.3526055

cremer yesterday at 9:33 AM

Redis sorted sets are probably the most widely deployed example. Redis uses a skiplist for range queries and ordered iteration paired with a hash table for O(1) lookups. Together they cover the full API at the right complexity for each operation

Skiplists also win over balanced BSTs when it comes to concurrent access. Lock-free implementations are much simplier to reason about and get right. ConcurrentSkipListMap has been in the standard library since Java 6 for exactly this reason and it holds up well under high contention

ozgrakkurt yesterday at 11:50 AM

Some more links that are inside the article:

- More info about skiplists: https://arxiv.org/pdf/2403.04582

- Performance comparison with B-tree ?: https://db.cs.cmu.edu/papers/2018/mod342-wangA.pdf

- Other blog from Anthithesis about writing their own db: https://antithesis.com/blog/2025/testing_pangolin/

Also I find it a bit hard to understand the performance outcome of this setup.

I know formats like parquet and databases like ClickHouse work better when duplicating data instead of doing joins. I guess BigQuery is similar.

The article is great but would be also interesting to learn how performance actually worked out with this.

josephg yesterday at 7:17 AM

For this problem, I’d consider a different approach. You have a fuzzer, and based on some seed it’s generating lots of records. You then need to query a specific record (or set of records) based on the leaf.

I’d just store a table of records with the leaf, associated with the seed. A good fuzzer is entirely deterministic. So you should be able to regenerate the entire run from simply knowing the seed. Just store a table of {leaf, seed}. Then gather all the seeds which generated the leaf you’re interested in and rerun the fuzzer for those seeds at query time to figure out what choices were made.

bob1029 yesterday at 7:28 AM

On practical machines they aren't good for much. To access a value in a skip list you have to dereference way more pointers than in a b+ tree. On paper they're about the same, but in practice the binary tree will tend to outperform. You get way more work done per IO operation.

ahartmetz yesterday at 5:05 AM

Skiplists have some nice properties - the code is fairly short and easy to understand, for one. Qt's QMap used to be skip list based, here's the rationale given for it: https://doc.qt.io/archives/qq/qq19-containers.html#associati...

talideon yesterday at 8:39 PM

Plenty, but these days mostly if you either (a) want a simple implementation or (b) don't have to worry much about cache locality. The problem is that (b) doesn't really exist outside of retrocomputing and embedded systems these days. It's still one of my favourite data structures, just because it's a clever way to get most of the benefits of more complicated datastructures on small systems with minimal code.

winwang yesterday at 6:01 AM

Only somewhat related but there is supposedly a SIMD/GPU-friendly skiplist algo written about here: https://csaws.cs.technion.ac.il/~erez/Papers/GPUSkiplist.pdf

maxtaco yesterday at 7:08 PM

Backpointers to earlier epochs in append-only cryptographic data structures like key transparency logs. If the client last fetched epoch 1000, and the server reports the current epoch is 3000, the server can return log(2000) intermediate epochs, following skip pointers, to provide a hash chain from epoch 3000 to 1000.

https://github.com/foks-proj/go-foks/blob/main/lib/merkle/bi... https://github.com/foks-proj/go-foks/blob/main/lib/merkle/cl...

torginus yesterday at 2:02 PM

>What are skiplists good for

In practice, I have found out, nothing much. Their appeal comes from being simpler to implement than self-balancing trees, while claiming to offer the same performance.

But they completely lack a mechanism for rebalancing, and are incredibly pointer heavy (in this implementation at least), and inserts/deletes can involve an ungodly amount of pointer patching.

While I think there are some append-heavy access patterns where it can come up on top, I have found that the gap between using a BST, a hashtable, or just putting stuff in an array and just sorting it when needed is very small.

mrjn yesterday at 5:30 AM

skiplists form the basis of in-memory tables used by LSM trees, which are themselves the basis of most modern DBs (written post 2005).

teiferer yesterday at 7:16 AM

In the age of agentic programming and the ever increasing pressure to ship faster, I'm afraid this kind of knowledge will become more and more fringe, even moreso than it is today. Who has the time to think through the intricacies of parallel data structures? Clearly we'll just throw more hardware at problems, write yet another service/api/http endpoint and move on to the next hype. The LLM figures out the algorithms and we soon lose the skills to develop new ones. And tell each other the scifi BS myth that "AI" will invent new data structures in the future so we don't even beed humans in the loop.

siddsukh yesterday at 9:32 PM

The skiptree is a great example of constraint-driven invention - you didn't set out to build a new data structure, you had a specific constraint (BigQuery's poor point lookups) and the solution naturally emerged from combining two existing ideas. The part about writing a JavaScript compiler to generate kilobyte-scale SQL queries is underappreciated, too. Most engineers would have switched databases, but building a compiler when output is too complex to write by hand is almost always the right call.

tooltower yesterday at 6:26 AM

In my personal projects, I've used it to insert/delete transactions in a ledger. I wanted to be able to update/query the account balance fast. Like the article says, "fold operations".

medbar yesterday at 10:12 AM

Skiplist operations are local for the most part, which makes it easier to write thread-safe code for than b-trees in practice. Anecdotally, they were a nice implementation problem for my Java class in uni. But I liked working with b-lists more.

Skip trees/graphs sound interesting, but I can't think of any use case for them off the top of my head.

shawn_w yesterday at 8:07 AM

Random access with similar performance to a balanced binary tree, and ordered iteration as simple as a linked list. It's a nice combination. (Of course, so is a binary search of a sorted array, which I lean more towards these days unless doing a lot of random insertions and deletions throughout the life of the mapping).

aaa_aaa yesterday at 10:10 AM

Almost nothing. My friend and I used it once (in a rather obscure problem). Then used simple lists with some tricks with better performance because of the locality etc.

torben-friis yesterday at 9:36 AM

Could someone provide intuitive understanding for why the "express lanes" in a skip list are created probabilistically?

My first instinctive idea would be that there is an optimal distance, maybe based on absolute distance or by function of list size or frequency of access or whatever. Leaving the promotion to randomness is counter intuitive to me.

ncruces yesterday at 8:56 PM

I guess I'm missing the point of this, so I'm probably looking at it wrong.

If you're saving multiple ancestors in ancestors_between why not save all of them?

And if the goal is to access the info for all of the ancestors, what makes it more likely than average that your ancestors aren't on the same layer as yourself (i.e. that e.g. half of them aren in ancestors_between)?

Because, if a fixed ratio (50 or even 10%) of an arbitrary node's ancestors is at the same layer, isn't complexity still the same (only reduced by a constant factor)?

fnordpiglet yesterday at 9:43 AM

A major global bank operated all trading, especially the complex stuff, off of a globally replicated skip list.

locknitpicker yesterday at 5:47 AM

FTA:

> Skiplists to the rescue! Or rather, a weird thing we invented called a “skiptree”…

I can't help but wonder. The article makes no mention of b-trees if any kind. To me, this sounded like the obvious first step.

If their main requirement was to do sequential access to load data, and their problem was how to speed up tree traversal on an ad-hoc tree data structure that was too deep, then I wonder if their problem was simply having tree nodes with too few children. A B+ tree specifically sounds to be right on the nose for the task.

calvinmorrison yesterday at 8:15 PM

Cyrus uses skip lists for its internal db structs.

esafak yesterday at 1:53 PM

What they really wanted was an HTAP. https://en.wikipedia.org/wiki/Hybrid_transactional/analytica...

einpoklum yesterday at 6:00 PM

> Every insert would need to write to both systems, and since we want to analyze the data online (while new writes are streaming in) keeping the two databases consistent would require something like two-phase commit (2PC).

Not convincing. One can write the bulk data which is at first unused - no need to sync anything. Then one writes to the tree DB using, where each node only stores a key to the relevant data.

m00dy yesterday at 11:08 AM

It was really cool to mention its name during tech interviews but not anymore I guess.

hpcgroup yesterday at 3:18 PM

[dead]

linzhangrun yesterday at 7:31 AM

[dead]

jimmypk yesterday at 1:03 PM

[flagged]

feverzsj yesterday at 6:38 AM

If you need a graph db, use a graph db.