I think you would find the splay tree provides similar properties, potentially i...

jsnell · on July 30, 2021

I don't think this is about hot-data optimization as such though, but about very short-term local clustering of keys.

I.e. it's not that a:b:c and a:b:d are globally accessed more often than x:y:z, but that after a:b:c is accessed it's more likely that the next access is a:b:d than x:y:z.

crdrost · on July 30, 2021

It's also that they are different data structures used historically for different things, no?

It'd be interesting to see a high-fanout splay tree, not even 100% sure how it'd work. Do you splay just the one node up to the root and have to split/combine each splay? Or do you only splay the first node in a "chunk" up? Or the whole "chunk"? Is it better to do a B+ type of thing rather than a B-tree type of thing?

bob1029 · on July 30, 2021

In theory, this could provide dramatic performance uplift if the splay is done along lines of an aggregate value function over the node contents, and the contents of the node are selected such that their aggregate value function is well-defined & ordered relative to other nodes. I've got an implementation I might try this with. There are a ton of directions you could take this in depending on specifically what you are trying to optimize for.

Looking at this from a mechanical sympathy perspective - If you are unable to batch tree operations for whatever reason, increasing fanout is a good intermediate answer since you can make your storage system do more effective work per I/O. You can't ever read or write in units smaller than 1 storage block, so maybe shoving more logical items into each block is where you need to look.

Practically, this is probably not feasible in a wide number of use cases. GUID/UUID keys (non-linear) would break this scheme in my mind.

eternalban · on July 30, 2021

The tree structure is invariant across the balancing/search algorithms. Maybe you can do splay on loads and then rebalance. Q is whether this ends up being more costly over time. Now I wonder if there can be a hybric splay with distinct sub-roots.

foobiekr · on July 30, 2021

I love splay trees, skew heaps, etc. data structures that get very little love.

But I’ve actually tried to use splay trees in production code. The performance benefits versus a simple Patricia tree seemed totally absent in actual practice even for things like very frequent adjacent queries. The structure is neat but paying attention to memory layout and cache line size is far, far more important.

bob1029 · on July 30, 2021

I use splay trees for their indirect utility. That is, the splay tree is a trivial way to calculate minimal, incremental updates to an append-only log. With most (all?) other b-tree structures, you would need to rewrite an unsustainable (or otherwise unpredictable) amount of tree data to the append-only log each time it is modified. The magic is that the splay operation redefines the root node every time you access the tree, so you just have to keep track of all the nodes you visited in each transaction batch and then write that sub-tree to the end of the log.

>paying attention to memory layout and cache line size is far, far more important.

Totally agree. For scenarios where I know I am going to have fewer than ~100k of something (assume an array of structs), I typically just do a linear scan over memory. If its a bunch of objects in memory or a cold access to storage, the trees become far more practical.

thomastay · on July 31, 2021

Hey, I'm really curious about how that works. Could you go into a bit more detail into that?

From what I can gather on your comment, it seems that if you are given a splay tree, and a batch of transactions i.e.

  GET 1
  SET 10 "hello"
  GET 15

You would write all the visited nodes, in order? But splaying involves a lot of tree operations and rotations too. I don't get how that works.

bob1029 · on July 31, 2021

> But splaying involves a lot of tree operations and rotations too. I don't get how that works.

Correct, but the whole idea is that over time the same subset of hot nodes is going to trickle to the top of the tree, so you are just rewriting node offsets in your temporary list as you process the batch. A node may be logically modified 100x per batch, and only written to storage once.

zinclozenge · on July 30, 2021

One of the things that surprised me was that linear search on optimally laid out data was frequently faster than binary search with frequent pointer chasing.