Desketching
For an arbitrary query q, parallel comparison computes the index i such that
sketch
(xi-1) ≤sketch
(q) ≤sketch
(xi)
Unfortunately, the sketch function is not in general order-preserving outside the set of keys, so it is not necessarily the case that xi-1 ≤ q ≤ xi. What is true is that, among all of the keys, either xi-1 or xi has the longest common prefix with q. This is because any key y with a longer common prefix with q would also have more sketch bits in common with q, and thus sketch
(y) would be closer to sketch
(q) than any sketch
(xj).
The length longest common prefix between two w-bit integers a and b can be computed in constant time by finding the most significant bit of the bitwise XOR between a and b. This can then be used to mask out all but the longest common prefix.
Note that p identifies exactly where q branches off from the set of keys. If the next bit of q is 0, then the successor of q is contained in the p1 subtree, and if the next bit of q is 1, then the predecessor of q is contained in the p0 subtree. This suggests the following algorithm:
- Use parallel comparison to find the index i such that
sketch
(xi-1) ≤sketch
(q) ≤sketch
(xi). - Compute the longest common prefix p of q and either xi-1 or xi (taking the longer of the two).
- Let l-1 be the length of the longest common prefix p.
- If the l-th bit of q is 0, let e = p10w-l. Use parallel comparison to search for the successor of
sketch
(e). This is the actual predecessor of q. - If the l-th bit of q is 1, let e = p01w-l. Use parallel comparison to search for the predecessor of
sketch
(e). This is the actual successor of q.
- If the l-th bit of q is 0, let e = p10w-l. Use parallel comparison to search for the successor of
- Once either the predecessor or successor of q is found, the exact position of q among the set of keys is determined.
Read more about this topic: Fusion Tree