Buckets and Van Emde Boas Trees
6.854 Notes #4

David Karger

1 Buckets

Explot indirect addresssing in RAM model.

review shortest path algorithm.

In shortest paths, often have edge lengths small integers (say max \(C\)).

Observe heap behavior:

Idea: lots of things have same value. Keep in buckets.

How to exploit?

standard heaps of buckets. \(O(m\log C)\) (slow) or \(O(m+n\log C)\) with Fib (messy).
Dial’s algorithm: \(O(m+nC)\).
in fact, \(O(m+D)\) for max distance \(D\)

space?

Can we improve this?

2-level buckets.

make blocks of size \(b\)
add layer on top: \(nC/b\) block summaries
in each summary, keep count of items in block
insert updates block and summary: \(O(1)\)
ditto decrease key
delete min scans summaries, then scans one block
- over whole algorithm, \(nC/b\) scanning summaries
- also, scan one block per delete min: \(b\) work
- over \(n\) delete mins, work \(nb+nC/b\)
- balance, optimize with \(b=\sqrt{C}\)
result: SP in \(O(m+n\sqrt{C})\)
as before “space trick” to keep array sizes to \(C\)
if know \(D\) (or binary search for it) make bucket size \(\sqrt{D/n}\) and get \(O(m+\sqrt{nD})\)
Generalize: 3 tiers?
- blocks, superblocks, and summary
- block size \(C^{1/3}\)
- runtime \(O(m+nC^{1/3})\)
- how far can this go? To \(m+n\)? no, because insert cost rises.

Tries.

Problems: space and time

Idea: be lazy! (Denardo and Fox 1979)

Operations:

Insert
- walk item down tree till stop in (inactive) bucket or bottom
- increment level count
- \(O(k)\) work
Decrease key
- Remove from current bucket
- put in proper bucket
- maybe descend (and revise level counts)
- new bucket must be a prior sibling since monotone
  - so bucket exists
  - and no higher levels need be touched
amortization:
- items descend once per touch, but never rise,
- (critically relies on monotone property)
- so \(k\) expansion per item
- alternatively, consider “potential” as “total heights of items”
- inserting at top adds \(k\) potential
- then all descents are free
Delete-min
- remove item, update bottom level count
- advance to new min
  - rise to first nonempty level
  - traverse to first nonempty bucket
  - expand till find min
- time:
  - pushdowns are accounted for in insert cost
  - identifying min happens during pushdowns
  - rise to nonempty bucket costs less than pushdowns from it
  - so, cost to scan only one block
- cost \(C^{1/k}\)
space to \(kC^{1/k}\)
Times:
- \(O(k)\) insert (charge expansions to insert)
- \(O(1)\) decrease key
- \(O(C^{1/k})\) delete-min
paths runtime: \(O(m+n(k+C^{1/k}))\), choose \(k = 2\log C/\log \log C\): \(O(m+n(\log C)/\log\log C)\)
great, even if e.g. \(C=2^{32}\)
Further improvement: heap on top (HOT) queues get \(O(m+n(\log C)^{1/3})\) time. Cherkassky, Goldberg, and Silverstein. SODA 97.
Implementation experiments—good model for project

Van Emde Boas, “Design and Implementation of an efficient priority queue” Math Syst. Th. 10 (1977)

Thorup, “On RAM priority queues” SODA 1996.

Idea

idea: in bucket heaps, problem of finding next empty bucket was heap problem. Recurse!
\(b\)-bit words
\(\log b\) running times
thorup paper improves to \(\log\log n\)
consequence for sorting.

Algorithm.

array of buckets but fast lookup on top.
queue \(Q\) on \(b\) bits is struct
- \(Q.\min\) is current min, not stored recursively
- Array \(Q.low[]\) of \(\sqrt{U}\) queues on low order bits in bucket
- \(Q.high\), vEB queue on high order bits of elements other than current min in queue
Insert \(x\):
- if \(x < Q.\min\), swap
- now insert \(x\) in recursive structs
- expand \(x=(x_h,x_l)\) high and low half words
- If \(Q.low[x_h]\) nonempty, then insert \(x_l\) in it
- else, make new queue holding \(x_l\) at \(Q.low[x_h]\), and insert \(x_h\) in \(Q.high\)
- note two inserts, but one to an empty queue, so constant time
Delete-min:
- need to replace \(Q.\min\)
- Look in \(Q.high.\min\). if null, queue is empty.
- else, gives first nonempty bucket \(x_h\)
- Delete min from \(Q.low[x_h]\) to finish finding \(Q.\min\)
- If results in empty queue, Delete-min from \(Q.high\) to remove that bucket from consideration
- Note two delete mins, but second only happens when first was constant time.

Problem: space