\documentclass[11pt]{article}
\usepackage{latexsym}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{amsthm}
\usepackage{epsfig}
\usepackage{psfig}
\newcommand{\handout}[5]{
\noindent
\begin{center}
\framebox{
\vbox{
\hbox to 5.78in { {\bf 6.851: Advanced Data Structures } \hfill #2 }
\vspace{4mm}
\hbox to 5.78in { {\Large \hfill #5 \hfill} }
\vspace{2mm}
\hbox to 5.78in { {\em #3 \hfill #4} }
}
}
\end{center}
\vspace*{4mm}
}
\newcommand{\lecture}[4]{\handout{#1}{#2}{#3}{Scribe: #4}{Lecture #1}}
\newtheorem{theorem}{Theorem}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{observation}[theorem]{Observation}
\newtheorem{proposition}[theorem]{Proposition}
\newtheorem{definition}[theorem]{Definition}
\newtheorem{claim}[theorem]{Claim}
\newtheorem{fact}[theorem]{Fact}
\newtheorem{assumption}[theorem]{Assumption}
% 1-inch margins, from fullpage.sty by H.Partl, Version 2, Dec. 15, 1988.
\topmargin 0pt
\advance \topmargin by -\headheight
\advance \topmargin by -\headsep
\textheight 8.9in
\oddsidemargin 0pt
\evensidemargin \oddsidemargin
\marginparwidth 0.5in
\textwidth 6.5in
\parindent 0in
\parskip 1.5ex
%\renewcommand{\baselinestretch}{1.25}
\renewcommand{\labelenumi}{(\alph{enumi})}
\begin{document}
\lecture{10 --- March 8, 2010}{Spring 2010}{Prof.\ Erik Demaine}{Nicholas Zehender}
\section{Overview}
In the last lecture we covered van Emde Boas and $y$-fast trees, integer data structures which support insert, delete, predecessor, and successor operations in $\log(w)$ time. These data structures are based on the Word RAM model, where we can manipulate a constant number of $w$-bit words in constant time.
In this lecture we covered fusion trees, as described by Fredman and Willard \cite{fw}. A fusion tree is a static data structure storing $n$ $w$-bit integers which supports predecessor and successor queries in $O(\log_wn)$ time per query and $O(n)$ space. There is also a dynamic version of fusion trees which get $O(log_w(n)+log(log(n)))$ for updates, as described by Andersson and Thorup \cite{at}, but we did not cover this. Depending on whether $n$ is small or large compared to $w$, we can use fusion trees or van Emde Boas trees, and our query time will be $O(\min\left(\frac{\log n}{\log w},\log{w}\right))$. This minimum will never be greater than $O(\sqrt{\log n})$, giving us a better than logarithmic bound which only depends on $n$.
\section{Fusion Trees}
A fusion tree is a B-tree with a branching factor of $w^{1/5}$. The depth of such a tree is $\log_{w^{1/5}}(n)=\log(n)/\log(w^{1/5})=5\log(n)/\log(w)$.
In order to actually achieve $O(\log(n)/\log(w))$ time for queries, we must deal with each node we encounter in constant time. However, there are $w^{1/5}$ keys, each $w$ bits long, in each node: in constant time, we can only do a constant number of operations on a constant number of $w$-bit words. To solve this problem, we will only compare the ``interesting bits'' of the keys.
\subsection{Interesting bits}
Consider what a trie containing the keys would look like. This trie would have height $w$, the length of the keys, and would have $k=O(w^{1/5})$ leaves (since the keys are distinct).
``Interesting bits'' are bits which correspond to levels in this trie with branching nodes---these are the bits we will look at in order to distinguish keys. Since our trie has $k$ leaves, it has $k-1$ branching nodes, and at most $k-1$ interesting bits (since multiple branching nodes might be on the same level). We will call the indices of the interesting bits $b_0$ through $b_{r-1}$.
\subsection{Sketches}
To get the sketch of a word $x$, we delete all the bits except the interesting bits. Order is preserved: for our keys $x_i$, we have $x_isketch(x_i)$ when $i