\documentclass[11pt]{article}
\usepackage{latexsym}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{amsthm}
%\usepackage{epsfig}
%\usepackage{psfig}
\usepackage{nicefrac}
\usepackage{graphicx}
\usepackage{qtree}
\newcommand{\handout}[5]{
\noindent
\begin{center}
\framebox{
\vbox{
\hbox to 5.78in { {\bf 6.897: Advanced Data Structures } \hfill #2 }
\vspace{4mm}
\hbox to 5.78in { {\Large \hfill #5 \hfill} }
\vspace{2mm}
\hbox to 5.78in { {\em #3 \hfill #4} }
}
}
\end{center}
\vspace*{4mm}
}
\newcommand{\lecture}[4]{\handout{#1}{#2}{#3}{Scribe: #4}{Lecture #1}}
\newtheorem{theorem}{Theorem}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{observation}[theorem]{Observation}
\newtheorem{proposition}[theorem]{Proposition}
\newtheorem{definition}[theorem]{Definition}
\newtheorem{claim}[theorem]{Claim}
\newtheorem{fact}[theorem]{Fact}
\newtheorem{assumption}[theorem]{Assumption}
% 1-inch margins, from fullpage.sty by H.Partl, Version 2, Dec. 15, 1988.
\topmargin 0pt
\advance \topmargin by -\headheight
\advance \topmargin by -\headsep
\textheight 8.9in
\oddsidemargin 0pt
\evensidemargin \oddsidemargin
\marginparwidth 0.5in
\textwidth 6.5in
\parindent 0in
\parskip 1.5ex
%\renewcommand{\baselinestretch}{1.25}
\begin{document}
\lecture{15 --- March 31, 2005}{Spring 2005}{Prof.\ Erik
Demaine}{Austin Clements}
\section{Overview}
In this lecture we introduced the topic of {\em range queries} with
the static {\em range minimum query} (RMQ) and {\em lowest common
ancestor} (LCA) problems.
In the static RMQ problem, we have a static array $A$ of $n$ numbers
and we want to be able to efficiently find the minimum element of any
contiguous subarray of $A$. In other words, we want to answer queries
of the form $RMQ(i,j) = \min(A[i], \ldots, A[j])$. For example, $15$
is the result of RMQ over the indicated range of the following array
\[17\;0\;\underbrace{36\;16\;23\;\mathbf{15}\;42}\;18\;20\]
To do this we will transform the RMQ problem into an LCA problem such
that an efficient solution to LCA will yield an efficient solution to
RMQ. In LCA we are given a tree and two nodes $a$ and $b$ and we want
to find the lowest node of the tree that is a common ancestor of both
$a$ and $b$.
LCA was originally reduced to $O(1)$ time per query with $O(n)$ space
by Harel and Tarjan~\cite{ht84}. Cole and Hariharan developed a
dynamic version of LCA that achieves $O(1)$ time per operation and
allows for insertion and deletion of keys as well as subdivision and
merging of edges~\cite{ch99}. The simpler version of LCA discussed in
this lecture was introduced later by Bender and Farach-Colton and
achieves the same bounds as the original static result~\cite{bf00}.
Using it, we can also solve RMQ with linear space and constant query
time.
We can na\"ively solve RMQ with $O(1)$ time queries by simply storing
a table over $i, j$ of minimum values for each range. This table
requires $O(n^2)$ space and $O(n^2)$ time to compute (using a dynamic
program). The rest of this lecture concerns reducing this to $O(n)$
space and $O(n)$ construction time.
\section{Reducing RMQ to LCA}
{\em Cartesian trees} are due to Gabow, Bentley, and
Tarjan~\cite{gbt84}. We used Cartesian trees in Lecture 13 for fast
sorting and will now revisit them for the purposes of reducing RMQ to
LCA. A Cartesian tree has the following structure
\begin{itemize}
\item {\em Root} -- Minimum element of $A$. Suppose this is $A[i]$.
\item {\em Left subtree} -- Cartesian tree on $A[1], \ldots, A[i-1]$.
\item {\em Right subtree} -- Cartesian tree on $A[i+1], \ldots, A[n]$.
\end{itemize}
Cartesian trees can be constructed in $O(n)$ time, as we saw in
Lecture 13. Figure~\ref{fig:ctree} shows an example Cartesian tree.
\begin{figure}[htbp]
%\centering % Doesn't play well with qtree's centering
\Tree [.0 17 [.\textbf{15} [.16 \framebox{36} 23 ] [.18
\framebox{42} 20 ] ] ]
\caption{The Cartesian tree of $17, 0, 36, 16, 23, 15, 42, 18, 20$.
The same RMQ as before is shown as the LCA of the boxed elements.}
\label{fig:ctree}
\end{figure}
To reduce RMQ to LCA, we put the elements of the array $A$ into a
Cartesian tree. The RMQ of elements $i$ and $j$ is simply the LCA of
the nodes corresponding to $i$ and $j$ in the Cartesian tree. Thus,
if we can solve LCA in $O(1)$ time, we can solve RMQ in $O(1)$ time.
\section{Lowest Common Ancestor}
Now we just need to know how to solve LCA quickly. Surprisingly, we
will do this by reducing LCA back to RMQ! The trick is that we'll
reduce it to a special case of RMQ, which we'll call ``RMQ$\pm1$''.
RMQ$\pm 1$ is identical to the usual RMQ, except that adjacent values
in the list can only differ by $\pm1$. This distinguishing property
lets us solve RMQ$\pm1$ easier.
\subsection{Reduce LCA to RMQ$\pm1$}
To reduce LCA to RMQ$\pm1$, we take the Euler tour of the tree we're
solving LCA for. However, instead of storing the values of the nodes
as we visit them, we store their depths as shown in the second row of
Figure~\ref{fig:tour}. Observe that depths change by $\pm 1$ as we
consider nodes in an Euler tour traversal, because adjacent nodes are
in a parent-child relation.
\begin{figure}[htbp]
\centering
\begin{tabular}%{c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c}
{ccccc|ccccccc|ccccc}
\cline{6-12}
0 & 17 & 0 & 15 & 16 & 36 & 16 & 23 & 16 & 15 & 18 & 42 & 18 & 20
& 18 & 15 & 0 \\
0 & 1 & 0 & 1 & 2 & 3 & 2 & 3 & 2 & 1 & 2 & 3 & 2 & 3 & 2 & 1 & 0 \\
\cline{6-12}
\multicolumn{9}{c}{} & $\Uparrow$
\end{tabular}
\caption{The Euler tour of the tree in Figure~\ref{fig:ctree},
showing the elements and their depths. The box encloses the two
elements we want to find the LCA of and the arrow points to the
element that is both their LCA and the RMQ of either row over the
enclosed range.}
\label{fig:tour}
\end{figure}
By taking the RMQ$\pm1$ over the sequence of depths between the two
elements we want to find the LCA of, we can find the index of the
element with the least depth. This element is precisely the LCA.
\subsection{Apply Indirection}
\label{sec:indirection}
Now that we've reduced the problem to RMQ$\pm1$, we need to solve
RMQ$\pm1$ in $O(1)$ time. This time we won't reduce to LCA. Instead,
we'll apply indirection as shown in Figure~\ref{fig:indirection}. We
split our list of $n$ elements into $\nicefrac{2n}{\lg n}$ groups,
each of size $\nicefrac{1}{2} \lg n$. From each group we find the
minimum element and produce a summary structure of the
$\nicefrac{2n}{\lg n}$ minimums, which we store in another RMQ array.
\begin{figure}[htbp]
\centering
\includegraphics{indirection}
\caption{Applying indirection. {\em (a)} The $n$ element list. We
wish to find the minimum element between elements $i$ and $j$.
{\em (b)} The same elements, this time split into
$\nicefrac{2n}{\lg n}$ groups, each of size $\nicefrac{1}{2} \lg
n$. {\em (c)} A summary RMQ data structure on the minimum element
of each group.}
\label{fig:indirection}
\end{figure}
The minimum between $i$ and $j$ is thus the minimum of
\begin{enumerate}
\item the elements in $i$'s group between $i$ and the end of the group,
\item the elements in $j$'s group between the beginning of the group
and $j$,
\item and the minimums of all of the groups between $i$'s group and
$j$'s group.
\end{enumerate}
In other words,
\begin{displaymath}
RMQ(i,j) = \min
\begin{cases}
RMQ(i,\infty) \text{ in $i$'s group} \\
RMQ(-\infty,j) \text{ in $j$'s group} \\
RMQ(> \text{$i$'s group}, < \text{$j$'s group}) \text{ in the
summary structure}
\end{cases}
\end{displaymath}
\subsection{Use a Lookup Table for Groups}
By constructing a lookup table mapping from a tuple of $\langle$group
type, $i$, $j$$\rangle$ to the index of the minimum element in the
group between elements $i$ and $j$, we can answer a query for the
minimum element of any range in any group in $O(1)$ time.
We begin by observing that the index of the minimum element is
invariant under translation. Thus, we can subtract $k$ from every
element of the group such that the first element is translated to $0$
without affecting the index of the minimum element in the group.
This leads to $2^{\nicefrac{1}{2} \lg n} = \sqrt{n}$ distinct group
types. Note that this is only true because we're solving the $\pm1$
case and not the general case. Thus, each element can be represented
as one of two values indicating its delta from the previous element.
$i$ and $j$ can each take on one of $\nicefrac{1}{2} \lg n$ possible
values, leading to a total of $(\nicefrac{1}{2} \lg n)^2$ possible
queries. The result of a query requires $O(\lg \lg n)$ bits, the
number of bits required to represent the index of the minimum element.
All together, this means the lookup table requires $O(\sqrt{n} \lg^2 n
\lg \lg n) = o(n)$ bits.
\subsection{Using RMQ for the Summary Structure}
We've dealt with finding the RMQ of $i$'s group and $j$'s group, but
we still have to find the RMQ over the summary structure. Note that
the elements of this structure may not satisfy the $\pm1$ property, so
general RMQ must be used. Using RMQ recursively would fail to achieve
$O(1)$ queries. Using the trivial $O(n^2)$ RMQ algorithm is
insufficient to achieve the desired bounds. We need something
smarter.
Instead, we'll use something similar to the trivial algorithm, but
with a {\em sparse table}. Instead of storing the answers for any
start point and any interval length, we'll only store the answers for
any start point and intervals of length $2^i$, where $i=0,\ldots,\lg
n$. This leads to $n$ choices for the starting point and $\lg n$
choices for the interval length, yielding $O(n \lg n)$ space.
Furthermore, this table can be computed in $O(n \lg n)$ time using
dynamic programming.
This is sufficient for computing RMQ because overlapping intervals
will not affect the minimum. As illustrated in
Figure~\ref{fig:regions}, any interval of length $k$ can be broken down
into at most two overlapping ranges of length $2^{\lfloor\lg
k\rfloor}$.
\begin{figure}[htbp]
\centering
\includegraphics{regions}
\caption{A range over $6$ elements can be broken down into two
ranges over $2^2 = 4$ elements.}
\label{fig:regions}
\end{figure}
Thus, we now have an $O(n\lg n)$ space and time algorithm for
computing a structure supporting $O(1)$ RMQ queries over the summary
structure from section~\ref{sec:indirection}. Since the summary
structure actually contains $n' = O(n / \lg n)$ elements, the space
and preprocessing time is linear in the original $n$.
\section{Summary}
Computing the Cartesian tree for the original array takes $O(n)$ time
and space. Using indirection, computing the structures for LCA takes
$O(n)$ time. Thus, construction of the RMQ structure takes $O(n)$
time. Each step of the query takes $O(1)$ time and there are a
constant number of steps, so queries are supported in $O(1)$ time.
\bibliographystyle{alpha}
\begin{thebibliography}{77}
\bibitem{bf00} M. A. Bender, M. Farach-Colton, \emph{The LCA Problem
Revisited}, LATIN 2000: 88-94.
% http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/f/Farach=Colton:Martin.html
\bibitem{ch99} R. Cole, R. Hariharan, \emph{Dynamic LCA Queries on
Trees}, SODA 1999: 235-244.
% http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/h/Hariharan:Ramesh.html
\bibitem{gbt84} H. N. Gabow, J. L. Bentley, R. E. Tarjan,
\emph{Scaling and Related Techniques for Geometry Problems}, STOC
1984: 135-143.
% http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/g/Gabow:Harold_N=.html
\bibitem{ht84} D. Harel, R. E. Tarjan, \emph{Fast Algorithms for
Finding Nearest Common Ancestors}, SIAM Journal on Computing,
13(2): 338-355, 1984.
% http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/h/Harel:Dov.html
\end{thebibliography}
\end{document}
% LocalWords: RMQ LCA