\documentclass[11pt,twoside]{article}
\usepackage{macros-course-lectures}
\usepackage{graphics}

%\newcommand{\remove}[1]{}
%%% Already defined
%\newtheorem{falsethm}[theorem]{False Theorem}
\renewcommand{\thepage}{\arabic{page}}

%\makeslides

\begin{document}

%\onlystudents

\lecture{notes5}{Relations}{September 21, 2000}

\section{Relations}

A ``relation'' is a fundamental mathematical notion expressing a
relationship between elements of sets. \\
It's an abstract notion useful in practice for modeling many
different sorts of relationships. \\
It's the basis of the {\em relational data base\/} model, the standard
data model for practical data processing systems.

\subsection{Definitions}

A {\em binary relation} {\em from} 
a set $A$ {\em to} a set $B$ is a subset 
$R\subseteq A\times B$. \\
A binary relation {\em on} a set $A$ is a subset $R\subseteq A^2$
(I.e., a relation from $A$ to $A$)\.
We often write $a\sim_R b$ or $aRb$ instead of  $(a,b)\in R$. \\
\\
We can also define a {\em ternary} relation on $A$ as a subset
$R\subseteq A^3$ or, in general, an $n$-ary relation as a subset
$R\subseteq A^n$, or $R\subseteq A_1\times A_2 \times\cdots\times A_n$
if the sets $A_i$ are different. \\
\\
Relations are used to model many different things:

\begin{enumerate}
\item \label{classes} 
The relation ``is taking class'' as a subset of
$\{$students at MIT$\}\times\{$classes at MIT$\}$. 
A relation from students to classes.
\item \label{roommates} 
The relation ``is living in the same room'' as a subset of
$\{$students at MIT$\}\times \{$students at MIT$\}$.
A relation on students.
\item \label{reachable} 
The relation ``can drive from first to second city''.
(Not necessarily directly -- just some way, on some roads.)
\item \label{wire} 
Relation on computers, ``are connected (directly) by a wire''
\item \label{meet} ``meet one another on a given day''
\item \label{likes}  ``likes''
\item \label{le} 
Let $A=\naturals$ and define $a\sim_R b$ iff $a\le b$.
\item \label{finite} 
Let $A={\cal P}(\naturals)$ and define $a\sim_R b$ iff $|a \cap b|$ is finite.
\item \label{dist} Let $A=\Re^2$ and define $a\sim_R b$ iff $d(a,b)=1$.
\item \label{contain} Let $A={\cal P}(\{1,\cdots,n\})$ and let $a\sim_R b$ if
$a\subseteq b$.
\end{enumerate} 

\subsection{Properties of Relations}

There are several standard properties that are satisfied by (some)
relations\.
These occur commonly enough that they are worth identifying\.
Sometimes interesting things can be deduced just from the abstract
properties.

A binary relation $R$ on $A$ is:
\begin{enumerate}
\item {\em reflexive} if for every $a\in A$, $a\sim_R a$.
\item {\em irreflexive} if for every $a \in A$, $a\not\sim_R a$.
\item {\em symmetric} if for every $a,b\in A$, $a\sim_R b$ implies $b\sim_R a$.
\item {\em antisymmetric} if for every $a,b\in A$, $a\sim_R b$ and
$b\sim_R a$ implies $a=b$.
\item {\em transitive} if for every $a,b,c\in A$, $a\sim_R b$ and $b\sim_R c$
implies $a\sim_R c$. 
\item {\em asymmetric} if for every $a,b\in A$, $a\sim_R b$ implies 
$\neg(b\sim_R a)$.
\end{enumerate} 

There's no sense to the specific negative prefixes; they just need to
be remembered\.  The difference between antisymmetric and asymmetric
relations, is that antisymmetric relations may contain pairs $(a,a)$,
i.e., elements can be in relations with themselves, while in an
asymmetric relation this is not allowed\.  Clearly, any asymmetric
relation is also antisymmetric, but not vice versa.

Among our examples:
\begin{itemize}
\item Relation~\ref{roommates} is reflexive, symmetric, transitive. 
\item Relation~\ref{reachable} is reflexive, transitive\.  Not
  necessarily symmetric, since roads could be one-way (consider
  Boston), but in actuality...\.  But definitely not antisymmetric.
\item Relation~\ref{wire} is symmetric but not transitive\.
Whether it is reflexive is open to interpretation.
\item Relation~\ref{meet} likewise. 
\item Relation~\ref{likes} is (unfortunately) not symmetric.  Not
antisymmetric.  Not transitive.  Not even reflexive!
\item Relation~\ref{le} is reflexive, antisymmetric, transitive.
\item Relation~\ref{finite} is not reflexive.  It is symmetric.  Not
transitive\.  Evens$\cap$Odds is finite (empty), but not Evens$\cap$Evens.
\item Relation~\ref{dist} is only symmetric.
\item Relation~\ref{contain} is reflexive, antisymmetric and transitive.
\end{itemize} 

\subsection{Representation}

There are different ways of representing relations, e.g., for handling
in a computer program. \\
\\
We can describe by properties, as we did above. \\
That's about all we can do, for infinite sets. \\
But for finite sets, usually we use some method that explicitly
enumerates all the elements of the relation. \\
\\
Some alternatives: Lists, Matrices, Graphs. 

\subsubsection{Lists}

To represent a finite relation from set $A$ to set $B$, just
list all the pairs. \\
\\
Ex: Consider for example the relation from \\
$A = \{ 0, 1, 2, 3 \}$ to $\{ a,b,c \}$\\ defined by the list 
$\{ (0,a), (0, c), (1, c), (2, b), (1,a) \}$. \\

Ex: The divisibility relation on natural numbers $\{ 1, \cdots, 12 \}$
is represented by the list: \\
$\{ (1,1), (1,2), \ldots, (1,12), (2,2), (2,4), \ldots, (2,12), $\\
$ (3,3), (3,6), (3,9), (3,12), (4,4), (4,8), (4,12), (5,5), (6,6), $\\
$(6,12), (7,7), (8,8), (9,9), (10,10), (11,11), (12,12) \}$. \\
\\

Reflexivity in this representation: Contains all pairs $(a,a)$. \\
Symmetry: If contains $(a,b)$ then contains $(b,a)$. \\
Transitivity: If contains $(a,b)$ and $(b,c)$ then contains $(a,c)$. \\
\\

\subsubsection{Boolean matrices}

Used for finite sets $A$, $B$ \\
Rows for elements of $A$, columns for $B$. \\
$1$ in position if the pair is in the relation, $0$ otherwise.

Ex: The relation from
$\{0,1,2,3\}$ to $\{a,b,c\}$ of our first example is 
represented by the matrix
\[\begin{array} {r|ccc}
 &a& b&  c \\ 
\hline
0&1& 0&  1\\
1&1& 0&  1\\
2&0& 1&  0\\
3&0& 0&  0\\
\end{array}\]

Ex: The divisibility relation over $\{1,2,\ldots,12\}$ is 
represented by the matrix

\[\begin{array}{r|cccccccccccc}
  &1  &2  &3  &4  &5  &6  &7  &8  &9 &10 &11 &12\\
\hline
1 &1  &1  &1  &1  &1  &1  &1  &1  &1  &1  &1  &1\\
2 &0  &1  &0  &1  &0  &1  &0  &1  &0  &1  &0  &1\\
3 &0  &0  &1  &0  &0  &1  &0  &0  &1  &0  &0  &1\\
4 &0  &0  &0  &1  &0  &0  &0  &1  &0  &0  &0  &1\\
5 &0  &0  &0  &0  &1  &0  &0  &0  &0  &1  &0  &0\\
6 &0  &0  &0  &0  &0  &1  &0  &0  &0  &0  &0  &1\\
7 &0  &0  &0  &0  &0  &0  &1  &0  &0  &0  &0  &0\\
8 &0  &0  &0  &0  &0  &0  &0  &1  &0  &0  &0  &0\\
9 &0  &0  &0  &0  &0  &0  &0  &0  &1  &0  &0  &0\\
10&0  &0  &0  &0  &0  &0  &0  &0  &0  &1  &0  &0\\
11&0  &0  &0  &0  &0  &0  &0  &0  &0  &0  &1  &0\\
12&0  &0  &0  &0  &0  &0  &0  &0  &0  &0  &0  &1\\
\end{array}\]

Reflexivity in this representation:  major diagonal is all 1 \\
Symmetry: the matrix is symmetric around major diagonal \\
Transitivity: ... \\
\\
\\
\subsubsection{Digraphs} 

Digraphs are used for relations where $A = B$, 
i.e., for relations on a finite set $A$.
To represent a relation as a digraph we draw a dot (vertex) 
for each element of $A$, and draw an arrow from first element
to second element of each pair in the relation. \\
The digraph may contain self-loops, i.e. arrows from a dot 
to itself, associated to the elements $a$ such that $(a,a)$ is 
in the relation. \\

The relation from
$\{0,1,2,3\}$ to $\{a,b,c\}$ of the previous examples 
cannot be represented as a digraph because $A\neq B$.

The divisibility relation over $\{1,2,\ldots,12\}$
is represented by the digraph
\begin{center}
\mbox{\includegraphics{/a/class/6.042/spring00/archive/lectures/figures/dividehasse}}
\end{center}

Reflexivity: All nodes have self-loops. \\
Symmetry: all edges are bidirectional. \\
Transitivity: Short-circuits, for any sequence of consecutive arrows,
there is a single arrow from the first to the last node. 

\subsection{Paths} 

\begin{definition}
  A \emph{path} in a relation $R$ is a sequence $a_0,\ldots,a_k$ with
  $k \ge 1$ such that $(a_i,a_{i+1}) \in R$ for every $i < k$.  We
  call $k$ the \emph{length} of the path.
\end{definition}

In the graph model, a path is something you can trace out by following
arrows from vertex to vertex, without lifting your pen\.  Note that a
singleton vertex is \emph{not} a length 0 path (this is just for
convenience).

\begin{definition}
  A \emph{simple path} is a path with no repeated vertices.
\end{definition}


\section{Operations on Relations}

\subsection{Inverse}

If $R$ is a relation on $A \times B$, then $R^{-1}$ is a relation on
$B\times A$ given by $R^{-1} = \set{(b,a) \mid (a,b) \in R}$.  It's
just the relation ``turned backwards.''

Inverse of ``parent-of'' is ``child-of.''


\subsection{Composition}


The {\em composition} of relations $R_1 \subseteq A \times B$ and $R_2
\subseteq B \times C$ is the relation $R_1 \circ R_2 =\{(a,c) \mid
(\exists b) ((a,b) \in R_1) \wedge ((b,c) \in R_2)\}$. \\
Notice that $R_1\circ R_2$ and $R_2\circ R_1$ are different.
(This notation is different from the one
used in the text book where $R_2 \circ R_1$ is used for
the above set. )

Examples: \\
Composition of parent-of relation with itself gives grandparent-of. \\
Composition of child-of relation with parent-of relation gives
sibling-of relation. \\
Does composition of parent-of with child-of give married-to/domestic
partners?  No, because misses childless couples.
\\
\\
Composition of relations is equivalent to matrix multiplication, with
$+$ replaced by $\vee$ (Boolean or) and $*$ replaced by $\wedge$. \\
\\
Ex: \\
Let $R_1$ be relation from first example above: \\
\[\begin{array}{r|ccc}
& a&  b&  c\\
\hline  
0& 1&  0&  1\\
1& 1&  0&  1\\
2& 0&  1&  0\\
3& 0&  0&  0\\
\end{array}
\]

Let $R_2$ be relation from $\{ a, b, c \}$ to $\{ d,e,f \}$ given by:
\[\begin{array}{r|ccc}
& d&  e&  f\\
\hline  
a& 1&  1&  1\\
b& 0&  1&  0\\
c& 0&  0&  1\\
\end{array}
\]

Then $R_1 \circ R_2 = \{ (0,d), ... \}$, 
that is, \\
\[\begin{array}{r|ccc}
& d&  e&  f\\
\hline  
0& 1&  1&  1\\
1& 1&  1&  1\\
2& 0&  1&  0\\
3& 0&  0&  0\\
\end{array}
\]

Same as matrix multiply, using Boolean ops and and or. \\
\\
\\
Relational composition is associative. \\
Just like matrix multiplication. \\

\begin{lemma}
  \label{associative}
  $R_1 \circ (R_2 \circ R_3) = (R_1 \circ R_2) \circ R_3$
\end{lemma}

The composition $R \circ R$ of $R$ with itself is written $R^2$. \\
Similarly $R^n$ denotes $R$ composed with itself $n$ times. \\
Recursive definition: \\
$R_1 = R$, $R^n = R \circ R^{n-1}$. \\
Actually, can define $R_0 = \{ (a,a) \mid a \in R \}$ as base case. \\
We'll just use $1$ as base case, as book does. \\
\\
Could just as well have defined as $R^{n-1} \circ R$:

Useful observation:

\begin{lemma}
$R^n = \set{(a,b) \mid \mbox{ there is a length $n$ path from $a$ to
    $b$ in $R$}}$
\end{lemma}

Before we start the proof consider the relation $R$,
$(a,b)$, $(a,c)$, $(b,d)$, $(d,e)$ or:

\[\begin{array}{r|ccccc}
  &a  &b  &c  &d  &e  \\
\hline
a &0  &1  &1  &0  &0  \\
b &0  &0  &0  &1  &0  \\
c &0  &0  &0  &0  &0  \\
d &0  &0  &0  &0  &1  \\
e &0  &0  &0  &0  &0  \\
\end{array}\]

$R^2$ will be\\

\[\begin{array}{r|ccccc}
  &a  &b  &c  &d  &e  \\
\hline
a &0  &0  &0  &1  &0  \\
b &0  &0  &0  &0  &1  \\
c &0  &0  &0  &0  &0  \\
d &0  &0  &0  &0  &0  \\
e &0  &0  &0  &0  &0  \\
\end{array}\]

Draw a graph and find that the only 
paths of length two in $R$ are $(a,b),(b,d)$
and $(b,d),(d,e)$. Now the proof:

\begin{proof}
  By induction\.  The base case is clear---paths of length one are
  edges of $R$\.  Suppose it is true for $R^n$; let's prove it for
  $R^{n+1}$\.  Suppose there is a path $a_0,\ldots,a_{n+1}$ in $R$\.
  This is a path $a_0,\ldots,a_n$ in $R$ followed by a tuple
  $(a_n,a_{n+1})$ of $R$\.  Thus (by induction) we have $(a_0,a_n) \in
  R^n$ and $(a_n,a_{n+1})\in R$\.  Thus (by definition of composition)
  $(a_0,a_{n+1}) \in R^{n+1}$ as claimed.

  Am I done?  No, I only proved set containment in one direction\.  I
  still need to show that if  $(a,b) \in R^{n}$ there is a
  path of length $n$ from $a$ to $b$\.  But I'll let you do it: use
  the induction above, but turn the proof backwards.
\end{proof}

\subsection{Closure}

A closure ``extends'' a relation to satisfy some property.  But
extends it as little as possible.

\begin{definition}
  The \emph{closure} of relation $R$ with respect to property $P$ is
  the relation $S$ that (i) contains $R$, (ii) has property $P$, and
  (iii) is contained in \emph{any} relation satisfying (i) and (ii).
  That is, $S$ is the ``smallest'' relation satisfying (i) and (ii).
\end{definition}

There might be no one relation satisfying the definition; in that case
the closure is not defined.

\begin{lemma}
  The reflexive closure of $R$ is $S=R \cup \set{(a,a) \mid a \in A}$\ 
\end{lemma}
\begin{proof}
  It contains $R$ and is reflexive by design\.  Furthermore (by
  definition) any relation satisfying (i) must contain $R$, and any
  satisfying (ii) must contain the pairs $(a,a)$, so any relation
  satisfying both (i) and (ii) must contain $S$.
\end{proof}

Ex: the following matrix:
\[\begin{array}{r|ccccc}
  &a  &b  &c  &d  &e  \\
\hline
a &0  &1  &1  &0  &0  \\
b &0  &0  &0  &1  &0  \\
c &0  &0  &0  &0  &0  \\
d &0  &0  &0  &0  &1  \\
e &0  &0  &0  &0  &0  \\
\end{array}\]

Will be changed to:
\[\begin{array}{r|ccccc}
  &a  &b  &c  &d  &e  \\
\hline
a &1  &1  &1  &0  &0  \\
b &0  &1  &0  &1  &0  \\
c &0  &0  &1  &0  &0  \\
d &0  &0  &0  &1  &1  \\
e &0  &0  &0  &0  &1  \\
\end{array}\]


\begin{lemma}
  The symmetric closure of $R$ is $S=R \union R^{-1}$.  
\end{lemma}
\begin{proof}
  This relation is symmetric and contains $R$\.  It is also the
  smallest such\.  For suppose we have some symmetric relation $T$
  with $R \subseteq T$\.  Consider $(a,b) \in R$\. Then $(a,b) \in T$
  so by symmetry $(b,a) \in T$\.  It follows that $R^{-1} \subseteq
  T$\.  So $S=R \cup R^{-1} \subseteq T$.
\end{proof}

Ex: the following matrix:
\[\begin{array}{r|ccccc}
  &a  &b  &c  &d  &e  \\
\hline
a &0  &1  &1  &0  &0  \\
b &0  &0  &0  &1  &0  \\
c &0  &0  &0  &0  &0  \\
d &0  &0  &0  &0  &1  \\
e &0  &0  &0  &0  &0  \\
\end{array}\]

Will be changed to:
\[\begin{array}{r|ccccc}
  &a  &b  &c  &d  &e  \\
\hline
a &0  &1  &1  &0  &0  \\
b &1  &0  &0  &1  &0  \\
c &1  &0  &0  &0  &0  \\
d &0  &1  &0  &0  &1  \\
e &0  &0  &0  &1  &0  \\
\end{array}\]

Paths give us the terminology for transitive closure.

\begin{lemma}
  The transitive closure of a relation $R$ is that set 
  \[
  S=\set{(a,b) \mid \mbox{there is a path from $a$ to $b$ in $R$}}.
  \]
\end{lemma}
\begin{proof}
  First let's show $S$ is transitive\.  Suppose $(a,b)\in S$ and
  $(c,d) \in S$\.  This means that there is an $(a,b)$ path and a
  $(b,c)$ path in $R$\.  If we ``concatenate'' them (attach the end of
  the $(a,b)$ path to the start of the $(b,c)$ path, we get an $(a,c)$
  path.  So $(a,c) \in S$\.  So $S$ is transitive.

  So consider any transitive relation $T$ containing $R$\.  We have to
  show it contains $S$\.  Assume for contradiction that $S \not
  \subseteq T$\.  This means that some pair $(a,b) \in S$ but $(a,b)
  \notin T$\.  In other words, there is a path $a_0,\ldots,a_k=b$ in
  $S$ where $(a_0,a_k) \notin T$\.  Call this a ``missing'' path.

  Now we use well ordering (i.e. induction)\.  We have just claimed that the
  set of paths in $T$ is nonempty\.  So consider a shortest missing path
  $s_0,\ldots,s_m$\.  We will derive a contradiction to this being the
  shortest missing path\.  

  First suppose $m=1$.  The $s_0,s_1$ is a path in $R$, so $(s_0,s_1)
  \in R$\.  But we know $T$ contains $R$, so $(s_0,s_1)\in T$, a
  contradiction.

  Now suppose $m>1$\.  Then $s_1,\ldots,s_{m-1}$ is a path in $R$ ($m-1
  > 0$)\.  But it is shorter than our original ``shortest missing''
  path, so cannot be missing\.  Thus $(s_1,s_{m-1}) \in T$\.  Also we
  have $(s_{m-1},s_m) \in T$ since $R \subseteq T$\.  Thus by
  transitivity of $T$, $(s_1,s_m)\in T$, a contradiction.

  We get a contradiction either way, so our assumption (that $S
  \not\subseteq T$) is false.  This proves the theorem.
\end{proof}

Wait a minute.  Well ordering is applied to sets of \emph{numbers}; we
applied it to a set of path!  How\?  Well, look at the set of
``lengths of missing paths''\.  It is nonempty, so has a smallest
element\.  There is path that has this length---so it is a shortest
path.

Ex: the following matrix:
\[\begin{array}{r|ccccc}
  &a  &b  &c  &d  &e  \\
\hline
a &0  &1  &0  &0  &0  \\
b &0  &0  &0  &1  &0  \\
c &0  &0  &0  &0  &0  \\
d &0  &0  &0  &0  &1  \\
e &0  &0  &0  &0  &0  \\
\end{array}\]

Which can be represented by a digraph:

$a$ $\rightarrow$ $b$ $\rightarrow$ $d$ $\rightarrow$ $e$ and an additional
node $c$\\

Will be changed to:
\[\begin{array}{r|ccccc}
  &a  &b  &c  &d  &e  \\
\hline
a &0  &1  &0  &1  &1  \\
b &0  &0  &0  &1  &1  \\
c &0  &0  &0  &0  &0  \\
d &0  &0  &0  &0  &1  \\
e &0  &0  &0  &0  &0  \\
\end{array}\]

\subsubsection{Computing the Transitive Closure}
With reflexive and symmetric closure, it is pretty clear how to
actually build them\.  But for transitive closure, how do we actually
find all the paths we need?

Let's start by finding paths of a given length\.  Recall the definition of
$R^n$ of $R$ composed with itself $n$ times.
We proved that\\
$R^n = \set{(a,b) \mid \mbox{ there is a length $n$ path from $a$ to
    $b$ in $R$}}$

This means we can write the transitive closure of $R$ as $R \cup R^2
\cup R^3 \cdots$\.  Better (since we know how to do composition), but
still a problem: there are infinitely many terms!

\begin{lemma}
Suppose $A$ has $n$ elements.  If there is a path from $a$ to $b$,
there is a path of length at most $n$ from $A$ to $B$.
\end{lemma}
\begin{proof}
  We'll use well ordering (for a change)\.  Consider the shortest path
  $a=a_0,a_1,\ldots,a_k$ from $a$ to $b$ (we know one exists, so by
  well ordering there is a shortest one)\.  Suppose $k > n$\.  Then
  some element of $A$ appears twice in the list (with more than $n$
  list entries, one must be a repeat)\.  This means the path is at
  some point circling back to where it was before\.  We can cut out
  this cycle from the path and get a shorter path\.  This contradicts
  that we had a shortest path.  So we cannot have $k>n$.
\end{proof}

So we don't need infinitely many terms.  It is enough to take paths of
length $n$, namely $R^1\cup R^2\cdots\cup R^n$.


\section{Equivalence relations and partitions}

Now we consider an important special type of relation, called an {i\em
  equivalence relation}\.  We give an important special case --
modular arithmetic\.  We show an application to RSA.

\subsection{Definitions}

\begin{definition}
  An {\em equivalence relation} is a relation that is reflexive,
  symmetric and transitive\.
\end{definition}

For example, the ``roommates'' relation is an equivalence relation.
So is ``married to'', ``same size as'', and ``on same Ethernet hub''\.
A trivial example is ``='' relation on natural numbers\. 
The hallmark of equivalence relations is the word ``same''\.  It
provides a way to ``hide'' unimportant differences.

Equivalence relations partition the universe into subsets in a natural
way:\\
\begin{definition}
  A {\em partition} of a set $A$ is a collection of subsets
  $\{A_1,\cdots,A_k\}$ such that any two of them are disjoint (for any
  $i\neq j$, $A_i\cap A_j=\emptyset$) and such
  that their union is $A$. \\
\end{definition}
That is, every element of $A$ is in exactly one subset. \\
\\
Fix an equivalence relation $R$ on $A$. \\
For an element $a\in A$, let $[a]$ denote the set $\{b\in A \given
a\sim_R b\}$. \\
We call this set the {\em equivalence class of $a$ under $R$}\.
We call $a$ a {\em representative} of $[a]$\\

\begin{lemma}
  The sets $[a]$ for $a\in A$ constitute a partition of $A$.  That is,
  for every $a,b\in A$, either $[a]=[b]$ or $[a]\cap [b]=\emptyset$.
\end{lemma}

\begin{proof}
  Fix $a,b\in A$. \\
  If $[a]=[b]$ then we are done, so suppose not. \\
  Let $c$ be any element that is in one of the sets but not the other. \\
  Without loss of generality we can assume that $c \in [b]-[a]$
  (we know that either $c\in [b]-[a]$ or $c\in [a]-[b]$.
  In the latter case we can simply swap $a$ and $b$ and reduce to 
  the first case.).
  \\
  Similarly, if $[a] \cap [b]=\emptyset$ then we are done, so suppose
  not. \\
  Let $d$ be any element in $d \in [a] \cap [b]$. \\
  \\
  We will get a contradiction by showing that $a\sim_R c$ and therefore
  that $c\in [a]$. \\
  \\
  First, $a\sim_R d$ because $d\in [a]$ (note that $d=a$ is a possibility but
  this is ok because $R$ is reflexive). \\
  Second, $d\sim_R b$ and $b\sim_R c$ because both $c,d\in [b]$ and $R$
  is symmetric. \\
  This implies, by transitivity, that $d\sim_Rc$. \\
  Finally, by transitivity, $a\sim_R c$ because $a\sim_R d$ and $d\sim_R c$.
\end{proof}

Conversely, any partition $\{A_1,\cdots,A_k\}$ of $A$ defines an
equivalence relation by letting $a\sim_R b$ iff $a$ and $b$ are in the
same $A_i$. \\
(Note ``same''.) \\
This is an equivalence relation:\\
\begin{itemize}
\item Reflexivity: for all $a$ we know $a\in A_i$ for some $i$,
  by definition of partition. Clearly $a$ and $a$ are in the same $A_i$,
  and $a\sim_R b$.
\item
  Symmetry: Assume $a\sim_R b$, that is $a$ and $b$ are in the 
  same $A_i$. Also $b$ and $a$ are in the same $A_i$
  and therefore $b\sim_R a$.
\item
  Transitivity: Assume $a\sim_R b$ and $b\sim_R c$.
  By definition of $\sim_R$, $a$ and $b$ are in the same 
  $A_i$ for some $i$, and $b$ and $c$ are in the same $A_j$
  for some $j$.
  But by definition of partition $b$ cannot be in two different $A_i$'s.
  So, it must be $A_i=A_j$ and $a$ and $c$ are in the same 
  $A_i$, proving $a\sim_R c$.
\end{itemize}

Therefore, we can look at partitions and at equivalence relations as
the same thing. 

\subsection{Integers modulo $m$}

An equivalence relation on integers (positive, negative and $0$):

\begin{defin}
  If $a$ and $b$ are integers, then we say $a=b \pmod m$ if $m \mid
  (a-b)$.  
\end{defin}

An equivalent formulation says that $a=b+km$ for some integer $k$.  

\begin{theorem}
  Equality modulo $m$ is an equivalence relation.
\end{theorem}

\begin{proof}
  We need to show the relation is reflexive, symmetric, and transitive.
  \begin{itemize}
  \item {Reflexive:} Clearly $m \mid a-a = 0$, so $a=a \pmod m$.
  \item {Symmetric:} If $a=b+km$ then $b=a+(-k)m$.
  \item {Transitive:} Suppose $a=b \pmod m$ and $b = c \pmod m$\. 
    Then $m \mid (a-b)$ and $m \mid (b-c)$\.  So $(a-b)=k_1 m$ and
    $(b-c)=k_2 m$\.  So $(a-c)=(a-b)+(b-c)=(k_1+k_2)m$\.  Therefore $m
    \mid = a-c$.
  \end{itemize}
\end{proof}

The equivalence class of $a$ is simply the set 
$[a] = \{ b \in {\Bbb Z} \mid a = b \pmod m \}$, or
$\{ km + a \mid k \in {\Bbb Z}\}$. \\
\\
It turns out that we can extend a lot of standard arithmetic to work
modulo $m$. \\
In fact, we can define notions of sum and product for the equivalence
classes mod $m$. \\
For example, we define $[a]+[b]$, given two equivalence classes mod
$m$, to be the equivalence class $[a+b]$. \\
This is not as obvious as it seems: notice
that the result is given in terms of $a$ and $b$, two selected
representative from the equivalence classes, but that it is supposed
to apply to the equivalence classes themselves.  \\
To prove that this works, we have to show that it doesn't matter which
representatives of the equivalence classes we choose:

\begin{lemma}
  If $a=x \pmod m$ and $b=y \pmod m$ then $(a+b) = (x+y) \pmod m$.  
\end{lemma}

\begin{proof}
  $m \mid (a-x)$ and $m \mid (b-y)$ so $m \mid ((a-x)+(b-y)) =
  (a+b)-(x+y)$,  
\end{proof}

It follows that if we are interested only the result of the addition
modulo $m$, which is the case, for example, in the RSA cryptosystem,
that we can at any time replace a given number with a different number
equivalent to it, without changing the value (equivalence class) of
the final answer. \\
\\
The same fact can be proven for multiplication.  \\
This is the basis of the fast exponentiation algorithm presented in
the next subsection.

\subsection{Fast Exponentiation}

Recall that for RSA we wanted to compute the remainder of $x^a$\ on
division by $b$---that is, the equivalence class $\bmod~b$ of $x^a$.


In the section on recursive algorithms, we already saw how to reduce
the number of multiplications substantially\.  But there is still a
problem: $x^a$ is a really big number\.  It is bigger than $2^a$, so
has $2^{2^{128}}$ bits if we use a 128-bit key $a$\.  It will take a
long time to write this number down (if we could find a place to fit
it)\.  

But notice that we only care about the remainder modulo $b$.  So using
modular arithmetic we can solve the large-number problem\.  Whenever
the number we are computing gets too big, replace it with a smaller
representative of the same equivalence class\.  What is the smallest
representative?  The remainder on division by $b$\.  We've argued that
switching representatives doesn't change the answer\.  But now all our
math is done on number smaller than $b$---i.e. 128-bit numbers.

\section{Partial Orders}

A particular type of binary relation. \\
\\
Important in computer science. \\
Applications to task scheduling, \\
Database concurrency control, \\
Logical time in distributed computing \\

\subsection{Definitions}

\subsubsection{Partial Order Definition}

\begin{defin} 
  A binary relation $R \subseteq A \times A$ is a {\bf partial order\/}
  if it is reflexive, transitive, and antisymmetric.
\end{defin}

Quick review:
\begin{itemize}
\item reflexive: $a R a$
\item transitive: $a R b \wedge b R c \Rightarrow a R c$
\item symmetric: $a R b \Rightarrow b R a$
\item antisymmetric: $a R b \wedge b R a \Rightarrow a=b$.
  i.e., $\forall a \neq b, a R b \Rightarrow \neg b R a$. Means NEVER
  symmetric!
\end{itemize}

For comparison: \\
Recall that a relation that is reflexive, transitive, and symmetric
is an equivalence relation. \\
\\


The reflexivity, antisymmetry and transitivity properties are abstract
properties that generally describe ``ordering'' relationships. \\ 

So we often write an ordering-style symbol like $\preceq$ instead of
just a letter like $R$, for a partial order relation.  This lets us use
notation similar to $\le$.  For example, we write $a \prec b$ if
$a \preceq b$ and $a \ne b$.  Similarly, we write $b \succeq a$ as
equivalent to $a \preceq b$.

But this could be misleading -- note that $\geq$ is a partial order
on natural numbers, as well as $\leq$.  
If we use the $\preceq$ symbol for $\geq$,
things look really funny. 
In cases like this, better to use $R$.
\\
A partial order is always defined on some set $A$.  The set together
with the partial order is called a ``poset'':
\begin{defin}
  A set $A$ together with a partial order $\preceq$ is called a 
  {\bf poset\/}  $(A, \preceq)$.
\end{defin}



Examples of posets:
\begin{itemize}
\item $A=\naturals, R=\leq$, easy to check reflexive, transitive, antisymmetric
\item $A=\naturals, R=\geq$, same.
\item $A=\naturals, R=<$, NOT because not reflexive
\item $A=\naturals, R=|$ (divides), easy to check reflexive,
  transitive, antisymmetric
\item $A=P(\naturals), R=\subseteq$, check reflexive: $S \subseteq S$,
  transitive: $S \subseteq S' \wedge S' \subseteq S'' \Rightarrow S
  \subseteq S''$, antisymmetric: $S \subseteq S' \wedge S' \subseteq S
  \Rightarrow S = S'$.
\item $A=$ ``set of all computers in the world'', $R=$ ``is (directly or
  indirectly) connected to''. \\
  NOT a p.o. because it is not true that $a R b \wedge b R a \Rightarrow
  a=b$. \\
  In fact, it is symmetric, transitive.  Equivalence relation.

\item $A=$ ``set of all propositions'', $R=\Rightarrow$, NOT because
  not antisymmetric. Not symmetric either, so not equivalence
  relation.
\end{itemize}


\subsubsection{Directed Acyclic Graphs}

A common source of partial orders in computer science is in ``task
graphs''\.  You have a set of tasks $A$, and a relation $R$ in which
$aRb$ means ``$b$ cannot be done until $a$ is finished''\.
Implicitly, ``if all the things that point at $b$ are done, I can do
$b$.''

This can be nicely drawn as a graph\.  We draw an arrow from $a$ to
$b$ if $a R b$\.

For example, below is a graphs that describes the order in which one
would put on clothes\. The set is of clothes, and the edges say what
should be put on before what.

\setlength{\unitlength}{0.012500in}%
                                %
\begingroup\makeatletter\ifx\SetFigFont\undefined
                                % extract first six characters in \fmtname
\def\x#1#2#3#4#5#6#7\relax{\def\x{#1#2#3#4#5#6}}%
\expandafter\x\fmtname xxxxxx\relax \def\y{splain}%
\ifx\x\y   % LaTeX or SliTeX?
\gdef\SetFigFont#1#2#3{%
  \ifnum #1<17\tiny\else \ifnum #1<20\small\else
  \ifnum #1<24\normalsize\else \ifnum #1<29\large\else
  \ifnum #1<34\Large\else \ifnum #1<41\LARGE\else
  \huge\fi\fi\fi\fi\fi\fi
  \csname #3\endcsname}%
\else
\gdef\SetFigFont#1#2#3{\begingroup
  \count@#1\relax \ifnum 25<\count@\count@25\fi
  \def\x{\endgroup\@setsize\SetFigFont{#2pt}}%
  \expandafter\x
  \csname \romannumeral\the\count@ pt\expandafter\endcsname
  \csname @\romannumeral\the\count@ pt\endcsname
  \csname #3\endcsname}%
\fi
\fi\endgroup
\begin{picture}(220,75)(20,760)
  \thinlines
  \put(165,817){\vector( 0,-1){ 20}}
  \put(240,817){\vector( 0,-1){ 20}}
  \put(240,787){\vector( 0,-1){ 20}}
  \put(165,787){\vector( 0,-1){ 20}}
  \put( 25,797){\vector( 0,-1){ 25}}
  \put( 85,797){\vector( 0,-1){ 25}}
  \put(180,787){\vector( 3,-1){ 51}}
  \put(178,765){\vector( 1, 0){ 53}}
  \put(153,795){\vector(-4,-1){88}}
  \put(155,790){\makebox(0.1111,0.7778){\SetFigFont{5}{6}{rm}.}}
  \put(153,790){\vector(-2,-1){ 32}}
  \put( 20,800){\makebox(0,0)[lb]{\smash{\SetFigFont{12}{14.4}{rm}left sock}}}
  \put( 20,765){\makebox(0,0)[lb]{\smash{\SetFigFont{12}{14.4}{rm}left shoe}}}
  \put( 80,800){\makebox(0,0)[lb]{\smash{\SetFigFont{12}{14.4}{rm}right sock}}}
  \put(235,820){\makebox(0,0)[lb]{\smash{\SetFigFont{12}{14.4}{rm}shirt}}}
  \put(155,820){\makebox(0,0)[lb]{\smash{\SetFigFont{12}{14.4}{rm}underwear}}}
  \put(235,790){\makebox(0,0)[lb]{\smash{\SetFigFont{12}{14.4}{rm}sweater}}}
  \put(235,760){\makebox(0,0)[lb]{\smash{\SetFigFont{12}{14.4}{rm}jacket}}}
  \put(155,760){\makebox(0,0)[lb]{\smash{\SetFigFont{12}{14.4}{rm}belt}}}
  \put(155,790){\makebox(0,0)[lb]{\smash{\SetFigFont{12}{14.4}{rm}pants}}}
  \put( 80,765){\makebox(0,0)[lb]{\smash{\SetFigFont{12}{14.4}{rm}right shoe}}}
\end{picture}

The ``depends on'' graph imposes an ordering on tasks.  But is it a
partial order\?  No, because it isn't reflexive or transitive\.  But
there is a natural extension: the reflexive transitive closure
\emph{is} a partial order\.  It gives the relation ``must be done
before.''

Wait, we know the reflexive transitive closure is reflexive and
transitive, but is it antisymmetric\?
What if I add a relation edge from belt to underwear\?  In that case my
dependency graph stops making sense: there is no way to get dressed!

What goes wrong?  A cyclic dependency.

\begin{definition}
A \emph{cycle} is a path that ends where it started (i.e., the last
vertex equals the first).
\end{definition}

\begin{definition}
A \emph{directed acyclic graph (DAG)} is a directed graph with no cycles.
\end{definition}

\begin{lemma}
Any partial order is a DAG.
\end{lemma}
\begin{proof}
  Suppose a partial order $R$ has a cycle $a_1,\ldots,a_k,a_1$\.  Then
  by transitivity of $R$ (with an induction hiding inside) we have
  $(a_1 R a_k)$.  We also have $a_k R a_1$.  This violates the
  antisymmetry of $R$, a contradiction.
\end{proof}

\begin{lemma}
The transitive reflexive closure of a DAG is a partial order.
\end{lemma}
\begin{proof}
  Let the DAG be $R$ and its transitive closure $S$\.  We already
  proved that $S$ is the pairs $(a,b)$ that are connected by a path in
  $R$\.  $S$ is transitive; we just need to prove it is
  antisymmetric\.  We do so by contradiction.  Suppose $A$ is not
  antisymmetric.  Then for some $a$ and $b$, we have $aSb$ and $bSa$\.
  In other words, there is a path from $a$ to $b$ and a path from $b$
  to $a$\.  If we attach these two paths, we get a path from $a$ to
  $a$\.  This contradicts the assumption that $R$ is acyclic.
\end{proof}


\subsubsection{Hasse Diagrams}

If we draw the graph of a poset, we may get \emph{lots} of edges
(consider the $\le$ order on some natural numbers)\.  But a lot of
those edges are ``implicit'' from the transitivity of the order\.  
We can get a much less messy picture by leaving out these arrows.

Consider the clothing example\.  Under the transitive closure,
underwear precedes belt\.  But we don't need to see that edge to know
it is there\.
Similarly loops (which express reflexive relationships) aren't shown.

\begin{defin}
  A {\bf Hasse diagram\/} for a poset $(A,\preceq)$ is a graph: \\
  - whose vertices are the elements of $A$, \\
  - whose edges (arrows) represent pairs $(a,b)$, where $a \leq b$ but
  $a \neq b$, \\
  - that contains no ``transitive edges'', \\
  - for which all pairs in $\preceq$ are obtainable by transitivity from
  edges in the diagram.
\end{defin}

It's a ``transitive disclosure'' of the poset.

To convert the diagram above to an ``official form'' Hasse diagram,
we must remove the pants $\rightarrow$ jacket arrow. 


\subsubsection{Partial vs. Total Orders}

A partial order is called {\em partial \/} because it is not necessary
that an ordering exist between every pair of elements in the set. \\
\\
E.g., Lshoe and Rshoe have no prescribed ordering between them. \\
\\
E.g., For two sets, it's not necessary that either be a subset of the
other. \\
\\
When there is no prescribed order between two elements we say that they
are ``incomparable''. 

\begin{defin}
  $a$ and $b$ are {\bf incomparable\/} if neither $a \preceq b$ nor $b
  \preceq a$.
\end{defin}

\begin{defin}
  $a$ and $b$ are {\bf comparable\/} if $a \preceq b$ or $b \preceq a$.
\end{defin}

E.g., for subsets of $\naturals$, $\{ 1,2,3 \}$ and $\{ 2,3,4 \}$ are
incomparable\.  However, a partial order need not have incomparable
elements\.  As a special case, we can have a partial order in which 
there is a
specified order between every pair of elements\.  Such partial orders
are called ``total orders.''

\begin{defin}
  A poset $(S, \preceq)$ is {\bf totally ordered\/} if $(\forall a,b
  \in S) [ a \preceq b \vee b \preceq a ]$.
\end{defin}

Examples:
\begin{itemize}
\item $S=\naturals, \preceq = \leq$
\item $S=\naturals, \preceq = |$, NOT because neither $3|5$ nor $5|3$
\item $S=P(\naturals), \preceq = \subseteq,$ NOT because neither $\{3\}
  \subseteq \{5\}$ nor $\{5\} \subseteq \{3\}$ 
\item Lexicographic order on pairs of natural numbers, defined by:
  $(a_1,a_2) \preceq (b_1,b_2)$ if and only if either $a_1 < b_1$, 
  or else $a_1 = b_1$ and $a_2 \leq b_2$. \\
  Do some examples here. $(2,37) \preceq (2,38) \preceq (3,2)$. \\
  ``Lexicographic'' because like dictionary.
\end{itemize}

The Hasse diagram of a total order looks like a line. 

This has been the basic material. \\
Book has a bit more: \\
minimal, maximal elements (nothing smaller, nothing larger) \\
lower bounds, upper bounds (for a subset -- $\leq$ everything in the
subset, or $\geq$) \\
lub, glb, \\
lattices. \\
Read these parts. 
\\

\subsection{Topological Sorting}

Sometimes when we have a partial order, e.g., of tasks to be
performed, we want to obtain a consistent total order\.  That is, an
order in which to perform all the tasks, one at a time, so as not to
conflict with the precedence requirements. \\ 
\\
The task of finding an ordering that is consistent with a partial
order is known as ``topological sorting''.
Maybe because the sort is based only on the shape (topology) of the
poset, and not on the actual values.

\begin{defin}
  A {\bf topological sort\/} of a finite poset $(A, \preceq)$ is a
  total ordering of all the elements of $A$, $a_1, a_2, \cdots, a_n$
  \\ in such a way that $\forall i < j$, either $a_i \prec a_j$ or
  $a_i$ and $a_j$ are incomparable in $(A, \preceq)$ (equivalently, it
  is not the case that $a_j \prec a_i$).
\end{defin}

For example, underwear, shirt, Lsock, Rsock, pants, sweater, Lshoe, Rshoe,
belt, jacket is a topological sort of how to dress. \\
\\
One of the nice facts about posets is that such an ordering always
exists and is even easy to find:

\begin{theorem}
  Every finite poset has a topological sort.
\end{theorem}

We'll prove this by induction. \\
On the number of elements in the domain of the poset. \\
The basic idea is to pick a ``smallest'' element to start and then
proceed recursively.

\begin{defin}
  A {\bf minimal\/} element $a$ in a poset $(A, \preceq)$ is one for
  which $(\forall b \in A) [ a \preceq b \vee a \mbox{ and } b \mbox{ are
    incomparable }]$.  Equivalently, it is an element $a$ for which no $b
  \prec a$.
\end{defin}

Notice that there can be more than one minimal element. There are 4 in
the clothes example: Lsock, Rsock, underwear, shirt.

\begin{lemma}
  Every finite poset $(A,\preceq)$ has a minimal element.
\end{lemma}
\begin{proof}
By induction on the number of elements in the poset\.  The base case
is clear.  Suppose it is true for all size-$n$ posets\.  Consider any
size $n+1$ poset.  Delete some element $a$ from the poset\.  This
leaves a poset on $n$ elements (you can verify this)\.  So the smaller
poset has a minimal element $m$\.  

Now consider two cases.  First suppose $a\not\preceq m$\.  The $m$ is a
minimal element of the whole poset\.  Now suppose $a \preceq m$.  Then
I claim $a$ is a minimal element of the whole poset\.  

To prove this claim we argue by contradiction\.  If $a$ is not minimal
then some $b \preceq a$\.  But then, by transitivity, since $a\preceq
m$, we have $b \preceq m$\.  But this would prevent $m$ from being
minimal in the poset with $a$ removed\.  This contradicts our finding
of $m$.
\end{proof}


Note that an infinite poset might have no minimal element.  Consider
$\integers$.  


Now we can prove the existence of the topological sort.

\begin{proof}
  By (ordinary) induction. \\
  $P(n)$ = any poset with $n$ elements has a topological sort. 
  \\
  Base case $(P(1))$: for a poset with one element the one element makes a
  topological sort. \\
  \\
  Inductive step: Assume $(P(n))$. Consider a poset of $n+1$ elements. \\
  By lemma it has a minimal element $u$. \\
  Consider the same relation on $A-\{u\}$. \\
  It is easy to check that it is still reflexive, transitive, and
  anti-symmetric. \\
  So by the inductive hypothesis, $A-\{u\}$ has a topological sort. \\
  We put $u$ first and now have a topological sort on the whole thing. \\
  (According to the definition of topological sort -- $u$ can't be in
  the wrong order w.r.t. any of the later elements.) \\
  \\
  Thus every finite poset has a topological sort, by induction.
\end{proof}

E.g.: Dressing example: \\
Construct ordering by picking one at a time. \\
In each case, look at the remaining partial order for remaining elements. \\
Lsock, shirt, sweater, Rsock, underwear, pants, Lshoe, belt, jacket, Rshoe

\remove{
E.g., subsets of $\{ 1,2,3,4 \}$ \\
$\emptyset$ is the unique minimal element, then have choices, e.g., do: \\
$\{ 1 \}$, $\{ 2 \}$, $\{ 1,2 \}$ \\
$\{ 3 \}$, $\{ 1,3 \}$, $\{ 2,3 \}$, $\{ 1,2,3 \}$, \\
$\{ 4 \}$, $\{ 1,4 \}$, $\{ 2,4 \}$, $\{ 3,4 \}$, \\
$\{ 1,2,4 \}$, $\{ 1,3,4 \}$, $\{ 2,3,4 \}$, $\{ 1,2,3,4 \}$
}%end remove

I'm not sure if infinite posets always have topological sorts---I
think this might be equivalent to the well ordering property, which is
equivalent to the axiom of choice, a hotly debated mathematical axiom.

\subsection{Parallel Task Scheduling}

When elements of a poset are tasks that need to be done and the partial
order is precedence constraints, topological sorting provides us with
a legal way to execute tasks sequentially, i.e., without violating any
precedence constraints. \\
But what if we have the ability to execute more than one task at the
same time? \\
For example, say tasks are programs, partial order indicates data
dependence, and we have a parallel machine with lots of processors
instead of a sequential machine with only one. \\
How should we schedule the tasks? \\
Our goal should be to minimize the total {\em time\/} to complete all
the tasks. \\
(For simplicity, say all the tasks take the same amount of time and
all the processors are identical.)
\\
So: \\
Given a finite poset of tasks, how long does it take to do them all,
in an optimal parallel schedule? \\
Assume that all tasks take 1 unit of time and we have an unlimited number
of processors. \\
\\
Use partial order concepts to analyze this problem. \\
\\
On clothes example, we could do all the minimal elements first (Lsock,
Rsock, underwear, shirt), remove them and repeat. \\
Need lots of hands, or maybe dressing servants. \\
(So do pants and sweater next, and then Lshoe, Rshoe, and belt, and
finally jacket.) \\
\\
Can't do any better, because the sequence shirt, pants, belt, jacket
must be done in that order. \\
A sequence like this is called a chain.

\begin{defin}
  A {\bf chain\/} in a poset is a sequence of elements of the domain,
  each of which is smaller than the next in the partial order 
  ($\preceq$ and $\neq$). 
\end{defin}

Clearly, parallel time $\geq$ length of any chain. \\
This is true, because if we used less time, then by pigeonhole two
tasks in the chain would have to be done at the same time, but by
definition of chain this violates precedence constraints. \\
\\
A longest chain is also known as a {\em critical path}. \\
\\
So we need at least $t$ steps, where $t$ is the length of the longest
chain. \\
Fortunately, it is always possible to use only $t$:

\begin{theorem}
  Given any finite poset $(A, \preceq)$ for which the longest chain
  has length $t$, it is possible to partition $A$ into $t$ subsets,
  $A_1, A_2, \cdots, A_t$ such that 
  $(\forall i \in [1,t]) (\forall a \in A_i) 
  (\forall b \preceq a, b \neq a) 
  [ b \in A_1 \cup A_2 \cup \cdots \cup A_{i-1} ]$.
\end{theorem}

That is, can divide up the tasks into $t$ ``groups'' so that for
each group $i$, all tasks that have to precede tasks in $i$ are in
smaller-numbered groups. 

\begin{corollary}
  For $(A, \preceq)$ and $t$ as above, it is possible to schedule all
  tasks in $t$ steps.
\end{corollary}

\begin{proof}
  $\forall i$, schedule all elements of $A_i$ at time $i$. \\
  This satisfies the precedence requirements, because all tasks that
  must precede a task are scheduled at preceding times. \\
\end{proof}

\begin{corollary}
  parallel time = length of longest chain
\end{corollary}

So it remains to prove the partition theorem:

\begin{proof}
  Use the rule: \\
  Put each $a$ in set $A_i$, where $i$ is the length of the longest chain
  ending at $a$. \\
  \\
  See this works: \\
  It gives just $t$ sets, because the longest path is length $t$. \\
  \\
  Need to show $(\forall i) ( \forall a \in A_i)$ \\
  $(\forall b \preceq a, b \neq q) 
  [ b \in A_1 \cup A_2 \cup \cdots \cup A_{i-1} ]$. \\
  \\
  Proof by contradiction: \\
  If $b \not\in A_1 \cup A_2 \cup \cdots \cup A_{i-1}$ then there is a path
  of length $>i-1$ ending in $b$, \\
  which means there is a path of length $>i$ ending in $a$, \\
  which means $a \not \in A_i$. \\
  This is a contradiction.
\end{proof}

So with an unlimited number of processors, the time to complete all
the tasks is the length of the longest chain.  \\
\\
It turns out that this theorem is good for more than parallel
scheduling. It is usually stated as follows.

\begin{corollary}
  If $t$ is the length of the longest chain in a poset $(A, \preceq)$
  then $A$ can be partitioned into $t$ antichains.
\end{corollary}

\begin{defin}
  An {\bf antichain\/} is a set of incomparable elements.
\end{defin}

This is a corollary because we showed before how to partition the tasks
into sets that could be done at the same time, which means that these
were incomparable. (If any two comparable then one would have to
precede other and so couldn't be done at same time.)

Review:
\begin{itemize}
\item {\bf chain\/}: all elements comparable (total order)
\item {\bf antichain\/}: all elements incomparable
\end{itemize}

\subsection{Dilworth's Theorem}

We can use this to prove a famous result about posets:

\subsubsection{The Basics}

\begin{theorem}
  (Dilworth) $\forall t$, every poset with $n$ elements must have a
  chain of size $ > t$ or an antichain of size $\geq \frac{n}{t}$.
\end{theorem}

Ex: Dressing poset.  $n = 10$.
Try $t = 3$.  Has chain of length $4$. \\
Try $t = 4$.  Has no chain of length $5$, but has antichain of size 
$4 \geq \frac{10}{4}$. \\

\begin{proof}
  Assume there is no chain of length $> t$. \\
  So, longest chain has length $\leq t$. \\
  Then by previous result, the $n$ elements can be partitioned into
  $\leq t$ antichains. \\
  (Cannot assume exactly $t$, since could have some empty.) \\
  So there is an antichain with $\geq \frac{n}{t}$
  elements. \\
  As needed.
\end{proof}

For instance: 

\begin{corollary}
  Every poset with $n$ elements has a chain of length $> \sqrt{n}$
  or an antichain of size $\geq \sqrt{n}$.
\end{corollary}

Set $t= \sqrt{n}$ and this is immediate.

Posets arise in all sorts of contexts, and when they do Dilworth's
theorem can have interesting implications. 

\remove{
\subsubsection{Increasing and Decreasing Sequences}

A mathematical example: \\
In any sequence of $n$ different numbers, there is either an
increasing subsequence of length $> \sqrt{n}$ or
a decreasing subsequence of length $\geq \sqrt{n}$. \\
\\
E.g., 6 4 7 9 1 2 5 3 8 has the decreasing sequence 6 4 1 and
the increasing sequence \\ 1 2 3 8.
That is, it has both. \\
\\
We can get this using Dilworth's theorem; the trick is to define the 
appropriate poset. \\
\\
The domain is the set of values in the sequence. \\
\\
For the ordering, define $a \preceq b$ if either $a = b$, or else
($a < b$ and $a$ comes before $b$ in the sequence). \\
\\
Check this is reflexive (stated explicitly), 
transitive, and antisymmetric. \\
\\
A chain corresponds to a sequence that increases in value and moves to
the right, that is, an increasing sequence. \\
\\
But what does an antichain correspond to?

\begin{lemma}
  If $a,b$ are incomparable and $a > b$ then $a$ precedes $b$ in the
  sequence. 
\end{lemma}

\begin{proof}
  By contradiction. \\
  Assume $b$ precedes $a$. \\
  Then $b < a$ and $b$ precedes $a$, so $b \preceq a$ in the partial
  order. \\
  This contradicts our assumption that $a$ and $b$ are incomparable.
\end{proof}

Extend this to more than 2 elements:

\begin{lemma}
  If $a_1, a_2, \cdots , a_t$ are incomparable and $a_1 > a_2 > \cdots
  > a_t$ then $a_1, a_2, \cdots a_t$ form a decreasing subsequence.
\end{lemma}

\begin{proof}
  $(\forall i)$,
  the fact that $a_i > a_{i+1}$ implies that
  $a_i$ precedes $a_{i+1}$ in the sequence, by the previous lemma.
\end{proof}

So given an antichain, arrange the elements so that they are in
decreasing order and the ``left of'' relation follows, giving a
decreasing subsequence.

Dilworth's theorem implies that there is either a chain of size 
$> \sqrt{n}$ or an antichain of size $\geq \sqrt{n}$. \\
By the analysis just above,
this yields either an increasing sequence of length $> \sqrt{n}$
or a decreasing subsequence of length $\geq \sqrt{n}$. 
}%end remove

\subsubsection{Number of Comparisons for Database Setup and Searching}

Consider a ``database'' of records with (distinct) numerical keys. \\
Suppose when records are inserted, some comparisons of these keys
occur, and the database maintains a record of these comparisons. \\
\\
E.g., at one extreme, database could be kept totally sorted. \\
E.g., totally unsorted. \\
\\
Each comparison costs something -- the time to perform it. \\
\\
Now suppose that after this setup, a search is to be performed on the
database to determine if a record with a particular key $k$ appears
anywhere in the database.  \\
The search is supposed to be based on comparisons only. \\
Then the worst-case number of comparisons it takes depends on the
database partial order. \\
\\
E.g., if db is totally sorted, can compare with middle elements, then
middle of halves, etc.  \\
Binary search.  \\
Approximately log n comparisons needed. \\
\\
E.g., if db totally unsorted, then in the worst case, have to compare
with every element.  \\
This is because each comparison with an element in the db gives no
information about whether the sought element might be any of the
others in the db. \\
\\
In fact, if there is an antichain of size $k$, then in the worst case,
the new key could be any of the elements of the antichain. \\
So in the worse case, have to compare with all $k$, since each one
gives no information about the others. \\
\\
See what Dilworth's theorem says about the database partial order. \\
For any $t$, either have a chain of length $> t$, or an antichain of
size $\geq \frac{n}{t}$.  \\
Setting up a chain of length $> t$ requires at least $t$ comparisons. \\
The antichain requires at least $\frac{n}{t}$ comparisons. \\
So this says that we have to pay the cost sometime -- tradeoff --
either we do $> t$ comparisons while setting up the db, or at
least $\frac{n}{t}$ in the worst case to search the db. \\
\\
More refined analysis is possible. \\
That's because reliably setting up a long chain, say length $t$, in
the database actually requires {\em more} than $t$ comparisons.  \\
Because in the worst case, the comparisons won't come out in the most
desirable way, setting up a linear chain. \\
Ex: Compare $a$ and $b$, find out $a > b$.  Then compare $b$ and $c$.
Would be nice if we found out $b > c$.  But we might not -- might get
$c > b$.  Then we don't have a chain of length $3$, but still an
antichain of size $2$.  To get a chain, need an additional comparison. \\
\\
Return to this in analysis of algorithms.

\subsection{Logical Time in Distributed Systems}

One last use of p.o.  \\
No use of powerful theorems here -- just showing an important place
where a p.o. is used as a mathematical model for a real computational
situation.  \\
\\
A distributed computation consists of a sequence of ``steps''
occurring at each of a collection of $n$ ``processes''. \\
In addition to performing local steps, the processes can send messages
to each other. \\
So, have $\ms{send}$ steps and $\ms{receive}$ steps. \\
\\
The processes are mostly working independently. \\
The only interaction is by means of the messages. \\
So, the only dependence among steps (feeding data from one to another,
using results to make decisions, etc.) is by: \\
\begin{enumerate}
\item
  Successive steps of same process.
\item
  Send vs. receive of same message.
\end{enumerate}
Taking reflexive transitive closure leads to partial order describing
the dependence among the computation steps. \\
\\
Basic theorem:  Any topological sort of this p.o. describes a possible
sequence in which all the distributed steps could occur. \\
(No changes in what happens based on reordering incomparable steps.) \\
\\
Example of what one might do with this p.o.: \\
Assign ``logical time'' to every step. \\
Instead of an actual real time. \\
The time is just any assignment of real numbers to the steps that is
consistent with the p.o.:  when one step precedes another in the p.o.,
it must have a smaller logical time. \\
\\
Behaves like real time:  Formally, can reorder the steps in logical
time order, still consistent with the p.o.\\
Then the system is behaving as if the logical time were actually real
time. \\
\\
Can be maintained by simple distributed algorithm. \\
(Piggybacks time on message, adjusts local times in response to what
is heard in some messages.) \\
\\
Although logical time isn't the same as actual real time, for many
purposes, it can be used as an adequate substitute. \\
Logical time can be used, e.g., to timestamp files to keep updated
copies consistent.

\end{document}

\endinput

Problems based on this lecture:

1. (Some basic p.o. defs -- minimal and maximal elements, lower and
upper bounds, etc.)
Section 6.6, Problem 20
and/or Section 6.2, Problem 22

2. (Parallel task scheduling)
Recall the theorem from class that says that if a task graph is 
a partial order with maximum chain length $t$, then the
whole set of tasks can be performed by an unlimited collection of
parallel processors in time $t$.
This theorem was only for the case where all tasks take the same
amount of time, say $1$ time unit.
(a) Apply this theorem to produce an optimal schedule for the
following task graph.
[[[give some graph.  Don't need to make up a cute story -- just some
letters will do]]]
(b) Generalize the theorem to the case where some tasks take $1$ time
unit and others take $2$ times units.
[[[Seems this should work by splitting the time-$2$ tasks into two
tasks, using this to come up with a new length measure for chains,
then applying the old theorem. 
Should take some arguing that everything works, but it seems nice.???]]]

3. This doesn't actually use Dilworth's theorem.
But the students will get to think about chains and antichains.

Suppose a database is stored using a partial order on keys.  
Your job is to design a search strategy to determine if a new key $k$
appears in the database.  
The only thing you are allowed to do in the strategy is to compare two
keys (either two within the database, or the new key $k$ vs. a key in
the database).
The result of such a comparison adds extra edges to the partial order.
The results of earlier comparisons can be used to determine which
comparisons are made later.

For each of the partial orders given below, describe a strategy for
searching for an arbitrary key $k$ in the database.
Your strategy should be designed to use the minimum number you can
achieve, of comparisons, in the worst case.
Say how many comparisons it takes in the worst case, and explain why.

(a) Consists of $2$ antichains $A$ and $B$, each with $5$ elements. 
Each of the $5$ elements of $A$ is known to be less than each of the
$5$ elements of $B$.
[[[Can draw diagram]]]

[[[I get 10 comparisons.  Can achieve by direct comparison of new key
with each element.  Doesn't seem we can better than, though -- every
key needs to be compared to $k$ directly or indirectly.  If we assume
$k$ is between all the elements of $A$ and all the elements of $B$,
then it seems we get no help from the existing edges.  Re-think this
-- if it looks too hard, we could default to the simpler case where we
are only allowed to compare $k$ to an element of the db.]]]

(b) Consists of $2$ chains, each of length $5$.  No relationships are
known between elements of the $2$ chains, however.
[[[Can draw diagram]]]

[[[Here I get $6$, by treating the two chains separately.]]]
\end{document}

