\documentclass[10pt]{article}
\usepackage{amsfonts}
\usepackage{fullpage}
\usepackage{amssymb}
\begin{document}
\input{preamble.tex}
\setlength{\parindent}{0pt}
\setlength{\parskip}{1ex plus 0.5ex minus 0.2ex}
\lecture{22}{Apr 29, 2009}{Madhu Sudan}{Jessica Yuan}
\section{Last Time}
Last time, we talked about what a propositional proof system is. It's a way of proving that a formula is a tautology (i.e., it's always true). Very quickly, we switched from proving formulas that are always true to formulas that are always false. Without loss of generality, we focused on CNF formulas, since we can transform formulas into CNF form with only a linear blow-up.
\smallskip
A proof system is something that can determine whether a CNF formula $F$ is either
\begin{itemize}
\item unsatisfiable: there exists some proof $\pi$ that is efficiently (poly-time) verifiable deterministically
\item satisfiable: there should be no fake proof that can fool us
\end{itemize}
\smallskip
Why do we care about propositional proof systems?
1) P vs. NP. If there is no proof system that can always provide proofs of polynomial size, this implies P $\neq$ NP.
2) Understand limits of math reasoning
3) SAT-solving. In industrial applications, there are really good algorithms for SAT-solving in linear time, and the best ones are based on a proof system called \emph{resolution}.
\medskip
We then looked at some examples of proof systems and decided to focus on resolution. This is the proof system where given clauses $C \vee x$ and $D \vee \bar{x}$, we can obtain the clause $C \vee D$. Additionally, we can use a weakening mechanism, where given clause $C$, we can obtain the weaker clause $C \vee D$. This weakening mechanism is not necessary for proving unsatisfiability, but it is often useful.
\smallskip
We looked at some examples of formula families, such as Graph Tautology, and decided to focus on Pigeonhole Principle. Pigeonhole Principle ($PHP^m_n$) lets us show that if there are $m$ pigeons and $n$ holes, where $m > n$, there must be two pigeons that share a hole. The unsatisfiable formula consists of two parts:
\begin{itemize}
\item Each pigeon must be in a hole: $\bigwedge_{i \in [m]} \bigvee_{j \in [n]} x_ij$
\item No two pigeons share a hole: $\bigwedge_{i_1 \neq i_2\in [m], j \in [n]} (\overline{x_{i_1j}} \vee \overline{x_{i_2j}})$
\end{itemize}
Combined, this gives us the formula:
\[ \bigwedge_{i \in [m]} \bigvee_{j \in [n]} x_ij
\wedge
\bigwedge_{i_1 \neq i_2\in [m], j \in [n]} (\overline{x_{i_1j}} \vee \overline{x_{i_2j}}) \]
Today, we'll focus on how hard it is to prove that this formula is unsatisfiable.
\section{Resolution Proof Size}
We're interested in the size of resolution proofs, i.e. the total number of symbols. We can also count the number of clauses in the proof (i.e. the length of the derivation of the contradiction). This tends to be easier to deal with than size, and because length is linearly related to size, we can get a good idea of the size of a proof given its length. The width of the clauses is also a good tool for studying the length of proofs.
The length of a refutation proof $\pi$ is denoted as $L(\pi)$. The length of the proof showing that a formula $F$ is unsatisfiable is the length of the shortest proof that proves that $F$ is unsatisfiable:
\[ L (F \vdash 0) = \min_{\pi : F \vdash 0} \{L(\pi)\}\]
Note that if $n$ is the number of variables in $F$, $L (F \vdash 0) \lesssim 2^n$.
The width of the proof is similarly denoted as $W (F \vdash 0) = \min_{\pi : F \vdash 0} \{W(\pi)\}$. Note that $W (F \vdash 0) \leq n$.
\medskip
Any resolution proof can be represented by a directed acyclic graph by drawing edges to clauses from the clauses used to derive them. If this graph is a tree, we call it a \emph{tree-like refutation}, and the length of the shortest tree-like refutation is $L_T(F \vdash 0)$. Generally, there is a big difference between tree-like proof lengths and general proof lengths. We can also hit the formula with an arbitrary restriction (e.g. $x$ = 1), and we get the refutation of the restricted formula.
\medskip
We would like to show that $W (F \vdash 0)$ is large $\rightarrow L (F \vdash 0)$ is large. Then, we will use this fact to prove that
\[L(PHP^{n+1}_n \vdash 0) = \exp(\Omega(n))\]
Resolution proofs are extremely well-studied. Resolution is one of the easiest non-trivial proof systems to deal with. Also, the best SAT-solvers tend to be built on resolution.
We want to prove lower bounds on length, but we'll first focus on width. Let's look at a connection between width and length:
If a refutation is narrow (width $\leq W$), then it must also be short. Length $\leq (2 |$Variables in $F|)^W$
More interestingly, Ben-Sasson and Wigderson in 1999 in their paper ``Short Proofs Are Narrow," they presented the idea that if there is a short proof, there must also exist a reasonably narrow proof. (Sidenote: the terms \emph{proof} and \emph{refutation} will be used interchangeably.)
We can quantify this for tree-like refutations:
\begin{theorem}[BW '99] Define $W(F)$ to be the width of the largest clause in formula $F$. Then,
\[W(F \vdash 0) \leq W(F) + log_2 L_T(F \vdash 0)\]
\end{theorem}
\begin{corollary}
$L_T(F \vdash 0) \geq 2^{W(F\vdash0) - W(F)}$
\end{corollary}
But tree-like resolutions are significantly weaker than general resolutions, so we'll want the general case:
\begin{theorem}[BW '99] Let $n$ be the number of variables in formula $F$. Then,
\[W(F \vdash 0) \leq W(F) + \sqrt{8 n * ln( L(F \vdash 0))}\]
\end{theorem}
\begin{corollary}
\[L_T(F \vdash 0) = \exp \left( \frac{(W(F\vdash0) - W(F))^2}{8n} \right) \]
\end{corollary}
We'll use the general theorem, but we'll only be proving the weaker tree-like theorem. The idea behind the proof for the general theorem is the same as for the weaker one, but the calculations are messier.
One way of thinking about this is that the width is upper-bounded by the logarithm of the length in Theorem 1. $n$ is about the logarithm of the worst-case proof. You can think of it as
\[\textrm{width} \lesssim \sqrt{ln(\textrm{worst-case})*ln(\textrm{actual-case})}\]
You might be thinking, maybe we can do better. Unfortunately, this bound is essentially tight, as proven by Bonet and Galesi in 1999.
To get anything interesting, we need a few things:
\begin{itemize}
\item We need $W(F)$ to be small.
\item We need $W(F\vdash 0) = \omega(\sqrt{n * ln(n)})$.
\end{itemize}
Now, we'll prove Theorem 1 and show how to use the general theorem to get the lower bound for the refutation of Pigeonhole Principle. We'll need two lemmas.
\begin{lemma}
If $W(F_{\upharpoonright x} \vdash D) = w$, then $W(F \vdash D \vee \bar{x}) = \max (W(F), w + 1)$.
\end{lemma}
Note that $F_{\upharpoonright x}$ is the formula $F$ with the variable $x$ set to true.
\begin{proof}
Let $F$ = $C_1 \wedge C_2 \wedge ..., C_m$.
Suppose the refutation for $F_{\upharpoonright x} \vdash D$ looks like $\pi = D_1, D_2, ..., D_C$.
The refutation for $F \vdash D \vee \bar{x}$ would involve listing all the clauses in $F$, then list all the $D$ clauses with $\bar{x}$ appended to them: $\pi' = C_1, C_2, ..., C_m, D_1 \vee \bar{x}, D_2 \vee \bar{x}, ..., D_C \vee \bar{x}$.
This is always allowed by the rules of resolution, and we can prove this by cases:
\begin{itemize}
\item Case 1: $D_i \vee \bar{x} \in F$. In this case, $D_i \vee \bar{x}$ is trivially allowed because it was already in $F$.
\item Case 2: $D_i \in F$. In this case, $D_i \vee \bar{x}$ is allowed because we're only weakening the clause $D_i$.
\item Case 3: Otherwise, it must be the case that there exist two clauses $D_j$ and $D_k$ such that $D_j$ and $D_k$ resolve to $D_i$, or else there is no way for $D_i$ to be in the refutation for $F_{\upharpoonright x} \vdash D$. By induction, we know that $D_j \vee \bar{x}$ and $D_k \vee \bar{x}$ are in the refutation for $F \vdash D \vee \bar{x}$. We can thus use the resolution rule to get $D_i \vee \bar{x}$.
\end{itemize}
\end{proof}
\begin{lemma}
If $W(F_{\upharpoonright x} \vdash 0) \leq w - 1$ and $W(F_{\upharpoonright \bar{x}} \vdash 0) \leq w$, then $W(F\vdash 0) \leq \max(w, W(F))$.
\end{lemma}
\begin{proof}
Let's apply Lemma 1 to $F_{\upharpoonright x}$.
\begin{itemize}
\item 1. From $F$, derive $\bar{x}$ in width $\leq \max(W(F), w)$.
\item 2. Resolve every $C \in F$ containing $x$ with $\bar{x}$. This gives us $F_{\upharpoonright \bar{x}}$.
\item 3. Conclude that $W(F_{\upharpoonright \bar{x}} \vdash 0) \leq w$.
\end{itemize}
\end{proof}
Now we want to go back and prove Theorem 1 (Ben-Sasson and Wigderson).
\begin{proof}
We'll prove by induction over $b$ and the number of variables $n$ that if $L_T(F \vdash 0 ) \leq 2^b$, then $W(F\vdash 0) \leq W(F) + b$.
Base cases
\begin{itemize}
\item $n \leq W(F)$. This case is trivial, because the width can never be larger than the number of variables.
\item $b = 0$. The length of refutation is 1, so the refutation must be just 0, which is possible if F contained the 0 clause.
\end{itemize}
Inductive step
Consider the tree-like resolution $L_T(F \vdash 0)$. By induction, $L_T(F \vdash 0) \leq 2^b$.
The last step of this refutation $\pi$ must be the derivation of some variable $x$ and another derivation of $\bar{x}$. Let's call these derivations $\pi_x$ and $\pi_{\bar{x}}$.
$L(\pi) = L(\pi_x) + L(\pi_{\bar{x}}) + 1$. So at least one of these subderivations has length $\leq 2^{b-1}$. Without loss of generality, let's suppose it is $\pi_x$. $L(\pi_x) \leq 2^{b-1}$. This means that $L(F_{\upharpoonright x}) \leq 2^{b-1}$ as well.
Apply the inductive hypothesis. We get
\[ W (F_{\upharpoonright x} \vdash 0) \leq W(F_{\upharpoonright x}) + b - 1 \leq W(F) + b - 1\]
We also know that $L(\pi_{\bar{x}} \leq 2^b$ and $L(F_{\upharpoonright \bar{x}}) \leq 2^b$. Applying the inductive hypothesis there, we get
\[ W (F_{\upharpoonright \bar{x}} \vdash 0) \leq W(F_{\upharpoonright \bar{x}}) + b \leq W(F) + b\]
By Lemma 6, we conclude that $W(F \vdash 0) \leq W(F) + b$.
\end{proof}
Note that in each step in our induction, the length sort of doubles, so at the end, we have a really narrow proof, but it might be of exponential length. In fact, for tree-like resolution, we know that this length blow-up is necessary. It's an open question as to whether this is necessary for general resolution.
Why doesn't this work in the general case? It fails at $L(\pi) = L(\pi_x) + L(\pi_{\bar{x}}) + 1$, because generally, the two derivations $\pi_x$ and $\pi_{\bar{x}}$ will share some steps and clauses. But the general idea of hitting the formula with $x = 0$ and $x = 1$ is still the same. However, it's trickier to find a good $x$, the calculations are messier, and the bounds are not as good.
Now we can finally prove the lower bound for $PHP$ resolution.
As a reminder, $PHP^m_n$ consists of clauses
\begin{itemize}
\item $P^i = \bigvee_{j \in [n]} x_ij$ ``pigeon $i$ must be in some hole"
\item $H^{i_1, i_2}_j = \overline{x_{i_1, j}} \vee \overline{x_{i_2,j}}$ ``hole $j$ does not hold both pigeons $i_1$ and $i_2$"
\end{itemize}
When $m > n$, $PHP$ is unsatisfiable. We'll focus on the hardest case to prove, which is when $m = n + 1$.
\begin{theorem}[Haken '85]
$L(PHP^{n+1}_n \vdash 0) = \exp(\Omega(n))$
\end{theorem}
We want to use the Ben-Sasson and Wigderson machinery described in Theorem 3, but $PHP$ currently does not satisfy the requirements that $W(PHP)$ is small and that $W(PHP\vdash 0) = \omega(\sqrt{n * ln(n)})$. To get around this, we want to make the formula ``sparser."
We can do this by thinking about PHP as a bipartite graph $G = (U \vee V, E)$, where $U$ is the set of the $m$ vertices on the left corresponding to the $m$ pigeons, $V$ is the set of the $n$ vertices on the right corresponding to the $n$ holes, and $E$ is the set of edges connecting a left vertex (pigeon) to a right vertex (hole). Plus, we define $N(u)$ to be the set of neighbors a pigeon vertex $u \in U$ has. Likewise, for a set of pigeon vertices $U$, $N(U)$ is the set of all vertices that are a neighbor to at least one vertex in $U$. Note that all vertices in $N(u)$ or $N(U)$ are hole vertices.
Each edge $x_{u,v}$ in the graph connects a pigeon vertex $u$ with a hole vertex $v$. $x_{u,v}$ is set to true if pigeon $u$ goes into hole $v$. Then, we can say that the graph is satisfiable if there is an assignment of edges such that $PHP(G)$ is satisfied:
\[PHP(G) = \bigwedge_{u \in U} \bigvee_{v \in N(u)} x_{u,v} \wedge \bigwedge_{v \in V} \bigwedge_{u \neq u' \in N(v)} \overline{x_{u,v}} \vee \overline{x_{u', v}}\]
The simplest way of thinking about PHP is to consider $G$ as a complete bipartite graph. However, to prove a lower bound on the proof complexity of PHP, it is not necessary to use a complete bipartite graph. In fact, if $G' = (U \vee V, E')$ has $E' \subseteq E$, then $L(PHP(G') \vdash 0) \leq L(PHP(G) \vdash 0)$. In other words, if we can find a lower bound for a sparser version of $G$, the lower bound also applies to $G$ itself.
Suppose $G$ has constant left degree $d \geq 2$. Then $PHP(G)$ is a $d$-CNF formula with $d*m$ variables, meaning it has a small width and small number of variables. This would satisfy the requirements for using the Ben-Sasson and Wigderson machinery.
The problem is, we want to make the graph sparser, but maintain the difficulty of the Pigeonhole Problem in order to obtain a high lower bound. The reason why the Pigeonhole Problem is so difficult is that it's a global problem. All the pigeon variables have to be observed. If there are $n$ pigeons and $n$ holes, the problem is satisfiable, but once there are $n+1$ pigeons, it is not.
So what we want is a sparse graph with good connectivity, which hopefully makes the graph harder to satisfy and gives us a better lower bound. This type of graph is called an \emph{expander}.
$G = (U \vee V, E)$ is a $(d, s, e)$-expander if
\begin{itemize}
\item the left degree of the graph $\leq d$ ($d$ is the outdegree bound of the pigeon vertices)
\item $\forall U' \subseteq U$ $|U'| \leq s \rightarrow |N(U')| \geq e|U'|$
\end{itemize}
Basically, what this definition says is that for all subsets $U'$ of $U$, the set of pigeon vertices, up to a limiting size $s \geq |U'|$, the number of hole vertex neighbors that this subset $U'$ has is greater than some multiplicative expansion factor $e$ of the size of the subset $|U'|$. In other words, if you look at a small number of pigeons, there's $e$ times that number of holes that they can go into.
\medskip
Actually, we'll need something stronger called a \emph{unique neighbor expander}.
$G$ is a $(d,s,e)$-unique neighbor expander if
\begin{itemize}
\item the left degree of the graph $\leq d$
\item $\forall U' \subseteq U$ $|U'| \leq s \rightarrow |\delta(U')| \geq e|U'|$ where $v \in \delta(U')$ if $|N(v) \cap U'| = 1$.
\end{itemize}
This definition is slightly different from the definition for the general expander. $e$ is no longer the expansion factor of the neighbor vertices, but rather the \emph{boundary} vertices. These are the vertices in $N(U')$ that only have one neighbor in $U'$, hence the name ``unique neighbor." In other words, these are the holes that only allow one certain pigeon from $U'$, i.e. no two pigeons are fighting over any hole in this set of boundary vertices.
Proposition: Any $(d, s, k)$-expander is a $(d, s, 2k - d)$-unique neighbor expander.
Now, we'll introduce and prove the following two lemmas and then use them in conjuction with Ben-Sasson and Wigderson to prove a lower bound for PHP.
\begin{lemma}[A]
For a $(d,s,e)$-unique neighbor expander where $e \geq 1$, $W(PHP(G) \vdash 0) \geq (s*e)/2$.
\end{lemma}
\begin{lemma}[B]
There is a $c > 1$ such that $\forall$ sufficiently large $n$, there are $(5, \frac{n}{c}, 1)$-unique neighbor expanders.
\end{lemma}
Lemma B can be proven probabilistically. Just take five neighbors of each node on the left, and with high probability, you'll end up with a unique neighbor expander.
To prove Lemma A, we start by defining a ``progress measure" $\mu$: \{clauses\} $\rightarrow \mathbb{N}$ with the following properties.
\begin{itemize}
\item $\mu($axioms$) \leq 1$
\item $\mu($final empty clause $0) \geq $ large
\item $\mu$ only increases gradually.
\item For a medium progress clause $D_i$, its width $W(D_i) \geq$ large
\end{itemize}
As before, let $\mathbb{H}$ denote all the ``hole" axioms and $P$ denote all the ``pigeon" axioms. $P^u$ refers to the clause that describes which holes pigeon $u$ can go into (e.g. $P^4: x_{4,1} \vee x_{4,3}$ if there are only edges to holes 1 and 3). Then we can define such a progress measure $\mu$:
\[\mu(D) = \min\{|U'| : \bigwedge_{u \in U'} P^u \wedge H \vDash D\}\]
To confirm that this measure satisfies the properties:
\begin{itemize}
\item $\mu($axioms$) \leq 1$. This is satisfied because for any hole axiom $H^v$, $\mu(H^v)$ = 0. For any pigeon axiom $P^u$, $\mu(P^u)$ = 1.
\item $\mu($final empty clause $0) \geq $ large. It turns out that $\mu(0) > s$, and the $s$ we'll be using is the $s$ in the (5, $\frac{n}{c}$, 1)-unique neighbor expander guaranteed to us by Lemma B. By the definition of unique neighbor expanders, if we have $s$ or fewer pigeons, we are guaranteed to have at least that many unique neighbors. So, we must be considering more than $s$ pigeons if we want to achieve a proof of unsatisfiability.
\item $\mu$ only increases gradually. Quantitatively, if $D_j$ and $D_j$ resolve to $D_i$, then $\mu(D_i) \leq \mu(D_j) + \mu(D_k)$. This ensures that there is some $D_i$ with a medium-sized $\mu(D_i)$
\item For a medium progress clause $D_i$, its width $W(D_i) \geq$ large. By the previous requirement, we know that there must be a medium progress clause $D$ such that $\frac{s}{2} \leq \mu(D_i) \leq s$, since each derivation cannot increase the progress measure by more than doubling. We then consider the width of this clause $D$.
\end{itemize}
For this $D$, fix $U'$ of size $\mu(D)$ such that $\bigwedge_{u \in U'} P^u \wedge H \vDash D$.
We then make the claim that $\forall v \in \delta(U')$ $\exists$ variable $x_{u, v}$ in $D$. We can prove this by contradiction.
\begin{proof}
Suppose $\exists v* \in \delta(U')$ such that there is no variable of the form $x_{u, v*}$ in $D$. Because $v* \in \delta(U')$, we know it has a unique neighbor $u* \in U'$. We also know that by the definition of the progress measure $\mu$, if we remove $u*$ from $U'$, we should no longer be able to derive $D$; otherwise, $U'$ would not be the minimal subset.
Thus, it must be the case that there is some truth-value assignment $\alpha$ such that
\begin{itemize}
\item $\alpha(\bigwedge_{i \in U' \backslash \{u*\}} P^i) = 1$
\item $\alpha(\mathbb{H}) = 1$
\item $\alpha(D) = 0$
\end{itemize}
In other words, there should be some truth-value instance where the axioms (minus $u*$'s pigeon axiom) are satisfied but the derived clause $D$ is not. Now, we can manipulate the values $x_{u,v}$ a little. (Recall that $x_{u,v}$ is true iff pigeon $u$ goes into hole $v$.)
First, we can set $x_{u, v*}$ to false, for all $u$. This won't affect $\alpha(D)$, because by our assumption, there is no variable of the form $x_{u, v*}$ in $D$. This won't affect the pigeon axioms, since $v*$ only has one neighbor $u*$, who is not included in the axioms. And this can only serve to strengthen the hole axioms, since they are negations.
Then, we can set $x_{u*, v*}$ to true. Again, this won't affect $\alpha(D)$ or the pigeon axioms. This also isn't affecting any of the hole axioms, because we set all other $x_{u, v*}$ to false, so any hole axiom involving $v*$ is true, regardless of the value of $x_{u*, v*}$. (Recall that each hole axiom is of the form $\overline{x_{i_1j}} \vee \overline{x_{i_2j}}$.)
Now we have $\alpha(P^{u*}) = 1$, because pigeon $u*$ was able to find a hole. We can combine that with $\alpha(\bigwedge_{i \in U' \backslash \{u*\}} P^i) = 1$ to get $\alpha(\bigwedge_{i \in U'} P^i) = 1$. However, now we have
\begin{itemize}
\item $\alpha(\bigwedge_{i \in U'} P^i) = 1$
\item $\alpha(\mathbb{H}) = 1$
\item $\alpha(D) = 0$
\end{itemize}
This is a contradiction, since we should be able to derive $D$ from the pigeon and hole axioms.
\end{proof}
Now we know that the width of $D$ is at least as great as $|\delta(U')|$. But $|\delta(U')| \geq e|U'| \geq (s*e)/2$. Hence, $W(D) \geq (s*e)/2$ and Lemma A follows.
\medskip
To finish up the proof for the lower bound of PHP, we plug this information into the Ben-Sasson and Wigderson equation. Recall:
\[L_T(F \vdash 0) = \exp \left( \frac{(W(F\vdash0) - W(F))^2}{8x} \right) \]
Note that in this formula, $x$ is the number of variables. If $n$ is the number of holes and $n+1$ is the number of pigeons, then the number of variables is on the order of $5n$, given that we are using the $(5, \frac{n}{c}, 1)$-unique neighbor expander. $W(F)$ is $O(1)$, because each hole axiom has width 2 and each pigeon axiom has width 5. $W(F\vdash0)$ is at least $(s*e)/2$, or $\Omega(n)$. Plug it all in, and we get
\[L_T(F \vdash 0) = \exp \left( \frac{(\Omega(n))^2}{n} \right) = \exp(\Omega(n)) \]
\end{document}