\documentclass[10pt]{article}
\usepackage{fullpage}
\usepackage{amsfonts}
\usepackage{amssymb}
%\usepackage{epsfig}
\begin{document}
\input{preamble.tex}
\lecture{17}{April 8, 2009}{Madhu Sudan}{Jean Yang}
%%%% body goes in here %%%%
\section{Overview}
Last lecture we showed $PERM \in IP \Rightarrow \#P \in IP$ by constructing an $IP$ protocol involving verifying a polynomial using a curve. In this lecture we review the method for showing $PERM \in IP$ and apply the method first for showing $\#SAT \in IP$ and then for showing $PSPACE \subseteq IP$.
In the second half of lecture we discussed ``knowledge,'' its definition, and its relationship to crytography and complexity.
\subsection{Administrative notes}
\begin{itemize}
\item No lecture on Monday.
\item Madhu will be away--Swastik will lecture on PCP.
\end{itemize}
\section{Review from last time: $\#P \in IP$}
We showed that $PERM \in IP$ problem using a polynomial construction sequence. A cool part is that this method would work for \emph{any} problem that we can deconstruct into a sequence of polynomials, so we can also apply this method to showing $\#SAT \in IP$ and, eventually, $PSPACE \subseteq IP$.
\subsection{Polynomial construction sequence}
Recall from last lecture that we can compute the permanent in field $\mathbb{F}$ using the sequence of $n$ polynomials $P_1, P_2, \dots, P_{\ell-1}, P_{\ell}$ where $P_i$ is the permanent of an $i \times i$ matrix. We have the following properties:
\begin{tabular}{lll}
Each polynomial of degree & $\leq$ & $d$\\
Each polynomial has number of variables & $\leq$ & $m$\\
$P_0$ is computable in time & $\leq$ & $t$\\
$P_i$ is computable in time $t$ with oracle for $P_{i-1}$ with \# calls & $\leq$ & $w = n$\\
\end{tabular}
We get the last property from the downward self-reducibility of the permanent: $P_n(M) = \sum_{i=1}^n m_{1i} P_{n-1}(M_i)$.
Given $\overline{a} = (a_1, \dots, a_m) \in \mathbb{F}^m$ and $b \in \mathbb{F}$, we can prove interactively that $P_\ell(\overline{a}) = b$ in time polynomial in $\ell, d, \mathbb{F}, m, t, w)$, provided that $\vert \mathbb{F} \vert$ is sufficiently large.
We showed in the previous lecture a method for verifying $PERM$ by running a curve through the space and then asking about a random point on the curve. We can work backwards from $P_{n-1}$.
\subsection{Typical phase of interaction}
We show how to verify the question ``$P_i(a^{(i)}) = b^{(i)}$?'' for polynomial $P_i$ in a sequence of polynomials.
To verify $P_n(M) = a$, we can compute the $w$ inputs $M_1, M_2, \dots, M_w$ to use the oracle for $P_{n-1}$ so that we can easily compute $P_n(M)$ from $P_{n-1}(M) \dots P_{n-1}(M_w)$. For verification purposes we use a curve $C: \mathbb{Z}_p \rightarrow \mathbb{Z}_p^v$ (where $\mathbb{Z}_p^v$ is our input space) such that $C(1) = M_1, \dots, C(w) = M_w$. Note $\mbox{deg}(C) \leq w$ (actually $\mbox{deg}(C) = w - 1$).
The interaction goes as follows:
\begin{itemize}
\item We ask the prover ``$P_{n-1} \circ C = ?$''
\item The prover comes back with an answer $h$ with degree $\leq d \cdot w$.
\item The verifier verifies that ``$P_n(M) = a$'' is consistent with $P_{n-1}(M_i) = h(i)$ by picking a $t$ at random and verifies $P_{n-1}(C(t)) = h(t)$.
\end{itemize}
The interaction looks like this:
\begin{tabular}{lcl}
& ``$P_i(a^{(i)}) = b^{(i)}$?'' & \\
Verifier & & Prover\\
& $\stackrel{C, h}{\longleftarrow}$ & Compute $v_1 \dots v_w$ s.t. $P_i(a^{(i)})$ can be computed\\
& & from $P_{i-1}(v_1 \dots v_2)$. Compute curve $C$ such that\\
& & $C(j) = v_j$ and $h \leftarrow P_{i-1} (C(t))$.\\
Verify $C(i) = v_i$. & $\stackrel{t_0}{\longrightarrow}$ & \\
Verify $b^{(i)} = f_i(h(1) \dots h(w))$. & & \\
Pick $t_0 \in \mathbb{F}$ at random, & &\\
$a^{(i-1)} = C(t_0); b^{(i-1)} = b(t_0)$. & & \\
\end{tabular}
The notes for last lecture explain why we can verify with high probability. We can keep walking backwards and verifying until we get to the base case, at which point we'll know the answer with high probability.
\subsection{IP protocol for $\#SAT$}
Though we get $\#P \subseteq IP$ because $PERM$ is $\#P$-complete, we can show why this is quite clearly with a similar $IP$ protocol for $\#SAT$. We will also use a similar method for showing $PSPACE \subseteq IP$.
\begin{theorem}
Let $\#3SAT$ be the number of satisfying clauses of formulae of the form $\phi = c_1 \wedge c_2 \wedge \dots \wedge c_m$, where $c_j = x_{i_1} \vee x_{i_2} \vee x_{i_3}$.
$\#3SAT \in IP$.
\end{theorem}
\begin{proof}
We can write a $\#3SAT$ formula naturally as a set of polynomials with the desired polynomial parameters.
We want a clause $c_i$ to be satisfied $\iff$ the corresponding polynomial expression is equal to $1$, and we want a formula to be satisfied $\iff$ every clause is satisified. To convert $\phi$ to a polynomial, we have:
\begin{tabular}{rcl}
$x_1 \dots x_n$ & $\rightarrow$ & $y_1 \dots y_n$, where $x_i$ is true $\iff$ $y_i = 1$\\
$\overline{x_i}$ & $\rightarrow$ & $1 - y_i$\\
$c_i$ & $\rightarrow$ & $1 - (1 - y_{i_1})(1 - y_{i_2})y_{1_3} = \hat{p}_j$\\
$\phi$ & $\rightarrow$ & $\Pi_{j} \hat{p}_j = p(y_1, \dots, y_n)$\\
\end{tabular}
To compute $\phi$, we can go from many variables to few variables by computing each polynomial from the value of the previous polynomial, looking at only $2$ places:
\begin{tabular}{lll}
$\hat{p}_0$ & = & $\hat{p}$\\
$\hat{p}_1$ & = & $\hat{p}_0(y_1, \dots, y_{n-1}, 0) + \hat{p}_0(y_1, \dots, y_{n-1}, 1)$\\
& \vdots & \\
$\hat{p}_i$ & = & $\hat{p}_{i-1}(y_1, \dots, y_{n-i}, 0) + \hat{p}_{i-1}(y_1, \dots, y_{n-i}, 1)$\\
& \vdots & \\
$\hat{p}_n()$ & = & $\hat{p}_{n-1}(0) + \hat{p}_{n-1}(1)$\\
\end{tabular}
For an interactive proof for the result of the constant function $\hat{p}_n$, we want walk backwards and ask about the previous function at two points, drawing a curve (line) between them. We know $p_0$ is of low degree, and summation does not increase the degree of the interactive proof.
A surprising thing is that we can pick a random value of $t$--everything works when we set $t$ as arbitrary integers over some field.
\end{proof}
\section{$PSPACE \subseteq IP$}
Now we show a $PSPACE$ computation where we want to compute the last step in a sequence of polynomials. We show how to take a generic $PSPACE$ computation and produce an $IP$ problem by providing
\begin{enumerate}
\item a polynomial construction sequence that, given starting and ending configurations, can determine if we can get from the starting to ending configuration in a specified number of steps, and
\item a polynomial construction to compute each polynomial in the above sequence while meeting the requirements.
\end{enumerate}
\begin{theorem}
$PSPACE \subseteq IP$.
\end{theorem}
\begin{proof}
Fix a $PSPACE$ language $L$ and machine $M$. We want to know if, starting from state $c_0$, we reach state $c_f$ in $2^s$ steps, where we use $O(s)$ space.
We want to build a function $P_s(x, y)$ such that
\begin{tabular}{ll}
$p_s(x = (x1, \dots, x_n), y = (x_1, \dots, x_n))$ & = 1 if $x \rightarrow y$ in $2^s$ steps\\
& = 0 if $x, y \in \{ 0, 1 \}^s$ otherwise\\
\end{tabular}
To relate this polynomial to something smaller, we can look at the midpoint $2^{s-1}$:
\begin{tabular}{ll}
$p_s(x, z)$ & = 1 if $x \rightarrow z$ in $2^{s-1}$ steps\\
& = 0 otherwise\\
\end{tabular}
Note that there is exactly one such $z$, so we can just sum over the possible $z$'s:
$p_s(x, y) = \sum_{z \in \{ 0, 1 \}^s} P_{s-1}(x, z) \circ P_{s-1}(z, y)$
We have $p_0(x, y), \dots, p_s(x, y)$, where $p_0(x, y)$ is whether we can get from $x \rightarrow y$ in one step. The final question is ``$p(c_0, c_f) = 1$?''
$p_0$ is a polynomial of low degree ($O(s)$ because we are looking at the product of $O(s)$ $\wedge$ conditions), exists, and is computable. Each expression of the form $p_{s-1}(x, z) \circ p_{s-1}(z, y)$ increases the degree of $z$, but se sum over all $z$'s, so the degree does not go up in each iteration! Now we have left to solve the problem of needing the values of exponentially many things in order to take the sum.
We want to get to $p_n$ with the polynomial sequence $p_0, p_1, \dots p_{n-1}, p_n$, but to do so we have to resolve the issue that $p_i$ needs $2^s$ values of $p_{i-1}$. We can use the $\#SAT$ trick and break up the sum into another series of polynomials. To compute $p_{n-1}(x, z)$, we can define a new sequence of polynomials
\begin{tabular}{ll}
$Q_s^0(x, z, y)$ & $= p_{n-1}(x, z) \circ p_{n-1}(z, y)$\\
& \vdots \\
$Q_s^i(x, z, y)$ & $= Q_s^{i-1}(x, (z_1, \dots, z_{s-i}, 0), y) + Q_s^{i-1}(x, (z_1, \dots, z_{s-i}, 1), y)$\\
\end{tabular}
At the end, we have $P_s = Q_s^s$. Let $c$ be the degree in each variable. We have:
\begin{tabular}{lll}
degree & $\leq$ & $4 \cdot c \cdot s$\\
length & $\leq$ & $O(s^2)$\\
width & = & 2\\
time & = & $O(s)$\\
\# variables & $\leq$ & $3s$\\
\end{tabular}
Thus our sequence is good over every large field.
\end{proof}
\subsection{Notes on the implications and importance of this result}
Previously, the class $IP$ was not thought to be this powerful: there was an assumption that the number of rounds of interaction would collapse to a constant. The $IP$ problem for the permanent was the first very counterintuitive proof and triggered a series of other proofs.
\section{(Zero) knowledge}
In class we introduced the concept of zero knowledge with the graph isomorphism example and then defined it more formally using the concept of replacing the verifier with a simulator. The professor's notes contains more information about information theory and results in zero knowledge which are not included here.
\subsection{Definition by example}
Consider the graph isomorphism problem: $G \simeq H$, we are asking
\begin{tabular}{ll}
$\exists \pi$ & $: V(G) \rightarrow V(H)$\\
$\pi$ & one-to-one\\
& such that $x \leftrightarrow y \iff \pi(x) \leftrightarrow \pi(y)$ in $H$
\end{tabular}
For a zero knowledge proof, we want the following properties with respect to the verifier $V$:
\begin{itemize}
\item \textbf{Completeness.} If $G_0 \simeq G_1$ then $V$ must accept w.h.p.
\item \textbf{Soundness.} If $G_0 \not\simeq G_1$ then $V$ must reject w.h.p.
\item \textbf{Zero knowledge.} If $G_0 \simeq G_1$ then $V$ must not know isomorphism or learn anything other than this fact.
\end{itemize}
Consider the following interactive proof [Goldreich, Micali, Wigderson]:
\begin{tabular}{lll}
& $(G_0, G_1)$ & \\
Verifier & & Prover\\
& $\stackrel{H}{\longleftarrow}$ & $b \in_R \{ 0, 1 \}$ \\
& & $\pi : V(G_b) \rightarrow V(G_b)$, $\pi$ one-to-one\\
& & randomly let $H \leftarrow \pi(G_b)$\\
choose $b' \in \{ 0, 1 \}$ & $\stackrel{b'}{\longrightarrow}$ & \\
& $\stackrel{\pi'}{\longleftarrow}$ & $\pi' \leftarrow \pi$ if $b' = b$\\
& & $\pi' \leftarrow \pi \circ \pi_0$ if $b= 0$, $b' = 1$\\
verify $H = \pi'(G_{b'})$ & & $\pi' \leftarrow \pi \circ \pi_{-1}$ if $b = 1$, $b' = 0$\\
\end{tabular}
If $G_0 \simeq H$, $H \simeq G_1$, then the prover can easily provide a permutation $\pi'$ such that $\pi'(G_{b'}) = H$; otherwise the prover must count on the verifier choosing $b' = b$. The prover will try to cheat when one of these isomorphisms is missing--in this case the prover has probability $\frac{1}{2}$ of getting the correct permutation. Note that we can repeat the interaction, but the prover will have a new $H$.
Why did we do this proof when simply providing the isomorphism could have sufficed? The motivation is from crytography: we want to be able to convince others of facts iwthout revealing additional information.
\subsection{Definition of zero knowledge}
\textbf{Anecdote.} Micali put problem on problem set to prove that you don't know anything at the end of the interaction. Cook aid he didn't know how to prove he can't know something. This led Micali, Goldwasser, and Rackoff to come up with a set of definitions that allow you to show this.
We want to be able to formally define the notion ``You learn \emph{nothing} from the fact that $(G_0, G_1)$ are isomorphic.''
We use the concept of a ``simulation.'' At the end of an interaction, we have a nondeterministic transcript of the interaction. If the verifier can simulate the interaction by sampling from some distribution without the prover, then we have a zero knowledge proof.
\begin{enumerate}
\item Fix verifier's random coins $R$.
\item Transcript is still random variable with distribution $D_R$.
\end{enumerate}
If the verifier can sample from $D_R$ on its own, then verifier gains no knwoledge from prover. The ``simulator'' samples from $D_R$.
Simulator for graph isomorphism:
\begin{tabular}{lll}
Verifier & & Simulator\\
& $\stackrel{b}{\longrightarrow}$ & pick $\pi \in S_n$\\
& $\stackrel{H}{\longleftarrow}$ & $H \leftarrow \pi(G_b)$\\
& $\stackrel{\pi}{\longleftarrow}$ & \\
\end{tabular}
We get exactly the same distribution as with a prover! Note, however, that in the graph isomorphism proof, the fact that $H$ is fixed before the verifier responds is important for \emph{convincing} the verifier, but it does not change the distribution of transcripts. The sequence only has convincing power--important for soundness.
\subsection{Complexity theory of knowledge}
We have the following classes of zero knowldge problems, where $PZK \subseteq SZK \subseteq CZK$:
\begin{itemize}
\item \textbf{Perfect zero knowledge.} Simulator can sample \emph{exactly} from same distribution as interaction.
\item \textbf{Statistical zero knowledge.} Simulator can sample from distribution that is $2^{-n}$ close to the real interaction. Simulator produces $D_R' \approx_{\epsilon} D_R$, so we have $\sum_n \vert D'(n) - D(n) \vert \leq 2 \epsilon$. Equivalently $\forall$ tests $T = \{ 0, 1 \}^n \rightarrow \{ 0, 1 \}$, we have $\vert Pr_{n \in D_R} [T(n) = 1] - Pr_{n \in D_R'} [T(n) = 1] \vert \leq \epsilon$.
\item \textbf{Computational zero knowledge.} Polynomial time does not suffice to distinguish simulated distribution from real one. The simulator produces a $D_R''$ such that $\forall$ poly-time algorithms $A$, $\vert Pr_{n \in D_R} [A(x) = 1] = Pr_{n \in D_R''} [A(n) = 1] \vert \leq \epsilon$.
Which is relevant where?
\begin{tabular}{llll}
``$x \in L$'' & in & $PZK$? & \\
& & $SZK$? & [complexity]\\
& & $CZK$? & [cryptography]\\
\end{tabular}
\end{itemize}
\subsection{Some notes on $SZK$}
$SZK$ is closed under complementation: $L \in SZK \iff \overline{L} \in SZK$. This is surprising and nice because it gives us a complete promise problem.
Let $x_1, \dots x_n$ be inputs and $y_1, \dots, y_m$ be outputs. $C(x)$ when $x_1, \dots, x_n \in \{ 0, 1 \}$ uniform, independent, is \emph{some} distribution on $m$ bit strings. (???)
Let $\vert \vert C - D \vert \vert$ be the distance between distributions. We have
$\vert \vert C - D \vert \vert = \frac{1}{2} \sum_{y \in \{ 0, 1 \}^m} \vert Pr_x [C(x) = y] - Pr_x [ D(x) = y ] \vert$
We get the following complete promise problem for $SZK$:
\begin{tabular}{ll}
$SD_{YES}$ & $= \{ (C, D) \ \vert \ ||C - D|| \geq \frac{2}{3} \}$\\
$SD_{NO}$ & $= \{ (C, D) \ \vert \ ||C - D|| \leq \frac{1}{3} \}$\\
\end{tabular}
The class $SZK$ is worth examining further. We also have $SZK \subseteq AM$.
\end{document}