\documentclass[12pt]{article} \usepackage{wide,me} \parindent0pt

### Fingerprints by Polynomials

Good for fingerprinting “composable” data objects.

• check if $P(x)Q(x)=R(x)$
• $P$ and $Q$ of degree $n$ (means $R$ of degree at most $2n$)
• mult in $O(n\log n)$ using FFT
• evaluation at fixed point in $O(n)$ time
• Random test:
• $S \subseteq F$
• pick random $r \in S$
• evaluate $P(r)Q(r)-R(r)$
• suppose this poly not 0
• then degree $2n$, so at most $2n$ roots
• thus, prob (discover root) $\le 2n/|S|$
• so, eg, sufficient to pick random int in $[0,4n]$
• Note: no prime needed (but needed for $Z_p$ sometimes)
• Again, major benefit if polynomial implicitly specified.

Small problem:

• degree $n$ poly can generate huge values from small inputs.
• Solution 1:
• If poly is over $Z_p$, can do all math mod $p$
• Need $p$ exceeding coefficients, degree
• doesn't change number of roots
• $p$ need not be random---pick once in advance
• Solution 2:
• Work in $Z$, deduce nonzero value from schwartz-zippel
• deduce nonzero mod random $q$ (as in string matching)
• so do all computation mod random $q$
• $q$ range must exceed bits (not value) of coeff.
• Again, major benefit if polynomial implicitly specified.

String checksum:

• treat as degree $n$ polynomial
• eval a random $O(\log n)$ bit input,
• prob. get 0 small

Multivariate:

• $n$ variables
• degree of term: sum of vars' degrees
• total degree $d$: max degree of term.
• Schwartz-Zippel: fix $S \subseteq F$ and let each $r_i$ random in $S$ $\Pr[Q(r_i)=0 \mid Q \ne 0] \le d/|S|$ Note: no dependence on number of vars!
Proof:
• induction. Base done.
• $Q \ne 0$. So pick some (say) $x_1$ that affects $Q$
• write $Q=\sum_{i \le k} x_1^iQ_i(x_2,\ldots,x_n)$ with $Q_k() \ne 0$ by choice of $k$
• $Q_k$ has total degree at most $d-k$
• By induction, prob $Q_k$ evals to 0 is at most $(d-k)/|S|$
• suppose it didn't. Then $q(x)=\sum x_1^i Q(r_2,\ldots,r_n)$ is a nonzero univariate poly.
• by base, prob. eval to 0 is $k/|S|$
• add: get $d/|S|$
• why can we add? $$\begin{eqnarray*} \Pr[E_1] &= &\Pr[E_1 \cap \overline{E_2}]+\Pr[E_1 \cap E_2]\\ &\le & \Pr[E_1 \mid \overline{E_2}] + \Pr[E_2] \end{eqnarray*}$$

### Perfect matching

• Define
• by max-flow techniques, get $m\sqrt{n} \le n^{2.5}$
• Edmonds matrix: variable $x_{ij}$ if edge $(u_i,v_j)$
• determinant nonzero if PM
• poly nonzero symbolically.
• so apply Schwartz-Zippel
• Degree is $n$
• So number $r\in(1,\ldots,n^2)$ yields 0 with prob. $1/n$
• Determinant can be computed with $n^{2.376...}$ ops.

Wait, det may be huge!

• Shwartz-Zippel applies in any field
• So pick work mod $p$, with $p>n^2$ to keep failure prob. small
• Note all coefficients of determinant polynomial are 1
• So determinant polynomial is still (symbolically) nonzero mod $p$

### Finding the Matching

We now have a way to decide if a matching exists.

• But can we find it?

Self reducibility

• General technique for solving construction/optimization via decision
• Use repeated calls to decision to “steer towards” output solution

For perfect matching

• Remove an edge
• Check if still have perfect matching.
• If not, put it back
• When done, all that's left is perfect matching.
• time is $m$ determinants
• not great time, but really simple algorithm!

### Network Coding

A communication task:

• $2n$-vertex bipartite graph with source and sink vertices
• source has $n$ messages to send to $t$
• each edge can carry one message in one time step
• one solution: send along perfect matching
• requires computing perfect matching

Coding solution

• think of messages as integers mod $p$
• $s$ sends one message to each LHS
• each LHS sends its message to all RHS neighbors
• each RHS neighber sends $t$ a random linear combination of what it got, mod $p$
• can $t$ “decode” the message?
• yes (assuming it knows the random coefficients).
• What is the linear operator mapping $n$ inputs (from $s$) to $n$ outputs (at $t$)?
• it's the Edmonds matrix of the bipartite graph, with random coefficients in it
• we proved this has nonzero determinant whp
• which means it is invertible
• so $t$ can invert the linear function to reconstruct input.

Multiple receivers

• same bipartite core
• $s$ has $k< n$ messages
• multiple $t_i$ each connected to $k$ distinct RHS vertices
• can send to any one $t_i$ over a perfect matching to that $t_i$
• but coding argument above says RLC also sends to $t_i$ ($k\times k$ transfer matrix)
• but this is true for all receivers simultaneously
• since each can receive, all can receive.

This is the key idea of network coding. It generalizes to arbitrary communication graphs.