\begin{document} \documentclass[11pt]{article} \usepackage{latexsym} \usepackage{amsmath} \usepackage{amssymb} \usepackage{amsthm} \usepackage{hyperref} \usepackage{algorithmic} \usepackage{algorithm} \usepackage{complexity} \usepackage{graphicx} \usepackage{palatino} %my favourite font \usepackage{makeidx} \makeindex \newcommand{\handout}[5]{ \noindent \begin{center} \framebox[\textwidth]{ \vbox{ \hbox to \textwidth { {\bf #2}\hfill {\bf Computational Complexity } } \vspace{4mm} \hbox to \textwidth { {\Large \hfill #5 \hfill} } \vspace{2mm} \hbox to \textwidth { {\em #3 \hfill #4} } } } \end{center} \vspace*{4mm} } \newcommand{\lecture}[4]{\handout{#1}{#2}{Instructor: #3}{Scribe: #4}{#1}} \newtheorem{theorem}{Theorem} \newtheorem{theorem*}{Theorem} \newtheorem{corollary}[theorem]{Corollary} \newtheorem{lemma}[theorem]{Lemma} \newtheorem{observation}[theorem]{Observation} \newtheorem{proposition}[theorem]{Proposition} \newtheorem{definition}[]{Definition} \newtheorem{claim}[theorem]{Claim} \newtheorem{fact}[]{Fact} \newtheorem{subclaim}[theorem]{Subclaim} % my custom commands \newcommand{\inparen}[1]{\left(#1\right)} %\inparen{x+y} is (x+y) \newcommand{\inbrace}[1]{\left\{#1\right\}} %\inbrace{x+y} is {x+y} \newcommand{\insquar}[1]{\left[#1\right]} %\insquar{x+y} is [x+y] \newcommand{\inangle}[1]{\left\langle#1\right\rangle} %\inangle{A} is <A> \newcommand{\abs}[1]{\left|#1\right|} %\abs{x} is |x| \newcommand{\norm}[1]{\left\Vert#1\right\Vert} %\norm{x} is ||x|| \newcommand{\union}{\cup} \newcommand{\Union}{\bigcup} \newcommand{\intersection}{\cap} \newcommand{\Intersection}{\bigcap} \newcommand{\super}[2]{#1^{\inparen{#2}}} %\super{G}{i-1} is G^{(i-1)} \newcommand{\setdef}[2]{\inbrace{{#1}\ : \ {#2}}} \newcommand{\inv}[1]{#1^{-1}} \newcommand{\inrpdt}[2]{\left\langle{#1},{#2}\right\rangle}%\inrpdt{x}{y} is <x,y>. \newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}} \newcommand{\iso}{\cong} \newcommand{\bea}[1]{\begin{eqnarray*} #1 \end{eqnarray*}} \newcommand{\legendre}[2]{\inparen{\frac{#1}{#2}}} \newcommand{\zo}{\inbrace{0,1}} \newcommand{\Rot}{\text{Rot}} \newcommand{\tensor}{\otimes} % Commands specific to this file % TODO: Find the right way to typeset group index \DeclareMathOperator{\Sym}{Sym} \newcommand{\gpidx}[2]{\insquar{#1 : #2}} %\gpidx{H}{K} is [H : K] \newcommand{\gpigs}[2]{\gpidx{\super{G}{#1}}{\super{G}{#2}}} %Group index of g super ... \newcommand{\llhd}{\!\!\lhd\!\!\lhd} \newcommand{\roundoff}[1]{\left\lfloor #1 \right\rceil} % \newcommand{\ceil}[1]{\lceil #1 \rceil} \newcommand{\floor}[1]{\lfloor #1 \rfloor} \newcommand{\F}{\mathbb{F}} \newcommand{\N}{\mathbb{N}} \newcommand{\Q}{\mathbb{Q}} \newcommand{\Z}{\mathbb{Z}} \renewcommand{\R}{\mathbb{R}} \renewcommand{\C}{\mathbb{C}} %for expected value \newcommand{\Ex}{{\bf E}} %for algorithms \renewcommand{\algorithmicrequire}{\textbf{Input:}} %Dirac's notation \newcommand{\bra}[1]{\left\langle#1\right|} \newcommand{\ket}[1]{\left|#1\right\rangle} %ZigZag Product \newcommand{\zigzag}{\textcircled{z}} \lecture{Constructing Expanders: The Zig-Zag Product}{CS640}{Manindra Agrawal}{Ramprasad Saptharishi} Over the last few classes we were introduced to expander graphs and we saw how they could be used to reduce the number of random bits required for amplification. The crucial part in that is that, given a vertex $v$ of the expander, we should be able to pick a random neighbour of that quickly. Though the graph is of huge size ($2^m$ vertices), we want our neighbour computation to be fast. This is captured by {\em rotation maps.} \section{Rotation Maps} \index{expanders!rotation maps} \begin{definition} Let $G$ be a $d$-regular $n$ vertex graph. The rotation map, denoted by $\Rot_G$, is a map $\Rot_G:V\times\inbrace{1,2,\cdots, d}\longrightarrow V\times\inbrace{1,2,\cdots d}$ such that $\Rot_G(u,i) = (v,j)$ if the $i$-th neighbour of $u$ is vertex $v$ and the $j$-th neighbour of $v$ is $u.$ \end{definition} From the definition it is clear that this is a permutation on $V\times \inbrace{1,2,\cdots, d}$ and it is also an involution; that is $\Rot_G$ applied twice successively is the identity. Therefore, this map can be represented as a symmetric permutation matrix of dimension $nd\times nd.$\\ We would want our expander graphs to have rotation maps that are computable in time $\poly(\log n, \log d).$ \section{Constructing Expander Graphs} Expander graphs are available in plenty. Infact, the following result states that almost all graphs are expanders \index{expanders!random graphs} \begin{theorem} A random $d$-regular graph on $n$ vertices has its second largest eigenvalue less than $9/10$ with high probability. \end{theorem} Therefore, the existence of such graphs is clearly not an issue. But we would want to explicitly construct them. A few such constructions were discovered earlier and all of them had a similar flavour - simple to visualize but really hard to show that they are indeed good expanders or even find efficient rotation maps. To illustrate that, consider the following graph. Let $p$ be a prime and consider a graph on $p$ vertices. For each $i$, connect it to $i-1,i+1$ and $i^{-1}\bmod{p}$ (put a self loop on $0$). This actually forms an expander graph! People were looking for explicit constructions of a family of constant degree expander graphs for a long time. The general idea was to take a small expander graph, and somehow blow it up or enlarge it to give a larger expander graph. We shall now look at certain {\em graph products.} \section{Graph Products} \index{expanders!graph products} We shall look at some ways by which we can take two graphs $G$ and $H$ and try and get a bigger graph by taking some product between them. \subsection{Powering} \index{expanders!graph products!powering} Suppose $G$ and $H$ are $(n,d,\lambda)$ and $(n,d',\lambda')$ expanders. Define the adjacency matrix of the product graph $GH$ as the product of the adjacency matrices. This may not be a $0,1$ matrix but think of this as a multigraph where $A_{ij}=k$ means that there are $k$ edges between vertex $i$ and $j.$ The resulting graph will also be an $n$ vertex graph and have degree $dd'.$ What about the eigenvalue? Notice that this new normalized adjacency matrix also has $\ket{\hat{1}}$ as an eigenvector with eigenvalue $1.$ Take any other eigenvector orthogonal to this. Then the adjacency matrix of $H$ shrinks it by $\lambda'$ and then $G$ shrinks it by $\lambda.$ Therefore, the second largest eigenvalue is $\lambda\lambda'.$ Therefore the resulting graph $GH$ is an $(n,dd',\lambda\lambda')$ expander. We shall denote a product of $G$ with itself $k$ times as $G^k.$ \subsection{Tensor Product} \index{expanders!graph products!tensor product} Let $G$ be an $(n,d,\lambda)$ expander and $H$ be a $(n',d',\lambda')$ expander. The tensor product $G\tensor H$ is defined as follows: \begin{itemize} \item The vertex set of $G\tensor H$ is $V_G\times V_H.$ Vertices can be thought of as $(u,u')$ such that $u\in G$ and $u'\in H.$ \item Vertex $(u,u')$ is connected to $(v,v')$ if and only if $(u,v)\in G$ and $(u',v')\in H.$ \end{itemize} It is clear from the definition that the number of vertices is $nn'$ and the degree is $dd'.$ And it is not hard to check that the adjacency matrix of this new graph will now be the tensor product of the adjacency matrices of the old graphs. And therefore, the related eigenvalues and eigenvectors are also corresponding products. Therefore, the second largest eigenvalue of $G\tensor H$ is $\max\inbrace{\lambda,\lambda'}.$ Therefore, the resulting graph $G\tensor H$ is an $(nn',dd',\max\inbrace{\lambda,\lambda'})$ expander.\\ Let us now look at what the two products give us. The powering is good in the sense that it reduces $\lambda$, which is good for us. The number of vertices however remain the same. The tensor product on the other hand increases the number of vertices and doesn't change the eigenvalue much since it remains the max of the $2$ graphs and is therefore under control. The problem with both the operations is that the degree of the graph keeps increasing. And with just these two products there is no hope of getting a family of constant degree expanders. The degree blows up because of the freedom that both the products allows. In case of powering, if you look at $GH$, it is equivalent to taking one edge in $H$ and then one in $G.$ So in essence you have complete freedom of choosing any edge in $G$ as long as it is incident in the vertex you are on after the $H$-edge. In case of tensoring, they are essentially two parallel moves, one on $G$ and one on $H$ and therefore allows tremendous freedom. What we need to do is reduce this freedom; somehow make the edge in one of the graphs ``influence'' the other edge. This is exactly what happens in the {\em zig zag} product which finally solves the problem of degree blow-up. \section{The Zig-Zag Product} \index{expanders!graph products!zig-zag product} Suppose $G$ is an $(N,D,\lambda_1)$ expander and $H$ is an $(D,d,\lambda_2)$ expander. We can then define what the zig-zag product of the two graphs are. It is denoted by $G\zigzag H.$ We shall first define it informally so that the readers get a good picture of what it is. Take the graph $G$. Each vertex $u$ has degree $D$ and therefore has $D$ edges going out. Now replace each vertex $u$ by a {\em cloud} of $D$ vertices such that each new vertex $u_i$ represents the $i$-th edge going out of $u.$ Therefore, the vertex set of $G\zigzag H$ is of size $ND$ since each vertex in $G$ is now blown up by a $D$ vertex cloud. To make the understanding, we shall think of the new vertices as pairs $(u,i)$ which corresponds to the $i$-th vertex in the cloud of $u.$ We hence have $ND$ vertices, identified as clouds, present with absolutely no edges between them. We need to now define the edges. Any edge in the zig-zag product will correspond to a $3$-step walk which is as follows. You start at vertex $(u,i)$ which is the $i$-th vertex in the cloud of $u.$ Now think of the vertices in this cloud as vertices of $H.$ Take one edge in $H$ to go from $(u,i)$ to $(u,j)$ where $j$ is a neighbour of $i$ in $H.$ Now you are at vertex $(u,j).$ The vertex $j$ corresponds to the $j$-th edge going out of $u$; take that edge in $G$ to go to vertex $v$ in $G$ and thereby going from a vertex in cloud $u$ to a vertex in cloud $v.$ Now we would go from $u$ to $v$ through some edge of $v$, say the $k$-th edge. This corresponds to $(v,k)$ in the cloud. So we end up there. Now think of the cloud of $v$ as the graph $H$ and go to some neighbour of $k$, say $l$. Therefore, you finally end up in $(v,l).$ This three step walk defines a single edge in $G\zigzag H$; you then connect $(u,i)$ and $(v,l)$ by an edge. OK, let us first find out what the degree of the new graph is. How many neighbours does $(u,i)$ have? Let us see what the freedom is. From $(u,i)$ we can take any edge in $H$ for the first step of the walk; that gives us $d$ choices. Then the new vertex defines the intercloud edges so there is no choice there. After that, we take one more edge in $H$ in the new cloud which again gives us $d$ choices. Therefore, the degree of this new graph is $d^2.$ Infact we can say that $(u,i)$'s $(a,b)$-th neighbour is $(v,k)$ if your first walk took the $a$-th neighbour of $a$ and then in the last step too the $b$-th neighbour to $k.$ Therefore, the edge labels can also be thought of as tuples $(a,b)$ where $1\leq a,b\leq d.$ \\ Let us now formally define the edges. Recall that the rotation map $\Rot_{G\zigzag H}$ is a map that takes a vertex and edge number and returns the destination vertex and its edge number that we took to get there. To compute $\Rot_{G\zigzag H}\inparen{(u,i),(a,b)}$, let $\Rot_H\inparen{i,a} = \inparen{j,a'}$ and let $\Rot_G\inparen{u,j} = \inparen{v,j'}$ and then let $\Rot_G\inparen{j,b} = \inparen{k,b'}.$ Then, define $\Rot_{G\zigzag H}\inparen{(u,i),(a,b)}$ as $\inparen{(v,k),(b',a')}.$ This rotation map therefore defines all the edges of $G\zigzag H.$ Now for the bounds on eigenvalues. \subsection{Eigenvalue Bounds} We now need to argue that he second largest eigenvalue isn't too large. Note that we don't need the eigenvalue to actually decrease. The graph powering reduces the eigenvalue a lot. Therefore if we reduce it enough through powering and then use the zig-zag product to get the degree down and a slight increase in eigenvalue will not cause too much of trouble. The following theorem gives a bound on the eigenvalue of the new graph. \begin{theorem} The graph $G\zigzag H$ is an $(ND,d^2,\lambda)$ expander where $\lambda = 1 - (1 - \lambda_1)(1 - \lambda_2)^2.$ \end{theorem} \begin{proof} The adjacency matrix of $G\zigzag H$ isn't as difficult as it seems. Every edge in the new graph corresponds to a $3$ step walk. You start at $(u,i)$ and go to a neighbour $j$ of $i$ in cloud $u$ and therefore to $(u,j).$ This is like the adjacency matrix of $H$ acting on the second coordinate and keeping the first coordinate fixed; this is what is captured by the matrix $I\tensor H$ where $H$ is the normalized adjacency matrix of the graph $H.$ From $(u,j)$, you go the $j$-th neighbour, say $v$, of $u.$ You would be taking $v$'s $k$-th edge into $v$ and hence you end up at $(v,k).$ This means that $u$'s $j$-th neighbour is $v$ and as a $k$-th neighbour of $v.$ This is exactly the rotation map of $G$, takes $(u,j)$ and returns $(v,k).$. Let us call the matrix representing the rotation map as $R_G.$ Then this matrix captures the second step. The third step is again keeping the first coordinate fixed and the second coordinate changing based on $H.$ Therefore, the third step is $I\tensor H$ as well. Hence, the normalized adjacency of $G\zigzag H$ is $(I\tensor H)R_G(I\tensor H).$\\ We need to compute the $2$nd largest eigenvalue of $Z = (I\tensor H)R_G(I\tensor H).$ Write $H = (1 - \lambda_2)J + \lambda_2 C$ where $J$ is the vector with all $1/D$s. \bea{ Z & = & (I\tensor H)R_G(I\tensor H)\\ & = & (I\tensor((1 - \lambda_2)J + \lambda_2 C))R_G(I\tensor((1 - \lambda_2)J + \lambda_2 C))\\ & = & (1 - \lambda_2)^2(I\tensor J)R_G(I\tensor J) + (1 - \lambda_2)\lambda_2(I\tensor C)R_G(I\tensor J)\\ & & +(1 -\lambda_2)\lambda_2(I\tensor J)R_G(I\tensor C) + \lambda_2(I\tensor C)R_G(I\tensor C)\\ & = & (1 - \lambda_2)^2(I\tensor J)R_G(I\tensor J) + (1 - (1 - \lambda_2))^2E } where $E$ is the rest of the matrices. After this, it's just some verification using the eigenvectors. It can be shown that $\norm{E}\leq 1.$ We leave it to the reader to think about it. As for the other term, look at the matrix $(I\tensor J)R_G(I\tensor J).$ Thinking of it graphically, the first $(I\tensor J)$ preserves the first coordinate and sends the second coordinate to ``any'' vertex in $H.$ Then $R_G$ then goes to ``any'' neighbour of $G$ and then the final $I\tensor J$ fixes the first component again. Hence, it can be formally shown as well, $(I\tensor J)R_G(I\tensor J) = G\tensor J.$ Now the only eigenvalues of $J$ are $1$ and $0$ ($0$ has multiplicity $D-1$) and therefore the second largest eigenvalue of $J$ is $0.$ Therefore, the second largest eigenvalue of $G\tensor J$ is $\lambda_1.$ Therefore, $$ \lambda \leq (1 - \lambda_2)^2\lambda_1 + 1 - (1-\lambda_2)^2 = 1 - (1 - \lambda_1)(1 - \lambda_2)^2 $$ \end{proof} \subsection{Revisiting Probability Amplification} \index{expanders!boosting $\RP$ success} Recall the part when we discussed amplifying the success probability in an $\RP$ algorithm. We had an algorithm that used $m$ random bits and whose error was less than say $\delta.$ We used $m+k\log D$ bits to get the error down to $2^{-\Omega{k}}.$ Think of this as a first block of $m$ bits to figure out the first vertex, and then $k$ chunks of $\log D$ bits. In this $k$ chunks of $\log D$ bits, you can recursively apply the procedure. Think of $\log D = m'$ and this looks like $k$ repeated applications of the naive algorithm that uses $m'$ random bits. Therefore, you can think of an expander on $2^{m'} = D$ vertices say of degree $d$ and do a random walk there to reduce the random bits again. Therefore we would now be using $\log m + \log D + \log d + \log d + \cdots.$ One could group the first two chunks to get a chunk of $\log mD$ random bits and then followed by a lot of $\log d$ chunks. This is like a random walk on an expander on $mD$ vertices and degree $2^{O(\log d)}$ which say is $d^2.$ And this is essentially what we get in the zig-zag product! Thus, we can think of the probability amplification in two steps: start with an expander on $2^m$ vertices but not necessarily of constant degree $D.$ Now use the smaller expander on $D$ vertices to reduce the random bits even further. This corresponds to using the expander which is the zigzag product of the two expanders. \section{Efficient Computable Family of Constant Degree Expanders} \index{expanders!family of constant size expanders} Using the $3$ graph products that we discussed, we can now go ahead and construct a family of efficiently computable expander graphs with constant degree. To start with, let us pick a small constant sized graph with $\lambda< 1/2.$ This can be picked by bruteforce. Let $H$ be a $(d^8,d,1/2)$ expander. Define $G_1 = HH.$ Therefore, $G_1$ will be an $d^{16}$ vertex $d^2$ regular graph. Now interatively, define $G_k$ for $k\geq 2$ as $$ G_k = (G_{k-1}\tensor G_{k-1})^2\zigzag H $$ The following claim is very straightforward to check. \begin{claim} The family of graphs $\inbrace{G_i}$ form a family of expander graphs each of degree $d^2.$ And $G_k$ is a $(8(2^k - 1),d^2,1/4)$ expander. And further, the rotation maps of $G_k$ can be computed in time $O(k)$ which is logarithmic in the size of the vertex set of the graph and is hence efficient. \end{claim} \end{document}