\begin{document} \documentclass[11pt]{article} \usepackage{latexsym} \usepackage{amsmath} \usepackage{amssymb} \usepackage{amsthm} \usepackage{hyperref} \usepackage{algorithmic} \usepackage{algorithm} \usepackage{complexity} \usepackage{graphicx} \usepackage{palatino} %what a beautiful font! \newcommand{\handout}[5]{ \noindent \begin{center} \framebox[\textwidth]{ \vbox{ \hbox to \textwidth { {\bf #2}\hfill {\bf Computational Number Theory } } \vspace{4mm} \hbox to \textwidth { {\Large \hfill #5 \hfill} } \vspace{2mm} \hbox to \textwidth { {\em #3 \hfill #4} } } } \end{center} \vspace*{4mm} } \newcommand{\lecture}[4]{\handout{#1}{#2}{Instructor: #3}{Scribe: #4}{Lecture #1}} \newtheorem{theorem}{Theorem} \newtheorem{theorem*}{Theorem} \newtheorem{corollary}[theorem]{Corollary} \newtheorem{lemma}[theorem]{Lemma} \newtheorem{observation}[theorem]{Observation} \newtheorem{proposition}[theorem]{Proposition} \newtheorem{definition}[]{Definition} \newtheorem{claim}[theorem]{Claim} \newtheorem{fact}[]{Fact} \newtheorem{subclaim}[theorem]{Subclaim} % my custom commands \newcommand{\inparen}[1]{\left(#1\right)} %\inparen{x+y} is (x+y) \newcommand{\inbrace}[1]{\left\{#1\right\}} %\inbrace{x+y} is {x+y} \newcommand{\insquar}[1]{\left[#1\right]} %\insquar{x+y} is [x+y] \newcommand{\inangle}[1]{\left\langle#1\right\rangle} %\inangle{A} is <A> \newcommand{\abs}[1]{\left|#1\right|} %\abs{x} is |x| \newcommand{\norm}[1]{\left\Vert#1\right\Vert} %\norm{x} is ||x|| \newcommand{\union}{\cup} \newcommand{\Union}{\bigcup} \newcommand{\intersection}{\cap} \newcommand{\Intersection}{\bigcap} \newcommand{\super}[2]{#1^{\inparen{#2}}} %\super{G}{i-1} is G^{(i-1)} \newcommand{\setdef}[2]{\inbrace{{#1}\ : \ {#2}}} \newcommand{\inv}[1]{#1^{-1}} \newcommand{\inrpdt}[2]{\left\langle{#1},{#2}\right\rangle}%\inrpdt{x}{y} is <x,y>. \newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}} \newcommand{\iso}{\cong} \newcommand{\bea}[1]{\begin{eqnarray*} #1 \end{eqnarray*}} \newcommand{\legendre}[2]{\inparen{\frac{#1}{#2}}} % Commands specific to this file % TODO: Find the right way to typeset group index \DeclareMathOperator{\Sym}{Sym} \newcommand{\gpidx}[2]{\insquar{#1 : #2}} %\gpidx{H}{K} is [H : K] \newcommand{\gpigs}[2]{\gpidx{\super{G}{#1}}{\super{G}{#2}}} %Group index of g super ... \newcommand{\llhd}{\!\!\lhd\!\!\lhd} \newcommand{\roundoff}[1]{\left\lfloor #1 \right\rceil} % \newcommand{\ceil}[1]{\lceil #1 \rceil} \newcommand{\floor}[1]{\lfloor #1 \rfloor} \newcommand{\F}{\mathbb{F}} \newcommand{\N}{\mathbb{N}} \newcommand{\Q}{\mathbb{Q}} \newcommand{\Z}{\mathbb{Z}} \renewcommand{\R}{\mathbb{R}} \renewcommand{\C}{\mathbb{C}} %for algorithms \renewcommand{\algorithmicrequire}{\textbf{Input:}} % Problems we look at \newcommand{\GIso}{\lang{Graph\text{-}Iso}} %Without \text, ugly minus instead of hyphen. \newcommand{\GAut}{\lang{Graph\text{-}Aut}} \newcommand{\SStab}{\lang{Set\text{-}Stab}} %for Quantum computing \newcommand{\q}[1]{\left|#1\right\rangle} \lecture{10: Distinct Degree Factoring}{CS681}{Piyush P Kurur}{Ramprasad Saptharishi} \section*{Overview} Last class we left of with a glimpse into distant degree factorization. This class, we shall look at the details of it and also how it can be used as an irreducibility test. \section{DDF: The Problem} We are given a polynomial $f$ over a finite field $\F_p$ of degree say $n.$ We want to factor them in to degree $1$ factors, degree $2$ factors etc. To make this more explicit, let us assume that $f$ factorizes as $$ f = g_{11}g_{12}\cdots g_{1m_1}g_{21}\cdots g_{dm_d} $$ where each $g_{ij}$ is an irreducible factor of $f$ of degree $i.$ Hence given $f$, we want to return $\prod_j g_{ij}$ for each $i.$ That is, return the factor of $f$ that is the product of all degree $i$ irreducible factors of $f.$ And we want to do this for all $i$. This is called {\em distinct degree factorization.} We want an efficient algorithm for this. But before we start thinking of algorithms, what do we mean by efficient? It is the usual 'polynomial time in input length' but what is the input length? We are looking at $f(X) = \F_p[X]$ and each coeffiecient of the polynomial is from $\F_p.$ And since $\F_p$ contains just $p$ elements, we can encode them in binary using $\log p$ bits. Hence the input size is about $n\log p.$ Thus we are interested in algorithms that have running time of $(n\log p)^c$ for some constant $c.$ \section{Extracting Square-free Parts} When we said that $f$ factorizes as $g_{11}\cdots g_{dm_d}$, it is very much possible that there are some $g{ij} = g_{ik}$ or in other words the square of $g_{ij}$ divides $f.$ The first step of any factoring algorithm is to remove such multiple factors and make $f$ {\em square free} (make sure that $f$ is not divisible by any square).\\ Suppose we were looking at polynomials over $\Z[X]$ or something. Then we have a very nice way of checking for such multiple factors. We know that if $f$ has a repeated root $\alpha$ , then the derivative $f'$ has $\alpha$ as a root as well. Infact, if $(x-\alpha)^m$ divided $f$ then $(x-\alpha)^{m-1}$ will divide $f'.$ And hence, the gcd of $f$ and $f'$ will have $(x-\alpha)^{m-1}$ as a factor. Thus, we can divide $f$ by this gcd and get rid of higher multiplicity terms. But in the case of $\Z[X]$, we had an interpretation of real numbers, limits and hence we could define what a derivative is. But when it comes to finite fields, what does differentiation mean? How can we use this technique to get extract the square-free part?\\ Note that we don't need the notion of a derivative as a tangent or something. That is where the limits come in. What we need is a condition where if some $g^m$ divides $f$ then $g^{m-1}$ should divide $f'$ for the $f'$ that we are going to define. Thus, since we are just in the realms of polynomials, we shall use the rules of differentiation as just a formula. Thus let $D$ be a map that sends $X^m$ to $mX^{m-1}.$ Now extend this linearly to all polynomials to get $$ D\inparen{a_0 + a_1X + a_2X^2 + \cdots a_mX^m} = a_1 + 2a_2X + \cdots ma_mX^{m-1} $$ Now, we leave the following things to be proven as an exercise. \subsection*{Exercise} \begin{enumerate} \item $D(f+g) = D(f)+D(g)$ \item $D(fg) = fD(g) + gD(f)$ \item $D(f^m) = mf^{m-1}D(f)$ \item {\bf Theorem: } If $h$ is a factor of $f$ such that $h^m\mid f$ then $h^{m-1}\mid g.$ And further, if $h^m$ is the highest power of $h$ that divides $f$ then $h^{m-1}$ is the highest power of $h$ that divides $f'.$ \end{enumerate} Once we have these properties, we can extract the square-free part of $f$ easily. Given an $f$ construct the formal derivative $f'.$ Let $g = gcd(f,f').$ The polynomial $f/g$ consists is the square free extraction of $f.$\\ There is, however, a small catch here. But we shall get back to it in the end of the class and discuss it in the next class in detail. For the moment, let us proceed with distinct degree factorization. \section{Distinct Degree Factorization} Given $f\in \F_p[X]$ of degree $n$, we want to get the distant degree factorization of $f.$ That is, we want to get $g_1,g_2,\cdots ,g_n$ such that $f = g_1g_2\cdots g_n$ where each $g$ is the product of all irreducible factors of $f$ of degree $i.$ And we want to do this efficiently. \\ The key idea is the formula we proved last class: $$ X^{p^m} - X = \prod_{\substack{f\in \text{Irr}(\F_p,d)\\d\mid m}} f(X) $$ Firstly, let us look at the case when $m=1.$ Then $X^p - 1$ is the product of all irreducible polynomials over $\F_p$ of degree $1.$ And thus, in particular, it contains the degree $1$ irreducible factors of $f$ within it. Thus, $g_1 = gcd(f,X^p - X)$ will be the product of all degree $1$ irreducible factors of $f$ as $X^p - X$ has just degree $1$ irreducible factors and all those that divide $f$ will be a part of the gcd. Thus, we have obtained the required $g_1.$ Now, call $f_2 = f/g_1$ and now let $g_2 = gcd(f_2,X^{p^2} - X)$ and this will have precisely all degree $2$ irreducible factors of $f$ (degree $1$ factors won't appear since we have removed all degree $1$ factors by dividing by $g_1$). Thus, we can repeat this procedure. This is a naive algorithm and has a pretty serious problem. The trouble is that the algorithm would have to compute $gcd(f_i,X^{p^i}-X)$ which is a HUGE degree polynomial. We want our running time as $(n\log p)^c$ but this naive algorithm takes about $O(p)$ time! And this is definitely not acceptable since even finding all linear irreducible factors might take $O(p)$ time! We definitely need to change this. But the polynomial $X^{p^i} - X$ is a nice polynomial and hence allows a nice simple trick to make the algorithm polynomial time. \\ Let us look at the gcd algorithm. The first step is to compute $X^{p^i} -X \bmod{f_i}$ and the problem with this is that the degree of the polynomial is too large to apply the naive algorithm. But we can do far better in the case of this special polynomial. Note that $X^{p^i} - X \bmod{f_i} = X^{p^i}\bmod{f_i} - X\bmod{f_i}$ and the difficulty was in computing $X^{p^i} \bmod{f_i}.$ This can be done quite efficiently using the technique of repeated squaring. \subsection{Repeated Squaring} In general, we have a polynomial $f$ of degree $n$ and we want to calculate $X^M\bmod{f}$ in time $(n\log M)^c$ for some constant $c.$ The idea is pretty simple. Firstly assume $M$ to be a power of $2.$ Then we can do the following. Start with $X$, and square it. And square it again and so on. The moment the degree goes beyond $n$, take it modulo $f$ to reduce it's degree back to less than $n.$ And continue this squaring process, for $\log M$ steps. Thus, we would have computed $X^M\bmod{f}$ in time $(n\log M)^c.$ Now what do we do for $M$ that is not a power of two? The answer is simple again. Just look at the binary representation of $M$. Say $m = b_0 + b_12 + b_22^2 + \cdots b_{l}2^l.$ Then $$ X^M = X^{b_0}(X^{b_1})^{2^1}(X^{b_2})^{2^2}\cdots (X^{b_l})^{2^l} $$ And since each $b_i$ is either $1$ or $0$, $X^{b_i}$ is either $1$ or $X.$ Thus, we can just compute $X^{2^i}\bmod{f}$ for $0\leq i\leq \log M$ and multiply all those residues that have $b_i$ as $1.$ \begin{algorithm} \caption{\sc{Evaluate $X^M \bmod{f}$ using repeated squaring}} \begin{algorithmic}[1] \STATE Let $M = b_0 + b_12 + \cdots b_l2^l$, the binary representation of $M,$ where $l =\floor{\log M}$. \STATE $g_0 = 1$ and $g = b_0X.$ \FOR{$i = 1$ to $l$} \STATE $g_i = (g_{i-1})^2\bmod{f}$ \IF{$b_i = 1$} \STATE $g = g\cdot g_i\bmod{f}$ \ENDIF \ENDFOR \STATE {\bf return} $g$ \end{algorithmic} \end{algorithm} Now, with this repeated squaring algorithm, we can do distinct degree factorization. The algorithm is given at the end of the lecture. \section{A Catch and a Hint} First let us get to the catch that we had mentioned earlier. Our method for extracting the square free part of $f$ involved taking the formal derivative and then the gcd of $f$ with $f'.$ But strange things can happen in a finite field. For example, let us take the polynomial $f(X) = X^{2p} - 3X^p + 5.$ The formal derivative is $2pX^{2p - 1} - 3pX^{p-1}$ which is $0$ in $\F_p$! What do we do now? It is easy to see that the derivative can become $0$ if the only surviving terms of the polynomial have degree that's a multiple of a prime. Then this polynomial $f(X)$ can be thought of as $g(X^p)$ and we can go on to do our algorithm for $g$ instead of $f.$ In our example, $g(X) = X^2 - 3X + 5$ which is well behaved. All we need to do is get the factors of $g$ and replace every occurance of $X$ by $X^p.$ And more over, since we are in $\F_p$, it is easy to check that $f(X) = g(X^p) = (g(X))^p.$ \\ So much for the small catch. Now here is a hint. The algorithm for distinct degree factorization immediately gives us an algorithm to test if a given polynomial over $\F_p[X]$ is irreducible. The reduction is very straightforward and the students are encouraged to think about it. We shall discuss this in the next class. \begin{algorithm} \caption{\sc{Distinct Degree Factorization}} \begin{algorithmic}[1] \REQUIRE $f(X) \in \F_p[X]$ of degree $n.$ \STATE $f_0 = f.$ \FOR{$i=1$ to $n$} \STATE Using repeated squaring, compute $s_i = X^{p^i} \bmod{f_{i-1}}$ \STATE Compute $g_i = gcd(f_{i-1},s_i - X).$ \STATE $f_i = f_{i-1}/g_i$ \ENDFOR \STATE {\bf return} $\inbrace{g_1,g_2,\cdots, g_n}.$ \end{algorithmic} \end{algorithm} \end{document}