\begin{document} \documentclass[11pt]{article} \usepackage{latexsym} \usepackage{amsmath} \usepackage{amssymb} \usepackage{amsthm} \usepackage{hyperref} \usepackage{algorithmic} \usepackage{algorithm} \usepackage{complexity} \usepackage{graphicx} \usepackage{palatino} %what a beautiful font! \newcommand{\handout}[5]{ \noindent \begin{center} \framebox[\textwidth]{ \vbox{ \hbox to \textwidth { {\bf #2}\hfill {\bf Computational Number Theory } } \vspace{4mm} \hbox to \textwidth { {\Large \hfill #5 \hfill} } \vspace{2mm} \hbox to \textwidth { {\em #3 \hfill #4} } } } \end{center} \vspace*{4mm} } \newcommand{\lecture}[4]{\handout{#1}{#2}{Instructor: #3}{Scribe: #4}{Lecture #1}} \newtheorem{theorem}{Theorem} \newtheorem{theorem*}{Theorem} \newtheorem{corollary}[theorem]{Corollary} \newtheorem{lemma}[theorem]{Lemma} \newtheorem{observation}[theorem]{Observation} \newtheorem{proposition}[theorem]{Proposition} \newtheorem{definition}[]{Definition} \newtheorem{claim}[theorem]{Claim} \newtheorem{fact}[]{Fact} \newtheorem{subclaim}[theorem]{Subclaim} % my custom commands \newcommand{\inparen}[1]{\left(#1\right)} %\inparen{x+y} is (x+y) \newcommand{\inbrace}[1]{\left\{#1\right\}} %\inbrace{x+y} is {x+y} \newcommand{\insquar}[1]{\left[#1\right]} %\insquar{x+y} is [x+y] \newcommand{\inangle}[1]{\left\langle#1\right\rangle} %\inangle{A} is <A> \newcommand{\abs}[1]{\left|#1\right|} %\abs{x} is |x| \newcommand{\norm}[1]{\left\Vert#1\right\Vert} %\norm{x} is ||x|| \newcommand{\union}{\cup} \newcommand{\Union}{\bigcup} \newcommand{\intersection}{\cap} \newcommand{\Intersection}{\bigcap} \newcommand{\super}[2]{#1^{\inparen{#2}}} %\super{G}{i-1} is G^{(i-1)} \newcommand{\setdef}[2]{\inbrace{{#1}\ : \ {#2}}} \newcommand{\inv}[1]{#1^{-1}} \newcommand{\inrpdt}[2]{\left\langle{#1},{#2}\right\rangle}%\inrpdt{x}{y} is <x,y>. \newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}} \newcommand{\iso}{\cong} \newcommand{\bea}[1]{\begin{eqnarray*} #1 \end{eqnarray*}} \newcommand{\legendre}[2]{\inparen{\frac{#1}{#2}}} % Commands specific to this file % TODO: Find the right way to typeset group index \DeclareMathOperator{\Sym}{Sym} \newcommand{\gpidx}[2]{\insquar{#1 : #2}} %\gpidx{H}{K} is [H : K] \newcommand{\gpigs}[2]{\gpidx{\super{G}{#1}}{\super{G}{#2}}} %Group index of g super ... \newcommand{\llhd}{\!\!\lhd\!\!\lhd} \newcommand{\roundoff}[1]{\left\lfloor #1 \right\rceil} % \newcommand{\ceil}[1]{\lceil #1 \rceil} \newcommand{\floor}[1]{\lfloor #1 \rfloor} \newcommand{\F}{\mathbb{F}} \newcommand{\N}{\mathbb{N}} \newcommand{\Q}{\mathbb{Q}} \newcommand{\Z}{\mathbb{Z}} \renewcommand{\R}{\mathbb{R}} \renewcommand{\C}{\mathbb{C}} %for algorithms \renewcommand{\algorithmicrequire}{\textbf{Input:}} % Problems we look at \newcommand{\GIso}{\lang{Graph\text{-}Iso}} %Without \text, ugly minus instead of hyphen. \newcommand{\GAut}{\lang{Graph\text{-}Aut}} \newcommand{\SStab}{\lang{Set\text{-}Stab}} %for Quantum computing \newcommand{\q}[1]{\left|#1\right\rangle} \lecture{7: Towards Factorization over Finite Fields}{CS681}{Piyush P Kurur}{Ramprasad Saptharishi} \section*{Overview} We shall slowly move into factorization of univariate polynomials (polynomials with just one variable) over finite fields. We are given a finite field $K$ and a polynomial over one variable $X$ whose coefficients are from $K.$ We are to find the factorization of this polynomial into irreducible factors. Before we get into this question, we need to first understand if it even makes sense. How can we be sure that such a factorization exists? And even if it did, how do we know if it is unique? We shall first answer a lot of questions in the algebra related to it before going to factorization as such. \section{Rings, Ideals, Factorization etc.} We know that integers can be uniquely factorized into product of prime powers. However, not all rings are as well-behaved as the integers are. We first need to ask if the algebraic structure has this property of unique factorization. Let us look at an example where this fails. Look at the set of integers modulo $8$. This is called $\Z_8$ and we know that this forms a ring. Suppose we look at polynomials over this ring, polynomials of a single variable $X$ whose coefficients come from $\Z_8,$ does this ring have the property of unique factorization? Here is a counter example in $\Z_8[X].$ $$ X^2 - 1 = (X-1)(X+1) = (X-3)(X+3) $$ But $\Z_8$ is a bad ring, in the sense that non-zero elements can multiply to give $0$ ($2\times 4 = 0$ here). As for another example, look at the set of all number of the form $a+b\sqrt{-5}$ where $a,b\in \Z.$ This forms a ring and over this ring $6 = 2\cdot 3 = (1+\sqrt{-5})(1 - \sqrt{-5}).$ Hence it's not always true that factorization is unique. However, fortunately for us, we have unique factorization over $K[X]$ whenever $K$ is a field. \begin{definition} A ring is said to be an integral domain if and only if there are no non-trivial zero divisors. That is, if $a,b\in R$ such that $ab=R$, then either $a=0$ or $b=0.$ \end{definition} The ring $\Z$ is an integral domain but the ring of integers modulo $6$ is not (since $2\times 3 = 0$ over the ring). Inorder to define factorization, we need a notion of primes over arbitrary rings. Let us first look at the definition of primes over integers. Let us first look at the wrong definition. \\ \noindent A number $p$ is said to be prime if for all $a$ that divides $p$, either $a=1$ or $a=p.$\\ This translates to the ring definition of a maximal ideal and not a prime ideal. This is the common definition in school but generalization based on this is erraneous. Though it happens to correct over the set of integers, it is not true in general. Here is the right definition. \begin{definition} A number $p$ is said to be a prime if and only if for all $a,b$ such that $p$ divides $ab$, either $p$ divides $a$ or $p$ divides $b.$ \end{definition} Thus this gives the definition of prime ideals in the setting of rings. \begin{definition} An ideal $\mathfrak{a} \subseteq R$ is said to be prime if and only if for all $a,b\in R$ such that $ab\in \mathfrak{a}$, either $a\in \mathfrak{a}$ or $b\in \mathfrak{a}.$ Any element $p\in R$ that generates a prime ideal is called a prime element of $R.$ \end{definition} \begin{definition} An ideal $\mathfrak{a} \subseteq R$ is said to be maximal if and only if for every ideal $\mathfrak{a'}\supseteq \mathfrak{a}$, either $\mathfrak{a'} = 1R = R$ or $\mathfrak{a'} = \mathfrak{a}.$ \end{definition} This basically means that no proper ideal of $R$ properly contains $\mathfrak{a}.$ Note that not all prime ideals are maximal. We were just lucky that this was true on $\Z$ and hence both definitions of prime numbers were equivalent. This is not true over arbitrary rings. \begin{definition} An ideal $\mathfrak{a} \subseteq R$ is said to be a principle ideal if the ideal is generated by a single element. That is, $\mathfrak{a} = aR$ for some $a\in R.$ \end{definition} \begin{definition} An integral domain $R$ is said to be a \begin{itemize} \item principle ideal domain (PID) if every ideal in it is principle (every ideal is generated by a single element). \item unique factorization domain (UFD) if every element can be uniquely factorized in to product of prime elements of the ring. \end{itemize} \end{definition} We already saw an example of a ring (and a domain) that was not a UFD. Here is an example of a ring that is not a PID. Consider a field $K$ and look at the ring of polynomials on two variables $X,Y$ over this field. This is denoted by $K[X,Y].$ In this field, look at the ideal generated by $X$ and $Y.$ That is, the set of polynomials of the form $Xf(X,Y) + Yg(X,Y)$, those polynomials that do not have a constant term. This is clearly an ideal but this isn't principle. A similar example is over $\Z[X]$ and the ideal being $(p,X)$ where $p$ is any prime number. \begin{fact} For any field $K$, $K[X]$ is a PID. \end{fact} \begin{fact} Any PID is also a UFD \end{fact} The two facts together tell us that we can indeed talk of factorization of polynomials in $K[X].$ Another useful fact is the following, and this helps us see that factorization makes sense even on multivariate polynomials. \begin{fact} If $R$ is a UFD, so is $R[X].$ \end{fact} The following theorems are very useful. \begin{theorem} A ring $R$ is a field if and only if the only ideals of $R$ are the $0$ ideal and the whole ring $R$. \end{theorem} \begin{proof} First we shall show that a field has no non-trivial ideals. Suppose The field had some ideal $I$ that contained some element $x\neq 0$. Since it is a field, the inverse of $x$ exists. Since $I$ is an ideal and $x\in I$ would mean that $xa \in I$ for all $a\in R$ and in particular $xx^{-1} = 1 \in I.$ But if $1\in I$, then for every element $a$ in the field, $1a \in I$ which would then force $I$ to be the entire field. \\ As for the other direction, suppose the ring $R$ was not a field. We want to show that there exists some non-trivial ideal in this ring. Since we assumed that it isn't a field, there must be some non-zero element $a$ whose inverse does not exist. Look at the ideal generated by it, $aR.$ This ideal certainly contains $a$ and it cannot $1$ since if it did, it would mean that $a$ is invertible. And hence this is an ideal that is non-zero and also not the whole of $R$; a non-trivial ideal. \end{proof} \begin{theorem} For any ring $R$ \begin{enumerate} \item if an ideal $\mathfrak{a}$ is prime, then $R/\mathfrak{a}$ is an integral domain. \item if an ideal $\mathfrak{a}$ is maximal, then $R/\mathfrak{a}$ is a field. \end{enumerate} \end{theorem} \begin{proof} We have to show that if $\mathfrak{a}$ is prime, then $R/\mathfrak{a}$ is an integral domain. Suppose not, then there exists two non-zero elements $a,b$ such that $ab=0$ in $R/\mathfrak{a}.$ This means that $a \bmod{\mathfrak{a}} \neq 0$ and $a\bmod{\mathfrak{a}} \neq 0$ but $ab\bmod{\mathfrak{a}} = 0$ or in other words $ab \in \mathfrak{a}$ but neither $a$ nor $b$ belongs to $\mathfrak{a}.$ This contradicts the assumption that $\mathfrak{a}$ and hence $R$ has to be an integral domain.\\ As for the case when $\mathfrak{a}$ is maximal, assume that $R/\mathfrak{a}$ is not a field. Then there exists some non-zero element that is not invertible. Look at the ideal generated by this element. As in the earlier theorem, this is a non-trivial ideal (neither $0$ nor the entire ring). But in the map from $R$ to $R/\mathfrak{a}$, ideals of $R/\mathfrak{a}$ corresponds to ideals in $R$ that contain $\mathfrak{a}.$ Since we just found a non-trivial ideal in $R/\mathfrak{a}$, this would translate to a non-trivial ideal in $R$ that properly contains $\mathfrak{a}$ thus contradicting the maximality of $\mathfrak{a}.$ Thus $R$ has to be a field. \end{proof} \subsection{Some Insights} This is not completely a part of the course but it would be useful to know this to understand factorization. In any ring, we can talk of a tower of prime ideals. What this means is a series of the form $0 \subseteq I_1 \subseteq I_2 \subseteq \cdots \subseteq I_n \subseteq R$ such that each ideal $I_j$ is a prime ideal. The number $n$ is called the Krull Dimension of the ring $R.$ The Krull Dimension is actually a local property but for it is well defined for rings like $K[X_1,X_2,\cdots,X_n]$ (where $K$ is a field) and $\Z[X].$ If we were to look at $K[X,Y]$, we have the tower $0\subseteq (X) \subseteq (X,Y) \subseteq K[X,Y].$ The krull dimension of this ring is $2.$ Similarly the ring of polynomials on $n$ variables over a field $K$ will have a krull dimension of $n.$ And the ring $\Z[X]$ has the tower $0 \leq (p) \leq (p,X) \leq \Z[X]$ and hence has krull dimension $2.$ We shall see soon that factorization of polynomials in $\Z[X]$ is so similar to factorization of polynomials in $K[X,Y].$ \\ We need to understand the concept of finite fields, extensions, etc before we get into factorization. We shall first spend some time on this. \section{Finite Fields} We shall be studying properties of fields that have finite number of elements in them. A few things to keep in mind, we shall prove them soon, is that any finite field has its cardinality to be a power of prime. There cannot exist a finite field whose cardinality is divisible by two distinct primes. And infact, for any prime $p$ and $\alpha$, there is exactly one field of size $p^\alpha.$ (and note that this isn't true on the infinite setting. $\R$ and $\C$ both have infinite number of elements but are clearly different) \begin{definition} A field $E$ is called an extension of a field $K$ if $E$ is a field that contains $K.$ This (also) is denoted by $E/K.$ \end{definition} There is a notion of a degree of a field extension but one needs to be familiar with vector spaces to completely understand this. We shall dwell a little on it. \subsection{Vector Spaces} \begin{definition} A vector space $V$ over a field $K$, with an additive structure and multiplication by elements of $K$ (scalars), satisfies the following conditions: \begin{itemize} \item $(V,+)$ is an additive abelian (commutative) group (additive closure, inverse, identity) \item For any vector $v\in V$ and scalar $\alpha \in K$, the element $\alpha v$ is also a vector. \item For any vectors $u,v\in V$ and scalar $\alpha \in K$, we have $\alpha(u+v) = \alpha u + \alpha v.$ \item For any vector $u$ and scalars $\alpha,\beta \in K$, we have $(\alpha + \beta)u = \alpha u + \beta u$ and $\alpha(\beta u) = (\alpha \beta)u.$ \end{itemize} \end{definition} Let us look at a few examples to get ourself familiar with this notion. $\C$ forms a vector space over $\R.$ Clearly the above properties are satisfied. Another example is the plane $\R^2$, set of point $(x,y)$ where both coordinates are from the reals. Scalar multiplication is defined as $\alpha(x,y) = (\alpha x, \alpha y).$ Another example is the ring of polynomials $K[X]$ over $K$ where $K$ is a field. Scalar multiplication is just multiplying every coefficient by the scalar. \\ Next we need a notion of linear independance. \begin{definition} A set $\inbrace{v_1, v_2, \cdots, v_k}$ is said to be linearly independant if the only way $$ c_1v_1 + c_2v_2 + \cdots + c_kv_k = 0 $$ can happen for scalars $c_i$ is when all the $c_i$'s are zero themselves. That is, no non-trival linearl combination of these vectors is zero. \end{definition} For example, let us look at each of our examples stated and find a linearly independant set. In $\C$ over $\R$, look at the set $\inbrace{3,2+i}.$ Suppose $c_1(3) + c_2(2+i) = 0$, then $(3c_1 + 2c_2) + c_2i = 0$ and this is possible only when both $c_1$ and $c_2$ are zero. Hence the set is linearly independant. And again, look at the set $\inbrace{(1,0),(0,1)}$ in $\R^2.$ This again is linearly independant since the only way $c_1(1,0) + c_2(0,1) = (c_1,c_2) = (0,0)$ is when both $c_1$ and $c_2$ are zero. In the third example, look at the set $\inbrace{1,X,X^2}.$ $c_1 + c_2X + c_3X^2$ can be the zero polynomial if and only if all the $c_i$'s are zero. \\ This is the notion of linear independance. With a little bit of thought, any vector that can be represented as a linear sum from such a set is infact uniquely represented so. For example, let us assume that $\inbrace{v_1, v_2,\cdots, v_k}$ was a linearly independant set. Let $v = c_1v_1 + c_2v_2 + \cdots + c_kv_k.$ Suppose this could be represented as a linear sum in a different way, we shall obtain a contradiction. \bea{ v & = & c_1v_1 + c_2v_2 + \cdots + c_kv_k\\ & = & c_1'v_1 + c_2'v_2 + \cdots + c_k'v_k\\ \implies 0 &= &(c_1 - c_1')v_1 + \cdots + (c_k - c_k')v_k } And if the two representations were indeed different, there is atleast one $i$ such that $c_i \neq c_i' \implies (c_i - c_i')\neq 0$ but this would give a non-trivial linaer combination of the $v_i$'s to become zero. This contradicts our assumption that they were linearly independant. Hence such linear representations are unique. \\ An example is that every point $(x,y)$ can be represented uniquely as a linear sum of $(1,0)$ and $(0,1)$ (it is just $x(1,0) + y(0,1)$). The students are encouraged to also check it for $\C$ with our linearly independant set being $\inbrace{3,2+i}.$ Let us look at our example of $K[X]$. We saw that $\inbrace{1,X,X^2}$ was a linearly independant subset but the term $X^5$ can never be written as a linear sum of $1,X,X^2.$ Thus, the set $\inbrace{1,X,X^2}$ doesn't cover or {\em span} $X^5.$ Since $X^5$ is not spanned by the set $\inbrace{1,X,X^2}$, we can add it to the set and it would still be linearly independant. We can keep adding elements to our linearly independant set in this way by picking up some vector that is not spanned by it and adding it. This process can go on indefinitely as well. For the moment let us look at the case where this process stops after finite number of steps. Now we have a set that is linearly independant and it also spans the entire space. An example would be to look at $\C$ over $\R.$ Start with $3$. The linear span of this is just elements of the form $3c$ where $c$ is a real number. Hence it does not span elements like $2+i.$ Hence we can add $2+i$ to this set and still have a linearly independant set. Now this set $\inbrace{3,2+i}$ is linearly independant and also spans the entire space. Any complex number $a+ib$ is equal to $b(2+i) + \frac{a - 2b}{3}3.$ Such a set that spans the space and also is linearly independant is called a {\em basis} of the vector space $V$ over $K$. And every vector in the vector space can be expressed as a linear combination of the basis elements, and uniquely so. The number of basis elements is called the dimension of the vector space. But wait, how do we know that every basis will have the same number of elements? Is it possible that I can find three complex numbers that are linearly independant over $\R$ and span $\C$? The answer is no. It is not so hard to see that all basis must have the same number of elements. Thus the dimension of the vector space is independant of the choice of basis is hence well-defined.\\ The vector space $K[X]$ over $K$ has infinite dimension and its basis could be chosen as $\inbrace{1,X,X^2,\cdots}.$ And a polynomial, say $80 + 2X + 0X^3 + 3X^4$ can be represented by the tuple $(80,2,0,3,0,0,0,\cdots)$ and every polynomial has a corresponding tuple. Suppose we choose $\inbrace{1,i}$ as a basis for $\C$ over $\R$, then every element $a+bi$ can be expressed as $a(1) + b(i).$ Now we can represent the number $a+bi$ as the tuple $(a,b).$ So essentially, a vector space $V$ over $K$ is one where each element in it can be represented as a tuple, whose entries come from $K.$ The arity of the tuple is the dimension of the vector space. Thus, in the finite setting, if $V$ is a finite dimensional (say $d$-dimensional) vector space over a finite field $K$, then the number of elements of $V$ is $|K|^d.$ This is clear since it is just the number of $d$-tuples whose entries come from $K.$ \subsection{Field Extensions} Let $E/K$ be a field extension. This just means that both $E$ and $K$ are fields and that $E$ contains $K.$ Now observe that for all $\alpha,\beta \in K$ and $u,v\in E$, $\alpha(u+v) = \alpha u + \alpha v$ and $(\alpha + \beta) u = \alpha u + \beta u$ etc. Thus all the conditions to call this a vector space hold. Thus, we can think of $E$ as a vector space over $K.$ An example of this we have seen already. $\C$ is a field that contains $\R.$ And $\C$ actually is a vector space over $\R.$ Another example would be to look at $$ \Q[\sqrt{2}] = \setdef{a+b\sqrt{2}}{a,b\in \Q} $$ It is easy to check that this is a field and this clearly contains $\Q.$ And this also naturally forms a vector space over $\Q.$ \begin{definition} The dimension of $E$ as a vector space over $K$ is called the degree of the extension $E/K$. This is denoted by $[E:K].$ \end{definition} $\C$ over $\R$ is a 2-dimensional extension. $\R$ over $\Q$ is an infinite dimensional extension. $\Q[\sqrt{2}]$ over $\Q$ is a $2$ dimensional extension. \subsubsection*{Adjoining Elements: An informal discussion} The field $\C$ is just taking $\R$ and adding the element $i$ to it. Once we add $i$ to $\R$, we just take all possible linear combinations, products, inverses to make it a field. We let the set $\R \cup {i}$ grow into the smallest field containing $\R$ and $i.$ This is formally referred to as $\R(i)$, the field got by adjoining $i$ to $\R.$ It is easy to check that $\Q(\sqrt{2})$ is infact $\Q[\sqrt{2}] = \setdef{a+b\sqrt{2}}{a,b\in Q}.$ And similarly one can also check that $\Q(\sqrt[3]{2}) = \setdef{a + b\sqrt[3]{2} + c\sqrt[3]{2^2}}{a,b,c\in \Q}.$ From this is it easily seen that $\Q(\sqrt[3]{2})$ is a degree $3$ extension over $\Q.$ Given such an adjointed field extension, is it easy to find out the degree? The answer is yes. All we need to do is choose an easy basis for the vector space. For example, let us look again at $\Q(\sqrt[3]{2}).$ Let $\alpha = \sqrt[3]{2}.$ We want the degree of the extension $\Q(\alpha)/\Q.$ Now consider the set $\inbrace{1,\alpha, \alpha^2, \alpha^3,\cdots...}.$ When does this fail to be a linearly independant subset? We know that $\alpha^3 - 2 = 0$ and hence it loses its linear independance after $\alpha^3.$ This is because $\alpha$ was a root of $X^3 - 2$, a degree $3$ polynomial over the $\Q.$ Instead if we were to look any $\alpha$, any equation of linear dependance would look like $a_0 + a_1\alpha + a_2\alpha^2 + \cdots a_k\alpha^k = 0$ and this would just mean that $\alpha$ is a root of the polynomial $a_0 + a_1X + a_2X^2 + \cdots a_kX^k = 0.$ Thus, the degree of such an extension $\Q(\alpha)/\Q$ is just the degree of the smallest degree polynomial of which $\alpha$ is a root. $\C = \R(i)$ and $i$ has $X^2 + 1$ as its minimum polynomial and thus $[\C:\R] = 2.$ If we were to look at $\Q(\pi)$, $\pi$ is not a root of any polynomial with coefficients in $\Q$ (this is also referred as '$\pi$ is transcendental'). Thus the set $\inbrace{1,\pi,\pi^2,\cdots}$ would be an infinite linearly independant subset. And hence the extension $\Q(\pi)$ over $\Q$ is of infinite degree. \\ We shall look at more properties of finite fields and extensions next time. \end{document}