\documentclass[11pt]{article}
\usepackage{latexsym}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{amsthm}
\usepackage{hyperref}
\usepackage{algorithmic}
\usepackage{algorithm}
\newcommand{\handout}[5]{
\noindent
\begin{center}
\framebox{
\vbox{
\hbox to 5.78in { {\bf Algebra and Computation } \hfill Course Instructor: #2 }
\vspace{4mm}
\hbox to 5.78in { {\Large \hfill #5 \hfill} }
\vspace{2mm}
\hbox to 5.78in { {\em #3 \hfill #4} }
}
}
\end{center}
\vspace*{4mm}
}
\newcommand{\lecture}[4]{\handout{#1}{#2}{Lecturer: #3}{Scribe: #4}{Lecture #1}}
\newtheorem{theorem}{Theorem}
\newtheorem{theorem*}{Theorem}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{observation}[theorem]{Observation}
\newtheorem{proposition}[theorem]{Proposition}
\newtheorem{definition}[theorem]{Definition}
\newtheorem{claim}[theorem]{Claim}
\newtheorem{fact}[theorem]{Fact}
\newtheorem{subclaim}[theorem]{Subclaim}
% my custom commands
\newcommand{\inparen}[1]{\left(#1\right)} %\inparen{x+y} is (x+y)
\newcommand{\inbrace}[1]{\left\{#1\right\}} %\inbrace{x+y} is {x+y}
\newcommand{\insquar}[1]{\left[#1\right]} %\insquar{x+y} is [x+y]
\newcommand{\inangle}[1]{\left\langle#1\right\rangle} %\inangle{A} is
\newcommand{\abs}[1]{\left|#1\right|} %\abs{x} is |x|
\newcommand{\norm}[1]{\left\Vert#1\right\Vert} %\norm{x} is ||x||
\newcommand{\union}{\cup}
\newcommand{\Union}{\bigcup}
\newcommand{\intersection}{\cap}
\newcommand{\super}[2]{#1^{\inparen{#2}}} %\super{G}{i-1} is G^{(i-1)}
\newcommand{\setdef}[2]{\inbrace{{#1}\ : \ {#2}}}
\newcommand{\inv}[1]{#1^{-1}}
% Commands specific to this file
% TODO: Find the right way to typeset group index
\DeclareMathOperator{\Sym}{Sym}
\newcommand{\gpidx}[2]{\insquar{#1 : #2}} %\gpidx{H}{K} is [H : K]
\newcommand{\gpigs}[2]{\gpidx{\super{G}{#1}}{\super{G}{#2}}} %Group index of g super ...
\newcommand{\llhd}{\!\!\lhd\!\!\lhd}
% \newcommand{\ceil}[1]{\lceil #1 \rceil}
\newcommand{\floor}[1]{\lfloor #1 \rfloor}
% \newcommand{\f}{\mathbb{F}}
% \newcommand{\R}{\mathbb{R}}
%for algorithms
\renewcommand{\algorithmicrequire}{\textbf{Input:}}
\begin{document}
\lecture{15: Bivariate Factorization: Missing Pieces}{V. Arvind}{V.
Arvind}{Ramprasad Saptharishi}
\section{Overview}
Last class we did bivariate factorization, but we made some
assumptions in the beginning. The hope was that with some
preprocessing, the assumptions can be guarenteed. This class we shall
see what those preprocessing steps are.
After that, we shall discuass a Hensel Lifting take on Newton's root
finding algorithm.
\section{The Missing Pieces}
The algorithm relies on the assumption that the factorization of $f$
and $f(x,0)$ is square free since we want the pseodo-gcd of factors to
be $1.$ We need to make sure that we can pull out repeated factors in
the beginning.
\subsection{$f$ is square free}
In the univariate case, this was trivial since we just had to take the
derivative and do it. Multivariate cases are a little tricky. The
first step is to remove the {\em content} of each variable from the
polynomial.
Think of the polynomail $f$ as one over $F[y][x]$, a univariate
polynomial with coefficients coming from $F[y].$ The $y$-content of
$f$ is defined as the $\gcd$ of the coefficients of the polynomial
when considered as one in $F[y][x].$
The $x$-content and $y$-content are clearly factors of $f$ and hence
we can factorize them using univariate factorization. Hence we can
assume that
$$
f = f_1^{e_1}f_2^{e_2}\cdots f_k^{e_k}
$$
where each $f_i$ is an irreducible factor with $x$-content and
$y$-content being $1.$ Let us look at this as $f = f_1^e h.$ Then,
$$
\pderiv{f}{x} = ef_1^{e-1}h\pderiv{f_1}{x} + f_1^e\pderiv{h}{x}
$$
Suppose both $\pderiv{f}{x}$ and $\pderiv{f}{y}$ are zero, then the
only way this can happen if each power of $x$ and $y$ is a multiple of
$p.$ Hence $f(x,y) = g(x^p,y^p)$ and this can be checked easily and it
now just amounts to factorizing $g.$
We can now assume without loss of generality that $\pderiv{f}{x}$ is
non-zero. Now suppose that $\pderiv{f_1}{x}$ was non-zero, then
clearly from the above equation the largest power of $f_1$ that
divides $\pderiv{f}{x}$ is $f^{e-1}.$
Let $u = \pderiv{f}{x}$, $v = \pderiv{f}{y}$, $u' = f/\gcd(f,u)$ and
$v' = f/\gcd(f,v)$, whenever they are non-zero. The bad news is that,
since some of the $\pderiv{f_i}{x}$ could be zero, it misses out the
factors that are $x^p$ polynomials. The good news is that, these are
the only things that $u'$ and $v'$ would miss.
Hence, factorize $u'$ and $v'$, then divide $f$ by the collection of
factors. We are then assured that the remaining factors have to be
polynomails of $x^p$ and $y^p$. We can them make the transformation
and recurse.
Thus, we can ensure that $f$ does not have any repeated factors.
\subsection{$f(x,0)$ is square free}
Though we have $f(x,y)$ to be square free, substituting $0$ for $y$
would case certain factors to collapse; $f(x,0)$ could have repeated
roots. The trick is to make a small change of variables to ensure that
it is square free.
Replace $f(x,y)$ by $f_\beta(x,y) = f(x,y+\beta).$ We need to show
that there exists a $\beta$ such that $f(x,\beta) = f_\beta(x,0)$ is
square free.
If $f' = 0$, then reverse the roles of $x$ and $y$ (if both are zero,
then it is a polynomial of $x^p$ and $y^p$). Note that $\gcd(f,f')
\neq 1$ if and only if $Res_x(f,f')=0.$ And since the resultant is a
polynomial of degree $2d^2$, this can have atmost $2d^2$ roots of $F.$
Hence if $|F| > 2d$, we can just substitute $2d^2+1$ values for $y$ and
we would get a polynomial where the residue is non-zero, and thus
$f_\beta(x,0)$ would be square free.
Hence, all that's left to do is the case when $|F| \leq 2d^2.$ The
trick is to go to a larger field and work there. Suppose $F$ =
$\F_q$, choose a prime $t$ such that $q^t > 2d^2$ and $t>\deg f.$
Replace $F$ by $\F_{q^t}$ (just find an irreducible polynomial of
degree $t$ and work in $\F_q$ modulo that polynomial). In this larger
field, the irreducible factors could split even further.
$$
f = f_1'f_2'\cdots f_{k'}'
$$
where bunches of these factors correspond to the original factors. To
study these bunches, we need an important map known as the {\em
Frobenius map.}
\begin{eqnarray*}
\sigma:\F_{q^t} &\longrightarrow & \F_{q^t}\\
a & \mapsto & a^q
\end{eqnarray*}
Note that the map fixes every element of $\F_q$ pointwise, and is an
automorphism. This can be naturally extended to the ring $\F_q[x,y].$
And since $f_1 \in \F_q[x,y]$, the frobenius map will fix it. We are
interested in finding the bunch of $f_i'$ that correspond to $f_1.$
Suppose $f_1'\mid f_1$, then by the automorphism, $\sigma(f_1') \mid
f_1$, $\sigma^2(f_1') \mid f_1$ and so on.
Since $\sigma^t(f_1') = f_1'$, for any $r$ such that $\sigma^r(f_1') =
f_1'$ will force $r$ to divide $t.$ Since $t$ is chosen to be a prime,
either $r=t$ or $r=1.$ If $r=t$, then each of the $t$ elements of the
form $\sigma^i(f_1')$ would be a factor of $f_1.$ But since $t > \deg
f$, all of them cannot fit inside $f$.
Hence $r=1$, and thus the factorization does not fit further in
$\F_{q^t}.$ We can now hunt for a $\beta$ here to make it square
free.
\section{Hensel Lifting and Newton Rhapson}
Suppose we are given a polynomial $f(x) \in \Z[x]$, we want to find a
root of $f$ efficiently by successive approximations. We shall do this
using Hensel lifting.
Pick a small prime $p$ such that $f(x)$ is square free.
\begin{eqnarray*}
f(x) & = & f_0 + f_1x + f_2x^2 + \cdots f_nx^n\\
f(x+h) & = & \sum_{i=1}^n f_i(x+h)^i\\
& = & \sum_{i=1}^n f_i(x^i + ihx^i-1 + \cdots )\\
& = & f(x) + hf'(x) + h^2P(x,h)
\end{eqnarray*}
Now using Berlekamp's algorithm, find an $x$ such that $f(x) = 0 \pmod
p.$ Suppose there exists an $\hat{x}$ such that $\hat{x} = x\pmod{p}$
and $f(\hat{x}) = 0$ then $\hat{x} = x + ap$. And hence
\begin{eqnarray*}
f(\hat{x}) & = & f(x) + apf'(x) + a^2p^2P(x,ap)\\
\implies 0 = f(\hat{x}) & = & f(x) + apf'(x) \pmod{p^2}
\end{eqnarray*}
Since $f(x) = 0\pmod{p}$, it makes sense to talk about
$\inparen{f(x)/p}.$ Thus, if we were to choose $a =
\inparen{-f(x)/p}\insquar{f'(x)}^{-1}$, the above equation would be
satisfied.
\begin{eqnarray*}
a & = & \inparen{\frac{-f(x)}{p}}\insquar{f'(x)}^{-1}\pmod{p}\\
\implies \hat{x} & = & x - f(x)\insquar{f'(x)}^{-1} \pmod{p}
\end{eqnarray*}
Thus, from a factorization modulo $p$, we have gone up to $p^2$ with
$\hat{x}$ as our next approximation. \\
Newton-Rhapson also has the similar expression. You are given a
function $f$, you choose a random point $x$. The next approximation is
given by drawing the tangent to the curve $f$ at $(x,f(x))$ and taking
the point where this tangent meets the $x$-axis as its next
approximation.
The following picture would make it clear.
\includegraphics[width=2.5in]{newton.png}
If the coordinate of $C$ was $\hat{x}$, our next approximation,
\begin{eqnarray*}
f'(x) & = & \frac{f(x)}{x - \hat{x}}\\
\implies \hat{x} = x - \frac{f(x)}{f'(x)}
\end{eqnarray*}
which is exactly what we got in the Hensel Lifting method.
Newton's method however require floating point arithmetic (since
division by $f'(x)$ is actual division, unlike inverse modulo $p$ in
the hensel lifting case), while it enjoys the ease of not having to
find the inverse modulo a number.
%%% Local Variables:
%%% mode: latex
%%% TeX-master: "lecture15"
%%% End:
\end{document}