Logic: Lecture 18, 20 October 2015
----------------------------------

Undecidability of First Order Logic

The decision problem
--------------------

Given a FOL formula phi, can we decide if phi is valid?

- Require an algorithm that halts and says yes or no
- The question "is phi satisfiable" is equivalent, since phi is
  valid iff ~phi is not satisfiable

For propositional logic, we can build a truth table over the
finite vocabulary of phi to decide satisfiability/validity.

For FOL, the question is undecidable.  The proof is a reduction
from the halting problem for two register machines (based on
Ben-Ari, Mathematical Logic for Computer Science, 3rd Ed, Chap
12.1)


Two register machines
---------------------

- Two registers, X and Y, each can hold a natural number from the
  set {0,1,2,....}

- A program is a sequence of instructions:

L0: Instruction 0
L1: Instruction 1
L2: Instruction 2
...
Ln: Instruction n

There are four types of instructions:

X := X + 1
Y := Y + 1
if X == 0 then go to Li else X := X-1
if Y == 0 then go to Li else Y := Y-1

The first two types of instructions increment X and Y
respectively and control moves to the next instruction.  The next
two types of instructions conditionally decrement X and Y.  If
the register to be decremented is 0, control shifts to the
location specified in the "go to", otherwise the decrement
succeeds and control moves to the next instruction.  The "go to"
can make control flow backwards or forwards, so we can implement
if-then-else and while loops using this conditional decrement.

The machine starts at L0 with X, Y both set to 0.  The machine
halts if it executes Ln.

For example, the following program does not halt.

L0: X := X+1
L1: if Y == 0 then go to L1 else Y := Y-1
L2: Y := Y+1

The following program halts and never executes instructions
L2,L3,L4.

L0: X := X+1
L1: if Y == 0 then go to L5 else Y := Y-1
L2: Y := Y+1
L3: if Y == 0 then go to L5 else Y := Y-1
L4: if Y == 0 then go to L2 else Y := Y-1
L5: X := X+1

Notice that a given program is deterministic: it follows a fixed
trajectory and either visits Ln or goes into an infinite loop
without ever reaching Ln.

Halting problem: Given a program of a two register machine, is
the instruction at Ln ever executed?

This problem is known to be undecidable (for a proof, see Marvin
Minsky's book Computation: Finite and Infinite Machines)

Reduction from two register halting problem to FOL validity
-----------------------------------------------------------

The state of a two register machine consists of the next
instruction to be executed (the "program counter") and the values
of registers X and Y.

Assume we have fixed a register machine program with instruction
locations L0, L1, ..., Ln.

- For i in {0,1,...,n}, our language has a binary predicate
  p_i(x,y).  For natural numbers m and n, p_i(m,n) denotes the
  state of the machine where the next instruction to be executed
  is at location Li and the current values of X and Y are m and
  n, respectively.

- We have a constant symbol 0 to denote the number zero and a
  unary function s() to denote the successor function

We translate the "effect" of each instruction as follows.  We
assume that the instruction is at a location Li and construct a
corresponding sentence Si.

Li: X := X + 1
Si: Ax Ay [ p_i(x,y) -> p_{i+1}(s(x),y) ]

Li: Y := Y + 1
Si: Ax Ay [ p_i(x,y) -> p_{i+1}(x,s(y)) ]

Li: if X == 0 then go to Lj else X := X-1
Si: Ay [ p_i(0,y) -> p_j(0,y) ] and
    Ax Ay [ p_i(s(x),y) -> p_{i+1}(x,y) ]

Li: if Y == 0 then go to Lj else Y := Y-1
Si: Ax [ p_i(x,0) -> p_j(x,0) ] and
    Ax Ay [ p_i(x,s(y)) -> p_{i+1}(x,y) ]

The first two formulas capture the (unconditional) update of the
state and instruction location for increment instructions.  The
third and fourth formulas have two parts to describe the effect
of a conditional decrement.  

Given a program with instruction locations L0,L1,...,Ln, we can
write a corresponding set of sentences S0,S1,...,Sn using the
translation above.

We then write an overall sentence SM describing the fact that the
machine has a halting computation.

  SM :: (p_0(0,0) and S0 and S1 and ... and Sn) -> Eu Ev p_n(u,v)

Theorem: M halts iff SM is valid

Proof:

(=>) If M halts, SM is valid

If M halts, there is a fixed halting computation that visits
states s0,s1,...,sm such that:

- At state s0, the instruction location is L0 and the values of X
  and Y are both 0.

- Each intermediate state si in the computation corresponds to an
  instruction location Lj with register values a_i,b_i for X and
  Y, respectively.

- At state sm, the machine is in a halting state.  In other
  words, at sm, M is at instruction location Ln, with register
  values a_m,b_m for X and Y, respectively.

We have to show that SM is true under all interpretations.

Let I be an interpretation where either p_0(0,0) or one of the
sentences Si is false.  Then SM is trivially true.  So we only
need to prove the result for I such that I |= p_(0,0) and I |= Si
for every i in {0,1,...,n}.

For every state si, i in {0,1,...,m}, along the halting
computation, we show that I |= p_j(s^{a_i}(0),s^{b_i}(0)), where
Lj is the instruction location of M at si and the values of X and
Y are si are a_i and b_i, respectively.  For convenience, we just
use a_i and b_i to denote s^{a_i}(0) and s^{b_i}(0),
respectively.

We prove this by induction on i.

Base case: i = 0

This is the initial state and M is at location L0 with X and Y
both set to 0.  By assumption, I |= p_0(0,0), so the claim holds.

Induction step: i > 0

We consider the state s_{i-1}.  This corresponds to some
instruction Lj with register values X == a_{i-1} and Y ==
b_{i-1}.  By the induction hypothesis, I |= p_j(a_{i-1},b_{i-1}).
We do a case analysis based on Lj:

Case 1: Lj is of the form "X := X+1"

Then, at si, M should be at L{j+1}, with a_i = s(a_{i-1}) and b_i
= b_{i-1}.

By assumption, I |= Sj, where Sj represents the effect of "X :=
X+1" (see the translation for the four types of instructions,
above).  Hence, I |= Ax Ay [ p_j(x,y) -> p_{j+1}(s(x),y) ] 

Combining this with the induction hypothesis, I |=
p_j(a_{i-1},b_{i-1}), and using the semantics of FOL, one can check
that I |= p_{j+1}(s(a_{i-1}),b_{i-1}).  Since a_i = s(a_{i-1})
and b_i = b_{i-1}, we have I |= p_{j+1}(a_i,b_i), as required.

Case 2: Lj is of the form "Y := Y+1" is symmetric to Case 1

Case 3: Lj is of the form "if X == 0 then go to Lk else X := X-1"

- If a_{i-1} is 0, it must be the case that at si, M is at Lk,
  with a_i = 0 and b_i = b_{i-1}.

  By the induction hypothesis, I |= p_j(0,b_{i-1}).  By
  assumption, from the translation of the conditional decrement
  instruction, I |= Ay [ p_j(0,y) -> p_k(0,y) ].  Combining
  these, using the semantics of FOL, we get I |= p_k(0,b_{i-1}).
  Since b_i = b_{i-1}, we have I |= p_k(0,b_i), as required.

- If a_{i-1} is not 0, it must be the case that at si, M is at L{j+1},
  with a_i = a_{i-1}-1 and b_i = b_{i-1}.

  By the induction hypothesis, I |= p_j(a_{i-1},b_{i-1}).  By
  assumption, from the translation of the conditional decrement
  instruction, I |= Ax Ay [ p_j(s(x),y) -> p_{j+1}(x,y) ].
  Recall that a_{i-1} is an abbreviation for s^{a_{i-1}}(0) and
  a_{i-1} is not 0.  Combining these, using the semantics of FOL,
  we get I |= p_k(s^{a_{i-1}-1)(0),b_{i-1}).  Since a_i =
  a_{i-1]-1 = s^{a_{i-1}-1)(0) and b_i = b_{i-1}, we have I |=
  p_{j+1}(a_i,b_i), as required.

Case 4: Lj is of the form "if Y == 0 then go to Lk else Y := Y-1"
  is symmetric to Case 3

Thus, by induction, we have I |= p_n(a_m,b_m) at sm.  Hence, I |=
Eu Ev p_n(u,v), as required.

(<=) If SM is valid, M halts.

Since SM is valid, it is true over all interpretations.  We pick
an interpretation I where the underlying set is Nat =
{0,1,2,...}, the constant symbol 0 is mapped to the number 0 and
s() is mapped to the successor function.  

Let s0,s1,s2,... be the (unique) run of the machine M.  Define
the interpretation of each predicate p_j(x,y) to be precisely the
set of reachable states along this run: in other words, I |=
p_j(a_i,b_i) iff there is a state si on this run with instruction
location Lj and register values X == a_i and Y == b_i.

Since the chosen intepretation I respects the operational
semantics of M, we can check that I satisfies the antecedent
(p_0(0,0) and S0 and S1 and ... and Sn) of SM.  Since I |= SM, we
also have I |= Eu Ev p_n(u,v).  From the way I is defined, there
must be a concrete state sm visited by M where the instruction
location is Ln, with register values X == a_m and Y == b_m.
Hence M halts.

======================================================================