doc-src/Ref/simplifier.tex
author oheimb
Sat, 15 Feb 1997 16:10:00 +0100
changeset 2628 1fe7c9f599c2
parent 2613 cc4eb23d37b3
child 2632 1612b99395d4
permissions -rw-r--r--
description of del(eq)congs, safe and unsafe solver

%% $Id$
\chapter{Simplification} \label{simp-chap}
\index{simplification|(}

This chapter describes Isabelle's generic simplification package, which
provides a suite of simplification tactics.  It performs conditional and
unconditional rewriting and uses contextual information (`local
assumptions').  It provides a few general hooks, which can provide
automatic case splits during rewriting, for example.  The simplifier is set
up for many of Isabelle's logics: \FOL, \ZF, \HOL\ and \HOLCF.

The next section is a quick introduction to the simplifier, the later
sections explain advanced features.

\section{Simplification for dummies}
\label{sec:simp-for-dummies}

In some logics (\FOL, {\HOL} and \ZF), the simplifier is particularly easy to
use because it supports the concept of a {\em current
  simpset}\index{simpset!current}.  This is a default set of simplification
rules.  All commands are interpreted relative to the current simpset.  For
example, in the theory of arithmetic the goal
\begin{ttbox}
{\out  1. 0 + (x + 0) = x + 0 + 0}
\end{ttbox}
can be solved by a single
\begin{ttbox}
by (Simp_tac 1);
\end{ttbox}
The simplifier uses the current simpset, which in the case of arithmetic
contains the required theorems $\Var{n}+0 = \Var{n}$ and $0+\Var{n} =
\Var{n}$.

If assumptions of the subgoal are also needed in the simplification
process, as in
\begin{ttbox}
{\out  1. x = 0 ==> x + x = 0}
\end{ttbox}
then there is the more powerful
\begin{ttbox}
by (Asm_simp_tac 1);
\end{ttbox}
which solves the above goal.

There are four basic simplification tactics:
\begin{ttdescription}
\item[\ttindexbold{Simp_tac} $i$] simplifies subgoal~$i$ using the current
  simpset.  It may solve the subgoal completely if it has become trivial,
  using the solver.
  
\item[\ttindexbold{Asm_simp_tac}]\index{assumptions!in simplification}
  is like \verb$Simp_tac$, but extracts additional rewrite rules from the
  assumptions.

\item[\ttindexbold{Full_simp_tac}] is like \verb$Simp_tac$, but also
  simplifies the assumptions (but without using the assumptions to simplify
  each other or the actual goal.)

\item[\ttindexbold{Asm_full_simp_tac}]
  is like \verb$Asm_simp_tac$, but also simplifies the assumptions one by
  one.  {\em Working from left to right, it uses each assumption in the
  simplification of those following.}
\end{ttdescription}

{\tt Asm_full_simp_tac} is the most powerful of this quartet but may also
loop where some of the others terminate. For example,
\begin{ttbox}
{\out  1. ALL x. f(x) = g(f(g(x))) ==> f(0) = f(0)+0}
\end{ttbox}
is solved by {\tt Simp_tac}, but {\tt Asm_simp_tac} and {\tt Asm_simp_tac}
loop because the rewrite rule $f(\Var{x}) = f(g(f(\Var{x})))$ extracted from
the assumption does not terminate. Isabelle notices certain simple forms of
nontermination, but not this one.
 
\begin{warn}
  Since \verb$Asm_full_simp_tac$ works from left to right, it sometimes
misses opportunities for simplification: given the subgoal
\begin{ttbox}
{\out [| P(f(a)); f(a) = t |] ==> ...}
\end{ttbox}
\verb$Asm_full_simp_tac$ will not simplify the first assumption using the
second but will leave the assumptions unchanged. See
\S\ref{sec:reordering-asms} for ways around this problem.
\end{warn}

Using the simplifier effectively may take a bit of experimentation.  The
tactics can be traced with the ML command \verb$trace_simp := true$.

There is not just one global current simpset, but one associated with each
theory as well. How are these simpset built up?
\begin{enumerate}
\item When loading {\tt T.thy}, the current simpset is initialized with the
  union of the simpsets associated with all the ancestors of {\tt T}. In
  addition, certain constructs in {\tt T} may add new rules to the simpset,
  e.g.\ \ttindex{datatype} and \ttindex{primrec} in \HOL. Definitions are not
  added automatically!
\item The script in {\tt T.ML} may manipulate the current simpset further by
  explicitly adding/deleting theorems to/from it (see below).
\item After {\tt T.ML} has been read, the current simpset is associated with
  theory {\tt T}.
\end{enumerate}
The current simpset is manipulated by
\begin{ttbox}
Addsimps, Delsimps: thm list -> unit
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{Addsimps} $thms$] adds $thms$ to the current simpset
\item[\ttindexbold{Delsimps} $thms$] deletes $thms$ from the current simpset
\end{ttdescription}

Generally useful simplification rules $\Var{n}+0 = \Var{n}$ should be added
to the current simpset right after they have been proved. Those of a more
specific nature (e.g.\ the laws of de~Morgan, which alter the structure of a
formula) should only be added for specific proofs and deleted again
afterwards. Conversely, it may also happen that a generally useful rule needs
to be removed for a certain proof and is added again afterwards.  Well
designed simpsets need few temporary additions or deletions.

\begin{warn}
  The union of the ancestor simpsets (as described above) is not always a good
  simpset for the new theory.  If some ancestors have deleted simplification
  rules because they are no longer wanted, while others have left those rules
  in, then the union will contain the unwanted rules.  If the ancestor
  simpsets differ in other components (the subgoaler, solver, looper or rule
  preprocessor: see below), then you cannot be sure which version of that
  component will be inherited.  You may have to set the component explicitly
  for the new theory's simpset by an assignment of the form
 {\tt simpset := \dots}.
\end{warn}

\begin{warn}
  If you execute proofs interactively, or load them from an ML file without
  associated {\tt .thy} file, you must set the current simpset by calling
  \ttindex{set_current_thy}~{\tt"}$T${\tt"}, where $T$ is the name of the
  theory you want to work in. If you have just loaded $T$, this is not
  necessary.
\end{warn}


\section{Simplification sets}\index{simplification sets} 
The simplification tactics are controlled by {\bf simpsets}.  These consist
of five components: rewrite rules, congruence rules, the subgoaler, the
solver and the looper.  The simplifier should be set up with sensible
defaults so that most simplifier calls specify only rewrite rules.
Experienced users can exploit the other components to streamline proofs.

Logics like \HOL, which support a current simpset\index{simpset!current},
store its value in an ML reference variable usually called {\tt
  simpset}\index{simpset@{\tt simpset} ML value}. It can be accessed via
{\tt!simpset} and updated via {\tt simpset := \dots}.

\subsection{Rewrite rules}
\index{rewrite rules|(}
Rewrite rules are theorems expressing some form of equality:
\begin{eqnarray*}
  Suc(\Var{m}) + \Var{n} &=&      \Var{m} + Suc(\Var{n}) \\
  \Var{P}\conj\Var{P}    &\bimp&  \Var{P} \\
  \Var{A} \un \Var{B} &\equiv& \{x.x\in \Var{A} \disj x\in \Var{B}\}
\end{eqnarray*}
Conditional rewrites such as $\Var{m}<\Var{n} \Imp \Var{m}/\Var{n} =
0$ are permitted; the conditions can be arbitrary terms.

Internally, all rewrite rules are translated into meta-equalities, theorems
with conclusion $lhs \equiv rhs$.  Each simpset contains a function for
extracting equalities from arbitrary theorems.  For example,
$\neg(\Var{x}\in \{\})$ could be turned into $\Var{x}\in \{\} \equiv
False$.  This function can be installed using \ttindex{setmksimps} but only
the definer of a logic should need to do this; see \S\ref{sec:setmksimps}.
The function processes theorems added by \ttindex{addsimps} as well as
local assumptions.


\begin{warn}
The simplifier will accept all standard rewrite rules: those
where all unknowns are of base type.  Hence ${\Var{i}+(\Var{j}+\Var{k})} =
{(\Var{i}+\Var{j})+\Var{k}}$ is ok.

It will also deal gracefully with all rules whose left-hand sides are
so-called {\em higher-order patterns}~\cite{nipkow-patterns}. These are terms
in $\beta$-normal form (this will always be the case unless you have done
something strange) where each occurrence of an unknown is of the form
$\Var{F}(x@1,\dots,x@n)$, where the $x@i$ are distinct bound variables.
Hence $(\forall x.\Var{P}(x) \land \Var{Q}(x)) \bimp (\forall x.\Var{P}(x))
\land (\forall x.\Var{Q}(x))$ is also ok, in both directions.

In some rare cases the rewriter will even deal with quite general rules: for
example ${\Var{f}(\Var{x})\in range(\Var{f})} = True$ rewrites $g(a) \in
range(g)$ to $True$, but will fail to match $g(h(b)) \in range(\lambda
x.g(h(x)))$. However, you can replace the offending subterms (in our case
$\Var{f}(\Var{x})$, which is not a pattern) by adding new variables and
conditions: $\Var{y} = \Var{f}(\Var{x}) \Imp \Var{y}\in range(\Var{f})
= True$ is acceptable as a conditional rewrite rule since conditions can
be arbitrary terms.

There is no restriction on the form of the right-hand sides.
\end{warn}

\index{rewrite rules|)}

\subsection{*Congruence rules}\index{congruence rules}
Congruence rules are meta-equalities of the form
\[ \List{\dots} \Imp
   f(\Var{x@1},\ldots,\Var{x@n}) \equiv f(\Var{y@1},\ldots,\Var{y@n}).
\]
This governs the simplification of the arguments of~$f$.  For
example, some arguments can be simplified under additional assumptions:
\[ \List{\Var{P@1} \bimp \Var{Q@1};\; \Var{Q@1} \Imp \Var{P@2} \bimp \Var{Q@2}}
   \Imp (\Var{P@1} \imp \Var{P@2}) \equiv (\Var{Q@1} \imp \Var{Q@2})
\]
Given this rule, the simplifier assumes $Q@1$ and extracts rewrite rules
from it when simplifying~$P@2$.  Such local assumptions are effective for
rewriting formulae such as $x=0\imp y+x=y$.  The local assumptions are also
provided as theorems to the solver; see page~\pageref{sec:simp-solver} below.

Here are some more examples.  The congruence rule for bounded quantifiers
also supplies contextual information, this time about the bound variable:
\begin{eqnarray*}
  &&\List{\Var{A}=\Var{B};\; 
          \Forall x. x\in \Var{B} \Imp \Var{P}(x) = \Var{Q}(x)} \Imp{} \\
 &&\qquad\qquad
    (\forall x\in \Var{A}.\Var{P}(x)) = (\forall x\in \Var{B}.\Var{Q}(x))
\end{eqnarray*}
The congruence rule for conditional expressions can supply contextual
information for simplifying the arms:
\[ \List{\Var{p}=\Var{q};~ \Var{q} \Imp \Var{a}=\Var{c};~
         \neg\Var{q} \Imp \Var{b}=\Var{d}} \Imp
   if(\Var{p},\Var{a},\Var{b}) \equiv if(\Var{q},\Var{c},\Var{d})
\]
A congruence rule can also {\em prevent\/} simplification of some arguments.
Here is an alternative congruence rule for conditional expressions:
\[ \Var{p}=\Var{q} \Imp
   if(\Var{p},\Var{a},\Var{b}) \equiv if(\Var{q},\Var{a},\Var{b})
\]
Only the first argument is simplified; the others remain unchanged.
This can make simplification much faster, but may require an extra case split
to prove the goal.  

Congruence rules are added/deleted using 
\ttindexbold{addeqcongs}/\ttindex{deleqcongs}.  
Their conclusion must be a meta-equality, as in the examples above.  It is more
natural to derive the rules with object-logic equality, for example
\[ \List{\Var{P@1} \bimp \Var{Q@1};\; \Var{Q@1} \Imp \Var{P@2} \bimp \Var{Q@2}}
   \Imp (\Var{P@1} \imp \Var{P@2}) \bimp (\Var{Q@1} \imp \Var{Q@2}),
\]
Each object-logic should define operators called \ttindex{addcongs} and 
\ttindex{delcongs} that expects object-equalities and translates them into 
meta-equalities.

\subsection{*The subgoaler}
The subgoaler is the tactic used to solve subgoals arising out of
conditional rewrite rules or congruence rules.  The default should be
simplification itself.  Occasionally this strategy needs to be changed.  For
example, if the premise of a conditional rule is an instance of its
conclusion, as in $Suc(\Var{m}) < \Var{n} \Imp \Var{m} < \Var{n}$, the
default strategy could loop.

The subgoaler can be set explicitly with \ttindex{setsubgoaler}.  For
example, the subgoaler
\begin{ttbox}
fun subgoal_tac ss = assume_tac ORELSE'
                     resolve_tac (prems_of_ss ss) ORELSE' 
                     asm_simp_tac ss;
\end{ttbox}
tries to solve the subgoal by assumption or with one of the premises, calling
simplification only if that fails; here {\tt prems_of_ss} extracts the
current premises from a simpset.

\subsection{*The solver}\label{sec:simp-solver}
The solver is a pair of tactics that attempt to solve a subgoal after
simplification.  Typically it just proves trivial subgoals such as {\tt
  True} and $t=t$.  It could use sophisticated means such as {\tt
  fast_tac}, though that could make simplification expensive. 

Rewriting does not instantiate unknowns.  For example, rewriting cannot
prove $a\in \Var{A}$ since this requires instantiating~$\Var{A}$.  The
solver, however, is an arbitrary tactic and may instantiate unknowns as it
pleases.  This is the only way the simplifier can handle a conditional
rewrite rule whose condition contains extra variables. When a simplification 
tactic is to be combined with other provers, especially with the classical 
reasoner, it is important whether it can be considered safe or not. Therefore,
solver is split into a safe and anunsafe part. Both parts of the solver,
which are stored within the simpset, can be set independently using 
\ttindex{setSSolver} (with a safe tactic as argument) and \ttindex{setSolver}
(with an unsafe tactic) . Additional solvers, which are tried after the already
set solvers, may be added using \ttindex{addSSolver} and \ttindex{addSolver}. 

The standard simplification procedures uses solely the unsafe solver, which is
appropriate in most cases. But for special applications, where the simplification
process is not allowed to instantiate unknowns within the goal, the tactic 
\ttindexbold{safe_asm_full_simp_tac} is provided. Like {\tt asm_full_simp_tac}, 
it uses the unsafe solver for any embedded simplification steps 
(i.e. for solving subgoals created by the subgoaler), 
but for the current subgoal, it applies the safe solver.

\index{assumptions!in simplification}
The tactic is presented with the full goal, including the asssumptions.
Hence it can use those assumptions (say by calling {\tt assume_tac}) even
inside {\tt simp_tac}, which otherwise does not use assumptions.  The
solver is also supplied a list of theorems, namely assumptions that hold in
the local context.

The subgoaler is also used to solve the premises of congruence rules, which
are usually of the form $s = \Var{x}$, where $s$ needs to be simplified and
$\Var{x}$ needs to be instantiated with the result.  Hence the subgoaler
should call the simplifier at some point.  The simplifier will then call the
solver, which must therefore be prepared to solve goals of the form $t =
\Var{x}$, usually by reflexivity.  In particular, reflexivity should be
tried before any of the fancy tactics like {\tt fast_tac}.  

It may even happen that, due to simplification, the subgoal is no longer an
equality.  For example $False \bimp \Var{Q}$ could be rewritten to
$\neg\Var{Q}$.  To cover this case, the solver could try resolving with the
theorem $\neg False$.

\begin{warn}
  If the simplifier aborts with the message {\tt Failed congruence proof!},
  then the subgoaler or solver has failed to prove a premise of a
  congruence rule.  This should never occur; it indicates that some
  congruence rule, or possibly the subgoaler or solver, is faulty.
\end{warn}


\subsection{*The looper}
The looper is a tactic that is applied after simplification, in case the
solver failed to solve the simplified goal.  If the looper succeeds, the
simplification process is started all over again.  Each of the subgoals
generated by the looper is attacked in turn, in reverse order.  A
typical looper is case splitting: the expansion of a conditional.  Another
possibility is to apply an elimination rule on the assumptions.  More
adventurous loopers could start an induction.  The looper is set with 
\ttindex{setloop}. Additional loopers, which are tried after the already set
looper, may be added with \ttindex{addloop}.


\begin{figure}
\index{*simpset ML type}
\index{*empty_ss}
\index{*rep_ss}
\index{*prems_of_ss}
\index{*setloop}
\index{*addloop}
\index{*setSSolver}
\index{*addSSolver}
\index{*setSolver}
\index{*addSolver}
\index{*setsubgoaler}
\index{*setmksimps}
\index{*addsimps}
\index{*delsimps}
\index{*addeqcongs}
\index{*deleqcongs}
\index{*merge_ss}
\index{*simpset}
\index{*Addsimps}
\index{*Delsimps}
\index{*simp_tac}
\index{*asm_simp_tac}
\index{*full_simp_tac}
\index{*asm_full_simp_tac}
\index{*safe_asm_full_simp_tac}
\index{*Simp_tac}
\index{*Asm_simp_tac}
\index{*Full_simp_tac}
\index{*Asm_full_simp_tac}

\begin{ttbox}
infix 4 setsubgoaler setloop addloop setSSolver addSSolver setSolver addSolver 
        setmksimps addsimps delsimps addeqcongs deleqcongs;

signature SIMPLIFIER =
sig
  type simproc
  val mk_simproc: string -> cterm list -> (Sign.sg -> term -> thm option) -> simproc
  val name_of_simproc: simproc -> string
  val conv_prover: (term * term -> term) -> thm -> (thm -> thm)
    -> tactic -> (int -> tactic) -> Sign.sg -> term -> term -> thm      (* FIXME move?, rename? *)
  type simpset
  val empty_ss: simpset
  val rep_ss: simpset -> {simps: thm list, procs: string list, congs: thm list,
                          subgoal_tac:        simpset -> int -> tactic,
                          loop_tac:                      int -> tactic,
                                 finish_tac: thm list -> int -> tactic,
                          unsafe_finish_tac: thm list -> int -> tactic}
  val setsubgoaler: simpset *  (simpset -> int -> tactic) -> simpset
  val setloop:      simpset *             (int -> tactic) -> simpset
  val addloop:      simpset *             (int -> tactic) -> simpset
  val setSSolver:   simpset * (thm list -> int -> tactic) -> simpset
  val addSSolver:   simpset * (thm list -> int -> tactic) -> simpset
  val setSolver:    simpset * (thm list -> int -> tactic) -> simpset
  val addSolver:    simpset * (thm list -> int -> tactic) -> simpset
  val setmksimps:  simpset * (thm -> thm list) -> simpset
  val addsimps:    simpset * thm list -> simpset
  val delsimps:    simpset * thm list -> simpset
  val addeqcongs:  simpset * thm list -> simpset
  val deleqcongs:  simpset * thm list -> simpset
  val merge_ss:    simpset * simpset -> simpset
  val prems_of_ss: simpset -> thm list
  val simpset:     simpset ref
  val Addsimps: thm list -> unit
  val Delsimps: thm list -> unit
  val               simp_tac: simpset -> int -> tactic
  val           asm_simp_tac: simpset -> int -> tactic
  val          full_simp_tac: simpset -> int -> tactic
  val      asm_full_simp_tac: simpset -> int -> tactic
  val safe_asm_full_simp_tac: simpset -> int -> tactic
  val               Simp_tac:            int -> tactic
  val           Asm_simp_tac:            int -> tactic
  val          Full_simp_tac:            int -> tactic
  val      Asm_full_simp_tac:            int -> tactic
end;
\end{ttbox}
\caption{The simplifier primitives} \label{SIMPLIFIER}
\end{figure}


\section{The simplification tactics}
\label{simp-tactics}
\index{simplification!tactics}
\index{tactics!simplification}

The actual simplification work is performed by the following basic tactics:
\ttindexbold{simp_tac}, \ttindexbold{asm_simp_tac},
\ttindexbold{full_simp_tac} and \ttindexbold{asm_full_simp_tac}. They work
exactly like their namesakes in \S\ref{sec:simp-for-dummies}, except that
they are explicitly supplied with a simpset. This is because the ones in
\S\ref{sec:simp-for-dummies} are defined in terms of the ones above, e.g.
\begin{ttbox}
fun Simp_tac i = simp_tac (!simpset) i;
\end{ttbox}
where \ttindex{!simpset} is the current simpset\index{simpset!current}.

The rewriting strategy of all four tactics is strictly bottom up, except for
congruence rules, which are applied while descending into a term.  Conditions
in conditional rewrite rules are solved recursively before the rewrite rule
is applied.

The infix operations \ttindex{addsimps}/\ttindex{delsimps} add/delete rewrite
rules to/from a simpset. They are used to implement \ttindex{Addsimps} and
\ttindex{Delsimps}, e.g.
\begin{ttbox}
fun Addsimps thms = (simpset := (!simpset addsimps thms));
\end{ttbox}
and can also be used directly, even in the presence of a current simpset. For
example
\begin{ttbox}
Addsimps \(thms\);
by (Simp_tac \(i\));
Delsimps \(thms\);
\end{ttbox}
can be compressed into
\begin{ttbox}
by (simp_tac (!simpset addsimps \(thms\)) \(i\));
\end{ttbox}

The simpset associated with a particular theory can be retrieved via the name
of the theory using the function
\begin{ttbox}
simpset_of: string -> simpset
\end{ttbox}\index{*simpset_of}

To remind yourself of what is in a simpset, use the function \verb$rep_ss$ to
return its simplification and congruence rules.

\section{Examples using the simplifier}
\index{examples!of simplification}
Assume we are working within {\tt FOL} and that
\begin{ttdescription}
\item[Nat.thy] 
  is a theory including the constants $0$, $Suc$ and $+$,
\item[add_0]
  is the rewrite rule $0+\Var{n} = \Var{n}$,
\item[add_Suc]
  is the rewrite rule $Suc(\Var{m})+\Var{n} = Suc(\Var{m}+\Var{n})$,
\item[induct]
  is the induction rule $\List{\Var{P}(0);\; \Forall x. \Var{P}(x)\Imp
    \Var{P}(Suc(x))} \Imp \Var{P}(\Var{n})$.
\item[FOL_ss]
  is a basic simpset for {\tt FOL}.%
\footnote{These examples reside on the file {\tt FOL/ex/Nat.ML}.} 
\end{ttdescription}

We create a simpset for natural numbers by extending~{\tt FOL_ss}:
\begin{ttbox}
val add_ss = FOL_ss addsimps [add_0, add_Suc];
\end{ttbox}

\subsection{A trivial example}
Proofs by induction typically involve simplification.  Here is a proof
that~0 is a right identity:
\begin{ttbox}
goal Nat.thy "m+0 = m";
{\out Level 0}
{\out m + 0 = m}
{\out  1. m + 0 = m}
\end{ttbox}
The first step is to perform induction on the variable~$m$.  This returns a
base case and inductive step as two subgoals:
\begin{ttbox}
by (res_inst_tac [("n","m")] induct 1);
{\out Level 1}
{\out m + 0 = m}
{\out  1. 0 + 0 = 0}
{\out  2. !!x. x + 0 = x ==> Suc(x) + 0 = Suc(x)}
\end{ttbox}
Simplification solves the first subgoal trivially:
\begin{ttbox}
by (simp_tac add_ss 1);
{\out Level 2}
{\out m + 0 = m}
{\out  1. !!x. x + 0 = x ==> Suc(x) + 0 = Suc(x)}
\end{ttbox}
The remaining subgoal requires \ttindex{asm_simp_tac} in order to use the
induction hypothesis as a rewrite rule:
\begin{ttbox}
by (asm_simp_tac add_ss 1);
{\out Level 3}
{\out m + 0 = m}
{\out No subgoals!}
\end{ttbox}

\subsection{An example of tracing}
Let us prove a similar result involving more complex terms.  The two
equations together can be used to prove that addition is commutative.
\begin{ttbox}
goal Nat.thy "m+Suc(n) = Suc(m+n)";
{\out Level 0}
{\out m + Suc(n) = Suc(m + n)}
{\out  1. m + Suc(n) = Suc(m + n)}
\end{ttbox}
We again perform induction on~$m$ and get two subgoals:
\begin{ttbox}
by (res_inst_tac [("n","m")] induct 1);
{\out Level 1}
{\out m + Suc(n) = Suc(m + n)}
{\out  1. 0 + Suc(n) = Suc(0 + n)}
{\out  2. !!x. x + Suc(n) = Suc(x + n) ==>}
{\out          Suc(x) + Suc(n) = Suc(Suc(x) + n)}
\end{ttbox}
Simplification solves the first subgoal, this time rewriting two
occurrences of~0:
\begin{ttbox}
by (simp_tac add_ss 1);
{\out Level 2}
{\out m + Suc(n) = Suc(m + n)}
{\out  1. !!x. x + Suc(n) = Suc(x + n) ==>}
{\out          Suc(x) + Suc(n) = Suc(Suc(x) + n)}
\end{ttbox}
Switching tracing on illustrates how the simplifier solves the remaining
subgoal: 
\begin{ttbox}
trace_simp := true;
by (asm_simp_tac add_ss 1);
\ttbreak
{\out Rewriting:}
{\out Suc(x) + Suc(n) == Suc(x + Suc(n))}
\ttbreak
{\out Rewriting:}
{\out x + Suc(n) == Suc(x + n)}
\ttbreak
{\out Rewriting:}
{\out Suc(x) + n == Suc(x + n)}
\ttbreak
{\out Rewriting:}
{\out Suc(Suc(x + n)) = Suc(Suc(x + n)) == True}
\ttbreak
{\out Level 3}
{\out m + Suc(n) = Suc(m + n)}
{\out No subgoals!}
\end{ttbox}
Many variations are possible.  At Level~1 (in either example) we could have
solved both subgoals at once using the tactical \ttindex{ALLGOALS}:
\begin{ttbox}
by (ALLGOALS (asm_simp_tac add_ss));
{\out Level 2}
{\out m + Suc(n) = Suc(m + n)}
{\out No subgoals!}
\end{ttbox}

\subsection{Free variables and simplification}
Here is a conjecture to be proved for an arbitrary function~$f$ satisfying
the law $f(Suc(\Var{n})) = Suc(f(\Var{n}))$:
\begin{ttbox}
val [prem] = goal Nat.thy
    "(!!n. f(Suc(n)) = Suc(f(n))) ==> f(i+j) = i+f(j)";
{\out Level 0}
{\out f(i + j) = i + f(j)}
{\out  1. f(i + j) = i + f(j)}
\ttbreak
{\out val prem = "f(Suc(?n)) = Suc(f(?n))}
{\out             [!!n. f(Suc(n)) = Suc(f(n))]" : thm}
\end{ttbox}
In the theorem~{\tt prem}, note that $f$ is a free variable while
$\Var{n}$ is a schematic variable.
\begin{ttbox}
by (res_inst_tac [("n","i")] induct 1);
{\out Level 1}
{\out f(i + j) = i + f(j)}
{\out  1. f(0 + j) = 0 + f(j)}
{\out  2. !!x. f(x + j) = x + f(j) ==> f(Suc(x) + j) = Suc(x) + f(j)}
\end{ttbox}
We simplify each subgoal in turn.  The first one is trivial:
\begin{ttbox}
by (simp_tac add_ss 1);
{\out Level 2}
{\out f(i + j) = i + f(j)}
{\out  1. !!x. f(x + j) = x + f(j) ==> f(Suc(x) + j) = Suc(x) + f(j)}
\end{ttbox}
The remaining subgoal requires rewriting by the premise, so we add it to
{\tt add_ss}:%
\footnote{The previous simplifier required congruence rules for function
  variables like~$f$ in order to simplify their arguments.  It was more
  general than the current simplifier, but harder to use and slower.  The
  present simplifier can be given congruence rules to realize non-standard
  simplification of a function's arguments, but this is seldom necessary.}
\begin{ttbox}
by (asm_simp_tac (add_ss addsimps [prem]) 1);
{\out Level 3}
{\out f(i + j) = i + f(j)}
{\out No subgoals!}
\end{ttbox}

\subsection{Reordering assumptions}
\label{sec:reordering-asms}
\index{assumptions!reordering}

As mentioned above, \ttindex{asm_full_simp_tac} may require the assumptions
to be permuted to be more effective. Given the subgoal
\begin{ttbox}
{\out 1. [| P(f(a)); Q; f(a) = t; R |] ==> S}
\end{ttbox}
we can rotate the assumptions two positions to the right
\begin{ttbox}
by (rotate_tac ~2 1);
\end{ttbox}
to obtain
\begin{ttbox}
{\out 1. [| f(a) = t; R; P(f(a)); Q |] ==> S}
\end{ttbox}
which enables \verb$asm_full_simp_tac$ to simplify \verb$P(f(a))$ to
\verb$P(t)$.

Since rotation alone cannot produce arbitrary permutations, you can also pick
out a particular assumption which needs to be rewritten and move it the the
right end of the assumptions. In the above case rotation can be replaced by
\begin{ttbox}
by (dres_inst_tac [("psi","P(f(a))")] asm_rl 1);
\end{ttbox}
which is more directed and leads to
\begin{ttbox}
{\out 1. [| Q; f(a) = t; R; P(f(a)) |] ==> S}
\end{ttbox}

Note that reordering assumptions usually leads to brittle proofs and should
therefore be avoided. Future versions of \verb$asm_full_simp_tac$ may remove
the need for such manipulations.

\section{Permutative rewrite rules}
\index{rewrite rules!permutative|(}

A rewrite rule is {\bf permutative} if the left-hand side and right-hand
side are the same up to renaming of variables.  The most common permutative
rule is commutativity: $x+y = y+x$.  Other examples include $(x-y)-z =
(x-z)-y$ in arithmetic and $insert(x,insert(y,A)) = insert(y,insert(x,A))$
for sets.  Such rules are common enough to merit special attention.

Because ordinary rewriting loops given such rules, the simplifier employs a
special strategy, called {\bf ordered rewriting}\index{rewriting!ordered}.
There is a built-in lexicographic ordering on terms.  A permutative rewrite
rule is applied only if it decreases the given term with respect to this
ordering.  For example, commutativity rewrites~$b+a$ to $a+b$, but then
stops because $a+b$ is strictly less than $b+a$.  The Boyer-Moore theorem
prover~\cite{bm88book} also employs ordered rewriting.

Permutative rewrite rules are added to simpsets just like other rewrite
rules; the simplifier recognizes their special status automatically.  They
are most effective in the case of associative-commutative operators.
(Associativity by itself is not permutative.)  When dealing with an
AC-operator~$f$, keep the following points in mind:
\begin{itemize}\index{associative-commutative operators}
\item The associative law must always be oriented from left to right, namely
  $f(f(x,y),z) = f(x,f(y,z))$.  The opposite orientation, if used with
  commutativity, leads to looping!  Future versions of Isabelle may remove
  this restriction.

\item To complete your set of rewrite rules, you must add not just
  associativity~(A) and commutativity~(C) but also a derived rule, {\bf
    left-commutativity} (LC): $f(x,f(y,z)) = f(y,f(x,z))$.
\end{itemize}
Ordered rewriting with the combination of A, C, and~LC sorts a term
lexicographically:
\[\def\maps#1{\stackrel{#1}{\longmapsto}}
 (b+c)+a \maps{A} b+(c+a) \maps{C} b+(a+c) \maps{LC} a+(b+c) \]
Martin and Nipkow~\cite{martin-nipkow} discuss the theory and give many
examples; other algebraic structures are amenable to ordered rewriting,
such as boolean rings.

\subsection{Example: sums of integers}
This example is set in Isabelle's higher-order logic.  Theory
\thydx{Arith} contains the theory of arithmetic.  The simpset {\tt
  arith_ss} contains many arithmetic laws including distributivity
of~$\times$ over~$+$, while {\tt add_ac} is a list consisting of the A, C
and LC laws for~$+$.  Let us prove the theorem 
\[ \sum@{i=1}^n i = n\times(n+1)/2. \]
%
A functional~{\tt sum} represents the summation operator under the
interpretation ${\tt sum}(f,n+1) = \sum@{i=0}^n f(i)$.  We extend {\tt
  Arith} using a theory file:
\begin{ttbox}
NatSum = Arith +
consts sum     :: [nat=>nat, nat] => nat
rules  sum_0      "sum(f,0) = 0"
       sum_Suc    "sum(f,Suc(n)) = f(n) + sum(f,n)"
end
\end{ttbox}
After declaring {\tt open NatSum}, we make the required simpset by adding
the AC-rules for~$+$ and the axioms for~{\tt sum}:
\begin{ttbox}
val natsum_ss = arith_ss addsimps ([sum_0,sum_Suc] \at add_ac);
{\out val natsum_ss = SS \{\ldots\} : simpset}
\end{ttbox}
Our desired theorem now reads ${\tt sum}(\lambda i.i,n+1) =
n\times(n+1)/2$.  The Isabelle goal has both sides multiplied by~$2$:
\begin{ttbox}
goal NatSum.thy "Suc(Suc(0))*sum(\%i.i,Suc(n)) = n*Suc(n)";
{\out Level 0}
{\out Suc(Suc(0)) * sum(\%i. i, Suc(n)) = n * Suc(n)}
{\out  1. Suc(Suc(0)) * sum(\%i. i, Suc(n)) = n * Suc(n)}
\end{ttbox}
Induction should not be applied until the goal is in the simplest form:
\begin{ttbox}
by (simp_tac natsum_ss 1);
{\out Level 1}
{\out Suc(Suc(0)) * sum(\%i. i, Suc(n)) = n * Suc(n)}
{\out  1. n + (n + (sum(\%i. i, n) + sum(\%i. i, n))) = n + n * n}
\end{ttbox}
Ordered rewriting has sorted the terms in the left-hand side.
The subgoal is now ready for induction:
\begin{ttbox}
by (nat_ind_tac "n" 1);
{\out Level 2}
{\out Suc(Suc(0)) * sum(\%i. i, Suc(n)) = n * Suc(n)}
{\out  1. 0 + (0 + (sum(\%i. i, 0) + sum(\%i. i, 0))) = 0 + 0 * 0}
\ttbreak
{\out  2. !!n1. n1 + (n1 + (sum(\%i. i, n1) + sum(\%i. i, n1))) =}
{\out           n1 + n1 * n1 ==>}
{\out           Suc(n1) +}
{\out           (Suc(n1) + (sum(\%i. i, Suc(n1)) + sum(\%i. i, Suc(n1)))) =}
{\out           Suc(n1) + Suc(n1) * Suc(n1)}
\end{ttbox}
Simplification proves both subgoals immediately:\index{*ALLGOALS}
\begin{ttbox}
by (ALLGOALS (asm_simp_tac natsum_ss));
{\out Level 3}
{\out Suc(Suc(0)) * sum(\%i. i, Suc(n)) = n * Suc(n)}
{\out No subgoals!}
\end{ttbox}
If we had omitted {\tt add_ac} from the simpset, simplification would have
failed to prove the induction step:
\begin{ttbox}
Suc(Suc(0)) * sum(\%i. i, Suc(n)) = n * Suc(n)
 1. !!n1. n1 + (n1 + (sum(\%i. i, n1) + sum(\%i. i, n1))) =
          n1 + n1 * n1 ==>
          n1 + (n1 + (n1 + sum(\%i. i, n1) + (n1 + sum(\%i. i, n1)))) =
          n1 + (n1 + (n1 + n1 * n1))
\end{ttbox}
Ordered rewriting proves this by sorting the left-hand side.  Proving
arithmetic theorems without ordered rewriting requires explicit use of
commutativity.  This is tedious; try it and see!

Ordered rewriting is equally successful in proving
$\sum@{i=1}^n i^3 = n^2\times(n+1)^2/4$.


\subsection{Re-orienting equalities}
Ordered rewriting with the derived rule {\tt symmetry} can reverse equality
signs:
\begin{ttbox}
val symmetry = prove_goal HOL.thy "(x=y) = (y=x)"
                 (fn _ => [fast_tac HOL_cs 1]);
\end{ttbox}
This is frequently useful.  Assumptions of the form $s=t$, where $t$ occurs
in the conclusion but not~$s$, can often be brought into the right form.
For example, ordered rewriting with {\tt symmetry} can prove the goal
\[ f(a)=b \conj f(a)=c \imp b=c. \]
Here {\tt symmetry} reverses both $f(a)=b$ and $f(a)=c$
because $f(a)$ is lexicographically greater than $b$ and~$c$.  These
re-oriented equations, as rewrite rules, replace $b$ and~$c$ in the
conclusion by~$f(a)$. 

Another example is the goal $\neg(t=u) \imp \neg(u=t)$.
The differing orientations make this appear difficult to prove.  Ordered
rewriting with {\tt symmetry} makes the equalities agree.  (Without
knowing more about~$t$ and~$u$ we cannot say whether they both go to $t=u$
or~$u=t$.)  Then the simplifier can prove the goal outright.

\index{rewrite rules!permutative|)}


\section{*Setting up the simplifier}\label{sec:setting-up-simp}
\index{simplification!setting up}

Setting up the simplifier for new logics is complicated.  This section
describes how the simplifier is installed for intuitionistic first-order
logic; the code is largely taken from {\tt FOL/simpdata.ML}.

The simplifier and the case splitting tactic, which reside on separate
files, are not part of Pure Isabelle.  They must be loaded explicitly:
\begin{ttbox}
use "../Provers/simplifier.ML";
use "../Provers/splitter.ML";
\end{ttbox}

Simplification works by reducing various object-equalities to
meta-equality.  It requires rules stating that equal terms and equivalent
formulae are also equal at the meta-level.  The rule declaration part of
the file {\tt FOL/ifol.thy} contains the two lines
\begin{ttbox}\index{*eq_reflection theorem}\index{*iff_reflection theorem}
eq_reflection   "(x=y)   ==> (x==y)"
iff_reflection  "(P<->Q) ==> (P==Q)"
\end{ttbox}
Of course, you should only assert such rules if they are true for your
particular logic.  In Constructive Type Theory, equality is a ternary
relation of the form $a=b\in A$; the type~$A$ determines the meaning of the
equality essentially as a partial equivalence relation.  The present
simplifier cannot be used.  Rewriting in {\tt CTT} uses another simplifier,
which resides in the file {\tt typedsimp.ML} and is not documented.  Even
this does not work for later variants of Constructive Type Theory that use
intensional equality~\cite{nordstrom90}.


\subsection{A collection of standard rewrite rules}
The file begins by proving lots of standard rewrite rules about the logical
connectives.  These include cancellation and associative laws.  To prove
them easily, it defines a function that echoes the desired law and then
supplies it the theorem prover for intuitionistic \FOL:
\begin{ttbox}
fun int_prove_fun s = 
 (writeln s;  
  prove_goal IFOL.thy s
   (fn prems => [ (cut_facts_tac prems 1), 
                  (Int.fast_tac 1) ]));
\end{ttbox}
The following rewrite rules about conjunction are a selection of those
proved on {\tt FOL/simpdata.ML}.  Later, these will be supplied to the
standard simpset.
\begin{ttbox}
val conj_rews = map int_prove_fun
 ["P & True <-> P",      "True & P <-> P",
  "P & False <-> False", "False & P <-> False",
  "P & P <-> P",
  "P & ~P <-> False",    "~P & P <-> False",
  "(P & Q) & R <-> P & (Q & R)"];
\end{ttbox}
The file also proves some distributive laws.  As they can cause exponential
blowup, they will not be included in the standard simpset.  Instead they
are merely bound to an \ML{} identifier, for user reference.
\begin{ttbox}
val distrib_rews  = map int_prove_fun
 ["P & (Q | R) <-> P&Q | P&R", 
  "(Q | R) & P <-> Q&P | R&P",
  "(P | Q --> R) <-> (P --> R) & (Q --> R)"];
\end{ttbox}


\subsection{Functions for preprocessing the rewrite rules}
\label{sec:setmksimps}
%
The next step is to define the function for preprocessing rewrite rules.
This will be installed by calling {\tt setmksimps} below.  Preprocessing
occurs whenever rewrite rules are added, whether by user command or
automatically.  Preprocessing involves extracting atomic rewrites at the
object-level, then reflecting them to the meta-level.

To start, the function {\tt gen_all} strips any meta-level
quantifiers from the front of the given theorem.  Usually there are none
anyway.
\begin{ttbox}
fun gen_all th = forall_elim_vars (#maxidx(rep_thm th)+1) th;
\end{ttbox}
The function {\tt atomize} analyses a theorem in order to extract
atomic rewrite rules.  The head of all the patterns, matched by the
wildcard~{\tt _}, is the coercion function {\tt Trueprop}.
\begin{ttbox}
fun atomize th = case concl_of th of 
    _ $ (Const("op &",_) $ _ $ _)   => atomize(th RS conjunct1) \at
                                       atomize(th RS conjunct2)
  | _ $ (Const("op -->",_) $ _ $ _) => atomize(th RS mp)
  | _ $ (Const("All",_) $ _)        => atomize(th RS spec)
  | _ $ (Const("True",_))           => []
  | _ $ (Const("False",_))          => []
  | _                               => [th];
\end{ttbox}
There are several cases, depending upon the form of the conclusion:
\begin{itemize}
\item Conjunction: extract rewrites from both conjuncts.

\item Implication: convert $P\imp Q$ to the meta-implication $P\Imp Q$ and
  extract rewrites from~$Q$; these will be conditional rewrites with the
  condition~$P$.

\item Universal quantification: remove the quantifier, replacing the bound
  variable by a schematic variable, and extract rewrites from the body.

\item {\tt True} and {\tt False} contain no useful rewrites.

\item Anything else: return the theorem in a singleton list.
\end{itemize}
The resulting theorems are not literally atomic --- they could be
disjunctive, for example --- but are broken down as much as possible.  See
the file {\tt ZF/simpdata.ML} for a sophisticated translation of
set-theoretic formulae into rewrite rules.

The simplified rewrites must now be converted into meta-equalities.  The
rule {\tt eq_reflection} converts equality rewrites, while {\tt
  iff_reflection} converts if-and-only-if rewrites.  The latter possibility
can arise in two other ways: the negative theorem~$\neg P$ is converted to
$P\equiv{\tt False}$, and any other theorem~$P$ is converted to
$P\equiv{\tt True}$.  The rules {\tt iff_reflection_F} and {\tt
  iff_reflection_T} accomplish this conversion.
\begin{ttbox}
val P_iff_F = int_prove_fun "~P ==> (P <-> False)";
val iff_reflection_F = P_iff_F RS iff_reflection;
\ttbreak
val P_iff_T = int_prove_fun "P ==> (P <-> True)";
val iff_reflection_T = P_iff_T RS iff_reflection;
\end{ttbox}
The function {\tt mk_meta_eq} converts a theorem to a meta-equality
using the case analysis described above.
\begin{ttbox}
fun mk_meta_eq th = case concl_of th of
    _ $ (Const("op =",_)$_$_)   => th RS eq_reflection
  | _ $ (Const("op <->",_)$_$_) => th RS iff_reflection
  | _ $ (Const("Not",_)$_)      => th RS iff_reflection_F
  | _                           => th RS iff_reflection_T;
\end{ttbox}
The three functions {\tt gen_all}, {\tt atomize} and {\tt mk_meta_eq} will
be composed together and supplied below to {\tt setmksimps}.


\subsection{Making the initial simpset}
It is time to assemble these items.  We open module {\tt Simplifier} to
gain access to its components.  We define the infix operator
\ttindexbold{addcongs} to insert congruence rules; given a list of theorems,
it converts their conclusions into meta-equalities and passes them to
\ttindex{addeqcongs}.
\begin{ttbox}
open Simplifier;
\ttbreak
infix addcongs;
fun ss addcongs congs =
    ss addeqcongs (congs RL [eq_reflection,iff_reflection]);
\end{ttbox}
The list {\tt IFOL_rews} contains the default rewrite rules for first-order
logic.  The first of these is the reflexive law expressed as the
equivalence $(a=a)\bimp{\tt True}$; the rewrite rule $a=a$ is clearly useless.
\begin{ttbox}
val IFOL_rews =
   [refl RS P_iff_T] \at conj_rews \at disj_rews \at not_rews \at 
    imp_rews \at iff_rews \at quant_rews;
\end{ttbox}
The list {\tt triv_rls} contains trivial theorems for the solver.  Any
subgoal that is simplified to one of these will be removed.
\begin{ttbox}
val notFalseI = int_prove_fun "~False";
val triv_rls = [TrueI,refl,iff_refl,notFalseI];
\end{ttbox}
%
The basic simpset for intuitionistic \FOL{} starts with \ttindex{empty_ss}.
It preprocess rewrites using {\tt gen_all}, {\tt atomize} and {\tt
  mk_meta_eq}.  It solves simplified subgoals using {\tt triv_rls} and
assumptions, and by detecting contradictions.  
It uses \ttindex{asm_simp_tac} to tackle subgoals of
conditional rewrites.  It takes {\tt IFOL_rews} as rewrite rules.  
Other simpsets built from {\tt IFOL_ss} will inherit these items.
In particular, {\tt FOL_ss} extends {\tt IFOL_ss} with classical rewrite
rules such as $\neg\neg P\bimp P$.
\index{*setmksimps}\index{*setSSolver}\index{*setSolver}\index{*setsubgoaler}
\index{*addsimps}\index{*addcongs}
\begin{ttbox}
fun unsafe_solver prems = FIRST'[resolve_tac (triv_rls@prems),
                                 atac, etac FalseE];
fun   safe_solver prems = FIRST'[match_tac (triv_rls@prems),
                                 eq_assume_tac, ematch_tac [FalseE]];
val IFOL_ss = empty_ss setsubgoaler asm_simp_tac
                       setSSolver   safe_solver
                       setSolver  unsafe_solver
                       setmksimps (map mk_meta_eq o atomize o gen_all)
                       addsimps IFOL_simps
                       addcongs [imp_cong];
\end{ttbox}
This simpset takes {\tt imp_cong} as a congruence rule in order to use
contextual information to simplify the conclusions of implications:
\[ \List{\Var{P}\bimp\Var{P'};\; \Var{P'} \Imp \Var{Q}\bimp\Var{Q'}} \Imp
   (\Var{P}\imp\Var{Q}) \bimp (\Var{P'}\imp\Var{Q'})
\]
By adding the congruence rule {\tt conj_cong}, we could obtain a similar
effect for conjunctions.


\subsection{Case splitting}
To set up case splitting, we must prove the theorem below and pass it to
\ttindexbold{mk_case_split_tac}.  The tactic \ttindexbold{split_tac} uses
{\tt mk_meta_eq}, defined above, to convert the splitting rules to
meta-equalities.
\begin{ttbox}
val meta_iffD = 
    prove_goal FOL.thy "[| P==Q; Q |] ==> P"
        (fn [prem1,prem2] => [rewtac prem1, rtac prem2 1])
\ttbreak
fun split_tac splits =
    mk_case_split_tac meta_iffD (map mk_meta_eq splits);
\end{ttbox}
%
The splitter replaces applications of a given function; the right-hand side
of the replacement can be anything.  For example, here is a splitting rule
for conditional expressions:
\[ \Var{P}(if(\Var{Q},\Var{x},\Var{y})) \bimp (\Var{Q} \imp \Var{P}(\Var{x}))
\conj (\lnot\Var{Q} \imp \Var{P}(\Var{y})) 
\] 
Another example is the elimination operator (which happens to be
called~$split$) for Cartesian products:
\[ \Var{P}(split(\Var{f},\Var{p})) \bimp (\forall a~b. \Var{p} =
\langle a,b\rangle \imp \Var{P}(\Var{f}(a,b))) 
\] 
Case splits should be allowed only when necessary; they are expensive
and hard to control.  Here is a typical example of use, where {\tt
  expand_if} is the first rule above:
\begin{ttbox}
by (simp_tac (prop_rec_ss setloop (split_tac [expand_if])) 1);
\end{ttbox}



\index{simplification|)}