doc-src/Ref/simplifier.tex
author wenzelm
Thu, 11 Mar 1999 12:34:10 +0100
changeset 6348 fdcbeaddd5fc
parent 5776 3bcc29d0c783
child 6569 66c941ea1f01
permissions -rw-r--r--
include 'README'; tuned;

%% $Id$
\chapter{Simplification}
\label{chap:simplification}
\index{simplification|(}

This chapter describes Isabelle's generic simplification package.  It
performs conditional and unconditional rewriting and uses contextual
information (`local assumptions').  It provides several general hooks,
which can provide automatic case splits during rewriting, for example.
The simplifier is already set up for many of Isabelle's logics: \FOL,
\ZF, \HOL, \HOLCF.

The first section is a quick introduction to the simplifier that
should be sufficient to get started.  The later sections explain more
advanced features.


\section{Simplification for dummies}
\label{sec:simp-for-dummies}

Basic use of the simplifier is particularly easy because each theory
is equipped with sensible default information controlling the rewrite
process --- namely the implicit {\em current
  simpset}\index{simpset!current}.  A suite of simple commands is
provided that refer to the implicit simpset of the current theory
context.

\begin{warn}
  Make sure that you are working within the correct theory context.
  Executing proofs interactively, or loading them from ML files
  without associated theories may require setting the current theory
  manually via the \ttindex{context} command.
\end{warn}

\subsection{Simplification tactics} \label{sec:simp-for-dummies-tacs}
\begin{ttbox}
Simp_tac          : int -> tactic
Asm_simp_tac      : int -> tactic
Full_simp_tac     : int -> tactic
Asm_full_simp_tac : int -> tactic
trace_simp        : bool ref \hfill{\bf initially false}
\end{ttbox}

\begin{ttdescription}
\item[\ttindexbold{Simp_tac} $i$] simplifies subgoal~$i$ using the
  current simpset.  It may solve the subgoal completely if it has
  become trivial, using the simpset's solver tactic.
  
\item[\ttindexbold{Asm_simp_tac}]\index{assumptions!in simplification}
  is like \verb$Simp_tac$, but extracts additional rewrite rules from
  the local assumptions.
  
\item[\ttindexbold{Full_simp_tac}] is like \verb$Simp_tac$, but also
  simplifies the assumptions (without using the assumptions to
  simplify each other or the actual goal).
  
\item[\ttindexbold{Asm_full_simp_tac}] is like \verb$Asm_simp_tac$,
  but also simplifies the assumptions. In particular, assumptions can
  simplify each other.
\footnote{\texttt{Asm_full_simp_tac} used to process the assumptions from
  left to right. For backwards compatibilty reasons only there is now
  \texttt{Asm_lr_simp_tac} that behaves like the old \texttt{Asm_full_simp_tac}.}
\item[set \ttindexbold{trace_simp};] makes the simplifier output
  internal operations.  This includes rewrite steps, but also
  bookkeeping like modifications of the simpset.
\end{ttdescription}

\medskip

As an example, consider the theory of arithmetic in \HOL.  The (rather
trivial) goal $0 + (x + 0) = x + 0 + 0$ can be solved by a single call
of \texttt{Simp_tac} as follows:
\begin{ttbox}
context Arith.thy;
Goal "0 + (x + 0) = x + 0 + 0";
{\out  1. 0 + (x + 0) = x + 0 + 0}
by (Simp_tac 1);
{\out Level 1}
{\out 0 + (x + 0) = x + 0 + 0}
{\out No subgoals!}
\end{ttbox}

The simplifier uses the current simpset of \texttt{Arith.thy}, which
contains suitable theorems like $\Var{n}+0 = \Var{n}$ and $0+\Var{n} =
\Var{n}$.

\medskip In many cases, assumptions of a subgoal are also needed in
the simplification process.  For example, \texttt{x = 0 ==> x + x = 0}
is solved by \texttt{Asm_simp_tac} as follows:
\begin{ttbox}
{\out  1. x = 0 ==> x + x = 0}
by (Asm_simp_tac 1);
\end{ttbox}

\medskip \texttt{Asm_full_simp_tac} is the most powerful of this quartet
of tactics but may also loop where some of the others terminate.  For
example,
\begin{ttbox}
{\out  1. ALL x. f x = g (f (g x)) ==> f 0 = f 0 + 0}
\end{ttbox}
is solved by \texttt{Simp_tac}, but \texttt{Asm_simp_tac} and {\tt
  Asm_simp_tac} loop because the rewrite rule $f\,\Var{x} =
g\,(f\,(g\,\Var{x}))$ extracted from the assumption does not
terminate.  Isabelle notices certain simple forms of nontermination,
but not this one. Because assumptions may simplify each other, there can be
very subtle cases of nontermination.
 
\begin{warn}
  \verb$Asm_full_simp_tac$ may miss opportunities to simplify an assumption
  $A@i$ using an assumption $A@j$ in case $A@j$ is to the right of $A@i$.
  For example, given the subgoal
\begin{ttbox}
{\out [| \dots f a \dots; P a; ALL x. P x --> f x = g x |] ==> \dots}
\end{ttbox}
\verb$Asm_full_simp_tac$ will not simplify the \texttt{f a} on the left.
This problem can be overcome by reordering assumptions (see
\S\ref{sec:reordering-asms}). Future versions of \verb$Asm_full_simp_tac$
will not suffer from this deficiency.
\end{warn}

\medskip

Using the simplifier effectively may take a bit of experimentation.
Set the \verb$trace_simp$\index{tracing!of simplification} flag to get
a better idea of what is going on.  The resulting output can be
enormous, especially since invocations of the simplifier are often
nested (e.g.\ when solving conditions of rewrite rules).


\subsection{Modifying the current simpset}
\begin{ttbox}
Addsimps    : thm list -> unit
Delsimps    : thm list -> unit
Addsimprocs : simproc list -> unit
Delsimprocs : simproc list -> unit
Addcongs    : thm list -> unit
Delcongs    : thm list -> unit
Addsplits   : thm list -> unit
Delsplits   : thm list -> unit
\end{ttbox}

Depending on the theory context, the \texttt{Add} and \texttt{Del}
functions manipulate basic components of the associated current
simpset.  Internally, all rewrite rules have to be expressed as
(conditional) meta-equalities.  This form is derived automatically
from object-level equations that are supplied by the user.  Another
source of rewrite rules are \emph{simplification procedures}, that is
\ML\ functions that produce suitable theorems on demand, depending on
the current redex.  Congruences are a more advanced feature; see
\S\ref{sec:simp-congs}.

\begin{ttdescription}

\item[\ttindexbold{Addsimps} $thms$;] adds rewrite rules derived from
  $thms$ to the current simpset.
  
\item[\ttindexbold{Delsimps} $thms$;] deletes rewrite rules derived
  from $thms$ from the current simpset.
  
\item[\ttindexbold{Addsimprocs} $procs$;] adds simplification
  procedures $procs$ to the current simpset.
  
\item[\ttindexbold{Delsimprocs} $procs$;] deletes simplification
  procedures $procs$ from the current simpset.
  
\item[\ttindexbold{Addcongs} $thms$;] adds congruence rules to the
  current simpset.
  
\item[\ttindexbold{Delcongs} $thms$;] deletes congruence rules from the
  current simpset.

\item[\ttindexbold{Addsplits} $thms$;] adds splitting rules to the
  current simpset.
  
\item[\ttindexbold{Delsplits} $thms$;] deletes splitting rules from the
  current simpset.

\end{ttdescription}

When a new theory is built, its implicit simpset is initialized by the
union of the respective simpsets of its parent theories.  In addition,
certain theory definition constructs (e.g.\ \ttindex{datatype} and
\ttindex{primrec} in \HOL) implicitly augment the current simpset.
Ordinary definitions are not added automatically!

It is up the user to manipulate the current simpset further by
explicitly adding or deleting theorems and simplification procedures.

\medskip

Good simpsets are hard to design.  Rules that obviously simplify,
like $\Var{n}+0 = \Var{n}$, should be added to the current simpset right after
they have been proved.  More specific ones (such as distributive laws, which
duplicate subterms) should be added only for specific proofs and deleted
afterwards.  Conversely, sometimes a rule needs
to be removed for a certain proof and restored afterwards.  The need of
frequent additions or deletions may indicate a badly designed
simpset.

\begin{warn}
  The union of the parent simpsets (as described above) is not always
  a good starting point for the new theory.  If some ancestors have
  deleted simplification rules because they are no longer wanted,
  while others have left those rules in, then the union will contain
  the unwanted rules.  After this union is formed, changes to 
  a parent simpset have no effect on the child simpset.
\end{warn}


\section{Simplification sets}\index{simplification sets} 

The simplifier is controlled by information contained in {\bf
  simpsets}.  These consist of several components, including rewrite
rules, simplification procedures, congruence rules, and the subgoaler,
solver and looper tactics.  The simplifier should be set up with
sensible defaults so that most simplifier calls specify only rewrite
rules or simplification procedures.  Experienced users can exploit the
other components to streamline proofs in more sophisticated manners.

\subsection{Inspecting simpsets}
\begin{ttbox}
print_ss : simpset -> unit
rep_ss   : simpset -> \{mss        : meta_simpset, 
                       subgoal_tac: simpset  -> int -> tactic,
                       loop_tac   :             int -> tactic,
                       finish_tac : thm list -> int -> tactic,
                unsafe_finish_tac : thm list -> int -> tactic\}
\end{ttbox}
\begin{ttdescription}
  
\item[\ttindexbold{print_ss} $ss$;] displays the printable contents of
  simpset $ss$.  This includes the rewrite rules and congruences in
  their internal form expressed as meta-equalities.  The names of the
  simplification procedures and the patterns they are invoked on are
  also shown.  The other parts, functions and tactics, are
  non-printable.

\item[\ttindexbold{rep_ss} $ss$;] decomposes $ss$ as a record of its internal 
  components, namely the meta_simpset, the subgoaler, the loop, and the safe
  and unsafe solvers.

\end{ttdescription}


\subsection{Building simpsets}
\begin{ttbox}
empty_ss : simpset
merge_ss : simpset * simpset -> simpset
\end{ttbox}
\begin{ttdescription}
  
\item[\ttindexbold{empty_ss}] is the empty simpset.  This is not very
  useful under normal circumstances because it doesn't contain
  suitable tactics (subgoaler etc.).  When setting up the simplifier
  for a particular object-logic, one will typically define a more
  appropriate ``almost empty'' simpset.  For example, in \HOL\ this is
  called \ttindexbold{HOL_basic_ss}.
  
\item[\ttindexbold{merge_ss} ($ss@1$, $ss@2$)] merges simpsets $ss@1$
  and $ss@2$ by building the union of their respective rewrite rules,
  simplification procedures and congruences.  The other components
  (tactics etc.) cannot be merged, though; they are taken from either
  simpset\footnote{Actually from $ss@1$, but it would unwise to count
    on that.}.

\end{ttdescription}


\subsection{Accessing the current simpset}
\label{sec:access-current-simpset}

\begin{ttbox}
simpset        : unit   -> simpset
simpset_ref    : unit   -> simpset ref
simpset_of     : theory -> simpset
simpset_ref_of : theory -> simpset ref
print_simpset  : theory -> unit
SIMPSET        :(simpset ->       tactic) ->       tactic
SIMPSET'       :(simpset -> 'a -> tactic) -> 'a -> tactic
\end{ttbox}

Each theory contains a current simpset\index{simpset!current} stored
within a private ML reference variable.  This can be retrieved and
modified as follows.

\begin{ttdescription}
  
\item[\ttindexbold{simpset}();] retrieves the simpset value from the
  current theory context.
  
\item[\ttindexbold{simpset_ref}();] retrieves the simpset reference
  variable from the current theory context.  This can be assigned to
  by using \texttt{:=} in ML.
  
\item[\ttindexbold{simpset_of} $thy$;] retrieves the simpset value
  from theory $thy$.
  
\item[\ttindexbold{simpset_ref_of} $thy$;] retrieves the simpset
  reference variable from theory $thy$.

\item[\ttindexbold{print_simpset} $thy$;] prints the current simpset
  of theory $thy$ in the same way as \texttt{print_ss}.

\item[\ttindexbold{SIMPSET} $tacf$, \ttindexbold{SIMPSET'} $tacf'$]
  are tacticals that make a tactic depend on the implicit current
  simpset of the theory associated with the proof state they are
  applied on.

\end{ttdescription}

\begin{warn}
  There is a subtle difference between \texttt{(SIMPSET'~$tacf$)} and
  \texttt{($tacf$~(simpset()))}.  For example \texttt{(SIMPSET'
    simp_tac)} would depend on the theory of the proof state it is
  applied to, while \texttt{(simp_tac (simpset()))} implicitly refers
  to the current theory context.  Both are usually the same in proof
  scripts, provided that goals are only stated within the current
  theory.  Robust programs would not count on that, of course.
\end{warn}


\subsection{Rewrite rules}
\begin{ttbox}
addsimps : simpset * thm list -> simpset \hfill{\bf infix 4}
delsimps : simpset * thm list -> simpset \hfill{\bf infix 4}
\end{ttbox}

\index{rewrite rules|(} Rewrite rules are theorems expressing some
form of equality, for example:
\begin{eqnarray*}
  Suc(\Var{m}) + \Var{n} &=&      \Var{m} + Suc(\Var{n}) \\
  \Var{P}\conj\Var{P}    &\bimp&  \Var{P} \\
  \Var{A} \un \Var{B} &\equiv& \{x.x\in \Var{A} \disj x\in \Var{B}\}
\end{eqnarray*}
Conditional rewrites such as $\Var{m}<\Var{n} \Imp \Var{m}/\Var{n} =
0$ are also permitted; the conditions can be arbitrary formulas.

Internally, all rewrite rules are translated into meta-equalities,
theorems with conclusion $lhs \equiv rhs$.  Each simpset contains a
function for extracting equalities from arbitrary theorems.  For
example, $\neg(\Var{x}\in \{\})$ could be turned into $\Var{x}\in \{\}
\equiv False$.  This function can be installed using
\ttindex{setmksimps} but only the definer of a logic should need to do
this; see \S\ref{sec:setmksimps}.  The function processes theorems
added by \texttt{addsimps} as well as local assumptions.

\begin{ttdescription}
  
\item[$ss$ \ttindexbold{addsimps} $thms$] adds rewrite rules derived
  from $thms$ to the simpset $ss$.
  
\item[$ss$ \ttindexbold{delsimps} $thms$] deletes rewrite rules
  derived from $thms$ from the simpset $ss$.

\end{ttdescription}

\begin{warn}
  The simplifier will accept all standard rewrite rules: those where
  all unknowns are of base type.  Hence ${\Var{i}+(\Var{j}+\Var{k})} =
  {(\Var{i}+\Var{j})+\Var{k}}$ is OK.
  
  It will also deal gracefully with all rules whose left-hand sides
  are so-called {\em higher-order patterns}~\cite{nipkow-patterns}.
  \indexbold{higher-order pattern}\indexbold{pattern, higher-order}
  These are terms in $\beta$-normal form (this will always be the case
  unless you have done something strange) where each occurrence of an
  unknown is of the form $\Var{F}(x@1,\dots,x@n)$, where the $x@i$ are
  distinct bound variables. Hence $(\forall x.\Var{P}(x) \land
  \Var{Q}(x)) \bimp (\forall x.\Var{P}(x)) \land (\forall
  x.\Var{Q}(x))$ is also OK, in both directions.
  
  In some rare cases the rewriter will even deal with quite general
  rules: for example ${\Var{f}(\Var{x})\in range(\Var{f})} = True$
  rewrites $g(a) \in range(g)$ to $True$, but will fail to match
  $g(h(b)) \in range(\lambda x.g(h(x)))$.  However, you can replace
  the offending subterms (in our case $\Var{f}(\Var{x})$, which is not
  a pattern) by adding new variables and conditions: $\Var{y} =
  \Var{f}(\Var{x}) \Imp \Var{y}\in range(\Var{f}) = True$ is
  acceptable as a conditional rewrite rule since conditions can be
  arbitrary terms.
  
  There is basically no restriction on the form of the right-hand
  sides.  They may not contain extraneous term or type variables,
  though.
\end{warn}
\index{rewrite rules|)}


\subsection{*Simplification procedures}
\begin{ttbox}
addsimprocs : simpset * simproc list -> simpset
delsimprocs : simpset * simproc list -> simpset
\end{ttbox}

Simplification procedures are {\ML} objects of abstract type
\texttt{simproc}.  Basically they are just functions that may produce
\emph{proven} rewrite rules on demand.  They are associated with
certain patterns that conceptually represent left-hand sides of
equations; these are shown by \texttt{print_ss}.  During its
operation, the simplifier may offer a simplification procedure the
current redex and ask for a suitable rewrite rule.  Thus rules may be
specifically fashioned for particular situations, resulting in a more
powerful mechanism than term rewriting by a fixed set of rules.


\begin{ttdescription}
  
\item[$ss$ \ttindexbold{addsimprocs} $procs$] adds the simplification
  procedures $procs$ to the current simpset.
  
\item[$ss$ \ttindexbold{delsimprocs} $procs$] deletes the simplification
  procedures $procs$ from the current simpset.

\end{ttdescription}

For example, simplification procedures \ttindexbold{nat_cancel} of
\texttt{HOL/Arith} cancel common summands and constant factors out of
several relations of sums over natural numbers.

Consider the following goal, which after cancelling $a$ on both sides
contains a factor of $2$.  Simplifying with the simpset of
\texttt{Arith.thy} will do the cancellation automatically:
\begin{ttbox}
{\out 1. x + a + x < y + y + 2 + a + a + a + a + a}
by (Simp_tac 1);
{\out 1. x < Suc (a + (a + y))}
\end{ttbox}


\subsection{*Congruence rules}\index{congruence rules}\label{sec:simp-congs}
\begin{ttbox}
addcongs   : simpset * thm list -> simpset \hfill{\bf infix 4}
delcongs   : simpset * thm list -> simpset \hfill{\bf infix 4}
addeqcongs : simpset * thm list -> simpset \hfill{\bf infix 4}
deleqcongs : simpset * thm list -> simpset \hfill{\bf infix 4}
\end{ttbox}

Congruence rules are meta-equalities of the form
\[ \dots \Imp
   f(\Var{x@1},\ldots,\Var{x@n}) \equiv f(\Var{y@1},\ldots,\Var{y@n}).
\]
This governs the simplification of the arguments of~$f$.  For
example, some arguments can be simplified under additional assumptions:
\[ \List{\Var{P@1} \bimp \Var{Q@1};\; \Var{Q@1} \Imp \Var{P@2} \bimp \Var{Q@2}}
   \Imp (\Var{P@1} \imp \Var{P@2}) \equiv (\Var{Q@1} \imp \Var{Q@2})
\]
Given this rule, the simplifier assumes $Q@1$ and extracts rewrite
rules from it when simplifying~$P@2$.  Such local assumptions are
effective for rewriting formulae such as $x=0\imp y+x=y$.  The local
assumptions are also provided as theorems to the solver; see
\S~\ref{sec:simp-solver} below.

\begin{ttdescription}
  
\item[$ss$ \ttindexbold{addcongs} $thms$] adds congruence rules to the
  simpset $ss$.  These are derived from $thms$ in an appropriate way,
  depending on the underlying object-logic.
  
\item[$ss$ \ttindexbold{delcongs} $thms$] deletes congruence rules
  derived from $thms$.
  
\item[$ss$ \ttindexbold{addeqcongs} $thms$] adds congruence rules in
  their internal form (conclusions using meta-equality) to simpset
  $ss$.  This is the basic mechanism that \texttt{addcongs} is built
  on.  It should be rarely used directly.
  
\item[$ss$ \ttindexbold{deleqcongs} $thms$] deletes congruence rules
  in internal form from simpset $ss$.
  
\end{ttdescription}

\medskip

Here are some more examples.  The congruence rule for bounded
quantifiers also supplies contextual information, this time about the
bound variable:
\begin{eqnarray*}
  &&\List{\Var{A}=\Var{B};\; 
          \Forall x. x\in \Var{B} \Imp \Var{P}(x) = \Var{Q}(x)} \Imp{} \\
 &&\qquad\qquad
    (\forall x\in \Var{A}.\Var{P}(x)) = (\forall x\in \Var{B}.\Var{Q}(x))
\end{eqnarray*}
The congruence rule for conditional expressions can supply contextual
information for simplifying the arms:
\[ \List{\Var{p}=\Var{q};~ \Var{q} \Imp \Var{a}=\Var{c};~
         \neg\Var{q} \Imp \Var{b}=\Var{d}} \Imp
   if(\Var{p},\Var{a},\Var{b}) \equiv if(\Var{q},\Var{c},\Var{d})
\]
A congruence rule can also {\em prevent\/} simplification of some arguments.
Here is an alternative congruence rule for conditional expressions:
\[ \Var{p}=\Var{q} \Imp
   if(\Var{p},\Var{a},\Var{b}) \equiv if(\Var{q},\Var{a},\Var{b})
\]
Only the first argument is simplified; the others remain unchanged.
This can make simplification much faster, but may require an extra case split
to prove the goal.  


\subsection{*The subgoaler}\label{sec:simp-subgoaler}
\begin{ttbox}
setsubgoaler : simpset *  (simpset -> int -> tactic) -> simpset \hfill{\bf infix 4}
prems_of_ss  : simpset -> thm list
\end{ttbox}

The subgoaler is the tactic used to solve subgoals arising out of
conditional rewrite rules or congruence rules.  The default should be
simplification itself.  Occasionally this strategy needs to be
changed.  For example, if the premise of a conditional rule is an
instance of its conclusion, as in $Suc(\Var{m}) < \Var{n} \Imp \Var{m}
< \Var{n}$, the default strategy could loop.

\begin{ttdescription}
  
\item[$ss$ \ttindexbold{setsubgoaler} $tacf$] sets the subgoaler of
  $ss$ to $tacf$.  The function $tacf$ will be applied to the current
  simplifier context expressed as a simpset.
  
\item[\ttindexbold{prems_of_ss} $ss$] retrieves the current set of
  premises from simplifier context $ss$.  This may be non-empty only
  if the simplifier has been told to utilize local assumptions in the
  first place, e.g.\ if invoked via \texttt{asm_simp_tac}.

\end{ttdescription}

As an example, consider the following subgoaler:
\begin{ttbox}
fun subgoaler ss =
    assume_tac ORELSE'
    resolve_tac (prems_of_ss ss) ORELSE'
    asm_simp_tac ss;
\end{ttbox}
This tactic first tries to solve the subgoal by assumption or by
resolving with with one of the premises, calling simplification only
if that fails.


\subsection{*The solver}\label{sec:simp-solver}
\begin{ttbox}
setSolver  : simpset * (thm list -> int -> tactic) -> simpset \hfill{\bf infix 4}
addSolver  : simpset * (thm list -> int -> tactic) -> simpset \hfill{\bf infix 4}
setSSolver : simpset * (thm list -> int -> tactic) -> simpset \hfill{\bf infix 4}
addSSolver : simpset * (thm list -> int -> tactic) -> simpset \hfill{\bf infix 4}
\end{ttbox}

The solver is a pair of tactics that attempt to solve a subgoal after
simplification.  Typically it just proves trivial subgoals such as
\texttt{True} and $t=t$.  It could use sophisticated means such as {\tt
  blast_tac}, though that could make simplification expensive.

Rewriting does not instantiate unknowns.  For example, rewriting
cannot prove $a\in \Var{A}$ since this requires
instantiating~$\Var{A}$.  The solver, however, is an arbitrary tactic
and may instantiate unknowns as it pleases.  This is the only way the
simplifier can handle a conditional rewrite rule whose condition
contains extra variables.  When a simplification tactic is to be
combined with other provers, especially with the classical reasoner,
it is important whether it can be considered safe or not.  For this
reason the solver is split into a safe and an unsafe part.

The standard simplification strategy solely uses the unsafe solver,
which is appropriate in most cases.  For special applications where
the simplification process is not allowed to instantiate unknowns
within the goal, simplification starts with the safe solver, but may
still apply the ordinary unsafe one in nested simplifications for
conditional rules or congruences.

\begin{ttdescription}
  
\item[$ss$ \ttindexbold{setSSolver} $tacf$] installs $tacf$ as the
  \emph{safe} solver of $ss$.
  
\item[$ss$ \ttindexbold{addSSolver} $tacf$] adds $tacf$ as an
  additional \emph{safe} solver; it will be tried after the solvers
  which had already been present in $ss$.
  
\item[$ss$ \ttindexbold{setSolver} $tacf$] installs $tacf$ as the
  unsafe solver of $ss$.
  
\item[$ss$ \ttindexbold{addSolver} $tacf$] adds $tacf$ as an
  additional unsafe solver; it will be tried after the solvers which
  had already been present in $ss$.

\end{ttdescription}

\medskip

\index{assumptions!in simplification} The solver tactic is invoked
with a list of theorems, namely assumptions that hold in the local
context.  This may be non-empty only if the simplifier has been told
to utilize local assumptions in the first place, e.g.\ if invoked via
\texttt{asm_simp_tac}.  The solver is also presented the full goal
including its assumptions in any case.  Thus it can use these (e.g.\ 
by calling \texttt{assume_tac}), even if the list of premises is not
passed.

\medskip

As explained in \S\ref{sec:simp-subgoaler}, the subgoaler is also used
to solve the premises of congruence rules.  These are usually of the
form $s = \Var{x}$, where $s$ needs to be simplified and $\Var{x}$
needs to be instantiated with the result.  Typically, the subgoaler
will invoke the simplifier at some point, which will eventually call
the solver.  For this reason, solver tactics must be prepared to solve
goals of the form $t = \Var{x}$, usually by reflexivity.  In
particular, reflexivity should be tried before any of the fancy
tactics like \texttt{blast_tac}.

It may even happen that due to simplification the subgoal is no longer
an equality.  For example $False \bimp \Var{Q}$ could be rewritten to
$\neg\Var{Q}$.  To cover this case, the solver could try resolving
with the theorem $\neg False$.

\medskip

\begin{warn}
  If the simplifier aborts with the message \texttt{Failed congruence
    proof!}, then the subgoaler or solver has failed to prove a
  premise of a congruence rule.  This should never occur under normal
  circumstances; it indicates that some congruence rule, or possibly
  the subgoaler or solver, is faulty.
\end{warn}


\subsection{*The looper}\label{sec:simp-looper}
\begin{ttbox}
setloop   : simpset *           (int -> tactic)  -> simpset \hfill{\bf infix 4}
addloop   : simpset * (string * (int -> tactic)) -> simpset \hfill{\bf infix 4}
delloop   : simpset *  string                    -> simpset \hfill{\bf infix 4}
addsplits : simpset * thm list -> simpset \hfill{\bf infix 4}
delsplits : simpset * thm list -> simpset \hfill{\bf infix 4}
\end{ttbox}

The looper is a list of tactics that are applied after simplification, in case
the solver failed to solve the simplified goal.  If the looper
succeeds, the simplification process is started all over again.  Each
of the subgoals generated by the looper is attacked in turn, in
reverse order.

A typical looper is case splitting: the expansion of a conditional.
Another possibility is to apply an elimination rule on the
assumptions.  More adventurous loopers could start an induction.

\begin{ttdescription}
  
\item[$ss$ \ttindexbold{setloop} $tacf$] installs $tacf$ as the only looper
  tactic of $ss$.
  
\item[$ss$ \ttindexbold{addloop} $(name,tacf)$] adds $tacf$ as an additional
  looper tactic with name $name$; it will be tried after the looper tactics
  that had already been present in $ss$.
  
\item[$ss$ \ttindexbold{delloop} $name$] deletes the looper tactic $name$
  from $ss$.
  
\item[$ss$ \ttindexbold{addsplits} $thms$] adds
  split tactics for $thms$ as additional looper tactics of $ss$.

\item[$ss$ \ttindexbold{addsplits} $thms$] deletes the
  split tactics for $thms$ from the looper tactics of $ss$.

\end{ttdescription}

The splitter replaces applications of a given function; the right-hand side
of the replacement can be anything.  For example, here is a splitting rule
for conditional expressions:
\[ \Var{P}(if(\Var{Q},\Var{x},\Var{y})) \bimp (\Var{Q} \imp \Var{P}(\Var{x}))
\conj (\lnot\Var{Q} \imp \Var{P}(\Var{y})) 
\] 
Another example is the elimination operator (which happens to be
called~$split$) for Cartesian products:
\[ \Var{P}(split(\Var{f},\Var{p})) \bimp (\forall a~b. \Var{p} =
\langle a,b\rangle \imp \Var{P}(\Var{f}(a,b))) 
\] 

For technical reasons, there is a distinction between case splitting in the 
conclusion and in the premises of a subgoal. The former is done by
\texttt{split_tac} with rules like \texttt{split_if}, 
which does not split the subgoal, while the latter is done by 
\texttt{split_asm_tac} with rules like \texttt{split_if_asm}, 
which splits the subgoal.
The operator \texttt{addsplits} automatically takes care of which tactic to
call, analyzing the form of the rules given as argument.
\begin{warn}
Due to \texttt{split_asm_tac}, the simplifier may split subgoals!
\end{warn}

Case splits should be allowed only when necessary; they are expensive
and hard to control.  Here is an example of use, where \texttt{split_if}
is the first rule above:
\begin{ttbox}
by (simp_tac (simpset() addloop ("split if",split_tac [split_if])) 1);
\end{ttbox}
Users would usually prefer the following shortcut using \texttt{addsplits}:
\begin{ttbox}
by (simp_tac (simpset() addsplits [split_if]) 1);
\end{ttbox}


\section{The simplification tactics}\label{simp-tactics}
\index{simplification!tactics}\index{tactics!simplification}
\begin{ttbox}
simp_tac               : simpset -> int -> tactic
asm_simp_tac           : simpset -> int -> tactic
full_simp_tac          : simpset -> int -> tactic
asm_full_simp_tac      : simpset -> int -> tactic
safe_asm_full_simp_tac : simpset -> int -> tactic
\end{ttbox}

These are the basic tactics that are underlying any actual
simplification work.  The rewriting strategy is always strictly bottom
up, except for congruence rules, which are applied while descending
into a term.  Conditions in conditional rewrite rules are solved
recursively before the rewrite rule is applied.

\begin{ttdescription}
  
\item[\ttindexbold{simp_tac}, \ttindexbold{asm_simp_tac},
  \ttindexbold{full_simp_tac}, \ttindexbold{asm_full_simp_tac}] are
  the basic simplification tactics that work exactly like their
  namesakes in \S\ref{sec:simp-for-dummies}, except that they are
  explicitly supplied with a simpset.
  
\item[\ttindexbold{safe_asm_full_simp_tac}] is like
  \texttt{asm_full_simp_tac}, but uses the safe solver as explained in
  \S\ref{sec:simp-solver}.  This tactic is mainly intended for
  building special tools, e.g.\ for combining the simplifier with the
  classical reasoner.  It is rarely used directly.
  
\end{ttdescription}

\medskip

Local modifications of simpsets within a proof are often much cleaner
by using above tactics in conjunction with explicit simpsets, rather
than their capitalized counterparts.  For example
\begin{ttbox}
Addsimps \(thms\);
by (Simp_tac \(i\));
Delsimps \(thms\);
\end{ttbox}
can be expressed more appropriately as
\begin{ttbox}
by (simp_tac (simpset() addsimps \(thms\)) \(i\));
\end{ttbox}

\medskip

Also note that functions depending implicitly on the current theory
context (like capital \texttt{Simp_tac} and the other commands of
\S\ref{sec:simp-for-dummies}) should be considered harmful outside of
actual proof scripts.  In particular, ML programs like theory
definition packages or special tactics should refer to simpsets only
explicitly, via the above tactics used in conjunction with
\texttt{simpset_of} or the \texttt{SIMPSET} tacticals.


\section{Forward rules and conversions}
\index{simplification!forward rules}\index{simplification!conversions}
\begin{ttbox}\index{*simplify}\index{*asm_simplify}\index{*full_simplify}\index{*asm_full_simplify}\index{*Simplifier.rewrite}\index{*Simplifier.asm_rewrite}\index{*Simplifier.full_rewrite}\index{*Simplifier.asm_full_rewrite}
simplify          : simpset -> thm -> thm
asm_simplify      : simpset -> thm -> thm
full_simplify     : simpset -> thm -> thm
asm_full_simplify : simpset -> thm -> thm\medskip
Simplifier.rewrite           : simpset -> cterm -> thm
Simplifier.asm_rewrite       : simpset -> cterm -> thm
Simplifier.full_rewrite      : simpset -> cterm -> thm
Simplifier.asm_full_rewrite  : simpset -> cterm -> thm
\end{ttbox}

The first four of these functions provide \emph{forward} rules for
simplification.  Their effect is analogous to the corresponding
tactics described in \S\ref{simp-tactics}, but affect the whole
theorem instead of just a certain subgoal.  Also note that the
looper~/ solver process as described in \S\ref{sec:simp-looper} and
\S\ref{sec:simp-solver} is omitted in forward simplification.

The latter four are \emph{conversions}, establishing proven equations
of the form $t \equiv u$ where the l.h.s.\ $t$ has been given as
argument.

\begin{warn}
  Forward simplification rules and conversions should be used rarely
  in ordinary proof scripts.  The main intention is to provide an
  internal interface to the simplifier for special utilities.
\end{warn}


\section{Examples of using the simplifier}
\index{examples!of simplification} Assume we are working within {\tt
  FOL} (see the file \texttt{FOL/ex/Nat}) and that
\begin{ttdescription}
\item[Nat.thy] 
  is a theory including the constants $0$, $Suc$ and $+$,
\item[add_0]
  is the rewrite rule $0+\Var{n} = \Var{n}$,
\item[add_Suc]
  is the rewrite rule $Suc(\Var{m})+\Var{n} = Suc(\Var{m}+\Var{n})$,
\item[induct]
  is the induction rule $\List{\Var{P}(0);\; \Forall x. \Var{P}(x)\Imp
    \Var{P}(Suc(x))} \Imp \Var{P}(\Var{n})$.
\end{ttdescription}
We augment the implicit simpset inherited from \texttt{Nat} with the
basic rewrite rules for addition of natural numbers:
\begin{ttbox}
Addsimps [add_0, add_Suc];
\end{ttbox}

\subsection{A trivial example}
Proofs by induction typically involve simplification.  Here is a proof
that~0 is a right identity:
\begin{ttbox}
Goal "m+0 = m";
{\out Level 0}
{\out m + 0 = m}
{\out  1. m + 0 = m}
\end{ttbox}
The first step is to perform induction on the variable~$m$.  This returns a
base case and inductive step as two subgoals:
\begin{ttbox}
by (res_inst_tac [("n","m")] induct 1);
{\out Level 1}
{\out m + 0 = m}
{\out  1. 0 + 0 = 0}
{\out  2. !!x. x + 0 = x ==> Suc(x) + 0 = Suc(x)}
\end{ttbox}
Simplification solves the first subgoal trivially:
\begin{ttbox}
by (Simp_tac 1);
{\out Level 2}
{\out m + 0 = m}
{\out  1. !!x. x + 0 = x ==> Suc(x) + 0 = Suc(x)}
\end{ttbox}
The remaining subgoal requires \ttindex{Asm_simp_tac} in order to use the
induction hypothesis as a rewrite rule:
\begin{ttbox}
by (Asm_simp_tac 1);
{\out Level 3}
{\out m + 0 = m}
{\out No subgoals!}
\end{ttbox}

\subsection{An example of tracing}
\index{tracing!of simplification|(}\index{*trace_simp}

Let us prove a similar result involving more complex terms.  We prove
that addition is commutative.
\begin{ttbox}
Goal "m+Suc(n) = Suc(m+n)";
{\out Level 0}
{\out m + Suc(n) = Suc(m + n)}
{\out  1. m + Suc(n) = Suc(m + n)}
\end{ttbox}
Performing induction on~$m$ yields two subgoals:
\begin{ttbox}
by (res_inst_tac [("n","m")] induct 1);
{\out Level 1}
{\out m + Suc(n) = Suc(m + n)}
{\out  1. 0 + Suc(n) = Suc(0 + n)}
{\out  2. !!x. x + Suc(n) = Suc(x + n) ==>}
{\out          Suc(x) + Suc(n) = Suc(Suc(x) + n)}
\end{ttbox}
Simplification solves the first subgoal, this time rewriting two
occurrences of~0:
\begin{ttbox}
by (Simp_tac 1);
{\out Level 2}
{\out m + Suc(n) = Suc(m + n)}
{\out  1. !!x. x + Suc(n) = Suc(x + n) ==>}
{\out          Suc(x) + Suc(n) = Suc(Suc(x) + n)}
\end{ttbox}
Switching tracing on illustrates how the simplifier solves the remaining
subgoal: 
\begin{ttbox}
set trace_simp;
by (Asm_simp_tac 1);
\ttbreak
{\out Adding rewrite rule:}
{\out .x + Suc n == Suc (.x + n)}
\ttbreak
{\out Applying instance of rewrite rule:}
{\out ?m + Suc ?n == Suc (?m + ?n)}
{\out Rewriting:}
{\out Suc .x + Suc n == Suc (Suc .x + n)}
\ttbreak
{\out Applying instance of rewrite rule:}
{\out Suc ?m + ?n == Suc (?m + ?n)}
{\out Rewriting:}
{\out Suc .x + n == Suc (.x + n)}
\ttbreak
{\out Applying instance of rewrite rule:}
{\out Suc ?m + ?n == Suc (?m + ?n)}
{\out Rewriting:}
{\out Suc .x + n == Suc (.x + n)}
\ttbreak
{\out Applying instance of rewrite rule:}
{\out ?x = ?x == True}
{\out Rewriting:}
{\out Suc (Suc (.x + n)) = Suc (Suc (.x + n)) == True}
\ttbreak
{\out Level 3}
{\out m + Suc(n) = Suc(m + n)}
{\out No subgoals!}
\end{ttbox}
Many variations are possible.  At Level~1 (in either example) we could have
solved both subgoals at once using the tactical \ttindex{ALLGOALS}:
\begin{ttbox}
by (ALLGOALS Asm_simp_tac);
{\out Level 2}
{\out m + Suc(n) = Suc(m + n)}
{\out No subgoals!}
\end{ttbox}
\index{tracing!of simplification|)}


\subsection{Free variables and simplification}

Here is a conjecture to be proved for an arbitrary function~$f$
satisfying the law $f(Suc(\Var{n})) = Suc(f(\Var{n}))$:
\begin{ttbox}
val [prem] = Goal "(!!n. f(Suc(n)) = Suc(f(n))) ==> f(i+j) = i+f(j)";
{\out Level 0}
{\out f(i + j) = i + f(j)}
{\out  1. f(i + j) = i + f(j)}
\ttbreak
{\out val prem = "f(Suc(?n)) = Suc(f(?n))}
{\out             [!!n. f(Suc(n)) = Suc(f(n))]" : thm}
\end{ttbox}
In the theorem~\texttt{prem}, note that $f$ is a free variable while
$\Var{n}$ is a schematic variable.
\begin{ttbox}
by (res_inst_tac [("n","i")] induct 1);
{\out Level 1}
{\out f(i + j) = i + f(j)}
{\out  1. f(0 + j) = 0 + f(j)}
{\out  2. !!x. f(x + j) = x + f(j) ==> f(Suc(x) + j) = Suc(x) + f(j)}
\end{ttbox}
We simplify each subgoal in turn.  The first one is trivial:
\begin{ttbox}
by (Simp_tac 1);
{\out Level 2}
{\out f(i + j) = i + f(j)}
{\out  1. !!x. f(x + j) = x + f(j) ==> f(Suc(x) + j) = Suc(x) + f(j)}
\end{ttbox}
The remaining subgoal requires rewriting by the premise, so we add it
to the current simpset:
\begin{ttbox}
by (asm_simp_tac (simpset() addsimps [prem]) 1);
{\out Level 3}
{\out f(i + j) = i + f(j)}
{\out No subgoals!}
\end{ttbox}

\subsection{Reordering assumptions}
\label{sec:reordering-asms}
\index{assumptions!reordering}

As mentioned in \S\ref{sec:simp-for-dummies-tacs},
\ttindex{asm_full_simp_tac} may require the assumptions to be permuted
to be more effective.  Given the subgoal
\begin{ttbox}
{\out 1. [| ALL x. P x --> f x = g x; Q(f a); P a; R |] ==> S}
\end{ttbox}
we can rotate the assumptions two positions to the right
\begin{ttbox}
by (rotate_tac ~2 1);
\end{ttbox}
to obtain
\begin{ttbox}
{\out 1. [| P a; R; ALL x. P x --> f x = g x; Q(f a) |] ==> S}
\end{ttbox}
which enables \verb$asm_full_simp_tac$ to simplify \verb$Q(f a)$ to
\verb$Q(g a)$ because now all required assumptions are to the left of
\verb$Q(f a)$.

Since rotation alone cannot produce arbitrary permutations, you can also pick
out a particular assumption which needs to be rewritten and move it the the
right end of the assumptions.  In the above case rotation can be replaced by
\begin{ttbox}
by (dres_inst_tac [("psi","Q(f a)")] asm_rl 1);
\end{ttbox}
which is more directed and leads to
\begin{ttbox}
{\out 1. [| ALL x. P x --> f x = g x; P a; R; Q(f a) |] ==> S}
\end{ttbox}

\begin{warn}
  Reordering assumptions usually leads to brittle proofs and should be
  avoided.  Future versions of \verb$asm_full_simp_tac$ will completely
  remove the need for such manipulations.
\end{warn}


\section{Permutative rewrite rules}
\index{rewrite rules!permutative|(}

A rewrite rule is {\bf permutative} if the left-hand side and right-hand
side are the same up to renaming of variables.  The most common permutative
rule is commutativity: $x+y = y+x$.  Other examples include $(x-y)-z =
(x-z)-y$ in arithmetic and $insert(x,insert(y,A)) = insert(y,insert(x,A))$
for sets.  Such rules are common enough to merit special attention.

Because ordinary rewriting loops given such rules, the simplifier
employs a special strategy, called {\bf ordered
  rewriting}\index{rewriting!ordered}.  There is a standard
lexicographic ordering on terms.  This should be perfectly OK in most
cases, but can be changed for special applications.

\begin{ttbox}
settermless : simpset * (term * term -> bool) -> simpset \hfill{\bf infix 4}
\end{ttbox}
\begin{ttdescription}
  
\item[$ss$ \ttindexbold{settermless} $rel$] installs relation $rel$ as
  term order in simpset $ss$.

\end{ttdescription}

\medskip

A permutative rewrite rule is applied only if it decreases the given
term with respect to this ordering.  For example, commutativity
rewrites~$b+a$ to $a+b$, but then stops because $a+b$ is strictly less
than $b+a$.  The Boyer-Moore theorem prover~\cite{bm88book} also
employs ordered rewriting.

Permutative rewrite rules are added to simpsets just like other
rewrite rules; the simplifier recognizes their special status
automatically.  They are most effective in the case of
associative-commutative operators.  (Associativity by itself is not
permutative.)  When dealing with an AC-operator~$f$, keep the
following points in mind:
\begin{itemize}\index{associative-commutative operators}
  
\item The associative law must always be oriented from left to right,
  namely $f(f(x,y),z) = f(x,f(y,z))$.  The opposite orientation, if
  used with commutativity, leads to looping in conjunction with the
  standard term order.

\item To complete your set of rewrite rules, you must add not just
  associativity~(A) and commutativity~(C) but also a derived rule, {\bf
    left-com\-mut\-ativ\-ity} (LC): $f(x,f(y,z)) = f(y,f(x,z))$.
\end{itemize}
Ordered rewriting with the combination of A, C, and~LC sorts a term
lexicographically:
\[\def\maps#1{\stackrel{#1}{\longmapsto}}
 (b+c)+a \maps{A} b+(c+a) \maps{C} b+(a+c) \maps{LC} a+(b+c) \]
Martin and Nipkow~\cite{martin-nipkow} discuss the theory and give many
examples; other algebraic structures are amenable to ordered rewriting,
such as boolean rings.

\subsection{Example: sums of natural numbers}

This example is again set in \HOL\ (see \texttt{HOL/ex/NatSum}).
Theory \thydx{Arith} contains natural numbers arithmetic.  Its
associated simpset contains many arithmetic laws including
distributivity of~$\times$ over~$+$, while \texttt{add_ac} is a list
consisting of the A, C and LC laws for~$+$ on type \texttt{nat}.  Let
us prove the theorem
\[ \sum@{i=1}^n i = n\times(n+1)/2. \]
%
A functional~\texttt{sum} represents the summation operator under the
interpretation $\texttt{sum} \, f \, (n + 1) = \sum@{i=0}^n f\,i$.  We
extend \texttt{Arith} as follows:
\begin{ttbox}
NatSum = Arith +
consts sum     :: [nat=>nat, nat] => nat
primrec "sum" nat 
  "sum f 0 = 0"
  "sum f (Suc n) = f(n) + sum f n"
end
\end{ttbox}
The \texttt{primrec} declaration automatically adds rewrite rules for
\texttt{sum} to the default simpset.  We now remove the
\texttt{nat_cancel} simplification procedures (in order not to spoil
the example) and insert the AC-rules for~$+$:
\begin{ttbox}
Delsimprocs nat_cancel;
Addsimps add_ac;
\end{ttbox}
Our desired theorem now reads $\texttt{sum} \, (\lambda i.i) \, (n+1) =
n\times(n+1)/2$.  The Isabelle goal has both sides multiplied by~$2$:
\begin{ttbox}
Goal "2 * sum (\%i.i) (Suc n) = n * Suc n";
{\out Level 0}
{\out 2 * sum (\%i. i) (Suc n) = n * Suc n}
{\out  1. 2 * sum (\%i. i) (Suc n) = n * Suc n}
\end{ttbox}
Induction should not be applied until the goal is in the simplest
form:
\begin{ttbox}
by (Simp_tac 1);
{\out Level 1}
{\out 2 * sum (\%i. i) (Suc n) = n * Suc n}
{\out  1. n + (sum (\%i. i) n + sum (\%i. i) n) = n * n}
\end{ttbox}
Ordered rewriting has sorted the terms in the left-hand side.  The
subgoal is now ready for induction:
\begin{ttbox}
by (induct_tac "n" 1);
{\out Level 2}
{\out 2 * sum (\%i. i) (Suc n) = n * Suc n}
{\out  1. 0 + (sum (\%i. i) 0 + sum (\%i. i) 0) = 0 * 0}
\ttbreak
{\out  2. !!n. n + (sum (\%i. i) n + sum (\%i. i) n) = n * n}
{\out           ==> Suc n + (sum (\%i. i) (Suc n) + sum (\%i. i) (Suc n)) =}
{\out               Suc n * Suc n}
\end{ttbox}
Simplification proves both subgoals immediately:\index{*ALLGOALS}
\begin{ttbox}
by (ALLGOALS Asm_simp_tac);
{\out Level 3}
{\out 2 * sum (\%i. i) (Suc n) = n * Suc n}
{\out No subgoals!}
\end{ttbox}
Simplification cannot prove the induction step if we omit \texttt{add_ac} from
the simpset.  Observe that like terms have not been collected:
\begin{ttbox}
{\out Level 3}
{\out 2 * sum (\%i. i) (Suc n) = n * Suc n}
{\out  1. !!n. n + sum (\%i. i) n + (n + sum (\%i. i) n) = n + n * n}
{\out           ==> n + (n + sum (\%i. i) n) + (n + (n + sum (\%i. i) n)) =}
{\out               n + (n + (n + n * n))}
\end{ttbox}
Ordered rewriting proves this by sorting the left-hand side.  Proving
arithmetic theorems without ordered rewriting requires explicit use of
commutativity.  This is tedious; try it and see!

Ordered rewriting is equally successful in proving
$\sum@{i=1}^n i^3 = n^2\times(n+1)^2/4$.


\subsection{Re-orienting equalities}
Ordered rewriting with the derived rule \texttt{symmetry} can reverse
equations:
\begin{ttbox}
val symmetry = prove_goal HOL.thy "(x=y) = (y=x)"
                 (fn _ => [Blast_tac 1]);
\end{ttbox}
This is frequently useful.  Assumptions of the form $s=t$, where $t$ occurs
in the conclusion but not~$s$, can often be brought into the right form.
For example, ordered rewriting with \texttt{symmetry} can prove the goal
\[ f(a)=b \conj f(a)=c \imp b=c. \]
Here \texttt{symmetry} reverses both $f(a)=b$ and $f(a)=c$
because $f(a)$ is lexicographically greater than $b$ and~$c$.  These
re-oriented equations, as rewrite rules, replace $b$ and~$c$ in the
conclusion by~$f(a)$. 

Another example is the goal $\neg(t=u) \imp \neg(u=t)$.
The differing orientations make this appear difficult to prove.  Ordered
rewriting with \texttt{symmetry} makes the equalities agree.  (Without
knowing more about~$t$ and~$u$ we cannot say whether they both go to $t=u$
or~$u=t$.)  Then the simplifier can prove the goal outright.

\index{rewrite rules!permutative|)}


\section{*Coding simplification procedures}
\begin{ttbox}
mk_simproc: string -> cterm list ->
              (Sign.sg -> thm list -> term -> thm option) -> simproc
\end{ttbox}

\begin{ttdescription}
\item[\ttindexbold{mk_simproc}~$name$~$lhss$~$proc$] makes $proc$ a
  simplification procedure for left-hand side patterns $lhss$.  The
  name just serves as a comment.  The function $proc$ may be invoked
  by the simplifier for redex positions matched by one of $lhss$ as
  described below.
\end{ttdescription}

Simplification procedures are applied in a two-stage process as
follows: The simplifier tries to match the current redex position
against any one of the $lhs$ patterns of any simplification procedure.
If this succeeds, it invokes the corresponding {\ML} function, passing
with the current signature, local assumptions and the (potential)
redex.  The result may be either \texttt{None} (indicating failure) or
\texttt{Some~$thm$}.

Any successful result is supposed to be a (possibly conditional)
rewrite rule $t \equiv u$ that is applicable to the current redex.
The rule will be applied just as any ordinary rewrite rule.  It is
expected to be already in \emph{internal form}, though, bypassing the
automatic preprocessing of object-level equivalences.

\medskip

As an example of how to write your own simplification procedures,
consider eta-expansion of pair abstraction (see also
\texttt{HOL/Modelcheck/MCSyn} where this is used to provide external
model checker syntax).
  
The {\HOL} theory of tuples (see \texttt{HOL/Prod}) provides an
operator \texttt{split} together with some concrete syntax supporting
$\lambda\,(x,y).b$ abstractions.  Assume that we would like to offer a
tactic that rewrites any function $\lambda\,p.f\,p$ (where $p$ is of
some pair type) to $\lambda\,(x,y).f\,(x,y)$.  The corresponding rule
is:
\begin{ttbox}
pair_eta_expand:  (f::'a*'b=>'c) = (\%(x, y). f (x, y))
\end{ttbox}
Unfortunately, term rewriting using this rule directly would not
terminate!  We now use the simplification procedure mechanism in order
to stop the simplifier from applying this rule over and over again,
making it rewrite only actual abstractions.  The simplification
procedure \texttt{pair_eta_expand_proc} is defined as follows:
\begin{ttbox}
local
  val lhss =
    [read_cterm (sign_of Prod.thy) 
                ("f::'a*'b=>'c", TVar (("'z", 0), []))];
  val rew = mk_meta_eq pair_eta_expand; \medskip
  fun proc _ _ (Abs _) = Some rew
    | proc _ _ _ = None;
in
  val pair_eta_expand_proc = mk_simproc "pair_eta_expand" lhss proc;
end;
\end{ttbox}
This is an example of using \texttt{pair_eta_expand_proc}:
\begin{ttbox}
{\out 1. P (\%p::'a * 'a. fst p + snd p + z)}
by (simp_tac (simpset() addsimprocs [pair_eta_expand_proc]) 1);
{\out 1. P (\%(x::'a,y::'a). x + y + z)}
\end{ttbox}

\medskip

In the above example the simplification procedure just did fine
grained control over rule application, beyond higher-order pattern
matching.  Usually, procedures would do some more work, in particular
prove particular theorems depending on the current redex.


\section{*Setting up the simplifier}\label{sec:setting-up-simp}
\index{simplification!setting up}

Setting up the simplifier for new logics is complicated.  This section
describes how the simplifier is installed for intuitionistic
first-order logic; the code is largely taken from {\tt
  FOL/simpdata.ML} of the Isabelle sources.

The simplifier and the case splitting tactic, which reside on separate
files, are not part of Pure Isabelle.  They must be loaded explicitly
by the object-logic as follows:
\begin{ttbox}
use "$ISABELLE_HOME/src/Provers/simplifier.ML";
use "$ISABELLE_HOME/src/Provers/splitter.ML";
\end{ttbox}

Simplification requires converting object-equalities to meta-level rewrite
rules.  This demands rules stating that equal terms and equivalent formulae
are also equal at the meta-level.  The rule declaration part of the file
\texttt{FOL/IFOL.thy} contains the two lines
\begin{ttbox}\index{*eq_reflection theorem}\index{*iff_reflection theorem}
eq_reflection   "(x=y)   ==> (x==y)"
iff_reflection  "(P<->Q) ==> (P==Q)"
\end{ttbox}
Of course, you should only assert such rules if they are true for your
particular logic.  In Constructive Type Theory, equality is a ternary
relation of the form $a=b\in A$; the type~$A$ determines the meaning
of the equality essentially as a partial equivalence relation.  The
present simplifier cannot be used.  Rewriting in \texttt{CTT} uses
another simplifier, which resides in the file {\tt
  Provers/typedsimp.ML} and is not documented.  Even this does not
work for later variants of Constructive Type Theory that use
intensional equality~\cite{nordstrom90}.


\subsection{A collection of standard rewrite rules}

We first prove lots of standard rewrite rules about the logical
connectives.  These include cancellation and associative laws.  We
define a function that echoes the desired law and then supplies it the
prover for intuitionistic \FOL:
\begin{ttbox}
fun int_prove_fun s = 
 (writeln s;  
  prove_goal IFOL.thy s
   (fn prems => [ (cut_facts_tac prems 1), 
                  (IntPr.fast_tac 1) ]));
\end{ttbox}
The following rewrite rules about conjunction are a selection of those
proved on \texttt{FOL/simpdata.ML}.  Later, these will be supplied to the
standard simpset.
\begin{ttbox}
val conj_simps = map int_prove_fun
 ["P & True <-> P",      "True & P <-> P",
  "P & False <-> False", "False & P <-> False",
  "P & P <-> P",
  "P & ~P <-> False",    "~P & P <-> False",
  "(P & Q) & R <-> P & (Q & R)"];
\end{ttbox}
The file also proves some distributive laws.  As they can cause exponential
blowup, they will not be included in the standard simpset.  Instead they
are merely bound to an \ML{} identifier, for user reference.
\begin{ttbox}
val distrib_simps  = map int_prove_fun
 ["P & (Q | R) <-> P&Q | P&R", 
  "(Q | R) & P <-> Q&P | R&P",
  "(P | Q --> R) <-> (P --> R) & (Q --> R)"];
\end{ttbox}


\subsection{Functions for preprocessing the rewrite rules}
\label{sec:setmksimps}
\begin{ttbox}\indexbold{*setmksimps}
setmksimps : simpset * (thm -> thm list) -> simpset \hfill{\bf infix 4}
\end{ttbox}
The next step is to define the function for preprocessing rewrite rules.
This will be installed by calling \texttt{setmksimps} below.  Preprocessing
occurs whenever rewrite rules are added, whether by user command or
automatically.  Preprocessing involves extracting atomic rewrites at the
object-level, then reflecting them to the meta-level.

To start, the function \texttt{gen_all} strips any meta-level
quantifiers from the front of the given theorem.  Usually there are none
anyway.
\begin{ttbox}
fun gen_all th = forall_elim_vars (#maxidx(rep_thm th)+1) th;
\end{ttbox}

The function \texttt{atomize} analyses a theorem in order to extract
atomic rewrite rules.  The head of all the patterns, matched by the
wildcard~\texttt{_}, is the coercion function \texttt{Trueprop}.
\begin{ttbox}
fun atomize th = case concl_of th of 
    _ $ (Const("op &",_) $ _ $ _)   => atomize(th RS conjunct1) \at
                                       atomize(th RS conjunct2)
  | _ $ (Const("op -->",_) $ _ $ _) => atomize(th RS mp)
  | _ $ (Const("All",_) $ _)        => atomize(th RS spec)
  | _ $ (Const("True",_))           => []
  | _ $ (Const("False",_))          => []
  | _                               => [th];
\end{ttbox}
There are several cases, depending upon the form of the conclusion:
\begin{itemize}
\item Conjunction: extract rewrites from both conjuncts.
\item Implication: convert $P\imp Q$ to the meta-implication $P\Imp Q$ and
  extract rewrites from~$Q$; these will be conditional rewrites with the
  condition~$P$.
\item Universal quantification: remove the quantifier, replacing the bound
  variable by a schematic variable, and extract rewrites from the body.
\item \texttt{True} and \texttt{False} contain no useful rewrites.
\item Anything else: return the theorem in a singleton list.
\end{itemize}
The resulting theorems are not literally atomic --- they could be
disjunctive, for example --- but are broken down as much as possible. 
See the file \texttt{ZF/simpdata.ML} for a sophisticated translation of
set-theoretic formulae into rewrite rules. 

For standard situations like the above,
there is a generic auxiliary function \ttindexbold{mk_atomize} that takes a 
list of pairs $(name, thms)$, where $name$ is an operator name and
$thms$ is a list of theorems to resolve with in case the pattern matches, 
and returns a suitable \texttt{atomize} function.


The simplified rewrites must now be converted into meta-equalities.  The
rule \texttt{eq_reflection} converts equality rewrites, while {\tt
  iff_reflection} converts if-and-only-if rewrites.  The latter possibility
can arise in two other ways: the negative theorem~$\neg P$ is converted to
$P\equiv\texttt{False}$, and any other theorem~$P$ is converted to
$P\equiv\texttt{True}$.  The rules \texttt{iff_reflection_F} and {\tt
  iff_reflection_T} accomplish this conversion.
\begin{ttbox}
val P_iff_F = int_prove_fun "~P ==> (P <-> False)";
val iff_reflection_F = P_iff_F RS iff_reflection;
\ttbreak
val P_iff_T = int_prove_fun "P ==> (P <-> True)";
val iff_reflection_T = P_iff_T RS iff_reflection;
\end{ttbox}
The function \texttt{mk_eq} converts a theorem to a meta-equality
using the case analysis described above.
\begin{ttbox}
fun mk_eq th = case concl_of th of
    _ $ (Const("op =",_)$_$_)   => th RS eq_reflection
  | _ $ (Const("op <->",_)$_$_) => th RS iff_reflection
  | _ $ (Const("Not",_)$_)      => th RS iff_reflection_F
  | _                           => th RS iff_reflection_T;
\end{ttbox}
The three functions \texttt{gen_all}, \texttt{atomize} and \texttt{mk_eq} 
will be composed together and supplied below to \texttt{setmksimps}.


\subsection{Making the initial simpset}

It is time to assemble these items.  We define the infix operator
\ttindex{addcongs} to insert congruence rules; given a list of
theorems, it converts their conclusions into meta-equalities and
passes them to \ttindex{addeqcongs}.
\begin{ttbox}
infix 4 addcongs;
fun ss addcongs congs =
    ss addeqcongs (congs RL [eq_reflection,iff_reflection]);
\end{ttbox}

The list \texttt{IFOL_simps} contains the default rewrite rules for
intuitionistic first-order logic.  The first of these is the reflexive law
expressed as the equivalence $(a=a)\bimp\texttt{True}$; the rewrite rule $a=a$ is
clearly useless.
\begin{ttbox}
val IFOL_simps =
   [refl RS P_iff_T] \at conj_simps \at disj_simps \at not_simps \at 
    imp_simps \at iff_simps \at quant_simps;
\end{ttbox}
The list \texttt{triv_rls} contains trivial theorems for the solver.  Any
subgoal that is simplified to one of these will be removed.
\begin{ttbox}
val notFalseI = int_prove_fun "~False";
val triv_rls = [TrueI,refl,iff_refl,notFalseI];
\end{ttbox}
%
The basic simpset for intuitionistic \FOL{} is
\ttindexbold{FOL_basic_ss}.  It preprocess rewrites using {\tt
  gen_all}, \texttt{atomize} and \texttt{mk_eq}.  It solves simplified
subgoals using \texttt{triv_rls} and assumptions, and by detecting
contradictions.  It uses \ttindex{asm_simp_tac} to tackle subgoals of
conditional rewrites.

Other simpsets built from \texttt{FOL_basic_ss} will inherit these items.
In particular, \ttindexbold{IFOL_ss}, which introduces {\tt
  IFOL_simps} as rewrite rules.  \ttindexbold{FOL_ss} will later
extend \texttt{IFOL_ss} with classical rewrite rules such as $\neg\neg
P\bimp P$.
\index{*setmksimps}\index{*setSSolver}\index{*setSolver}\index{*setsubgoaler}
\index{*addsimps}\index{*addcongs}
\begin{ttbox}
fun unsafe_solver prems = FIRST'[resolve_tac (triv_rls {\at} prems),
                                 atac, etac FalseE];

fun   safe_solver prems = FIRST'[match_tac (triv_rls {\at} prems),
                                 eq_assume_tac, ematch_tac [FalseE]];

val FOL_basic_ss = empty_ss setsubgoaler asm_simp_tac
                            addsimprocs [defALL_regroup, defEX_regroup]
                            setSSolver   safe_solver
                            setSolver  unsafe_solver
                            setmksimps (map mk_eq o atomize o gen_all);

val IFOL_ss = FOL_basic_ss addsimps (IFOL_simps {\at} 
                                     int_ex_simps {\at} int_all_simps)
                           addcongs [imp_cong];
\end{ttbox}
This simpset takes \texttt{imp_cong} as a congruence rule in order to use
contextual information to simplify the conclusions of implications:
\[ \List{\Var{P}\bimp\Var{P'};\; \Var{P'} \Imp \Var{Q}\bimp\Var{Q'}} \Imp
   (\Var{P}\imp\Var{Q}) \bimp (\Var{P'}\imp\Var{Q'})
\]
By adding the congruence rule \texttt{conj_cong}, we could obtain a similar
effect for conjunctions.


\subsection{Splitter setup}\index{simplification!setting up the splitter}

To set up case splitting, we have to call the \ML{} functor \ttindex{
SplitterFun}, which takes the argument signature \texttt{SPLITTER_DATA}. 
So we prove the theorem \texttt{meta_eq_to_iff} below and store it, together
with the \texttt{mk_eq} function described above and several standard
theorems, in the structure \texttt{SplitterData}. Calling the functor with
this data yields a new instantiation of the splitter for our logic.
\begin{ttbox}
val meta_eq_to_iff = prove_goal IFOL.thy "x==y ==> x<->y"
  (fn [prem] => [rewtac prem, rtac iffI 1, atac 1, atac 1]);
\ttbreak
structure SplitterData =
  struct
  structure Simplifier = Simplifier
  val mk_eq          = mk_eq
  val meta_eq_to_iff = meta_eq_to_iff
  val iffD           = iffD2
  val disjE          = disjE
  val conjE          = conjE
  val exE            = exE
  val contrapos      = contrapos
  val contrapos2     = contrapos2
  val notnotD        = notnotD
  end;
\ttbreak
structure Splitter = SplitterFun(SplitterData);
\end{ttbox}


\subsection{Theory setup}\index{simplification!setting up the theory}
\begin{ttbox}\indexbold{*Simplifier.setup}\index{*setup!simplifier}
Simplifier.setup: (theory -> theory) list
\end{ttbox}

Advanced theory related features of the simplifier (e.g.\ implicit
simpset support) have to be set up explicitly.  The simplifier already
provides a suitable setup function definition.  This has to be
installed into the base theory of any new object-logic via a
\texttt{setup} declaration.

For example, this is done in \texttt{FOL/IFOL.thy} as follows:
\begin{ttbox}
setup Simplifier.setup
\end{ttbox}


\index{simplification|)}


%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "ref"
%%% End: