doc-src/Ref/theories.tex
author wenzelm
Sun, 11 Nov 2001 21:34:09 +0100
changeset 12143 dc42d17c5b53
parent 11623 9c95b6a76e15
child 19627 b07c46e67e2d
permissions -rw-r--r--
added add_thmss;


%% $Id$

\chapter{Theories, Terms and Types} \label{theories}
\index{theories|(}\index{signatures|bold}
\index{reading!axioms|see{\texttt{assume_ax}}} Theories organize the syntax,
declarations and axioms of a mathematical development.  They are built,
starting from the Pure or CPure theory, by extending and merging existing
theories.  They have the \ML\ type \mltydx{theory}.  Theory operations signal
errors by raising exception \xdx{THEORY}, returning a message and a list of
theories.

Signatures, which contain information about sorts, types, constants and
syntax, have the \ML\ type~\mltydx{Sign.sg}.  For identification, each
signature carries a unique list of \bfindex{stamps}, which are \ML\
references to strings.  The strings serve as human-readable names; the
references serve as unique identifiers.  Each primitive signature has a
single stamp.  When two signatures are merged, their lists of stamps are
also merged.  Every theory carries a unique signature.

Terms and types are the underlying representation of logical syntax.  Their
\ML\ definitions are irrelevant to naive Isabelle users.  Programmers who
wish to extend Isabelle may need to know such details, say to code a tactic
that looks for subgoals of a particular form.  Terms and types may be
`certified' to be well-formed with respect to a given signature.


\section{Defining theories}\label{sec:ref-defining-theories}

Theories are defined via theory files $name$\texttt{.thy} (there are also
\ML-level interfaces which are only intended for people building advanced
theory definition packages).  Appendix~\ref{app:TheorySyntax} presents the
concrete syntax for theory files; here follows an explanation of the
constituent parts.
\begin{description}
\item[{\it theoryDef}] is the full definition.  The new theory is called $id$.
  It is the union of the named \textbf{parent
    theories}\indexbold{theories!parent}, possibly extended with new
  components.  \thydx{Pure} and \thydx{CPure} are the basic theories, which
  contain only the meta-logic.  They differ just in their concrete syntax for
  function applications.
  
  The new theory begins as a merge of its parents.
  \begin{ttbox}
    Attempt to merge different versions of theories: "\(T@1\)", \(\ldots\), "\(T@n\)"
  \end{ttbox}
  This error may especially occur when a theory is redeclared --- say to
  change an inappropriate definition --- and bindings to old versions persist.
  Isabelle ensures that old and new theories of the same name are not involved
  in a proof.

\item[$classes$]
  is a series of class declarations.  Declaring {\tt$id$ < $id@1$ \dots\
    $id@n$} makes $id$ a subclass of the existing classes $id@1\dots
  id@n$.  This rules out cyclic class structures.  Isabelle automatically
  computes the transitive closure of subclass hierarchies; it is not
  necessary to declare \texttt{c < e} in addition to \texttt{c < d} and \texttt{d <
    e}.

\item[$default$]
  introduces $sort$ as the new default sort for type variables.  This applies
  to unconstrained type variables in an input string but not to type
  variables created internally.  If omitted, the default sort is the listwise
  union of the default sorts of the parent theories (i.e.\ their logical
  intersection).
  
\item[$sort$] is a finite set of classes.  A single class $id$ abbreviates the
  sort $\{id\}$.

\item[$types$]
  is a series of type declarations.  Each declares a new type constructor
  or type synonym.  An $n$-place type constructor is specified by
  $(\alpha@1,\dots,\alpha@n)name$, where the type variables serve only to
  indicate the number~$n$.

  A \textbf{type synonym}\indexbold{type synonyms} is an abbreviation
  $(\alpha@1,\dots,\alpha@n)name = \tau$, where $name$ and $\tau$ can
  be strings.

\item[$infix$]
  declares a type or constant to be an infix operator having priority $nat$
  and associating to the left (\texttt{infixl}) or right (\texttt{infixr}).
  Only 2-place type constructors can have infix status; an example is {\tt
  ('a,'b)~"*"~(infixr~20)}, which may express binary product types.

\item[$arities$] is a series of type arity declarations.  Each assigns
  arities to type constructors.  The $name$ must be an existing type
  constructor, which is given the additional arity $arity$.
  
\item[$nonterminals$]\index{*nonterminal symbols} declares purely
  syntactic types to be used as nonterminal symbols of the context
  free grammar.

\item[$consts$] is a series of constant declarations.  Each new
  constant $name$ is given the specified type.  The optional $mixfix$
  annotations may attach concrete syntax to the constant.
  
\item[$syntax$] \index{*syntax section}\index{print mode} is a variant
  of $consts$ which adds just syntax without actually declaring
  logical constants.  This gives full control over a theory's context
  free grammar.  The optional $mode$ specifies the print mode where the
  mixfix productions should be added.  If there is no \texttt{output}
  option given, all productions are also added to the input syntax
  (regardless of the print mode).

\item[$mixfix$] \index{mixfix declarations}
  annotations can take three forms:
  \begin{itemize}
  \item A mixfix template given as a $string$ of the form
    {\tt"}\dots{\tt\_}\dots{\tt\_}\dots{\tt"} where the $i$-th underscore
    indicates the position where the $i$-th argument should go.  The list
    of numbers gives the priority of each argument.  The final number gives
    the priority of the whole construct.

  \item A constant $f$ of type $\tau@1\To(\tau@2\To\tau)$ can be given {\bf
    infix} status.

  \item A constant $f$ of type $(\tau@1\To\tau@2)\To\tau$ can be given {\bf
    binder} status.  The declaration \texttt{binder} $\cal Q$ $p$ causes
  ${\cal Q}\,x.F(x)$ to be treated
  like $f(F)$, where $p$ is the priority.
  \end{itemize}

\item[$trans$]
  specifies syntactic translation rules (macros).  There are three forms:
  parse rules (\texttt{=>}), print rules (\texttt{<=}), and parse/print rules ({\tt
  ==}).

\item[$rules$]
  is a series of rule declarations.  Each has a name $id$ and the formula is
  given by the $string$.  Rule names must be distinct within any single
  theory.

\item[$defs$] is a series of definitions.  They are just like $rules$, except
  that every $string$ must be a definition (see below for details).

\item[$constdefs$] combines the declaration of constants and their
  definition.  The first $string$ is the type, the second the definition.
  
\item[$axclass$] \index{*axclass section} defines an \rmindex{axiomatic type
    class} \cite{Wenzel:1997:TPHOL} as the intersection of existing classes,
  with additional axioms holding.  Class axioms may not contain more than one
  type variable.  The class axioms (with implicit sort constraints added) are
  bound to the given names.  Furthermore a class introduction rule is
  generated, which is automatically employed by $instance$ to prove
  instantiations of this class.
  
\item[$instance$] \index{*instance section} proves class inclusions or
  type arities at the logical level and then transfers these to the
  type signature.  The instantiation is proven and checked properly.
  The user has to supply sufficient witness information: theorems
  ($longident$), axioms ($string$), or even arbitrary \ML{} tactic
  code $verbatim$.

\item[$oracle$] links the theory to a trusted external reasoner.  It is
  allowed to create theorems, but each theorem carries a proof object
  describing the oracle invocation.  See \S\ref{sec:oracles} for details.
  
\item[$local$, $global$] change the current name declaration mode.
  Initially, theories start in $local$ mode, causing all names of
  types, constants, axioms etc.\ to be automatically qualified by the
  theory name.  Changing this to $global$ causes all names to be
  declared as short base names only.
  
  The $local$ and $global$ declarations act like switches, affecting
  all following theory sections until changed again explicitly.  Also
  note that the final state at the end of the theory will persist.  In
  particular, this determines how the names of theorems stored later
  on are handled.
  
\item[$setup$]\index{*setup!theory} applies a list of ML functions to
  the theory.  The argument should denote a value of type
  \texttt{(theory -> theory) list}.  Typically, ML packages are
  initialized in this way.

\item[$ml$] \index{*ML section}
  consists of \ML\ code, typically for parse and print translation functions.
\end{description}
%
Chapters~\ref{Defining-Logics} and \ref{chap:syntax} explain mixfix
declarations, translation rules and the \texttt{ML} section in more detail.


\subsection{Definitions}\indexbold{definitions}

\textbf{Definitions} are intended to express abbreviations.  The simplest
form of a definition is $f \equiv t$, where $f$ is a constant.
Isabelle also allows a derived forms where the arguments of~$f$ appear
on the left, abbreviating a string of $\lambda$-abstractions.

Isabelle makes the following checks on definitions:
\begin{itemize}
\item Arguments (on the left-hand side) must be distinct variables.
\item All variables on the right-hand side must also appear on the left-hand
  side. 
\item All type variables on the right-hand side must also appear on
  the left-hand side; this prohibits definitions such as {\tt
    (zero::nat) == length ([]::'a list)}.
\item The definition must not be recursive.  Most object-logics provide
  definitional principles that can be used to express recursion safely.
\end{itemize}
These checks are intended to catch the sort of errors that might be made
accidentally.  Misspellings, for instance, might result in additional
variables appearing on the right-hand side.  More elaborate checks could be
made, but the cost might be overly strict rules on declaration order, etc.


\subsection{*Classes and arities}
\index{classes!context conditions}\index{arities!context conditions}

In order to guarantee principal types~\cite{nipkow-prehofer},
arity declarations must obey two conditions:
\begin{itemize}
\item There must not be any two declarations $ty :: (\vec{r})c$ and
  $ty :: (\vec{s})c$ with $\vec{r} \neq \vec{s}$.  For example, this
  excludes the following:
\begin{ttbox}
arities
  foo :: (\{logic{\}}) logic
  foo :: (\{{\}})logic
\end{ttbox}

\item If there are two declarations $ty :: (s@1,\dots,s@n)c$ and $ty ::
  (s@1',\dots,s@n')c'$ such that $c' < c$ then $s@i' \preceq s@i$ must hold
  for $i=1,\dots,n$.  The relationship $\preceq$, defined as
\[ s' \preceq s \iff \forall c\in s. \exists c'\in s'.~ c'\le c, \]
expresses that the set of types represented by $s'$ is a subset of the
set of types represented by $s$.  Assuming $term \preceq logic$, the
following is forbidden:
\begin{ttbox}
arities
  foo :: (\{logic{\}})logic
  foo :: (\{{\}})term
\end{ttbox}

\end{itemize}


\section{The theory loader}\label{sec:more-theories}
\index{theories!reading}\index{files!reading}

Isabelle's theory loader manages dependencies of the internal graph of theory
nodes (the \emph{theory database}) and the external view of the file system.
See \S\ref{sec:intro-theories} for its most basic commands, such as
\texttt{use_thy}.  There are a few more operations available.

\begin{ttbox}
use_thy_only    : string -> unit
update_thy_only : string -> unit
touch_thy       : string -> unit
remove_thy      : string -> unit
delete_tmpfiles : bool ref \hfill\textbf{initially true}
\end{ttbox}

\begin{ttdescription}
\item[\ttindexbold{use_thy_only} "$name$";] is similar to \texttt{use_thy},
  but processes the actual theory file $name$\texttt{.thy} only, ignoring
  $name$\texttt{.ML}.  This might be useful in replaying proof scripts by hand
  from the very beginning, starting with the fresh theory.
  
\item[\ttindexbold{update_thy_only} "$name$";] is similar to
  \texttt{update_thy}, but processes the actual theory file
  $name$\texttt{.thy} only, ignoring $name$\texttt{.ML}.

\item[\ttindexbold{touch_thy} "$name$";] marks theory node $name$ of the
  internal graph as outdated.  While the theory remains usable, subsequent
  operations such as \texttt{use_thy} may cause a reload.
  
\item[\ttindexbold{remove_thy} "$name$";] deletes theory node $name$,
  including \emph{all} of its descendants.  Beware!  This is a quick way to
  dispose a large number of theories at once.  Note that {\ML} bindings to
  theorems etc.\ of removed theories may still persist.
  
\item[reset \ttindexbold{delete_tmpfiles};] processing theory files usually
  involves temporary {\ML} files to be created.  By default, these are deleted
  afterwards.  Resetting the \texttt{delete_tmpfiles} flag inhibits this,
  leaving the generated code for debugging purposes.  The basic location for
  temporary files is determined by the \texttt{ISABELLE_TMP} environment
  variable (which is private to the running Isabelle process and may be
  retrieved by \ttindex{getenv} from {\ML}).
\end{ttdescription}

\medskip Theory and {\ML} files are located by skimming through the
directories listed in Isabelle's internal load path, which merely contains the
current directory ``\texttt{.}'' by default.  The load path may be accessed by
the following operations.

\begin{ttbox}
show_path: unit -> string list
add_path: string -> unit
del_path: string -> unit
reset_path: unit -> unit
with_path: string -> ('a -> 'b) -> 'a -> 'b
no_document: ('a -> 'b) -> 'a -> 'b
\end{ttbox}

\begin{ttdescription}
\item[\ttindexbold{show_path}();] displays the load path components in
  canonical string representation (which is always according to Unix rules).
  
\item[\ttindexbold{add_path} "$dir$";] adds component $dir$ to the beginning
  of the load path.
  
\item[\ttindexbold{del_path} "$dir$";] removes any occurrences of component
  $dir$ from the load path.
  
\item[\ttindexbold{reset_path}();] resets the load path to ``\texttt{.}''
  (current directory) only.
  
\item[\ttindexbold{with_path} "$dir$" $f$ $x$;] temporarily adds component
  $dir$ to the beginning of the load path while executing $(f~x)$.
  
\item[\ttindexbold{no_document} $f$ $x$;] temporarily disables {\LaTeX}
  document generation while executing $(f~x)$.

\end{ttdescription}

Furthermore, in operations referring indirectly to some file (e.g.\ 
\texttt{use_dir}) the argument may be prefixed by a directory that will be
temporarily appended to the load path, too.


\section{Locales}
\label{Locales}

Locales \cite{kammueller-locales} are a concept of local proof contexts.  They
are introduced as named syntactic objects within theories and can be
opened in any descendant theory.

\subsection{Declaring Locales}

A locale is declared in a theory section that starts with the
keyword \texttt{locale}.  It consists typically of three parts, the
\texttt{fixes} part, the \texttt{assumes} part, and the \texttt{defines} part.
Appendix \ref{app:TheorySyntax} presents the full syntax.

\subsubsection{Parts of Locales}

The subsection introduced by the keyword \texttt{fixes} declares the locale
constants in a way that closely resembles a global \texttt{consts}
declaration.  In particular, there may be an optional pretty printing syntax
for the locale constants.

The subsequent \texttt{assumes} part specifies the locale rules.  They are
defined like \texttt{rules}: by an identifier followed by the rule
given as a string.  Locale rules admit the statement of local assumptions
about the locale constants.  The \texttt{assumes} part is optional.  Non-fixed
variables in locale rules are automatically bound by the universal quantifier
\texttt{!!} of the meta-logic.

Finally, the \texttt{defines} part introduces the definitions that are
available in the locale.  Locale constants declared in the \texttt{fixes}
section are defined using the meta-equality \texttt{==}.  If the
locale constant is a functiond then its definition can (as usual) have
variables on the left-hand side acting as formal parameters; they are
considered as schematic variables and are automatically generalized by
universal quantification of the meta-logic.  The right hand side of a
definition must not contain variables that are not already on the left hand
side.  In so far locale definitions behave like theory level definitions.
However, the locale concept realizes \emph{dependent definitions}: any variable
that is fixed as a locale constant can occur on the right hand side of
definitions.  For an illustration of these dependent definitions see the
occurrence of the locale constant \texttt{G} on the right hand side of the
definitions of the locale \texttt{group} below.  Naturally, definitions can
already use the syntax of the locale constants in the \texttt{fixes}
subsection.  The \texttt{defines} part is, as the \texttt{assumes} part,
optional.

\subsubsection{Example for Definition}
The concrete syntax of locale definitions is demonstrated by example below.

Locale \texttt{group} assumes the definition of groups in a theory
file\footnote{This and other examples are from \texttt{HOL/ex}.}.  A locale
defining a convenient proof environment for group related proofs may be
added to the theory as follows:
\begin{ttbox}
  locale group =
    fixes 
      G         :: "'a grouptype"
      e         :: "'a"
      binop     :: "'a => 'a => 'a"        (infixr "#" 80)
      inv       :: "'a => 'a"              ("i(_)" [90] 91)
    assumes
      Group_G   "G: Group"
    defines
      e_def     "e == unit G"
      binop_def "x # y == bin_op G x y"
      inv_def   "i(x) == inverse G x"
\end{ttbox}

\subsubsection{Polymorphism}

In contrast to polymorphic definitions in theories, the use of the
same type variable for the declaration of different locale constants in the
fixes part means \emph{the same} type.  In other words, the scope of the
polymorphic variables is extended over all constant declarations of a locale.
In the above example \texttt{'a} refers to the same type which is fixed inside
the locale.  In an exported theorem (see \S\ref{sec:locale-export}) the
constructors of locale \texttt{group} are polymorphic, yet only simultaneously
instantiatable.

\subsubsection{Nested Locales}

A locale can be defined as the extension of a previously defined
locale.  This operation of extension is optional and is syntactically
expressed as 
\begin{ttbox}
locale foo = bar + ...
\end{ttbox}
The locale \texttt{foo} builds on the constants and syntax of the locale {\tt
bar}.  That is, all contents of the locale \texttt{bar} can be used in
definitions and rules of the corresponding parts of the locale {\tt
foo}.  Although locale \texttt{foo} assumes the \texttt{fixes} part of \texttt{bar} it
does not automatically subsume its rules and definitions.  Normally, one
expects to use locale \texttt{foo} only if locale \texttt{bar} is already
active.  These aspects of use and activation of locales are considered in the
subsequent section.


\subsection{Locale Scope}

Locales are by default inactive, but they can be invoked.  The list of
currently active locales is called \emph{scope}.  The process of activating
them is called \emph{opening}; the reverse is \emph{closing}.

\subsubsection{Scope}
The locale scope is part of each theory.  It is a dynamic stack containing
all active locales at a certain point in an interactive session.
The scope lives until all locales are explicitly closed.  At one time there
can be more than one locale open.  The contents of these various active
locales are all visible in the scope.  In case of nested locales for example,
the nesting is actually reflected to the scope, which contains the nested
locales as layers.  To check the state of the scope during a development the
function \texttt{Print\_scope} may be used.  It displays the names of all open
locales on the scope.  The function \texttt{print\_locales} applied to a theory
displays all locales contained in that theory and in addition also the
current scope.

The scope is manipulated by the commands for opening and closing of locales. 

\subsubsection{Opening}
Locales can be \emph{opened} at any point during a session where
we want to prove theorems concerning the locale.  Opening a locale means
making its contents visible by pushing it onto the scope of the current
theory.  Inside a scope of opened locales, theorems can use all definitions and
rules contained in the locales on the scope.  The rules and definitions may
be accessed individually using the function \ttindex{thm}.  This function is
applied to the names assigned to locale rules and definitions as
strings.  The opening command is called \texttt{Open\_locale} and takes the 
name of the locale to be opened as its argument.

If one opens a locale \texttt{foo} that is defined by extension from locale
\texttt{bar}, the function \texttt{Open\_locale} checks if locale \texttt{bar}
is open.  If so, then it just opens \texttt{foo}, if not, then it prints a
message and opens \texttt{bar} before opening \texttt{foo}.  Naturally, this
carries on, if \texttt{bar} is again an extension.

\subsubsection{Closing}

\emph{Closing} means to cancel the last opened locale, pushing it out of the
scope.  Theorems proved during the life cycle of this locale will be disabled,
unless they have been explicitly exported, as described below.  However, when
the same locale is opened again these theorems may be used again as well,
provided that they were saved as theorems in the first place, using
\texttt{qed} or ML assignment.  The command \texttt{Close\_locale} takes a
locale name as a string and checks if this locale is actually the topmost
locale on the scope.  If this is the case, it removes this locale, otherwise
it prints a warning message and does not change the scope.

\subsubsection{Export of Theorems}
\label{sec:locale-export}

Export of theorems transports theorems out of the scope of locales.  Locale
rules that have been used in the proof of an exported theorem inside the
locale are carried by the exported form of the theorem as its individual
meta-assumptions.  The locale constants are universally quantified variables
in these theorems, hence such theorems can be instantiated individually.
Definitions become unfolded; locale constants that were merely used for
definitions vanish.  Logically, exporting corresponds to a combined
application of introduction rules for implication and universal
quantification.  Exporting forms a kind of normalization of theorems in a
locale scope.

According to the possibility of nested locales there are two different forms
of export.  The first one is realized by the function \texttt{export} that
exports theorems through all layers of opened locales of the scope.  Hence,
the application of export to a theorem yields a theorem of the global level,
that is, the current theory context without any local assumptions or
definitions.

When locales are nested we might want to export a theorem, not to the global
level of the current theory but just to the previous level.  The other export
function, \texttt{Export}, transports theorems one level up in the scope; the
theorem still uses locale constants, definitions and rules of the locales
underneath.

\subsection{Functions for Locales}
\label{Syntax}
\index{locales!functions}

Here is a quick reference list of locale functions.
\begin{ttbox}
  Open_locale  : xstring -> unit 
  Close_locale : xstring -> unit
  export       :     thm -> thm
  Export       :     thm -> thm
  thm          : xstring -> thm
  Print_scope  :    unit -> unit
  print_locales:  theory -> unit
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{Open_locale} $xstring$] 
    opens the locale {\it xstring}, adding it to the scope of the theory of the
  current context.  If the opened locale is built by extension, the ancestors
  are opened automatically.
  
\item[\ttindexbold{Close_locale} $xstring$] eliminates the locale {\it
    xstring} from the scope if it is the topmost item on it, otherwise it does
  not change the scope and produces a warning.

\item[\ttindexbold{export} $thm$] locale definitions become expanded in {\it
    thm} and locale rules that were used in the proof of {\it thm} become part
  of its individual assumptions.  This normalization happens with respect to
  \emph{all open locales} on the scope.
  
\item[\ttindexbold{Export} $thm$] works like \texttt{export} but normalizes
  theorems only up to the previous level of locales on the scope.
  
\item[\ttindexbold{thm} $xstring$] applied to the name of a locale definition
  or rule it returns the definition as a theorem.
  
\item[\ttindexbold{Print_scope}()] prints the names of the locales in the
  current scope of the current theory context.
  
\item[\ttindexbold{print_locale} $theory$] prints all locales that are
  contained in {\it theory} directly or indirectly.  It also displays the
  current scope similar to \texttt{Print\_scope}.
\end{ttdescription}


\section{Basic operations on theories}\label{BasicOperationsOnTheories}

\subsection{*Theory inclusion}
\begin{ttbox}
subthy      : theory * theory -> bool
eq_thy      : theory * theory -> bool
transfer    : theory -> thm -> thm
transfer_sg : Sign.sg -> thm -> thm
\end{ttbox}

Inclusion and equality of theories is determined by unique
identification stamps that are created when declaring new components.
Theorems contain a reference to the theory (actually to its signature)
they have been derived in.  Transferring theorems to super theories
has no logical significance, but may affect some operations in subtle
ways (e.g.\ implicit merges of signatures when applying rules, or
pretty printing of theorems).

\begin{ttdescription}

\item[\ttindexbold{subthy} ($thy@1$, $thy@2$)] determines if $thy@1$
  is included in $thy@2$ wrt.\ identification stamps.

\item[\ttindexbold{eq_thy} ($thy@1$, $thy@2$)] determines if $thy@1$
  is exactly the same as $thy@2$.

\item[\ttindexbold{transfer} $thy$ $thm$] transfers theorem $thm$ to
  theory $thy$, provided the latter includes the theory of $thm$.
  
\item[\ttindexbold{transfer_sg} $sign$ $thm$] is similar to
  \texttt{transfer}, but identifies the super theory via its
  signature.

\end{ttdescription}


\subsection{*Primitive theories}
\begin{ttbox}
ProtoPure.thy  : theory
Pure.thy       : theory
CPure.thy      : theory
\end{ttbox}
\begin{description}
\item[\ttindexbold{ProtoPure.thy}, \ttindexbold{Pure.thy},
  \ttindexbold{CPure.thy}] contain the syntax and signature of the
  meta-logic.  There are basically no axioms: meta-level inferences
  are carried out by \ML\ functions.  \texttt{Pure} and \texttt{CPure}
  just differ in their concrete syntax of prefix function application:
  $t(u@1, \ldots, u@n)$ in \texttt{Pure} vs.\ $t\,u@1,\ldots\,u@n$ in
  \texttt{CPure}.  \texttt{ProtoPure} is their common parent,
  containing no syntax for printing prefix applications at all!

%% FIXME
%\item [\ttindexbold{extend_theory} $thy$ {\tt"}$T${\tt"} $\cdots$] extends
%  the theory $thy$ with new types, constants, etc.  $T$ identifies the theory
%  internally.  When a theory is redeclared, say to change an incorrect axiom,
%  bindings to the old axiom may persist.  Isabelle ensures that the old and
%  new theories are not involved in the same proof.  Attempting to combine
%  different theories having the same name $T$ yields the fatal error
%extend_theory  : theory -> string -> \(\cdots\) -> theory
%\begin{ttbox}
%Attempt to merge different versions of theory: \(T\)
%\end{ttbox}
\end{description}

%% FIXME
%\item [\ttindexbold{extend_theory} $thy$ {\tt"}$T${\tt"}
%      ($classes$, $default$, $types$, $arities$, $consts$, $sextopt$) $rules$]
%\hfill\break   %%% include if line is just too short
%is the \ML{} equivalent of the following theory definition:
%\begin{ttbox}
%\(T\) = \(thy\) +
%classes \(c\) < \(c@1\),\(\dots\),\(c@m\)
%        \dots
%default {\(d@1,\dots,d@r\)}
%types   \(tycon@1\),\dots,\(tycon@i\) \(n\)
%        \dots
%arities \(tycon@1'\),\dots,\(tycon@j'\) :: (\(s@1\),\dots,\(s@n\))\(c\)
%        \dots
%consts  \(b@1\),\dots,\(b@k\) :: \(\tau\)
%        \dots
%rules   \(name\) \(rule\)
%        \dots
%end
%\end{ttbox}
%where
%\begin{tabular}[t]{l@{~=~}l}
%$classes$ & \tt[("$c$",["$c@1$",\dots,"$c@m$"]),\dots] \\
%$default$ & \tt["$d@1$",\dots,"$d@r$"]\\
%$types$   & \tt[([$tycon@1$,\dots,$tycon@i$], $n$),\dots] \\
%$arities$ & \tt[([$tycon'@1$,\dots,$tycon'@j$], ([$s@1$,\dots,$s@n$],$c$)),\dots]
%\\
%$consts$  & \tt[([$b@1$,\dots,$b@k$],$\tau$),\dots] \\
%$rules$   & \tt[("$name$",$rule$),\dots]
%\end{tabular}


\subsection{Inspecting a theory}\label{sec:inspct-thy}
\index{theories!inspecting|bold}
\begin{ttbox}
print_syntax        : theory -> unit
print_theory        : theory -> unit
parents_of          : theory -> theory list
ancestors_of        : theory -> theory list
sign_of             : theory -> Sign.sg
Sign.stamp_names_of : Sign.sg -> string list
\end{ttbox}
These provide means of viewing a theory's components.
\begin{ttdescription}
\item[\ttindexbold{print_syntax} $thy$] prints the syntax of $thy$
  (grammar, macros, translation functions etc., see
  page~\pageref{pg:print_syn} for more details).
  
\item[\ttindexbold{print_theory} $thy$] prints the logical parts of
  $thy$, excluding the syntax.
  
\item[\ttindexbold{parents_of} $thy$] returns the direct ancestors
  of~$thy$.
  
\item[\ttindexbold{ancestors_of} $thy$] returns all ancestors of~$thy$
  (not including $thy$ itself).
  
\item[\ttindexbold{sign_of} $thy$] returns the signature associated
  with~$thy$.  It is useful with functions like {\tt
    read_instantiate_sg}, which take a signature as an argument.
  
\item[\ttindexbold{Sign.stamp_names_of} $sg$]\index{signatures}
  returns the names of the identification \rmindex{stamps} of ax
  signature.  These coincide with the names of its full ancestry
  including that of $sg$ itself.

\end{ttdescription}


\section{Terms}\label{sec:terms}
\index{terms|bold}
Terms belong to the \ML\ type \mltydx{term}, which is a concrete datatype
with six constructors:
\begin{ttbox}
type indexname = string * int;
infix 9 $;
datatype term = Const of string * typ
              | Free  of string * typ
              | Var   of indexname * typ
              | Bound of int
              | Abs   of string * typ * term
              | op $  of term * term;
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{Const} ($a$, $T$)] \index{constants|bold}
  is the \textbf{constant} with name~$a$ and type~$T$.  Constants include
  connectives like $\land$ and $\forall$ as well as constants like~0
  and~$Suc$.  Other constants may be required to define a logic's concrete
  syntax.

\item[\ttindexbold{Free} ($a$, $T$)] \index{variables!free|bold}
  is the \textbf{free variable} with name~$a$ and type~$T$.

\item[\ttindexbold{Var} ($v$, $T$)] \index{unknowns|bold}
  is the \textbf{scheme variable} with indexname~$v$ and type~$T$.  An
  \mltydx{indexname} is a string paired with a non-negative index, or
  subscript; a term's scheme variables can be systematically renamed by
  incrementing their subscripts.  Scheme variables are essentially free
  variables, but may be instantiated during unification.

\item[\ttindexbold{Bound} $i$] \index{variables!bound|bold}
  is the \textbf{bound variable} with de Bruijn index~$i$, which counts the
  number of lambdas, starting from zero, between a variable's occurrence
  and its binding.  The representation prevents capture of variables.  For
  more information see de Bruijn \cite{debruijn72} or
  Paulson~\cite[page~376]{paulson-ml2}.

\item[\ttindexbold{Abs} ($a$, $T$, $u$)]
  \index{lambda abs@$\lambda$-abstractions|bold}
  is the $\lambda$-\textbf{abstraction} with body~$u$, and whose bound
  variable has name~$a$ and type~$T$.  The name is used only for parsing
  and printing; it has no logical significance.

\item[$t$ \$ $u$] \index{$@{\tt\$}|bold} \index{function applications|bold}
is the \textbf{application} of~$t$ to~$u$.
\end{ttdescription}
Application is written as an infix operator to aid readability.  Here is an
\ML\ pattern to recognize FOL formulae of the form~$A\imp B$, binding the
subformulae to~$A$ and~$B$:
\begin{ttbox}
Const("Trueprop",_) $ (Const("op -->",_) $ A $ B)
\end{ttbox}


\section{*Variable binding}
\begin{ttbox}
loose_bnos     : term -> int list
incr_boundvars : int -> term -> term
abstract_over  : term*term -> term
variant_abs    : string * typ * term -> string * term
aconv          : term * term -> bool\hfill\textbf{infix}
\end{ttbox}
These functions are all concerned with the de Bruijn representation of
bound variables.
\begin{ttdescription}
\item[\ttindexbold{loose_bnos} $t$]
  returns the list of all dangling bound variable references.  In
  particular, \texttt{Bound~0} is loose unless it is enclosed in an
  abstraction.  Similarly \texttt{Bound~1} is loose unless it is enclosed in
  at least two abstractions; if enclosed in just one, the list will contain
  the number 0.  A well-formed term does not contain any loose variables.

\item[\ttindexbold{incr_boundvars} $j$]
  increases a term's dangling bound variables by the offset~$j$.  This is
  required when moving a subterm into a context where it is enclosed by a
  different number of abstractions.  Bound variables with a matching
  abstraction are unaffected.

\item[\ttindexbold{abstract_over} $(v,t)$]
  forms the abstraction of~$t$ over~$v$, which may be any well-formed term.
  It replaces every occurrence of \(v\) by a \texttt{Bound} variable with the
  correct index.

\item[\ttindexbold{variant_abs} $(a,T,u)$]
  substitutes into $u$, which should be the body of an abstraction.
  It replaces each occurrence of the outermost bound variable by a free
  variable.  The free variable has type~$T$ and its name is a variant
  of~$a$ chosen to be distinct from all constants and from all variables
  free in~$u$.

\item[$t$ \ttindexbold{aconv} $u$]
  tests whether terms~$t$ and~$u$ are \(\alpha\)-convertible: identical up
  to renaming of bound variables.
\begin{itemize}
  \item
    Two constants, \texttt{Free}s, or \texttt{Var}s are \(\alpha\)-convertible
    if their names and types are equal.
    (Variables having the same name but different types are thus distinct.
    This confusing situation should be avoided!)
  \item
    Two bound variables are \(\alpha\)-convertible
    if they have the same number.
  \item
    Two abstractions are \(\alpha\)-convertible
    if their bodies are, and their bound variables have the same type.
  \item
    Two applications are \(\alpha\)-convertible
    if the corresponding subterms are.
\end{itemize}

\end{ttdescription}

\section{Certified terms}\index{terms!certified|bold}\index{signatures}
A term $t$ can be \textbf{certified} under a signature to ensure that every type
in~$t$ is well-formed and every constant in~$t$ is a type instance of a
constant declared in the signature.  The term must be well-typed and its use
of bound variables must be well-formed.  Meta-rules such as \texttt{forall_elim}
take certified terms as arguments.

Certified terms belong to the abstract type \mltydx{cterm}.
Elements of the type can only be created through the certification process.
In case of error, Isabelle raises exception~\ttindex{TERM}\@.

\subsection{Printing terms}
\index{terms!printing of}
\begin{ttbox}
     string_of_cterm :           cterm -> string
Sign.string_of_term  : Sign.sg -> term -> string
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{string_of_cterm} $ct$]
displays $ct$ as a string.

\item[\ttindexbold{Sign.string_of_term} $sign$ $t$]
displays $t$ as a string, using the syntax of~$sign$.
\end{ttdescription}

\subsection{Making and inspecting certified terms}
\begin{ttbox}
cterm_of       : Sign.sg -> term -> cterm
read_cterm     : Sign.sg -> string * typ -> cterm
cert_axm       : Sign.sg -> string * term -> string * term
read_axm       : Sign.sg -> string * string -> string * term
rep_cterm      : cterm -> \{T:typ, t:term, sign:Sign.sg, maxidx:int\}
Sign.certify_term : Sign.sg -> term -> term * typ * int
\end{ttbox}
\begin{ttdescription}
  
\item[\ttindexbold{cterm_of} $sign$ $t$] \index{signatures} certifies
  $t$ with respect to signature~$sign$.
  
\item[\ttindexbold{read_cterm} $sign$ ($s$, $T$)] reads the string~$s$
  using the syntax of~$sign$, creating a certified term.  The term is
  checked to have type~$T$; this type also tells the parser what kind
  of phrase to parse.
  
\item[\ttindexbold{cert_axm} $sign$ ($name$, $t$)] certifies $t$ with
  respect to $sign$ as a meta-proposition and converts all exceptions
  to an error, including the final message
\begin{ttbox}
The error(s) above occurred in axiom "\(name\)"
\end{ttbox}

\item[\ttindexbold{read_axm} $sign$ ($name$, $s$)] similar to {\tt
    cert_axm}, but first reads the string $s$ using the syntax of
  $sign$.
  
\item[\ttindexbold{rep_cterm} $ct$] decomposes $ct$ as a record
  containing its type, the term itself, its signature, and the maximum
  subscript of its unknowns.  The type and maximum subscript are
  computed during certification.
  
\item[\ttindexbold{Sign.certify_term}] is a more primitive version of
  \texttt{cterm_of}, returning the internal representation instead of
  an abstract \texttt{cterm}.

\end{ttdescription}


\section{Types}\index{types|bold}
Types belong to the \ML\ type \mltydx{typ}, which is a concrete datatype with
three constructor functions.  These correspond to type constructors, free
type variables and schematic type variables.  Types are classified by sorts,
which are lists of classes (representing an intersection).  A class is
represented by a string.
\begin{ttbox}
type class = string;
type sort  = class list;

datatype typ = Type  of string * typ list
             | TFree of string * sort
             | TVar  of indexname * sort;

infixr 5 -->;
fun S --> T = Type ("fun", [S, T]);
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{Type} ($a$, $Ts$)] \index{type constructors|bold}
  applies the \textbf{type constructor} named~$a$ to the type operand list~$Ts$.
  Type constructors include~\tydx{fun}, the binary function space
  constructor, as well as nullary type constructors such as~\tydx{prop}.
  Other type constructors may be introduced.  In expressions, but not in
  patterns, \hbox{\tt$S$-->$T$} is a convenient shorthand for function
  types.

\item[\ttindexbold{TFree} ($a$, $s$)] \index{type variables|bold}
  is the \textbf{type variable} with name~$a$ and sort~$s$.

\item[\ttindexbold{TVar} ($v$, $s$)] \index{type unknowns|bold}
  is the \textbf{type unknown} with indexname~$v$ and sort~$s$.
  Type unknowns are essentially free type variables, but may be
  instantiated during unification.
\end{ttdescription}


\section{Certified types}
\index{types!certified|bold}
Certified types, which are analogous to certified terms, have type
\ttindexbold{ctyp}.

\subsection{Printing types}
\index{types!printing of}
\begin{ttbox}
     string_of_ctyp :           ctyp -> string
Sign.string_of_typ  : Sign.sg -> typ -> string
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{string_of_ctyp} $cT$]
displays $cT$ as a string.

\item[\ttindexbold{Sign.string_of_typ} $sign$ $T$]
displays $T$ as a string, using the syntax of~$sign$.
\end{ttdescription}


\subsection{Making and inspecting certified types}
\begin{ttbox}
ctyp_of          : Sign.sg -> typ -> ctyp
rep_ctyp         : ctyp -> \{T: typ, sign: Sign.sg\}
Sign.certify_typ : Sign.sg -> typ -> typ
\end{ttbox}
\begin{ttdescription}
  
\item[\ttindexbold{ctyp_of} $sign$ $T$] \index{signatures} certifies
  $T$ with respect to signature~$sign$.
  
\item[\ttindexbold{rep_ctyp} $cT$] decomposes $cT$ as a record
  containing the type itself and its signature.
  
\item[\ttindexbold{Sign.certify_typ}] is a more primitive version of
  \texttt{ctyp_of}, returning the internal representation instead of
  an abstract \texttt{ctyp}.

\end{ttdescription}


\section{Oracles: calling trusted external reasoners}
\label{sec:oracles}
\index{oracles|(}

Oracles allow Isabelle to take advantage of external reasoners such as
arithmetic decision procedures, model checkers, fast tautology checkers or
computer algebra systems.  Invoked as an oracle, an external reasoner can
create arbitrary Isabelle theorems.  It is your responsibility to ensure that
the external reasoner is as trustworthy as your application requires.
Isabelle's proof objects~(\S\ref{sec:proofObjects}) record how each theorem
depends upon oracle calls.

\begin{ttbox}
invoke_oracle     : theory -> xstring -> Sign.sg * object -> thm
Theory.add_oracle : bstring * (Sign.sg * object -> term) -> theory 
                    -> theory
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{invoke_oracle} $thy$ $name$ ($sign$, $data$)]
  invokes the oracle $name$ of theory $thy$ passing the information
  contained in the exception value $data$ and creating a theorem
  having signature $sign$.  Note that type \ttindex{object} is just an
  abbreviation for \texttt{exn}.  Errors arise if $thy$ does not have
  an oracle called $name$, if the oracle rejects its arguments or if
  its result is ill-typed.
  
\item[\ttindexbold{Theory.add_oracle} $name$ $fun$ $thy$] extends
  $thy$ by oracle $fun$ called $name$.  It is seldom called
  explicitly, as there is concrete syntax for oracles in theory files.
\end{ttdescription}

A curious feature of {\ML} exceptions is that they are ordinary constructors.
The {\ML} type \texttt{exn} is a datatype that can be extended at any time.  (See
my {\em {ML} for the Working Programmer}~\cite{paulson-ml2}, especially
page~136.)  The oracle mechanism takes advantage of this to allow an oracle to
take any information whatever.

There must be some way of invoking the external reasoner from \ML, either
because it is coded in {\ML} or via an operating system interface.  Isabelle
expects the {\ML} function to take two arguments: a signature and an
exception object.
\begin{itemize}
\item The signature will typically be that of a desendant of the theory
  declaring the oracle.  The oracle will use it to distinguish constants from
  variables, etc., and it will be attached to the generated theorems.

\item The exception is used to pass arbitrary information to the oracle.  This
  information must contain a full description of the problem to be solved by
  the external reasoner, including any additional information that might be
  required.  The oracle may raise the exception to indicate that it cannot
  solve the specified problem.
\end{itemize}

A trivial example is provided in theory \texttt{FOL/ex/IffOracle}.  This
oracle generates tautologies of the form $P\bimp\cdots\bimp P$, with
an even number of $P$s.

The \texttt{ML} section of \texttt{IffOracle.thy} begins by declaring
a few auxiliary functions (suppressed below) for creating the
tautologies.  Then it declares a new exception constructor for the
information required by the oracle: here, just an integer. It finally
defines the oracle function itself.
\begin{ttbox}
exception IffOracleExn of int;\medskip
fun mk_iff_oracle (sign, IffOracleExn n) =
  if n > 0 andalso n mod 2 = 0
  then Trueprop \$ mk_iff n
  else raise IffOracleExn n;
\end{ttbox}
Observe the function's two arguments, the signature \texttt{sign} and the
exception given as a pattern.  The function checks its argument for
validity.  If $n$ is positive and even then it creates a tautology
containing $n$ occurrences of~$P$.  Otherwise it signals error by
raising its own exception (just by happy coincidence).  Errors may be
signalled by other means, such as returning the theorem \texttt{True}.
Please ensure that the oracle's result is correctly typed; Isabelle
will reject ill-typed theorems by raising a cryptic exception at top
level.

The \texttt{oracle} section of \texttt{IffOracle.thy} installs above
\texttt{ML} function as follows:
\begin{ttbox}
IffOracle = FOL +\medskip
oracle
  iff = mk_iff_oracle\medskip
end
\end{ttbox}

Now in \texttt{IffOracle.ML} we first define a wrapper for invoking
the oracle:
\begin{ttbox}
fun iff_oracle n = invoke_oracle IffOracle.thy "iff"
                      (sign_of IffOracle.thy, IffOracleExn n);
\end{ttbox}

Here are some example applications of the \texttt{iff} oracle.  An
argument of 10 is allowed, but one of 5 is forbidden:
\begin{ttbox}
iff_oracle 10;
{\out  "P <-> P <-> P <-> P <-> P <-> P <-> P <-> P <-> P <-> P" : thm}
iff_oracle 5;
{\out Exception- IffOracleExn 5 raised}
\end{ttbox}

\index{oracles|)}
\index{theories|)}


%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "ref"
%%% End: