doc-src/Logics/old.defining.tex
 changeset 104 d8205bb279a7
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc-src/Logics/old.defining.tex	Wed Nov 10 05:00:57 1993 +0100
@@ -0,0 +1,871 @@
+\chapter{Defining Logics} \label{Defining-Logics}
+This chapter is intended for Isabelle experts.  It explains how to define new
+logical systems, Isabelle's {\it raison d'\^etre}.  Isabelle logics are
+hierarchies of theories.  A number of simple examples are contained in the
+introductory manual; the full syntax of theory definitions is shown in the
+{\em Reference Manual}.  The purpose of this chapter is to explain the
+remaining subtleties, especially some context conditions on the class
+structure and the definition of new mixfix syntax.  A full understanding of
+the material requires knowledge of the internal representation of terms (data
+type {\tt term}) as detailed in the {\em Reference Manual}.  Sections marked
+with a * can be skipped on first reading.
+
+
+\section{Classes and Types *}
+\index{*arities!context conditions}
+
+Type declarations are subject to the following two well-formedness
+conditions:
+\begin{itemize}
+\item There are no two declarations $ty :: (\vec{r})c$ and $ty :: (\vec{s})c$
+  with $\vec{r} \neq \vec{s}$.  For example
+\begin{ttbox}
+types ty 1
+arities ty :: (\{logic\}) logic
+        ty :: (\{\})logic
+\end{ttbox}
+leads to an error message and fails.
+\item If there are two declarations $ty :: (s@1,\dots,s@n)c$ and $ty :: + (s@1',\dots,s@n')c'$ such that $c' < c$ then $s@i' \preceq s@i$ must hold
+  for $i=1,\dots,n$.  The relationship $\preceq$, defined as
+$s' \preceq s \iff \forall c\in s. \exists c'\in s'.~ c'\le c,$
+expresses that the set of types represented by $s'$ is a subset of the set of
+types represented by $s$.  For example
+\begin{ttbox}
+classes term < logic
+types ty 1
+arities ty :: (\{logic\})logic
+        ty :: (\{\})term
+\end{ttbox}
+leads to an error message and fails.
+\end{itemize}
+These conditions guarantee principal types~\cite{nipkow-prehofer}.
+
+\section{Precedence Grammars}
+\label{PrecedenceGrammars}
+\index{precedence grammar|(}
+
+The precise syntax of a logic is best defined by a context-free grammar.
+These grammars obey the following conventions: identifiers denote
+nonterminals, {\tt typewriter} fount denotes terminals, repetition is
+indicated by \dots, and alternatives are separated by $|$.
+
+In order to simplify the description of mathematical languages, we introduce
+an extended format which permits {\bf precedences}\index{precedence}.  This
+scheme generalizes precedence declarations in \ML\ and {\sc prolog}.  In this
+extended grammar format, nonterminals are decorated by integers, their
+precedence.  In the sequel, precedences are shown as subscripts.  A nonterminal
+$A@p$ on the right-hand side of a production may only be replaced using a
+production $A@q = \gamma$ where $p \le q$.
+
+Formally, a set of context free productions $G$ induces a derivation
+relation $\rew@G$ on strings as follows:
+$\alpha A@p \beta ~\rew@G~ \alpha\gamma\beta ~~~iff~~~ + \exists q \ge p.~(A@q=\gamma) \in G +$
+Any extended grammar of this kind can be translated into a normal context
+free grammar.  However, this translation may require the introduction of a
+large number of new nonterminals and productions.
+
+\begin{example}
+\label{PrecedenceEx}
+The following simple grammar for arithmetic expressions demonstrates how
+binding power and associativity of operators can be enforced by precedences.
+\begin{center}
+\begin{tabular}{rclr}
+$A@9$ & = & {\tt0} \\
+$A@9$ & = & {\tt(} $A@0$ {\tt)} \\
+$A@0$ & = & $A@0$ {\tt+} $A@1$ \\
+$A@2$ & = & $A@3$ {\tt*} $A@2$ \\
+$A@3$ & = & {\tt-} $A@3$
+\end{tabular}
+\end{center}
+The choice of precedences determines that \verb$-$ binds tighter than
+\verb$*$ which binds tighter than \verb$+$, and that \verb$+$ and \verb$*$
+associate to the left and right, respectively.
+\end{example}
+
+To minimize the number of subscripts, we adopt the following conventions:
+\begin{itemize}
+\item all precedences $p$ must be in the range $0 \leq p \leq max_pri$ for
+  some fixed $max_pri$.
+\item precedence $0$ on the right-hand side and precedence $max_pri$ on the
+  left-hand side may be omitted.
+\end{itemize}
+In addition, we write the production $A@p = \alpha$ as $A = \alpha~(p)$.
+
+Using these conventions and assuming $max_pri=9$, the grammar in
+Example~\ref{PrecedenceEx} becomes
+\begin{center}
+\begin{tabular}{rclc}
+$A$ & = & {\tt0} & \hspace*{4em} \\
+ & $|$ & {\tt(} $A$ {\tt)} \\
+ & $|$ & $A$ {\tt+} $A@1$ & (0) \\
+ & $|$ & $A@3$ {\tt*} $A@2$ & (2) \\
+ & $|$ & {\tt-} $A@3$ & (3)
+\end{tabular}
+\end{center}
+
+\index{precedence grammar|)}
+
+\section{Basic syntax *}
+
+An informal account of most of Isabelle's syntax (meta-logic, types etc) is
+contained in {\em Introduction to Isabelle}.  A precise description using a
+precedence grammar is shown in Figure~\ref{MetaLogicSyntax}.  This description
+is the basis of all extensions by object-logics.
+\begin{figure}[htb]
+\begin{center}
+\begin{tabular}{rclc}
+$prop$ &=& \ttindex{PROP} $aprop$ ~~$|$~~ {\tt(} $prop$ {\tt)} \\
+     &$|$& $logic@3$ \ttindex{==} $logic@2$ & (2) \\
+     &$|$& $prop@2$ \ttindex{==>} $prop@1$ & (1) \\
+     &$|$& {\tt[|} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt|]} {\tt==>} $prop@1$ & (1) \\
+     &$|$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\\\
+$logic$ &=& $prop$ ~~$|$~~ $fun$ \\\\
+$aprop$ &=& $id$ ~~$|$~~ $var$
+    ~~$|$~~ $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\\\
+$fun$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $fun$ {\tt)} \\
+    &$|$& \ttindex{\%} $idts$ {\tt.} $logic$ & (0) \\\\
+$idts$ &=& $idt$ ~~$|$~~ $idt@1$ $idts$ \\\\
+$idt$ &=& $id$ ~~$|$~~ {\tt(} $idt$ {\tt)} \\
+    &$|$& $id$ \ttindex{::} $type$ & (0) \\\\
+$type$ &=& $tfree$ ~~$|$~~ $tvar$ \\
+     &$|$& $tfree$ {\tt::} $sort$ ~~$|$~~ $tvar$ {\tt::} $sort$ \\
+     &$|$& $id$ ~~$|$~~ $type@{max_pri}$ $id$
+                ~~$|$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\
+     &$|$& $type@1$ \ttindex{=>} $type$ & (0) \\
+     &$|$& {\tt[}  $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0)\\
+     &$|$& {\tt(} $type$ {\tt)} \\\\
+$sort$ &=& $id$ ~~$|$~~ {\tt\{\}}
+                ~~$|$~~ {\tt\{} $id$ {\tt,} \dots {\tt,} $id$ {\tt\}}
+\end{tabular}\index{*"!"!}\index{*"["|}\index{*"|"]}
+\indexbold{type@$type$}\indexbold{sort@$sort$}\indexbold{idts@$idts$}
+\indexbold{logic@$logic$}\indexbold{prop@$prop$}\indexbold{fun@$fun$}
+\end{center}
+\caption{Meta-Logic Syntax}
+\label{MetaLogicSyntax}
+\end{figure}
+The following main categories are defined:
+\begin{description}
+\item[$prop$] Terms of type $prop$, i.e.\ formulae of the meta-logic.
+\item[$aprop$] Atomic propositions.
+\item[$logic$] Terms of types in class $logic$.  Initially, $logic$ contains
+  merely $prop$.  As the syntax is extended by new object-logics, more
+  productions for $logic$ are added (see below).
+\item[$fun$] Terms potentially of function type.
+\item[$type$] Types.
+\item[$idts$] a list of identifiers, possibly constrained by types.  Note
+  that $x::nat~y$ is parsed as $x::(nat~y)$, i.e.\ $y$ is treated like a
+  type constructor applied to $nat$.
+\end{description}
+
+The predefined types $id$, $var$, $tfree$ and $tvar$ represent identifiers
+({\tt f}), unknowns ({\tt ?f}), type variables ({\tt 'a}), and type unknowns
+({\tt ?'a}) respectively.  If we think of them as nonterminals with
+predefined syntax, we may assume that all their productions have precedence
+$max_pri$.
+
+\subsection{Logical types and default syntax}
+
+Isabelle is concerned with mathematical languages which have a certain
+minimal vocabulary: identifiers, variables, parentheses, and the lambda
+calculus.  Logical types, i.e.\ those of class $logic$, are automatically
+equipped with this basic syntax.  More precisely, for any type constructor
+$ty$ with arity $(\dots)c$, where $c$ is a subclass of $logic$, the following
+\begin{center}
+\begin{tabular}{rclc}
+$ty$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $ty$ {\tt)} \\
+  &$|$& $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)}\\
+  &$|$& $ty@{max_pri}$ {\tt::} $type$\\\\
+$logic$ &=& $ty$
+\end{tabular}
+\end{center}
+
+
+\section{Mixfix syntax}
+\index{mixfix|(}
+
+We distinguish between abstract and concrete syntax.  The {\em abstract}
+syntax is given by the typed constants of a theory.  Abstract syntax trees are
+well-typed terms, i.e.\ values of \ML\ type {\tt term}.  If none of the
+constants are introduced with mixfix annotations, there is no concrete syntax
+to speak of: terms can only be abstractions or applications of the form
+$f(t@1,\dots,t@n)$, where $f$ is a constant or variable.  Since this notation
+quickly becomes unreadable, Isabelle supports syntax definitions in the form
+of unrestricted context-free grammars using mixfix annotations.
+
+Mixfix annotations describe the {\em concrete} syntax, its translation into
+the abstract syntax, and a pretty-printing scheme, all in one.  Isabelle
+syntax definitions are inspired by \OBJ's~\cite{OBJ} {\em mixfix\/} syntax.
+Each mixfix annotation defines a precedence grammar production and associates
+an Isabelle constant with it.
+
+A {\em mixfix declaration} {\tt consts $c$ ::\ $\tau$ ($sy$ $ps$ $p$)} is
+interpreted as a grammar pro\-duction as follows:
+\begin{itemize}
+\item $sy$ is the right-hand side of this production, specified as a {\em
+    mixfix annotation}.  In general, $sy$ is of the form
+  $\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$, where each occurrence of
+  \ttindex{_}'' denotes an argument/nonterminal and the strings
+  $\alpha@i$ do not contain {\tt_}''.
+\item $\tau$ specifies the types of the nonterminals on the left and right
+  hand side. If $sy$ is of the form above, $\tau$ must be of the form
+  $[\tau@1,\dots,\tau@n] \To \tau'$.  Then argument $i$ is of type $\tau@i$
+  and the result, i.e.\ the left-hand side of the production, is of type
+  $\tau'$.  Both the $\tau@i$ and $\tau'$ may be function types.
+\item $c$ is the name of the Isabelle constant associated with this production.
+  Parsing an instance of the phrase $sy$ generates the {\tt term} {\tt
+    Const($c$,dummyT\footnote{Proper types are inserted later on.  See
+      \S\ref{Typing}})\$$a@1\$$\dots$\$$a@n}\index{*dummyT}, where a@i is + the term generated by parsing the i^{th} argument. +\item ps must be of the form [p@1,\dots,p@n], where p@i is the + minimal precedence\index{precedence} required of any phrase that may appear + as the i^{th} argument. The null list is interpreted as a list of 0's of + the appropriate length. +\item p is the precedence of this production. +\end{itemize} +Notice that there is a close connection between abstract and concrete syntax: +each production has an associated constant, and types act as {\bf syntactic + categories} in the concrete syntax. To emphasize this connection, we +sometimes refer to the nonterminals on the right-hand side of a production as +its arguments and to the nonterminal on the left-hand side as its result. + +The maximal legal precedence is called \ttindexbold{max_pri}, which is +currently 1000. If you want to ignore precedences, the safest way to do so is +to use the annotation {\tt(sy)}: this production puts no precedence +constraints on any of its arguments and has maximal precedence itself, i.e.\ +it is always applicable and does not exclude any productions of its +arguments. + +\begin{example} +In mixfix notation the grammar in Example~\ref{PrecedenceEx} can be written +as follows: +\begin{ttbox} +types exp 0 +consts "0" :: "exp" ("0" 9) + "+" :: "[exp,exp] => exp" ("_ + _" [0,1] 0) + "*" :: "[exp,exp] => exp" ("_ * _" [3,2] 2) + "-" :: "exp => exp" ("- _" [3] 3) +\end{ttbox} +Parsing the string \verb!"0 + - 0 + 0"! produces the term {\tt + p\(p\(m\$$z$)\$$z)\$$z$} where {\tt$p =$Const("+",dummyT)}, +{\tt$m =$Const("-",dummyT)}, and {\tt$z =$Const("0",dummyT)}. +\end{example} + +The interpretation of \ttindex{_} in a mixfix annotation is always as a {\bf + meta-character}\index{meta-character} which does not represent itself but +an argument position. The following characters are also meta-characters: +\begin{ttbox} +' ( ) / +\end{ttbox} +Preceding any character with a quote (\verb$'$) turns it into an ordinary +character. Thus you can write \verb!''! if you really want a single quote. +The purpose of the other meta-characters is explained in +\S\ref{PrettyPrinting}. Remember that in \ML\ strings \verb$\$is already a +(different kind of) meta-character. + + +\subsection{Types and syntactic categories *} + +The precise mapping from types to syntactic categories is defined by the +following function: +\begin{eqnarray*} +N(\tau@1\To\tau@2) &=& fun \\ +N((\tau@1,\dots,\tau@n)ty) &=& ty \\ +N(\alpha) &=& logic +\end{eqnarray*} +Only the outermost type constructor is taken into account and type variables +can range over all logical types. This catches some ill-typed terms (like +$Cons(x,0)$, where$Cons :: [\alpha,\alpha list] \To \alpha list$and$0 ::
+nat$) but leaves the real work to the type checker. + +In terms of the precedence grammar format introduced in +\S\ref{PrecedenceGrammars}, the declaration +\begin{ttbox} +consts $$c$$ :: "[$$\tau@1$$,\dots,$$\tau@n$$]$$\To\tau$$" ("$$\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$$") [$$p@1$$,\dots,$$p@n$$] $$p$$) +\end{ttbox} +defines the production +$N(\tau)@p ~~=~~ \alpha@0 ~N(\tau@1)@{p@1}~ \alpha@1~ \dots + ~\alpha@{n-1} ~N(\tau@n)@{p@n}~ \alpha@n +$ + +\subsection{Copy productions *} + +Productions which do not create a new node in the abstract syntax tree are +called {\bf copy productions}. They must have exactly one nonterminal on +the right hand side. The term generated when parsing that nonterminal is +simply passed up as the result of parsing the whole copy production. In +Isabelle a copy production is indicated by an empty constant name, i.e.\ by +\begin{ttbox} +consts "" :: $$\tau$$ ($$sy$$ $$ps$$ $$p$$) +\end{ttbox} + +A special kind of copy production is one where, modulo white space,$sy$is +{\tt"_"}. It is called a {\bf chain production}. Chain productions should be +seen as an abbreviation mechanism. Conceptually, they are removed from the +grammar by adding appropriate new rules. Precedence information attached to +chain productions is ignored. The following example demonstrates the effect: +the grammar defined by +\begin{ttbox} +types A,B,C 0 +consts AB :: "B => A" ("A _" [10] 517) + "" :: "C => B" ("_" [0] 100) + x :: "C" ("x" 5) + y :: "C" ("y" 15) +\end{ttbox} +admits {\tt"A y"} but not {\tt"A x"}. Had the constant in the second +production been some non-empty string, both {\tt"A y"} and {\tt"A x"} would +be legal. + +\index{mixfix|)} + +\section{Lexical conventions} + +The lexical analyzer distinguishes the following kinds of tokens: delimiters, +identifiers, unknowns, type variables and type unknowns. + +Delimiters are user-defined, i.e.\ they are extracted from the syntax +definition. If$\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$is a mixfix +annotation, each$\alpha@i$is decomposed into substrings +$\beta@1~\dots~\beta@k$which are separated by and do not contain +\bfindex{white space} ( = blanks, tabs, newlines). Each$\beta@j$becomes a +delimiter. Thus a delimiter can be an arbitrary string not containing white +space. + +The lexical syntax of identifiers and variables ( = unknowns) is defined in +the introductory manual. Parsing an identifier$f$generates {\tt + Free($f$,dummyT)}\index{*dummyT}. Parsing a variable {\tt?}$v$generates +{\tt Var(($u$,$i$),dummyT)} where$i$is the integer value of the longest +numeric suffix of$v$(possibly$0$), and$u$is the remaining prefix. +Parsing a variable {\tt?}$v{.}i$generates {\tt Var(($v$,$i$),dummyT)}. The +following table covers the four different cases that can arise: +\begin{center}\tt +\begin{tabular}{cccc} +"?v" & "?v.7" & "?v5" & "?v7.5" \\ +Var(("v",0),$d$) & Var(("v",7),$d$) & Var(("v",5),$d$) & Var(("v7",5),$d$) +\end{tabular} +\end{center} +where$d = {\tt dummyT}$. + +In mixfix annotations, \ttindexbold{id}, \ttindexbold{var}, +\ttindexbold{tfree} and \ttindexbold{tvar} are the predefined categories of +identifiers, unknowns, type variables and type unknowns, respectively. + + +The lexical analyzer translates input strings to token lists by repeatedly +taking the maximal prefix of the input string that forms a valid token. A +maximal prefix that is both a delimiter and an identifier or variable (like +{\tt ALL}) is treated as a delimiter. White spaces are separators. + +An important consequence of this translation scheme is that delimiters need +not be separated by white space to be recognized as separate. If \verb$"-"$+is a delimiter but \verb$"--"$is not, the string \verb$"--"$is treated as +two consecutive occurrences of \verb$"-"$. This is in contrast to \ML\ which +would treat \verb$"--"$as a single (undeclared) identifier. The +consequence of Isabelle's more liberal scheme is that the same string may be +parsed in a different way after extending the syntax: after adding +\verb$"--"$as a delimiter, the input \verb$"--"$is treated as +a single occurrence of \verb$"--"$. + +\section{Infix operators} + +{\tt Infixl} and {\tt infixr} declare infix operators which associate to the +left and right respectively. As in \ML, prefixing infix operators with +\ttindexbold{op} turns them into curried functions. Infix declarations can +be reduced to mixfix ones as follows: +\begin{center}\tt +\begin{tabular}{l@{~~$\Longrightarrow$~~}l} +"$c$" ::~$\tau$(\ttindexbold{infixl}$p$) & +"op$c$" ::~$\tau$("_$c$_" [$p$,$p+1$]$p$) \\ +"$c$" ::~$\tau$(\ttindexbold{infixr}$p$) & +"op$c$" ::~$\tau$("_$c$_" [$p+1$,$p$]$p$) +\end{tabular} +\end{center} + + +\section{Binders} +A {\bf binder} is a variable-binding constant, such as a quantifier. +The declaration +\begin{ttbox} +consts $$c$$ :: $$\tau$$ (binder $$Q$$ $$p$$) +\end{ttbox}\indexbold{*binder} +introduces a binder$c$of type$\tau$, +which must have the form$(\tau@1\To\tau@2)\To\tau@3$. Its concrete syntax +is$Q~x.t$. A binder is like a generalized quantifier where$\tau@1$is the +type of the bound variable$x$,$\tau@2$the type of the body$t$, and +$\tau@3$the type of the whole term. For example$\forall$can be declared +like this: +\begin{ttbox} +consts All :: "('a => o) => o" (binder "ALL " 10) +\end{ttbox} +This allows us to write$\forall x.P$either as {\tt ALL$x$.$P$} or {\tt + All(\%$x$.$P$)}; the latter form is for purists only. + +In case$\tau@2 = \tau@3$, nested quantifications can be written as$Q~x@1
+\dots x@n.t$. From a syntactic point of view, +\begin{ttbox} +consts $$c$$ :: "$$(\tau@1\To\tau@2)\To\tau@3$$" (binder "$$Q$$" $$p$$) +\end{ttbox} +is equivalent to +\begin{ttbox} +consts $$c$$ :: "$$(\tau@1\To\tau@2)\To\tau@3$$" + "$$Q$$" :: "[idts,$$\tau@2$$] => $$\tau@3$$" ("$$Q$$_. _" $$p$$) +\end{ttbox} +where {\tt idts} is the syntactic category$idts$defined in +Figure~\ref{MetaLogicSyntax}. + +However, there is more to binders than concrete syntax: behind the scenes the +body of the quantified expression has to be converted into a +$\lambda$-abstraction (when parsing) and back again (when printing). This +is performed by the translation mechanism, which is discussed below. For +binders, the definition of the required translation functions has been +automated. Many other syntactic forms, such as set comprehension, require +special treatment. + + +\section{Parse translations *} +\label{Parse-translations} +\index{parse translation|(} + +So far we have pretended that there is a close enough relationship between +concrete and abstract syntax to allow an automatic translation from one to +the other using the constant name supplied with each production. In many +cases this scheme is not powerful enough, especially for constructs involving +variable bindings. Therefore the$ML$-section of a theory definition can +associate constant names with user-defined translation functions by including +a line +\begin{ttbox} +val parse_translation = \dots +\end{ttbox} +where the right-hand side of this binding must be an \ML-expression of type +\verb$(string * (term list -> term))list$. + +After the input string has been translated into a term according to the +syntax definition, there is a second phase in which the term is translated +using the user-supplied functions in a bottom-up manner. Given a list$tab$+of the above type, a term$t$is translated as follows. If$t$is not of the +form {\tt Const($c$,$\tau$)\$$t@1\\dots\$$t@n$}, then $t$ is returned
+unchanged.  Otherwise all $t@i$ are translated into $t@i'$.  Let {\tt $t' =$
+  Const($c$,$\tau$)\$$t@1'\\dots\$$t@n'$}. If there is no pair$(c,f)$in +$tab$, return$t'$. Otherwise apply$f$to$[t@1',\dots,t@n']$. If that +raises an exception, return$t'$, otherwise return the result. +\begin{example}\label{list-enum} +\ML-lists are constructed by {\tt[]} and {\tt::}. For readability the +list \hbox{\tt$x$::$y$::$z$::[]} can be written \hbox{\tt[$x$,$y$,$z$]}. +In Isabelle the two forms of lists are declared as follows: +\begin{ttbox} +types list 1 + enum 0 +arities list :: (term)term +consts "[]" :: "'a list" ("[]") + ":" :: "['a, 'a list] => 'a list" (infixr 50) + enum :: "enum => 'a list" ("[_]") + sing :: "'a => enum" ("_") + cons :: "['a,enum] => enum" ("_, _") +end +\end{ttbox} +Because \verb$::$is already used for type constraints, it is replaced by +\verb$:$as the infix list constructor. + +In order to allow list enumeration, the new type {\tt enum} is introduced. +Its only purpose is syntactic and hence it does not need an arity, in +contrast to the logical type {\tt list}. Although \hbox{\tt[$x$,$y$,$z$]} is +syntactically legal, it needs to be translated into a term built up from +\verb$[]$and \verb$:$. This is what \verb$make_list$accomplishes: +\begin{ttbox} +val cons = Const("op :", dummyT); + +fun make_list (Const("sing",_)$e) = cons $e$ Const("[]", dummyT)
+  | make_list (Const("cons",_)$e$es) = cons $e$ make_list es;
+\end{ttbox}
+To hook this translation up to Isabelle's parser, the theory definition needs
+to contain the following $ML$-section:
+\begin{ttbox}
+ML
+fun enum_tr[enum] = make_list enum;
+val parse_translation = [("enum",enum_tr)]
+\end{ttbox}
+This causes \verb!Const("enum",_)$!$t$to be replaced by +\verb$enum_tr[$$t\verb]. + +Of course the definition of \verbmake_list should be included in the +ML-section. +\end{example} +\begin{example}\label{SET} + Isabelle represents the set \{ x \mid P(x) \} internally by Set(\lambda + x.P(x)). The internal and external forms need separate +constants:\footnote{In practice, the external form typically has a name +beginning with an {\at} sign, such as {\tt {\at}SET}. This emphasizes that +the constant should be used only for parsing/printing.} +\begin{ttbox} +types set 1 +arities set :: (term)term +consts Set :: "('a => o) => 'a set" + SET :: "[id,o] => 'a set" ("\{_ | _\}") +\end{ttbox} +Parsing {\tt"\{x | P\}"} according to this syntax yields the term {\tt + Const("SET",dummyT) \ Free("$$x$$",dummyT) \ $$p$$}, where p is the +result of parsing P. What we need is the term {\tt + Const("Set",dummyT)\Abs("x",dummyT,p')}, where p' is some +abstracted'' version of p. Therefore we define a function +\begin{ttbox} +fun set_tr[Free(s,T), p] = Const("Set", dummyT)  + Abs(s, T, abstract_over(Free(s,T), p)); +\end{ttbox} +where \verbabstract_over: term*term -> term is a predefined function such +that {\tt abstract_over(u,t)} replaces every occurrence of u in t by +a {\tt Bound} variable of the correct index (i.e.\ 0 at top level). Remember +that {\tt dummyT} is replaced by the correct types at a later stage (see +\S\ref{Typing}). Function {\tt set_tr} is associated with {\tt SET} by +including the \ML-text +\begin{ttbox} +val parse_translation = [("SET", set_tr)]; +\end{ttbox} +\end{example} + +If you want to run the above examples in Isabelle, you should note that an +ML-section needs to contain not just a definition of +\verbparse_translation but also of a variable \verbprint_translation. The +purpose of the latter is to reverse the effect of the former during printing; +details are found in \S\ref{Print-translations}. Hence you need to include +the line +\begin{ttbox} +val print_translation = []; +\end{ttbox} +This is instructive because the terms are then printed out in their internal +form. For example the input \hbox{\tt[x,y,z]} is echoed as +\hbox{\ttx:y:z:[]}. This helps to check that your parse translation is +working correctly. + +%\begin{warn} +%Explicit type constraints disappear with type checking but are still +%visible to the parse translation functions. +%\end{warn} + +\index{parse translation|)} + +\section{Printing} + +Syntax definitions provide printing information in three distinct ways: +through +\begin{itemize} +\item the syntax of the language (as used for parsing), +\item pretty printing information, and +\item print translation functions. +\end{itemize} +The bare mixfix declarations enable Isabelle to print terms, but the result +will not necessarily be pretty and may look different from what you expected. +To produce a pleasing layout, you need to read the following sections. + +\subsection{Printing with mixfix declarations} + +Let {\ttt = Const(c,_)\$$t@1$\$\dots\$$t@n} be a term and let +\begin{ttbox} +consts $$c$$ :: $$\tau$$ ($$sy$$) +\end{ttbox} +be a mixfix declaration where sy is of the form +\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n. Printing t according to +sy means printing the string +\alpha@0\beta@1\alpha@1\ldots\alpha@{n-1}\beta@n\alpha@n, where \beta@i +is the result of printing t@i. + +Note that the system does {\em not\/} insert blanks. They should be part of +the mixfix syntax if they are required to separate tokens or achieve a +certain layout. + +\subsection{Pretty printing} +\label{PrettyPrinting} +\index{pretty printing} + +In order to format the output, it is possible to embed pretty printing +directives in mixfix annotations. These directives are ignored during parsing +and affect only printing. The characters {\tt(}, {\tt)} and {\tt/} are +interpreted as meta-characters\index{meta-character} when found in a mixfix +annotation. Their meaning is +\begin{description} +\item[~{\tt(}~] Open a block. A sequence of digits following it is + interpreted as the \bfindex{indentation} of this block. It causes the + output to be indented by n positions if a line break occurs within the + block. If {\tt(} is not followed by a digit, the indentation defaults to + 0. +\item[~{\tt)}~] Close a block. +\item[~\ttindex{/}~] Allow a \bfindex{line break}. White space immediately + following {\tt/} is not printed if the line is broken at this point. +\end{description} + +\subsection{Print translations *} +\label{Print-translations} +\index{print translation|(} + +Since terms are translated after parsing (see \S\ref{Parse-translations}), +there is a similar mechanism to translate them back before printing. +Therefore the ML-section of a theory definition can associate constant +names with user-defined translation functions by including a line +\begin{ttbox} +val print_translation = \dots +\end{ttbox} +where the right-hand side of this binding is again an \ML-expression of type +\verb(string * (term list -> term))list. +Including a pair (c,f) in this list causes the printer to print +f[t@1,\dots,t@n] whenever it finds {\tt Const(c,_)\$$t@1$\$\dots\$$t@n}. +\begin{example} +Reversing the effect of the parse translation in Example~\ref{list-enum} is +accomplished by the following function: +\begin{ttbox} +fun make_enum (Const("op :",_)  e  es) = case es of + Const("[]",_) => Const("sing",dummyT)  e + | _ => Const("enum",dummyT)  e  make_enum es; +\end{ttbox} +It translates \hbox{\ttx:y:z:[]} to \hbox{\tt[x,y,z]}. However, +if the input does not terminate with an empty list, e.g.\ \hbox{\ttx:xs}, +\verbmake_enum raises exception {\tt Match}. This signals that the +attempted translation has failed and the term should be printed as is. +The connection with Isabelle's pretty printer is established as follows: +\begin{ttbox} +fun enum_tr'[x,xs] = Const("enum",dummyT)  + make_enum(Const("op :",dummyT)xxs); +val print_translation = [("op :", enum_tr')]; +\end{ttbox} +This declaration causes the printer to print \hbox{\tt enum_tr'[x,y]} +whenever it finds \verb!Const("op :",_)!x\verb!!y. +\end{example} +\begin{example} + In Example~\ref{SET} we showed how to translate the concrete syntax for set + comprehension into the proper internal form. The string {\tt"\{x | + P\}"} now becomes {\tt Const("Set",_)\Abs("x",_,p)}. If, however, + the latter term were printed without translating it back, it would result + in {\tt"Set(\%x.P)"}. Therefore the abstraction has to be turned back + into a term that matches the concrete mixfix syntax: +\begin{ttbox} +fun set_tr'[Abs(x,T,P)] = + let val (x',P') = variant_abs(x,T,P) + in Const("SET",dummyT)  Free(x',T)  P' end; +\end{ttbox} +The function \verbvariant_abs, a basic term manipulation function, replaces +the bound variable x by a {\tt Free} variable x' having a unique name. A +term produced by {\tt set_tr'} can now be printed according to the concrete +syntax defined in Example~\ref{SET} above. + +Notice that the application of {\tt set_tr'} fails if the second component of +the argument is not an abstraction, but for example just a {\tt Free} +variable. This is intentional because it signals to the caller that the +translation is inapplicable. As a result {\tt Const("Set",_)\Free("P",_)} +prints as {\tt"Set(P)"}. + +The full theory extension, including concrete syntax and both translation +functions, has the following form: +\begin{ttbox} +types set 1 +arities set :: (term)term +consts Set :: "('a => o) => 'a set" + SET :: "[id,o] => 'a set" ("\{_ | _\}") +end +ML +fun set_tr[Free(s,T), p] = \dots; +val parse_translation = [("SET", set_tr)]; +fun set_tr'[Abs(x,T,P)] = \dots; +val print_translation = [("Set", set_tr')]; +\end{ttbox} +\end{example} +As during the parse translation process, the types attached to constants +during print translation are ignored. Thus {\tt Const("SET",dummyT)} in +{\tt set_tr'} above is acceptable. The types of {\tt Free}s and {\tt Var}s +however must be preserved because they may get printed (see {\tt +show_types}). + +\index{print translation|)} + +\subsection{Printing a term} + +Let tab be the set of all string-function pairs of print translations in the +current syntax. + +Terms are printed recursively; print translations are applied top down: +\begin{itemize} +\item {\tt Free(x,_)} is printed as x. +\item {\tt Var((x,i),_)} is printed as x, if i = 0 and x does not + end with a digit, as x followed by i, if i \neq 0 and x does not + end with a digit, and as {\tt x.i}, if x ends with a digit. Thus the + following cases can arise: +\begin{center} +{\tt\begin{tabular}{cccc} +\verbVar(("v",0),_) & \verbVar(("v",7),_) & \verbVar(("v5",0),_) \\ +"?v" & "?v7" & "?v5.0" +\end{tabular}} +\end{center} +\item {\tt Abs(x@1,_,Abs(x@2,_,\dots Abs(x@n,_,p)\dots))}, where p + is not an abstraction, is printed as {\tt \%y@1\dots y@n.P}, where P + is the result of printing p, and the x@i are replaced by y@i. The + latter are (possibly new) unique names. +\item {\tt Bound(i)} is printed as {\tt B.i} \footnote{The occurrence of + such loose'' bound variables indicates that either you are trying to + print a subterm of an abstraction, or there is something wrong with your + print translations.}. +\item The application {\ttt = Const(c,_)\$$t@1$\$\dots\$$t@n} (where + n may be 0!) is printed as follows: + + If there is a pair (c,f) in tab, print f[t@1,\dots,t@n]. If this + application raises exception {\tt Match} or there is no pair (c,f) in + tab, let sy be the mixfix annotation associated with c. If there is + no such sy, or if sy does not have exactly n argument positions, t + is printed as an application; otherwise t is printed according to sy. + + All other applications are printed as applications. +\end{itemize} +Printing a term {\tt c\$$t@1$\$\dots\t@n$} as an application means +printing it as {\tt$s$($s@1$,\dots,$s@n$)}, where$s@i$is the result of +printing$t@i$. If$c$is a {\tt Const},$s$is its first argument; +otherwise$s$is the result of printing$c$as described above. +\medskip + +The printer also inserts parentheses where they are necessary for reasons +of precedence. + +\section{Identifiers, constants, and type inference *} +\label{Typing} + +There is one final step in the translation from strings to terms that we have +not covered yet. It explains how constants are distinguished from {\tt Free}s +and how {\tt Free}s and {\tt Var}s are typed. Both issues arise because {\tt + Free}s and {\tt Var}s are not declared. + +An identifier$f$that does not appear as a delimiter in the concrete syntax +can be either a free variable or a constant. Since the parser knows only +about those constants which appear in mixfix annotations, it parses$f$as +{\tt Free("$f$",dummyT)}, where \ttindex{dummyT} is the predefined dummy {\tt + typ}. Although the parser produces these very raw terms, most user +interface level functions like {\tt goal} type terms according to the given +theory, say$T$. In a first step, every occurrence of {\tt Free($f$,_)} or +{\tt Const($f$,_)} is replaced by {\tt Const($f$,$\tau$)}, provided there is +a constant$f$of {\tt typ}$\tau$in$T$. This means that identifiers are +treated as {\tt Free}s iff they are not declared in the theory. The types of +the remaining {\tt Free}s (and {\tt Var}s) are inferred as in \ML. Type +constraints can be used to remove ambiguities. + +One peculiarity of the current type inference algorithm is that variables +with the same name must have the same type, irrespective of whether they are +schematic, free or bound. For example, take the first-order formula$f(x) = x
+\land (\forall f.~ f=f)$where${=} :: [\alpha{::}term,\alpha]\To o$and +$\forall :: (\alpha{::}term\To o)\To o$. The first conjunct forces +$x::\alpha{::}term$and$f::\alpha\To\alpha$, the second one +$f::\beta{::}term$. Although the two$f$'s are distinct, they are required to +have the same type. Unifying$\alpha\To\alpha$and$\beta{::}term$fails +because, in first-order logic, function types are not in class$term$. + + +\section{Putting it all together} + +Having discussed the individual building blocks of a logic definition, it +remains to be shown how they fit together. In particular we need to say how +an object-logic syntax is hooked up to the meta-logic. Since all theorems +must conform to the syntax for$prop$(see Figure~\ref{MetaLogicSyntax}), +that syntax has to be extended with the object-level syntax. Assume that the +syntax of your object-logic defines a category$o$of formulae. These +formulae can now appear in axioms and theorems wherever$prop$does if you +add the production +$prop ~=~ form.$ +More precisely, you need a coercion from formulae to propositions: +\begin{ttbox} +Base = Pure + +types o 0 +arities o :: logic +consts Trueprop :: "o => prop" ("_" 5) +end +\end{ttbox} +The constant {\tt Trueprop} (the name is arbitrary) acts as an invisible +coercion function. Assuming this definition resides in a file {\tt base.thy}, +you have to load it with the command {\tt use_thy"base"}. + +One of the simplest nontrivial logics is {\em minimal logic} of +implication. Its definition in Isabelle needs no advanced features but +illustrates the overall mechanism quite nicely: +\begin{ttbox} +Hilbert = Base + +consts "-->" :: "[o,o] => o" (infixr 10) +rules +K "P --> Q --> P" +S "(P --> Q --> R) --> (P --> Q) --> P --> R" +MP "[| P --> Q; P |] ==> Q" +end +\end{ttbox} +After loading this definition you can start to prove theorems in this logic: +\begin{ttbox} +goal Hilbert.thy "P --> P"; +{\out Level 0} +{\out P --> P} +{\out 1. P --> P} +by (resolve_tac [Hilbert.MP] 1); +{\out Level 1} +{\out P --> P} +{\out 1. ?P --> P --> P} +{\out 2. ?P} +by (resolve_tac [Hilbert.MP] 1); +{\out Level 2} +{\out P --> P} +{\out 1. ?P1 --> ?P --> P --> P} +{\out 2. ?P1} +{\out 3. ?P} +by (resolve_tac [Hilbert.S] 1); +{\out Level 3} +{\out P --> P} +{\out 1. P --> ?Q2 --> P} +{\out 2. P --> ?Q2} +by (resolve_tac [Hilbert.K] 1); +{\out Level 4} +{\out P --> P} +{\out 1. P --> ?Q2} +by (resolve_tac [Hilbert.K] 1); +{\out Level 5} +{\out P --> P} +{\out No subgoals!} +\end{ttbox} +As you can see, this Hilbert-style formulation of minimal logic is easy to +define but difficult to use. The following natural deduction formulation is +far preferable: +\begin{ttbox} +MinI = Base + +consts "-->" :: "[o,o] => o" (infixr 10) +rules +impI "(P ==> Q) ==> P --> Q" +impE "[| P --> Q; P |] ==> Q" +end +\end{ttbox} +Note, however, that although the two systems are equivalent, this fact cannot +be proved within Isabelle: {\tt S} and {\tt K} can be derived in \verb$MinI\$
+(exercise!), but {\tt impI} cannot be derived in \verb!Hilbert!.  The reason
+is that {\tt impI} is only an {\em admissible} rule in \verb!Hilbert!,
+something that can only be shown by induction over all possible proofs in
+\verb!Hilbert!.
+
+It is a very simple matter to extend minimal logic with falsity:
+\begin{ttbox}
+MinIF = MinI +
+consts False :: "o"
+rules
+FalseE  "False ==> P"
+end
+\end{ttbox}
+On the other hand, we may wish to introduce conjunction only:
+\begin{ttbox}
+MinC = Base +
+consts "&" :: "[o,o] => o"  (infixr 30)
+rules
+conjI  "[| P; Q |] ==> P & Q"
+conjE1 "P & Q ==> P"
+conjE2 "P & Q ==> Q"
+end
+\end{ttbox}
+And if we want to have all three connectives together, we define:
+\begin{ttbox}
+MinIFC = MinIF + MinC
+\end{ttbox}
+Now we can prove mixed theorems like
+\begin{ttbox}
+goal MinIFC.thy "P & False --> Q";
+by (resolve_tac [MinI.impI] 1);
+by (dresolve_tac [MinC.conjE2] 1);
+by (eresolve_tac [MinIF.FalseE] 1);
+\end{ttbox}
+Try this as an exercise!