doc-src/Logics/old.defining.tex
 changeset 104 d8205bb279a7
equal inserted replaced
103:30bd42401ab2 104:d8205bb279a7

     1 \chapter{Defining Logics} \label{Defining-Logics}

     2 This chapter is intended for Isabelle experts.  It explains how to define new

     3 logical systems, Isabelle's {\it raison d'\^etre}.  Isabelle logics are

     4 hierarchies of theories.  A number of simple examples are contained in the

     5 introductory manual; the full syntax of theory definitions is shown in the

     6 {\em Reference Manual}.  The purpose of this chapter is to explain the

     7 remaining subtleties, especially some context conditions on the class

     8 structure and the definition of new mixfix syntax.  A full understanding of

     9 the material requires knowledge of the internal representation of terms (data

    10 type {\tt term}) as detailed in the {\em Reference Manual}.  Sections marked

    11 with a * can be skipped on first reading.

    12

    13

    14 \section{Classes and Types *}

    15 \index{*arities!context conditions}

    16

    17 Type declarations are subject to the following two well-formedness

    18 conditions:

    19 \begin{itemize}

    20 \item There are no two declarations $ty :: (\vec{r})c$ and $ty :: (\vec{s})c$

    21   with $\vec{r} \neq \vec{s}$.  For example

    22 \begin{ttbox}

    23 types ty 1

    24 arities ty :: (\{logic\}) logic

    25         ty :: (\{\})logic

    26 \end{ttbox}

    27 leads to an error message and fails.

    28 \item If there are two declarations $ty :: (s@1,\dots,s@n)c$ and $ty ::    29 (s@1',\dots,s@n')c'$ such that $c' < c$ then $s@i' \preceq s@i$ must hold

    30   for $i=1,\dots,n$.  The relationship $\preceq$, defined as

    31 $s' \preceq s \iff \forall c\in s. \exists c'\in s'.~ c'\le c,$

    32 expresses that the set of types represented by $s'$ is a subset of the set of

    33 types represented by $s$.  For example

    34 \begin{ttbox}

    35 classes term < logic

    36 types ty 1

    37 arities ty :: (\{logic\})logic

    38         ty :: (\{\})term

    39 \end{ttbox}

    40 leads to an error message and fails.

    41 \end{itemize}

    42 These conditions guarantee principal types~\cite{nipkow-prehofer}.

    43

    44 \section{Precedence Grammars}

    45 \label{PrecedenceGrammars}

    46 \index{precedence grammar|(}

    47

    48 The precise syntax of a logic is best defined by a context-free grammar.

    49 These grammars obey the following conventions: identifiers denote

    50 nonterminals, {\tt typewriter} fount denotes terminals, repetition is

    51 indicated by \dots, and alternatives are separated by $|$.

    52

    53 In order to simplify the description of mathematical languages, we introduce

    54 an extended format which permits {\bf precedences}\index{precedence}.  This

    55 scheme generalizes precedence declarations in \ML\ and {\sc prolog}.  In this

    56 extended grammar format, nonterminals are decorated by integers, their

    57 precedence.  In the sequel, precedences are shown as subscripts.  A nonterminal

    58 $A@p$ on the right-hand side of a production may only be replaced using a

    59 production $A@q = \gamma$ where $p \le q$.

    60

    61 Formally, a set of context free productions $G$ induces a derivation

    62 relation $\rew@G$ on strings as follows:

    63 $\alpha A@p \beta ~\rew@G~ \alpha\gamma\beta ~~~iff~~~    64 \exists q \ge p.~(A@q=\gamma) \in G    65$

    66 Any extended grammar of this kind can be translated into a normal context

    67 free grammar.  However, this translation may require the introduction of a

    68 large number of new nonterminals and productions.

    69

    70 \begin{example}

    71 \label{PrecedenceEx}

    72 The following simple grammar for arithmetic expressions demonstrates how

    73 binding power and associativity of operators can be enforced by precedences.

    74 \begin{center}

    75 \begin{tabular}{rclr}

    76 $A@9$ & = & {\tt0} \\

    77 $A@9$ & = & {\tt(} $A@0$ {\tt)} \\

    78 $A@0$ & = & $A@0$ {\tt+} $A@1$ \\

    79 $A@2$ & = & $A@3$ {\tt*} $A@2$ \\

    80 $A@3$ & = & {\tt-} $A@3$

    81 \end{tabular}

    82 \end{center}

    83 The choice of precedences determines that \verb$-$ binds tighter than

    84 \verb$*$ which binds tighter than \verb$+$, and that \verb$+$ and \verb$*$

    85 associate to the left and right, respectively.

    86 \end{example}

    87

    88 To minimize the number of subscripts, we adopt the following conventions:

    89 \begin{itemize}

    90 \item all precedences $p$ must be in the range $0 \leq p \leq max_pri$ for

    91   some fixed $max_pri$.

    92 \item precedence $0$ on the right-hand side and precedence $max_pri$ on the

    93   left-hand side may be omitted.

    94 \end{itemize}

    95 In addition, we write the production $A@p = \alpha$ as $A = \alpha~(p)$.

    96

    97 Using these conventions and assuming $max_pri=9$, the grammar in

    98 Example~\ref{PrecedenceEx} becomes

    99 \begin{center}

   100 \begin{tabular}{rclc}

   101 $A$ & = & {\tt0} & \hspace*{4em} \\

   102  & $|$ & {\tt(} $A$ {\tt)} \\

   103  & $|$ & $A$ {\tt+} $A@1$ & (0) \\

   104  & $|$ & $A@3$ {\tt*} $A@2$ & (2) \\

   105  & $|$ & {\tt-} $A@3$ & (3)

   106 \end{tabular}

   107 \end{center}

   108

   109 \index{precedence grammar|)}

   110

   111 \section{Basic syntax *}

   112

   113 An informal account of most of Isabelle's syntax (meta-logic, types etc) is

   114 contained in {\em Introduction to Isabelle}.  A precise description using a

   115 precedence grammar is shown in Figure~\ref{MetaLogicSyntax}.  This description

   116 is the basis of all extensions by object-logics.

   117 \begin{figure}[htb]

   118 \begin{center}

   119 \begin{tabular}{rclc}

   120 $prop$ &=& \ttindex{PROP} $aprop$ ~~$|$~~ {\tt(} $prop$ {\tt)} \\

   121      &$|$& $logic@3$ \ttindex{==} $logic@2$ & (2) \\

   122      &$|$& $prop@2$ \ttindex{==>} $prop@1$ & (1) \\

   123      &$|$& {\tt[|} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt|]} {\tt==>} $prop@1$ & (1) \\

   124      &$|$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\\\

   125 $logic$ &=& $prop$ ~~$|$~~ $fun$ \\\\

   126 $aprop$ &=& $id$ ~~$|$~~ $var$

   127     ~~$|$~~ $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\\\

   128 $fun$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $fun$ {\tt)} \\

   129     &$|$& \ttindex{\%} $idts$ {\tt.} $logic$ & (0) \\\\

   130 $idts$ &=& $idt$ ~~$|$~~ $idt@1$ $idts$ \\\\

   131 $idt$ &=& $id$ ~~$|$~~ {\tt(} $idt$ {\tt)} \\

   132     &$|$& $id$ \ttindex{::} $type$ & (0) \\\\

   133 $type$ &=& $tfree$ ~~$|$~~ $tvar$ \\

   134      &$|$& $tfree$ {\tt::} $sort$ ~~$|$~~ $tvar$ {\tt::} $sort$ \\

   135      &$|$& $id$ ~~$|$~~ $type@{max_pri}$ $id$

   136                 ~~$|$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\

   137      &$|$& $type@1$ \ttindex{=>} $type$ & (0) \\

   138      &$|$& {\tt[}  $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0)\\

   139      &$|$& {\tt(} $type$ {\tt)} \\\\

   140 $sort$ &=& $id$ ~~$|$~~ {\tt\{\}}

   141                 ~~$|$~~ {\tt\{} $id$ {\tt,} \dots {\tt,} $id$ {\tt\}}

   142 \end{tabular}\index{*"!"!}\index{*"["|}\index{*"|"]}

   143 \indexbold{type@$type$}\indexbold{sort@$sort$}\indexbold{idts@$idts$}

   144 \indexbold{logic@$logic$}\indexbold{prop@$prop$}\indexbold{fun@$fun$}

   145 \end{center}

   146 \caption{Meta-Logic Syntax}

   147 \label{MetaLogicSyntax}

   148 \end{figure}

   149 The following main categories are defined:

   150 \begin{description}

   151 \item[$prop$] Terms of type $prop$, i.e.\ formulae of the meta-logic.

   152 \item[$aprop$] Atomic propositions.

   153 \item[$logic$] Terms of types in class $logic$.  Initially, $logic$ contains

   154   merely $prop$.  As the syntax is extended by new object-logics, more

   155   productions for $logic$ are added (see below).

   156 \item[$fun$] Terms potentially of function type.

   157 \item[$type$] Types.

   158 \item[$idts$] a list of identifiers, possibly constrained by types.  Note

   159   that $x::nat~y$ is parsed as $x::(nat~y)$, i.e.\ $y$ is treated like a

   160   type constructor applied to $nat$.

   161 \end{description}

   162

   163 The predefined types $id$, $var$, $tfree$ and $tvar$ represent identifiers

   164 ({\tt f}), unknowns ({\tt ?f}), type variables ({\tt 'a}), and type unknowns

   165 ({\tt ?'a}) respectively.  If we think of them as nonterminals with

   166 predefined syntax, we may assume that all their productions have precedence

   167 $max_pri$.

   168

   169 \subsection{Logical types and default syntax}

   170

   171 Isabelle is concerned with mathematical languages which have a certain

   172 minimal vocabulary: identifiers, variables, parentheses, and the lambda

   173 calculus.  Logical types, i.e.\ those of class $logic$, are automatically

   174 equipped with this basic syntax.  More precisely, for any type constructor

   175 $ty$ with arity $(\dots)c$, where $c$ is a subclass of $logic$, the following

   176 productions are added:

   177 \begin{center}

   178 \begin{tabular}{rclc}

   179 $ty$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $ty$ {\tt)} \\

   180   &$|$& $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)}\\

   181   &$|$& $ty@{max_pri}$ {\tt::} $type$\\\\

   182 $logic$ &=& $ty$

   183 \end{tabular}

   184 \end{center}

   185

   186

   187 \section{Mixfix syntax}

   188 \index{mixfix|(}

   189

   190 We distinguish between abstract and concrete syntax.  The {\em abstract}

   191 syntax is given by the typed constants of a theory.  Abstract syntax trees are

   192 well-typed terms, i.e.\ values of \ML\ type {\tt term}.  If none of the

   193 constants are introduced with mixfix annotations, there is no concrete syntax

   194 to speak of: terms can only be abstractions or applications of the form

   195 $f(t@1,\dots,t@n)$, where $f$ is a constant or variable.  Since this notation

   196 quickly becomes unreadable, Isabelle supports syntax definitions in the form

   197 of unrestricted context-free grammars using mixfix annotations.

   198

   199 Mixfix annotations describe the {\em concrete} syntax, its translation into

   200 the abstract syntax, and a pretty-printing scheme, all in one.  Isabelle

   201 syntax definitions are inspired by \OBJ's~\cite{OBJ} {\em mixfix\/} syntax.

   202 Each mixfix annotation defines a precedence grammar production and associates

   203 an Isabelle constant with it.

   204

   205 A {\em mixfix declaration} {\tt consts $c$ ::\ $\tau$ ($sy$ $ps$ $p$)} is

   206 interpreted as a grammar pro\-duction as follows:

   207 \begin{itemize}

   208 \item $sy$ is the right-hand side of this production, specified as a {\em

   209     mixfix annotation}.  In general, $sy$ is of the form

   210   $\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$, where each occurrence of

   211   \ttindex{_}'' denotes an argument/nonterminal and the strings

   212   $\alpha@i$ do not contain {\tt_}''.

   213 \item $\tau$ specifies the types of the nonterminals on the left and right

   214   hand side. If $sy$ is of the form above, $\tau$ must be of the form

   215   $[\tau@1,\dots,\tau@n] \To \tau'$.  Then argument $i$ is of type $\tau@i$

   216   and the result, i.e.\ the left-hand side of the production, is of type

   217   $\tau'$.  Both the $\tau@i$ and $\tau'$ may be function types.

   218 \item $c$ is the name of the Isabelle constant associated with this production.

   219   Parsing an instance of the phrase $sy$ generates the {\tt term} {\tt

   220     Const($c$,dummyT\footnote{Proper types are inserted later on.  See

   221       \S\ref{Typing}})\$$a@1\$$\dots$\$$a@n}\index{*dummyT}, where a@i is    222 the term generated by parsing the i^{th} argument.    223 \item ps must be of the form [p@1,\dots,p@n], where p@i is the    224 minimal precedence\index{precedence} required of any phrase that may appear    225 as the i^{th} argument. The null list is interpreted as a list of 0's of    226 the appropriate length.    227 \item p is the precedence of this production.    228 \end{itemize}    229 Notice that there is a close connection between abstract and concrete syntax:    230 each production has an associated constant, and types act as {\bf syntactic    231 categories} in the concrete syntax. To emphasize this connection, we    232 sometimes refer to the nonterminals on the right-hand side of a production as    233 its arguments and to the nonterminal on the left-hand side as its result.    234     235 The maximal legal precedence is called \ttindexbold{max_pri}, which is    236 currently 1000. If you want to ignore precedences, the safest way to do so is    237 to use the annotation {\tt(sy)}: this production puts no precedence    238 constraints on any of its arguments and has maximal precedence itself, i.e.\     239 it is always applicable and does not exclude any productions of its    240 arguments.    241     242 \begin{example}    243 In mixfix notation the grammar in Example~\ref{PrecedenceEx} can be written    244 as follows:    245 \begin{ttbox}    246 types exp 0    247 consts "0" :: "exp" ("0" 9)    248 "+" :: "[exp,exp] => exp" ("_ + _" [0,1] 0)    249 "*" :: "[exp,exp] => exp" ("_ * _" [3,2] 2)    250 "-" :: "exp => exp" ("- _"  3)    251 \end{ttbox}    252 Parsing the string \verb!"0 + - 0 + 0"! produces the term {\tt    253 p\(p\(m\$$z$)\$$z)\$$z$} where {\tt$p =$Const("+",dummyT)},    254 {\tt$m =$Const("-",dummyT)}, and {\tt$z =$Const("0",dummyT)}.    255 \end{example}    256     257 The interpretation of \ttindex{_} in a mixfix annotation is always as a {\bf    258 meta-character}\index{meta-character} which does not represent itself but    259 an argument position. The following characters are also meta-characters:    260 \begin{ttbox}    261 ' ( ) /    262 \end{ttbox}    263 Preceding any character with a quote (\verb$'$) turns it into an ordinary    264 character. Thus you can write \verb!''! if you really want a single quote.    265 The purpose of the other meta-characters is explained in    266 \S\ref{PrettyPrinting}. Remember that in \ML\ strings \verb$\$is already a    267 (different kind of) meta-character.    268     269     270 \subsection{Types and syntactic categories *}    271     272 The precise mapping from types to syntactic categories is defined by the    273 following function:    274 \begin{eqnarray*}    275 N(\tau@1\To\tau@2) &=& fun \\    276 N((\tau@1,\dots,\tau@n)ty) &=& ty \\    277 N(\alpha) &=& logic    278 \end{eqnarray*}    279 Only the outermost type constructor is taken into account and type variables    280 can range over all logical types. This catches some ill-typed terms (like    281$Cons(x,0)$, where$Cons :: [\alpha,\alpha list] \To \alpha list$and$0 ::

   282 nat$) but leaves the real work to the type checker.    283     284 In terms of the precedence grammar format introduced in    285 \S\ref{PrecedenceGrammars}, the declaration    286 \begin{ttbox}    287 consts $$c$$ :: "[$$\tau@1$$,\dots,$$\tau@n$$]$$\To\tau$$" ("$$\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$$") [$$p@1$$,\dots,$$p@n$$] $$p$$)    288 \end{ttbox}    289 defines the production    290 $N(\tau)@p ~~=~~ \alpha@0 ~N(\tau@1)@{p@1}~ \alpha@1~ \dots    291 ~\alpha@{n-1} ~N(\tau@n)@{p@n}~ \alpha@n    292$    293     294 \subsection{Copy productions *}    295     296 Productions which do not create a new node in the abstract syntax tree are    297 called {\bf copy productions}. They must have exactly one nonterminal on    298 the right hand side. The term generated when parsing that nonterminal is    299 simply passed up as the result of parsing the whole copy production. In    300 Isabelle a copy production is indicated by an empty constant name, i.e.\ by    301 \begin{ttbox}    302 consts "" :: $$\tau$$ ($$sy$$ $$ps$$ $$p$$)    303 \end{ttbox}    304     305 A special kind of copy production is one where, modulo white space,$sy$is    306 {\tt"_"}. It is called a {\bf chain production}. Chain productions should be    307 seen as an abbreviation mechanism. Conceptually, they are removed from the    308 grammar by adding appropriate new rules. Precedence information attached to    309 chain productions is ignored. The following example demonstrates the effect:    310 the grammar defined by    311 \begin{ttbox}    312 types A,B,C 0    313 consts AB :: "B => A" ("A _"  517)    314 "" :: "C => B" ("_"  100)    315 x :: "C" ("x" 5)    316 y :: "C" ("y" 15)    317 \end{ttbox}    318 admits {\tt"A y"} but not {\tt"A x"}. Had the constant in the second    319 production been some non-empty string, both {\tt"A y"} and {\tt"A x"} would    320 be legal.    321     322 \index{mixfix|)}    323     324 \section{Lexical conventions}    325     326 The lexical analyzer distinguishes the following kinds of tokens: delimiters,    327 identifiers, unknowns, type variables and type unknowns.    328     329 Delimiters are user-defined, i.e.\ they are extracted from the syntax    330 definition. If$\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$is a mixfix    331 annotation, each$\alpha@i$is decomposed into substrings    332$\beta@1~\dots~\beta@k$which are separated by and do not contain    333 \bfindex{white space} ( = blanks, tabs, newlines). Each$\beta@j$becomes a    334 delimiter. Thus a delimiter can be an arbitrary string not containing white    335 space.    336     337 The lexical syntax of identifiers and variables ( = unknowns) is defined in    338 the introductory manual. Parsing an identifier$f$generates {\tt    339 Free($f$,dummyT)}\index{*dummyT}. Parsing a variable {\tt?}$v$generates    340 {\tt Var(($u$,$i$),dummyT)} where$i$is the integer value of the longest    341 numeric suffix of$v$(possibly$0$), and$u$is the remaining prefix.    342 Parsing a variable {\tt?}$v{.}i$generates {\tt Var(($v$,$i$),dummyT)}. The    343 following table covers the four different cases that can arise:    344 \begin{center}\tt    345 \begin{tabular}{cccc}    346 "?v" & "?v.7" & "?v5" & "?v7.5" \\    347 Var(("v",0),$d$) & Var(("v",7),$d$) & Var(("v",5),$d$) & Var(("v7",5),$d$)    348 \end{tabular}    349 \end{center}    350 where$d = {\tt dummyT}$.    351     352 In mixfix annotations, \ttindexbold{id}, \ttindexbold{var},    353 \ttindexbold{tfree} and \ttindexbold{tvar} are the predefined categories of    354 identifiers, unknowns, type variables and type unknowns, respectively.    355     356     357 The lexical analyzer translates input strings to token lists by repeatedly    358 taking the maximal prefix of the input string that forms a valid token. A    359 maximal prefix that is both a delimiter and an identifier or variable (like    360 {\tt ALL}) is treated as a delimiter. White spaces are separators.    361     362 An important consequence of this translation scheme is that delimiters need    363 not be separated by white space to be recognized as separate. If \verb$"-"$    364 is a delimiter but \verb$"--"$is not, the string \verb$"--"$is treated as    365 two consecutive occurrences of \verb$"-"$. This is in contrast to \ML\ which    366 would treat \verb$"--"$as a single (undeclared) identifier. The    367 consequence of Isabelle's more liberal scheme is that the same string may be    368 parsed in a different way after extending the syntax: after adding    369 \verb$"--"$as a delimiter, the input \verb$"--"$is treated as    370 a single occurrence of \verb$"--"$.    371     372 \section{Infix operators}    373     374 {\tt Infixl} and {\tt infixr} declare infix operators which associate to the    375 left and right respectively. As in \ML, prefixing infix operators with    376 \ttindexbold{op} turns them into curried functions. Infix declarations can    377 be reduced to mixfix ones as follows:    378 \begin{center}\tt    379 \begin{tabular}{l@{~~$\Longrightarrow$~~}l}    380 "$c$" ::~$\tau$(\ttindexbold{infixl}$p$) &    381 "op$c$" ::~$\tau$("_$c$_" [$p$,$p+1$]$p$) \\    382 "$c$" ::~$\tau$(\ttindexbold{infixr}$p$) &    383 "op$c$" ::~$\tau$("_$c$_" [$p+1$,$p$]$p$)    384 \end{tabular}    385 \end{center}    386     387     388 \section{Binders}    389 A {\bf binder} is a variable-binding constant, such as a quantifier.    390 The declaration    391 \begin{ttbox}    392 consts $$c$$ :: $$\tau$$ (binder $$Q$$ $$p$$)    393 \end{ttbox}\indexbold{*binder}    394 introduces a binder$c$of type$\tau$,    395 which must have the form$(\tau@1\To\tau@2)\To\tau@3$. Its concrete syntax    396 is$Q~x.t$. A binder is like a generalized quantifier where$\tau@1$is the    397 type of the bound variable$x$,$\tau@2$the type of the body$t$, and    398$\tau@3$the type of the whole term. For example$\forall$can be declared    399 like this:    400 \begin{ttbox}    401 consts All :: "('a => o) => o" (binder "ALL " 10)    402 \end{ttbox}    403 This allows us to write$\forall x.P$either as {\tt ALL$x$.$P$} or {\tt    404 All(\%$x$.$P$)}; the latter form is for purists only.    405     406 In case$\tau@2 = \tau@3$, nested quantifications can be written as$Q~x@1

   407 \dots x@n.t$. From a syntactic point of view,    408 \begin{ttbox}    409 consts $$c$$ :: "$$(\tau@1\To\tau@2)\To\tau@3$$" (binder "$$Q$$" $$p$$)    410 \end{ttbox}    411 is equivalent to    412 \begin{ttbox}    413 consts $$c$$ :: "$$(\tau@1\To\tau@2)\To\tau@3$$"    414 "$$Q$$" :: "[idts,$$\tau@2$$] => $$\tau@3$$" ("$$Q$$_. _" $$p$$)    415 \end{ttbox}    416 where {\tt idts} is the syntactic category$idts$defined in    417 Figure~\ref{MetaLogicSyntax}.    418     419 However, there is more to binders than concrete syntax: behind the scenes the    420 body of the quantified expression has to be converted into a    421$\lambda$-abstraction (when parsing) and back again (when printing). This    422 is performed by the translation mechanism, which is discussed below. For    423 binders, the definition of the required translation functions has been    424 automated. Many other syntactic forms, such as set comprehension, require    425 special treatment.    426     427     428 \section{Parse translations *}    429 \label{Parse-translations}    430 \index{parse translation|(}    431     432 So far we have pretended that there is a close enough relationship between    433 concrete and abstract syntax to allow an automatic translation from one to    434 the other using the constant name supplied with each production. In many    435 cases this scheme is not powerful enough, especially for constructs involving    436 variable bindings. Therefore the$ML$-section of a theory definition can    437 associate constant names with user-defined translation functions by including    438 a line    439 \begin{ttbox}    440 val parse_translation = \dots    441 \end{ttbox}    442 where the right-hand side of this binding must be an \ML-expression of type    443 \verb$(string * (term list -> term))list$.    444     445 After the input string has been translated into a term according to the    446 syntax definition, there is a second phase in which the term is translated    447 using the user-supplied functions in a bottom-up manner. Given a list$tab$    448 of the above type, a term$t$is translated as follows. If$t$is not of the    449 form {\tt Const($c$,$\tau$)\$$t@1\\dots\$$t@n$}, then $t$ is returned

   450 unchanged.  Otherwise all $t@i$ are translated into $t@i'$.  Let {\tt $t' =$

   451   Const($c$,$\tau$)\$$t@1'\\dots\$$t@n'$}. If there is no pair$(c,f)$in    452$tab$, return$t'$. Otherwise apply$f$to$[t@1',\dots,t@n']$. If that    453 raises an exception, return$t'$, otherwise return the result.    454 \begin{example}\label{list-enum}    455 \ML-lists are constructed by {\tt[]} and {\tt::}. For readability the    456 list \hbox{\tt$x$::$y$::$z$::[]} can be written \hbox{\tt[$x$,$y$,$z$]}.    457 In Isabelle the two forms of lists are declared as follows:    458 \begin{ttbox}    459 types list 1    460 enum 0    461 arities list :: (term)term    462 consts "[]" :: "'a list" ("[]")    463 ":" :: "['a, 'a list] => 'a list" (infixr 50)    464 enum :: "enum => 'a list" ("[_]")    465 sing :: "'a => enum" ("_")    466 cons :: "['a,enum] => enum" ("_, _")    467 end    468 \end{ttbox}    469 Because \verb$::$is already used for type constraints, it is replaced by    470 \verb$:$as the infix list constructor.    471     472 In order to allow list enumeration, the new type {\tt enum} is introduced.    473 Its only purpose is syntactic and hence it does not need an arity, in    474 contrast to the logical type {\tt list}. Although \hbox{\tt[$x$,$y$,$z$]} is    475 syntactically legal, it needs to be translated into a term built up from    476 \verb$[]$and \verb$:$. This is what \verb$make_list$accomplishes:    477 \begin{ttbox}    478 val cons = Const("op :", dummyT);    479     480 fun make_list (Const("sing",_)$e) = cons $e$ Const("[]", dummyT)

   481   | make_list (Const("cons",_)$e$es) = cons $e$ make_list es;

   482 \end{ttbox}

   483 To hook this translation up to Isabelle's parser, the theory definition needs

   484 to contain the following $ML$-section:

   485 \begin{ttbox}

   486 ML

   487 fun enum_tr[enum] = make_list enum;

   488 val parse_translation = [("enum",enum_tr)]

   489 \end{ttbox}

   490 This causes \verb!Const("enum",_)$!$t$to be replaced by    491 \verb$enum_tr[$$t\verb].    492     493 Of course the definition of \verbmake_list should be included in the    494 ML-section.    495 \end{example}    496 \begin{example}\label{SET}    497 Isabelle represents the set \{ x \mid P(x) \} internally by Set(\lambda    498 x.P(x)). The internal and external forms need separate    499 constants:\footnote{In practice, the external form typically has a name    500 beginning with an {\at} sign, such as {\tt {\at}SET}. This emphasizes that    501 the constant should be used only for parsing/printing.}    502 \begin{ttbox}    503 types set 1    504 arities set :: (term)term    505 consts Set :: "('a => o) => 'a set"    506 SET :: "[id,o] => 'a set" ("\{_ | _\}")    507 \end{ttbox}    508 Parsing {\tt"\{x | P\}"} according to this syntax yields the term {\tt    509 Const("SET",dummyT) \ Free("$$x$$",dummyT) \ $$p$$}, where p is the    510 result of parsing P. What we need is the term {\tt    511 Const("Set",dummyT)\Abs("x",dummyT,p')}, where p' is some    512 abstracted'' version of p. Therefore we define a function    513 \begin{ttbox}    514 fun set_tr[Free(s,T), p] = Const("Set", dummyT)     515 Abs(s, T, abstract_over(Free(s,T), p));    516 \end{ttbox}    517 where \verbabstract_over: term*term -> term is a predefined function such    518 that {\tt abstract_over(u,t)} replaces every occurrence of u in t by    519 a {\tt Bound} variable of the correct index (i.e.\ 0 at top level). Remember    520 that {\tt dummyT} is replaced by the correct types at a later stage (see    521 \S\ref{Typing}). Function {\tt set_tr} is associated with {\tt SET} by    522 including the \ML-text    523 \begin{ttbox}    524 val parse_translation = [("SET", set_tr)];    525 \end{ttbox}    526 \end{example}    527     528 If you want to run the above examples in Isabelle, you should note that an    529 ML-section needs to contain not just a definition of    530 \verbparse_translation but also of a variable \verbprint_translation. The    531 purpose of the latter is to reverse the effect of the former during printing;    532 details are found in \S\ref{Print-translations}. Hence you need to include    533 the line    534 \begin{ttbox}    535 val print_translation = [];    536 \end{ttbox}    537 This is instructive because the terms are then printed out in their internal    538 form. For example the input \hbox{\tt[x,y,z]} is echoed as    539 \hbox{\ttx:y:z:[]}. This helps to check that your parse translation is    540 working correctly.    541     542 %\begin{warn}    543 %Explicit type constraints disappear with type checking but are still    544 %visible to the parse translation functions.    545 %\end{warn}    546     547 \index{parse translation|)}    548     549 \section{Printing}    550     551 Syntax definitions provide printing information in three distinct ways:    552 through    553 \begin{itemize}    554 \item the syntax of the language (as used for parsing),    555 \item pretty printing information, and    556 \item print translation functions.    557 \end{itemize}    558 The bare mixfix declarations enable Isabelle to print terms, but the result    559 will not necessarily be pretty and may look different from what you expected.    560 To produce a pleasing layout, you need to read the following sections.    561     562 \subsection{Printing with mixfix declarations}    563     564 Let {\ttt = Const(c,_)\$$t@1$\$\dots\$$t@n} be a term and let    565 \begin{ttbox}    566 consts $$c$$ :: $$\tau$$ ($$sy$$)    567 \end{ttbox}    568 be a mixfix declaration where sy is of the form    569 \alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n. Printing t according to    570 sy means printing the string    571 \alpha@0\beta@1\alpha@1\ldots\alpha@{n-1}\beta@n\alpha@n, where \beta@i    572 is the result of printing t@i.    573     574 Note that the system does {\em not\/} insert blanks. They should be part of    575 the mixfix syntax if they are required to separate tokens or achieve a    576 certain layout.    577     578 \subsection{Pretty printing}    579 \label{PrettyPrinting}    580 \index{pretty printing}    581     582 In order to format the output, it is possible to embed pretty printing    583 directives in mixfix annotations. These directives are ignored during parsing    584 and affect only printing. The characters {\tt(}, {\tt)} and {\tt/} are    585 interpreted as meta-characters\index{meta-character} when found in a mixfix    586 annotation. Their meaning is    587 \begin{description}    588 \item[~{\tt(}~] Open a block. A sequence of digits following it is    589 interpreted as the \bfindex{indentation} of this block. It causes the    590 output to be indented by n positions if a line break occurs within the    591 block. If {\tt(} is not followed by a digit, the indentation defaults to    592 0.    593 \item[~{\tt)}~] Close a block.    594 \item[~\ttindex{/}~] Allow a \bfindex{line break}. White space immediately    595 following {\tt/} is not printed if the line is broken at this point.    596 \end{description}    597     598 \subsection{Print translations *}    599 \label{Print-translations}    600 \index{print translation|(}    601     602 Since terms are translated after parsing (see \S\ref{Parse-translations}),    603 there is a similar mechanism to translate them back before printing.    604 Therefore the ML-section of a theory definition can associate constant    605 names with user-defined translation functions by including a line    606 \begin{ttbox}    607 val print_translation = \dots    608 \end{ttbox}    609 where the right-hand side of this binding is again an \ML-expression of type    610 \verb(string * (term list -> term))list.    611 Including a pair (c,f) in this list causes the printer to print    612 f[t@1,\dots,t@n] whenever it finds {\tt Const(c,_)\$$t@1$\$\dots\$$t@n}.    613 \begin{example}    614 Reversing the effect of the parse translation in Example~\ref{list-enum} is    615 accomplished by the following function:    616 \begin{ttbox}    617 fun make_enum (Const("op :",_)  e  es) = case es of    618 Const("[]",_) => Const("sing",dummyT)  e    619 | _ => Const("enum",dummyT)  e  make_enum es;    620 \end{ttbox}    621 It translates \hbox{\ttx:y:z:[]} to \hbox{\tt[x,y,z]}. However,    622 if the input does not terminate with an empty list, e.g.\ \hbox{\ttx:xs},    623 \verbmake_enum raises exception {\tt Match}. This signals that the    624 attempted translation has failed and the term should be printed as is.    625 The connection with Isabelle's pretty printer is established as follows:    626 \begin{ttbox}    627 fun enum_tr'[x,xs] = Const("enum",dummyT)     628 make_enum(Const("op :",dummyT)xxs);    629 val print_translation = [("op :", enum_tr')];    630 \end{ttbox}    631 This declaration causes the printer to print \hbox{\tt enum_tr'[x,y]}    632 whenever it finds \verb!Const("op :",_)!x\verb!!y.    633 \end{example}    634 \begin{example}    635 In Example~\ref{SET} we showed how to translate the concrete syntax for set    636 comprehension into the proper internal form. The string {\tt"\{x |    637 P\}"} now becomes {\tt Const("Set",_)\Abs("x",_,p)}. If, however,    638 the latter term were printed without translating it back, it would result    639 in {\tt"Set(\%x.P)"}. Therefore the abstraction has to be turned back    640 into a term that matches the concrete mixfix syntax:    641 \begin{ttbox}    642 fun set_tr'[Abs(x,T,P)] =    643 let val (x',P') = variant_abs(x,T,P)    644 in Const("SET",dummyT)  Free(x',T)  P' end;    645 \end{ttbox}    646 The function \verbvariant_abs, a basic term manipulation function, replaces    647 the bound variable x by a {\tt Free} variable x' having a unique name. A    648 term produced by {\tt set_tr'} can now be printed according to the concrete    649 syntax defined in Example~\ref{SET} above.    650     651 Notice that the application of {\tt set_tr'} fails if the second component of    652 the argument is not an abstraction, but for example just a {\tt Free}    653 variable. This is intentional because it signals to the caller that the    654 translation is inapplicable. As a result {\tt Const("Set",_)\Free("P",_)}    655 prints as {\tt"Set(P)"}.    656     657 The full theory extension, including concrete syntax and both translation    658 functions, has the following form:    659 \begin{ttbox}    660 types set 1    661 arities set :: (term)term    662 consts Set :: "('a => o) => 'a set"    663 SET :: "[id,o] => 'a set" ("\{_ | _\}")    664 end    665 ML    666 fun set_tr[Free(s,T), p] = \dots;    667 val parse_translation = [("SET", set_tr)];    668 fun set_tr'[Abs(x,T,P)] = \dots;    669 val print_translation = [("Set", set_tr')];    670 \end{ttbox}    671 \end{example}    672 As during the parse translation process, the types attached to constants    673 during print translation are ignored. Thus {\tt Const("SET",dummyT)} in    674 {\tt set_tr'} above is acceptable. The types of {\tt Free}s and {\tt Var}s    675 however must be preserved because they may get printed (see {\tt    676 show_types}).    677     678 \index{print translation|)}    679     680 \subsection{Printing a term}    681     682 Let tab be the set of all string-function pairs of print translations in the    683 current syntax.    684     685 Terms are printed recursively; print translations are applied top down:    686 \begin{itemize}    687 \item {\tt Free(x,_)} is printed as x.    688 \item {\tt Var((x,i),_)} is printed as x, if i = 0 and x does not    689 end with a digit, as x followed by i, if i \neq 0 and x does not    690 end with a digit, and as {\tt x.i}, if x ends with a digit. Thus the    691 following cases can arise:    692 \begin{center}    693 {\tt\begin{tabular}{cccc}    694 \verbVar(("v",0),_) & \verbVar(("v",7),_) & \verbVar(("v5",0),_) \\    695 "?v" & "?v7" & "?v5.0"    696 \end{tabular}}    697 \end{center}    698 \item {\tt Abs(x@1,_,Abs(x@2,_,\dots Abs(x@n,_,p)\dots))}, where p    699 is not an abstraction, is printed as {\tt \%y@1\dots y@n.P}, where P    700 is the result of printing p, and the x@i are replaced by y@i. The    701 latter are (possibly new) unique names.    702 \item {\tt Bound(i)} is printed as {\tt B.i} \footnote{The occurrence of    703 such loose'' bound variables indicates that either you are trying to    704 print a subterm of an abstraction, or there is something wrong with your    705 print translations.}.    706 \item The application {\ttt = Const(c,_)\$$t@1$\$\dots\$$t@n} (where    707 n may be 0!) is printed as follows:    708     709 If there is a pair (c,f) in tab, print f[t@1,\dots,t@n]. If this    710 application raises exception {\tt Match} or there is no pair (c,f) in    711 tab, let sy be the mixfix annotation associated with c. If there is    712 no such sy, or if sy does not have exactly n argument positions, t    713 is printed as an application; otherwise t is printed according to sy.    714     715 All other applications are printed as applications.    716 \end{itemize}    717 Printing a term {\tt c\$$t@1$\$\dots\t@n$} as an application means    718 printing it as {\tt$s$($s@1$,\dots,$s@n$)}, where$s@i$is the result of    719 printing$t@i$. If$c$is a {\tt Const},$s$is its first argument;    720 otherwise$s$is the result of printing$c$as described above.    721 \medskip    722     723 The printer also inserts parentheses where they are necessary for reasons    724 of precedence.    725     726 \section{Identifiers, constants, and type inference *}    727 \label{Typing}    728     729 There is one final step in the translation from strings to terms that we have    730 not covered yet. It explains how constants are distinguished from {\tt Free}s    731 and how {\tt Free}s and {\tt Var}s are typed. Both issues arise because {\tt    732 Free}s and {\tt Var}s are not declared.    733     734 An identifier$f$that does not appear as a delimiter in the concrete syntax    735 can be either a free variable or a constant. Since the parser knows only    736 about those constants which appear in mixfix annotations, it parses$f$as    737 {\tt Free("$f$",dummyT)}, where \ttindex{dummyT} is the predefined dummy {\tt    738 typ}. Although the parser produces these very raw terms, most user    739 interface level functions like {\tt goal} type terms according to the given    740 theory, say$T$. In a first step, every occurrence of {\tt Free($f$,_)} or    741 {\tt Const($f$,_)} is replaced by {\tt Const($f$,$\tau$)}, provided there is    742 a constant$f$of {\tt typ}$\tau$in$T$. This means that identifiers are    743 treated as {\tt Free}s iff they are not declared in the theory. The types of    744 the remaining {\tt Free}s (and {\tt Var}s) are inferred as in \ML. Type    745 constraints can be used to remove ambiguities.    746     747 One peculiarity of the current type inference algorithm is that variables    748 with the same name must have the same type, irrespective of whether they are    749 schematic, free or bound. For example, take the first-order formula$f(x) = x

   750 \land (\forall f.~ f=f)$where${=} :: [\alpha{::}term,\alpha]\To o$and    751$\forall :: (\alpha{::}term\To o)\To o$. The first conjunct forces    752$x::\alpha{::}term$and$f::\alpha\To\alpha$, the second one    753$f::\beta{::}term$. Although the two$f$'s are distinct, they are required to    754 have the same type. Unifying$\alpha\To\alpha$and$\beta{::}term$fails    755 because, in first-order logic, function types are not in class$term$.    756     757     758 \section{Putting it all together}    759     760 Having discussed the individual building blocks of a logic definition, it    761 remains to be shown how they fit together. In particular we need to say how    762 an object-logic syntax is hooked up to the meta-logic. Since all theorems    763 must conform to the syntax for$prop$(see Figure~\ref{MetaLogicSyntax}),    764 that syntax has to be extended with the object-level syntax. Assume that the    765 syntax of your object-logic defines a category$o$of formulae. These    766 formulae can now appear in axioms and theorems wherever$prop$does if you    767 add the production    768 $prop ~=~ form.$    769 More precisely, you need a coercion from formulae to propositions:    770 \begin{ttbox}    771 Base = Pure +    772 types o 0    773 arities o :: logic    774 consts Trueprop :: "o => prop" ("_" 5)    775 end    776 \end{ttbox}    777 The constant {\tt Trueprop} (the name is arbitrary) acts as an invisible    778 coercion function. Assuming this definition resides in a file {\tt base.thy},    779 you have to load it with the command {\tt use_thy"base"}.    780     781 One of the simplest nontrivial logics is {\em minimal logic} of    782 implication. Its definition in Isabelle needs no advanced features but    783 illustrates the overall mechanism quite nicely:    784 \begin{ttbox}    785 Hilbert = Base +    786 consts "-->" :: "[o,o] => o" (infixr 10)    787 rules    788 K "P --> Q --> P"    789 S "(P --> Q --> R) --> (P --> Q) --> P --> R"    790 MP "[| P --> Q; P |] ==> Q"    791 end    792 \end{ttbox}    793 After loading this definition you can start to prove theorems in this logic:    794 \begin{ttbox}    795 goal Hilbert.thy "P --> P";    796 {\out Level 0}    797 {\out P --> P}    798 {\out 1. P --> P}    799 by (resolve_tac [Hilbert.MP] 1);    800 {\out Level 1}    801 {\out P --> P}    802 {\out 1. ?P --> P --> P}    803 {\out 2. ?P}    804 by (resolve_tac [Hilbert.MP] 1);    805 {\out Level 2}    806 {\out P --> P}    807 {\out 1. ?P1 --> ?P --> P --> P}    808 {\out 2. ?P1}    809 {\out 3. ?P}    810 by (resolve_tac [Hilbert.S] 1);    811 {\out Level 3}    812 {\out P --> P}    813 {\out 1. P --> ?Q2 --> P}    814 {\out 2. P --> ?Q2}    815 by (resolve_tac [Hilbert.K] 1);    816 {\out Level 4}    817 {\out P --> P}    818 {\out 1. P --> ?Q2}    819 by (resolve_tac [Hilbert.K] 1);    820 {\out Level 5}    821 {\out P --> P}    822 {\out No subgoals!}    823 \end{ttbox}    824 As you can see, this Hilbert-style formulation of minimal logic is easy to    825 define but difficult to use. The following natural deduction formulation is    826 far preferable:    827 \begin{ttbox}    828 MinI = Base +    829 consts "-->" :: "[o,o] => o" (infixr 10)    830 rules    831 impI "(P ==> Q) ==> P --> Q"    832 impE "[| P --> Q; P |] ==> Q"    833 end    834 \end{ttbox}    835 Note, however, that although the two systems are equivalent, this fact cannot    836 be proved within Isabelle: {\tt S} and {\tt K} can be derived in \verb$MinI\$

   837 (exercise!), but {\tt impI} cannot be derived in \verb!Hilbert!.  The reason

   838 is that {\tt impI} is only an {\em admissible} rule in \verb!Hilbert!,

   839 something that can only be shown by induction over all possible proofs in

   840 \verb!Hilbert!.

   841

   842 It is a very simple matter to extend minimal logic with falsity:

   843 \begin{ttbox}

   844 MinIF = MinI +

   845 consts False :: "o"

   846 rules

   847 FalseE  "False ==> P"

   848 end

   849 \end{ttbox}

   850 On the other hand, we may wish to introduce conjunction only:

   851 \begin{ttbox}

   852 MinC = Base +

   853 consts "&" :: "[o,o] => o"  (infixr 30)

   854 rules

   855 conjI  "[| P; Q |] ==> P & Q"

   856 conjE1 "P & Q ==> P"

   857 conjE2 "P & Q ==> Q"

   858 end

   859 \end{ttbox}

   860 And if we want to have all three connectives together, we define:

   861 \begin{ttbox}

   862 MinIFC = MinIF + MinC

   863 \end{ttbox}

   864 Now we can prove mixed theorems like

   865 \begin{ttbox}

   866 goal MinIFC.thy "P & False --> Q";

   867 by (resolve_tac [MinI.impI] 1);

   868 by (dresolve_tac [MinC.conjE2] 1);

   869 by (eresolve_tac [MinIF.FalseE] 1);

   870 \end{ttbox}

   871 Try this as an exercise!