| 104 |      1 | \chapter{Defining Logics} \label{Defining-Logics}
 | 
|  |      2 | This chapter is intended for Isabelle experts.  It explains how to define new
 | 
|  |      3 | logical systems, Isabelle's {\it raison d'\^etre}.  Isabelle logics are
 | 
|  |      4 | hierarchies of theories.  A number of simple examples are contained in the
 | 
|  |      5 | introductory manual; the full syntax of theory definitions is shown in the
 | 
|  |      6 | {\em Reference Manual}.  The purpose of this chapter is to explain the
 | 
|  |      7 | remaining subtleties, especially some context conditions on the class
 | 
|  |      8 | structure and the definition of new mixfix syntax.  A full understanding of
 | 
|  |      9 | the material requires knowledge of the internal representation of terms (data
 | 
|  |     10 | type {\tt term}) as detailed in the {\em Reference Manual}.  Sections marked
 | 
|  |     11 | with a * can be skipped on first reading.
 | 
|  |     12 | 
 | 
|  |     13 | 
 | 
|  |     14 | \section{Classes and Types *}
 | 
|  |     15 | \index{*arities!context conditions}
 | 
|  |     16 | 
 | 
|  |     17 | Type declarations are subject to the following two well-formedness
 | 
|  |     18 | conditions:
 | 
|  |     19 | \begin{itemize}
 | 
|  |     20 | \item There are no two declarations $ty :: (\vec{r})c$ and $ty :: (\vec{s})c$
 | 
|  |     21 |   with $\vec{r} \neq \vec{s}$.  For example
 | 
|  |     22 | \begin{ttbox}
 | 
|  |     23 | types ty 1
 | 
|  |     24 | arities ty :: (\{logic\}) logic
 | 
|  |     25 |         ty :: (\{\})logic
 | 
|  |     26 | \end{ttbox}
 | 
|  |     27 | leads to an error message and fails.
 | 
|  |     28 | \item If there are two declarations $ty :: (s@1,\dots,s@n)c$ and $ty ::
 | 
|  |     29 |   (s@1',\dots,s@n')c'$ such that $c' < c$ then $s@i' \preceq s@i$ must hold
 | 
|  |     30 |   for $i=1,\dots,n$.  The relationship $\preceq$, defined as
 | 
|  |     31 | \[ s' \preceq s \iff \forall c\in s. \exists c'\in s'.~ c'\le c, \]
 | 
|  |     32 | expresses that the set of types represented by $s'$ is a subset of the set of
 | 
|  |     33 | types represented by $s$.  For example
 | 
|  |     34 | \begin{ttbox}
 | 
|  |     35 | classes term < logic
 | 
|  |     36 | types ty 1
 | 
|  |     37 | arities ty :: (\{logic\})logic
 | 
|  |     38 |         ty :: (\{\})term
 | 
|  |     39 | \end{ttbox}
 | 
|  |     40 | leads to an error message and fails.
 | 
|  |     41 | \end{itemize}
 | 
|  |     42 | These conditions guarantee principal types~\cite{nipkow-prehofer}.
 | 
|  |     43 | 
 | 
|  |     44 | \section{Precedence Grammars}
 | 
|  |     45 | \label{PrecedenceGrammars}
 | 
|  |     46 | \index{precedence grammar|(}
 | 
|  |     47 | 
 | 
|  |     48 | The precise syntax of a logic is best defined by a context-free grammar.
 | 
|  |     49 | These grammars obey the following conventions: identifiers denote
 | 
|  |     50 | nonterminals, {\tt typewriter} fount denotes terminals, repetition is
 | 
|  |     51 | indicated by \dots, and alternatives are separated by $|$.
 | 
|  |     52 | 
 | 
|  |     53 | In order to simplify the description of mathematical languages, we introduce
 | 
|  |     54 | an extended format which permits {\bf precedences}\index{precedence}.  This
 | 
|  |     55 | scheme generalizes precedence declarations in \ML\ and {\sc prolog}.  In this
 | 
|  |     56 | extended grammar format, nonterminals are decorated by integers, their
 | 
|  |     57 | precedence.  In the sequel, precedences are shown as subscripts.  A nonterminal
 | 
|  |     58 | $A@p$ on the right-hand side of a production may only be replaced using a
 | 
|  |     59 | production $A@q = \gamma$ where $p \le q$.
 | 
|  |     60 | 
 | 
|  |     61 | Formally, a set of context free productions $G$ induces a derivation
 | 
|  |     62 | relation $\rew@G$ on strings as follows:
 | 
|  |     63 | \[ \alpha A@p \beta ~\rew@G~ \alpha\gamma\beta ~~~iff~~~
 | 
|  |     64 |    \exists q \ge p.~(A@q=\gamma) \in G
 | 
|  |     65 | \]
 | 
|  |     66 | Any extended grammar of this kind can be translated into a normal context
 | 
|  |     67 | free grammar.  However, this translation may require the introduction of a
 | 
|  |     68 | large number of new nonterminals and productions.
 | 
|  |     69 | 
 | 
|  |     70 | \begin{example}
 | 
|  |     71 | \label{PrecedenceEx}
 | 
|  |     72 | The following simple grammar for arithmetic expressions demonstrates how
 | 
|  |     73 | binding power and associativity of operators can be enforced by precedences.
 | 
|  |     74 | \begin{center}
 | 
|  |     75 | \begin{tabular}{rclr}
 | 
|  |     76 | $A@9$ & = & {\tt0} \\
 | 
|  |     77 | $A@9$ & = & {\tt(} $A@0$ {\tt)} \\
 | 
|  |     78 | $A@0$ & = & $A@0$ {\tt+} $A@1$ \\
 | 
|  |     79 | $A@2$ & = & $A@3$ {\tt*} $A@2$ \\
 | 
|  |     80 | $A@3$ & = & {\tt-} $A@3$
 | 
|  |     81 | \end{tabular}
 | 
|  |     82 | \end{center}
 | 
|  |     83 | The choice of precedences determines that \verb$-$ binds tighter than
 | 
|  |     84 | \verb$*$ which binds tighter than \verb$+$, and that \verb$+$ and \verb$*$
 | 
|  |     85 | associate to the left and right, respectively.
 | 
|  |     86 | \end{example}
 | 
|  |     87 | 
 | 
|  |     88 | To minimize the number of subscripts, we adopt the following conventions:
 | 
|  |     89 | \begin{itemize}
 | 
|  |     90 | \item all precedences $p$ must be in the range $0 \leq p \leq max_pri$ for
 | 
|  |     91 |   some fixed $max_pri$.
 | 
|  |     92 | \item precedence $0$ on the right-hand side and precedence $max_pri$ on the
 | 
|  |     93 |   left-hand side may be omitted.
 | 
|  |     94 | \end{itemize}
 | 
|  |     95 | In addition, we write the production $A@p = \alpha$ as $A = \alpha~(p)$.
 | 
|  |     96 | 
 | 
|  |     97 | Using these conventions and assuming $max_pri=9$, the grammar in
 | 
|  |     98 | Example~\ref{PrecedenceEx} becomes
 | 
|  |     99 | \begin{center}
 | 
|  |    100 | \begin{tabular}{rclc}
 | 
|  |    101 | $A$ & = & {\tt0} & \hspace*{4em} \\
 | 
|  |    102 |  & $|$ & {\tt(} $A$ {\tt)} \\
 | 
|  |    103 |  & $|$ & $A$ {\tt+} $A@1$ & (0) \\
 | 
|  |    104 |  & $|$ & $A@3$ {\tt*} $A@2$ & (2) \\
 | 
|  |    105 |  & $|$ & {\tt-} $A@3$ & (3)
 | 
|  |    106 | \end{tabular}
 | 
|  |    107 | \end{center}
 | 
|  |    108 | 
 | 
|  |    109 | \index{precedence grammar|)}
 | 
|  |    110 | 
 | 
|  |    111 | \section{Basic syntax *}
 | 
|  |    112 | 
 | 
|  |    113 | An informal account of most of Isabelle's syntax (meta-logic, types etc) is
 | 
|  |    114 | contained in {\em Introduction to Isabelle}.  A precise description using a
 | 
|  |    115 | precedence grammar is shown in Figure~\ref{MetaLogicSyntax}.  This description
 | 
|  |    116 | is the basis of all extensions by object-logics.
 | 
|  |    117 | \begin{figure}[htb]
 | 
|  |    118 | \begin{center}
 | 
|  |    119 | \begin{tabular}{rclc}
 | 
|  |    120 | $prop$ &=& \ttindex{PROP} $aprop$ ~~$|$~~ {\tt(} $prop$ {\tt)} \\
 | 
|  |    121 |      &$|$& $logic@3$ \ttindex{==} $logic@2$ & (2) \\
 | 
|  |    122 |      &$|$& $prop@2$ \ttindex{==>} $prop@1$ & (1) \\
 | 
|  |    123 |      &$|$& {\tt[|} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt|]} {\tt==>} $prop@1$ & (1) \\
 | 
|  |    124 |      &$|$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\\\
 | 
|  |    125 | $logic$ &=& $prop$ ~~$|$~~ $fun$ \\\\
 | 
|  |    126 | $aprop$ &=& $id$ ~~$|$~~ $var$
 | 
|  |    127 |     ~~$|$~~ $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\\\
 | 
|  |    128 | $fun$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $fun$ {\tt)} \\
 | 
|  |    129 |     &$|$& \ttindex{\%} $idts$ {\tt.} $logic$ & (0) \\\\
 | 
|  |    130 | $idts$ &=& $idt$ ~~$|$~~ $idt@1$ $idts$ \\\\
 | 
|  |    131 | $idt$ &=& $id$ ~~$|$~~ {\tt(} $idt$ {\tt)} \\
 | 
|  |    132 |     &$|$& $id$ \ttindex{::} $type$ & (0) \\\\
 | 
|  |    133 | $type$ &=& $tfree$ ~~$|$~~ $tvar$ \\
 | 
|  |    134 |      &$|$& $tfree$ {\tt::} $sort$ ~~$|$~~ $tvar$ {\tt::} $sort$ \\
 | 
|  |    135 |      &$|$& $id$ ~~$|$~~ $type@{max_pri}$ $id$
 | 
|  |    136 |                 ~~$|$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\
 | 
|  |    137 |      &$|$& $type@1$ \ttindex{=>} $type$ & (0) \\
 | 
|  |    138 |      &$|$& {\tt[}  $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0)\\
 | 
|  |    139 |      &$|$& {\tt(} $type$ {\tt)} \\\\
 | 
|  |    140 | $sort$ &=& $id$ ~~$|$~~ {\tt\{\}}
 | 
|  |    141 |                 ~~$|$~~ {\tt\{} $id$ {\tt,} \dots {\tt,} $id$ {\tt\}} 
 | 
|  |    142 | \end{tabular}\index{*"!"!}\index{*"["|}\index{*"|"]}
 | 
|  |    143 | \indexbold{type@$type$}\indexbold{sort@$sort$}\indexbold{idts@$idts$}
 | 
|  |    144 | \indexbold{logic@$logic$}\indexbold{prop@$prop$}\indexbold{fun@$fun$}
 | 
|  |    145 | \end{center}
 | 
|  |    146 | \caption{Meta-Logic Syntax}
 | 
|  |    147 | \label{MetaLogicSyntax}
 | 
|  |    148 | \end{figure}
 | 
|  |    149 | The following main categories are defined:
 | 
|  |    150 | \begin{description}
 | 
|  |    151 | \item[$prop$] Terms of type $prop$, i.e.\ formulae of the meta-logic.
 | 
|  |    152 | \item[$aprop$] Atomic propositions.
 | 
|  |    153 | \item[$logic$] Terms of types in class $logic$.  Initially, $logic$ contains
 | 
|  |    154 |   merely $prop$.  As the syntax is extended by new object-logics, more
 | 
|  |    155 |   productions for $logic$ are added (see below).
 | 
|  |    156 | \item[$fun$] Terms potentially of function type.
 | 
|  |    157 | \item[$type$] Types.
 | 
|  |    158 | \item[$idts$] a list of identifiers, possibly constrained by types.  Note
 | 
|  |    159 |   that $x::nat~y$ is parsed as $x::(nat~y)$, i.e.\ $y$ is treated like a
 | 
|  |    160 |   type constructor applied to $nat$.
 | 
|  |    161 | \end{description}
 | 
|  |    162 | 
 | 
|  |    163 | The predefined types $id$, $var$, $tfree$ and $tvar$ represent identifiers
 | 
|  |    164 | ({\tt f}), unknowns ({\tt ?f}), type variables ({\tt 'a}), and type unknowns
 | 
|  |    165 | ({\tt ?'a}) respectively.  If we think of them as nonterminals with
 | 
|  |    166 | predefined syntax, we may assume that all their productions have precedence
 | 
|  |    167 | $max_pri$.
 | 
|  |    168 | 
 | 
|  |    169 | \subsection{Logical types and default syntax}
 | 
|  |    170 | 
 | 
|  |    171 | Isabelle is concerned with mathematical languages which have a certain
 | 
|  |    172 | minimal vocabulary: identifiers, variables, parentheses, and the lambda
 | 
|  |    173 | calculus.  Logical types, i.e.\ those of class $logic$, are automatically
 | 
|  |    174 | equipped with this basic syntax.  More precisely, for any type constructor
 | 
|  |    175 | $ty$ with arity $(\dots)c$, where $c$ is a subclass of $logic$, the following
 | 
|  |    176 | productions are added:
 | 
|  |    177 | \begin{center}
 | 
|  |    178 | \begin{tabular}{rclc}
 | 
|  |    179 | $ty$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $ty$ {\tt)} \\
 | 
|  |    180 |   &$|$& $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)}\\
 | 
|  |    181 |   &$|$& $ty@{max_pri}$ {\tt::} $type$\\\\
 | 
|  |    182 | $logic$ &=& $ty$
 | 
|  |    183 | \end{tabular}
 | 
|  |    184 | \end{center}
 | 
|  |    185 | 
 | 
|  |    186 | 
 | 
|  |    187 | \section{Mixfix syntax}
 | 
|  |    188 | \index{mixfix|(}
 | 
|  |    189 | 
 | 
|  |    190 | We distinguish between abstract and concrete syntax.  The {\em abstract}
 | 
|  |    191 | syntax is given by the typed constants of a theory.  Abstract syntax trees are
 | 
|  |    192 | well-typed terms, i.e.\ values of \ML\ type {\tt term}.  If none of the
 | 
|  |    193 | constants are introduced with mixfix annotations, there is no concrete syntax
 | 
|  |    194 | to speak of: terms can only be abstractions or applications of the form
 | 
|  |    195 | $f(t@1,\dots,t@n)$, where $f$ is a constant or variable.  Since this notation
 | 
|  |    196 | quickly becomes unreadable, Isabelle supports syntax definitions in the form
 | 
|  |    197 | of unrestricted context-free grammars using mixfix annotations.
 | 
|  |    198 | 
 | 
|  |    199 | Mixfix annotations describe the {\em concrete} syntax, its translation into
 | 
|  |    200 | the abstract syntax, and a pretty-printing scheme, all in one.  Isabelle
 | 
|  |    201 | syntax definitions are inspired by \OBJ's~\cite{OBJ} {\em mixfix\/} syntax.
 | 
|  |    202 | Each mixfix annotation defines a precedence grammar production and associates
 | 
|  |    203 | an Isabelle constant with it.
 | 
|  |    204 | 
 | 
|  |    205 | A {\em mixfix declaration} {\tt consts $c$ ::\ $\tau$ ($sy$ $ps$ $p$)} is
 | 
|  |    206 | interpreted as a grammar pro\-duction as follows:
 | 
|  |    207 | \begin{itemize}
 | 
|  |    208 | \item $sy$ is the right-hand side of this production, specified as a {\em
 | 
|  |    209 |     mixfix annotation}.  In general, $sy$ is of the form
 | 
|  |    210 |   $\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$, where each occurrence of
 | 
|  |    211 |   ``\ttindex{_}'' denotes an argument/nonterminal and the strings
 | 
|  |    212 |   $\alpha@i$ do not contain ``{\tt_}''.
 | 
|  |    213 | \item $\tau$ specifies the types of the nonterminals on the left and right
 | 
|  |    214 |   hand side. If $sy$ is of the form above, $\tau$ must be of the form
 | 
|  |    215 |   $[\tau@1,\dots,\tau@n] \To \tau'$.  Then argument $i$ is of type $\tau@i$
 | 
|  |    216 |   and the result, i.e.\ the left-hand side of the production, is of type
 | 
|  |    217 |   $\tau'$.  Both the $\tau@i$ and $\tau'$ may be function types.
 | 
|  |    218 | \item $c$ is the name of the Isabelle constant associated with this production.
 | 
|  |    219 |   Parsing an instance of the phrase $sy$ generates the {\tt term} {\tt
 | 
|  |    220 |     Const($c$,dummyT\footnote{Proper types are inserted later on.  See
 | 
|  |    221 |       \S\ref{Typing}})\$$a@1$\$$\dots$\$$a@n$}\index{*dummyT}, where $a@i$ is
 | 
|  |    222 |   the term generated by parsing the $i^{th}$ argument.
 | 
|  |    223 | \item $ps$ must be of the form $[p@1,\dots,p@n]$, where $p@i$ is the
 | 
|  |    224 |   minimal precedence\index{precedence} required of any phrase that may appear
 | 
|  |    225 |   as the $i^{th}$ argument.  The null list is interpreted as a list of 0's of
 | 
|  |    226 |   the appropriate length.
 | 
|  |    227 | \item $p$ is the precedence of this production.
 | 
|  |    228 | \end{itemize}
 | 
|  |    229 | Notice that there is a close connection between abstract and concrete syntax:
 | 
|  |    230 | each production has an associated constant, and types act as {\bf syntactic
 | 
|  |    231 |   categories} in the concrete syntax.  To emphasize this connection, we
 | 
|  |    232 | sometimes refer to the nonterminals on the right-hand side of a production as
 | 
|  |    233 | its arguments and to the nonterminal on the left-hand side as its result.
 | 
|  |    234 | 
 | 
|  |    235 | The maximal legal precedence is called \ttindexbold{max_pri}, which is
 | 
|  |    236 | currently 1000.  If you want to ignore precedences, the safest way to do so is
 | 
|  |    237 | to use the annotation {\tt($sy$)}: this production puts no precedence
 | 
|  |    238 | constraints on any of its arguments and has maximal precedence itself, i.e.\ 
 | 
|  |    239 | it is always applicable and does not exclude any productions of its
 | 
|  |    240 | arguments.
 | 
|  |    241 | 
 | 
|  |    242 | \begin{example}
 | 
|  |    243 | In mixfix notation the grammar in Example~\ref{PrecedenceEx} can be written
 | 
|  |    244 | as follows:
 | 
|  |    245 | \begin{ttbox}
 | 
|  |    246 | types exp 0
 | 
|  |    247 | consts "0"  ::              "exp"  ("0" 9)
 | 
|  |    248 |        "+"  :: "[exp,exp] => exp"  ("_ + _" [0,1] 0)
 | 
|  |    249 |        "*"  :: "[exp,exp] => exp"  ("_ * _" [3,2] 2)
 | 
|  |    250 |        "-"  ::       "exp => exp"  ("- _"   [3]   3)
 | 
|  |    251 | \end{ttbox}
 | 
|  |    252 | Parsing the string \verb!"0 + - 0 + 0"! produces the term {\tt
 | 
|  |    253 |   $p$\$($p$\$($m$\$$z$)\$$z$)\$$z$} where {\tt$p =$ Const("+",dummyT)},
 | 
|  |    254 | {\tt$m =$ Const("-",dummyT)}, and {\tt$z =$ Const("0",dummyT)}.
 | 
|  |    255 | \end{example}
 | 
|  |    256 | 
 | 
|  |    257 | The interpretation of \ttindex{_} in a mixfix annotation is always as a {\bf
 | 
|  |    258 |   meta-character}\index{meta-character} which does not represent itself but
 | 
|  |    259 | an argument position.  The following characters are also meta-characters:
 | 
|  |    260 | \begin{ttbox}
 | 
|  |    261 | '   (   )   /
 | 
|  |    262 | \end{ttbox}
 | 
|  |    263 | Preceding any character with a quote (\verb$'$) turns it into an ordinary
 | 
|  |    264 | character.  Thus you can write \verb!''! if you really want a single quote.
 | 
|  |    265 | The purpose of the other meta-characters is explained in
 | 
|  |    266 | \S\ref{PrettyPrinting}.  Remember that in \ML\ strings \verb$\$ is already a
 | 
|  |    267 | (different kind of) meta-character.
 | 
|  |    268 | 
 | 
|  |    269 | 
 | 
|  |    270 | \subsection{Types and syntactic categories *}
 | 
|  |    271 | 
 | 
|  |    272 | The precise mapping from types to syntactic categories is defined by the
 | 
|  |    273 | following function:
 | 
|  |    274 | \begin{eqnarray*}
 | 
|  |    275 | N(\tau@1\To\tau@2) &=& fun \\
 | 
|  |    276 | N((\tau@1,\dots,\tau@n)ty) &=& ty \\
 | 
|  |    277 | N(\alpha) &=& logic
 | 
|  |    278 | \end{eqnarray*}
 | 
|  |    279 | Only the outermost type constructor is taken into account and type variables
 | 
|  |    280 | can range over all logical types.  This catches some ill-typed terms (like
 | 
|  |    281 | $Cons(x,0)$, where $Cons :: [\alpha,\alpha list] \To \alpha list$ and $0 ::
 | 
|  |    282 | nat$) but leaves the real work to the type checker.
 | 
|  |    283 | 
 | 
|  |    284 | In terms of the precedence grammar format introduced in
 | 
|  |    285 | \S\ref{PrecedenceGrammars}, the declaration
 | 
|  |    286 | \begin{ttbox}
 | 
|  |    287 | consts \(c\) :: "[\(\tau@1\),\dots,\(\tau@n\)]\(\To\tau\)" ("\(\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n\)") [\(p@1\),\dots,\(p@n\)] \(p\))
 | 
|  |    288 | \end{ttbox}
 | 
|  |    289 | defines the production
 | 
|  |    290 | \[ N(\tau)@p ~~=~~ \alpha@0 ~N(\tau@1)@{p@1}~ \alpha@1~ \dots
 | 
|  |    291 |                   ~\alpha@{n-1} ~N(\tau@n)@{p@n}~ \alpha@n
 | 
|  |    292 | \]
 | 
|  |    293 | 
 | 
|  |    294 | \subsection{Copy productions *}
 | 
|  |    295 | 
 | 
|  |    296 | Productions which do not create a new node in the abstract syntax tree are
 | 
|  |    297 | called {\bf copy productions}.  They must have exactly one nonterminal on
 | 
|  |    298 | the right hand side.  The term generated when parsing that nonterminal is
 | 
|  |    299 | simply passed up as the result of parsing the whole copy production.  In
 | 
|  |    300 | Isabelle a copy production is indicated by an empty constant name, i.e.\ by
 | 
|  |    301 | \begin{ttbox}
 | 
|  |    302 | consts "" :: \(\tau\)  (\(sy\) \(ps\) \(p\))
 | 
|  |    303 | \end{ttbox}
 | 
|  |    304 | 
 | 
|  |    305 | A special kind of copy production is one where, modulo white space, $sy$ is
 | 
|  |    306 | {\tt"_"}.  It is called a {\bf chain production}.  Chain productions should be
 | 
|  |    307 | seen as an abbreviation mechanism.  Conceptually, they are removed from the
 | 
|  |    308 | grammar by adding appropriate new rules.  Precedence information attached to
 | 
|  |    309 | chain productions is ignored.  The following example demonstrates the effect:
 | 
|  |    310 | the grammar defined by
 | 
|  |    311 | \begin{ttbox}
 | 
|  |    312 | types A,B,C 0
 | 
|  |    313 | consts AB :: "B => A"  ("A _" [10] 517)
 | 
|  |    314 |        "" :: "C => B"  ("_"   [0]  100)
 | 
|  |    315 |        x  :: "C"       ("x"          5)
 | 
|  |    316 |        y  :: "C"       ("y"         15)
 | 
|  |    317 | \end{ttbox}
 | 
|  |    318 | admits {\tt"A y"} but not {\tt"A x"}.  Had the constant in the second
 | 
|  |    319 | production been some non-empty string, both {\tt"A y"} and {\tt"A x"} would
 | 
|  |    320 | be legal.
 | 
|  |    321 | 
 | 
|  |    322 | \index{mixfix|)}
 | 
|  |    323 | 
 | 
|  |    324 | \section{Lexical conventions}
 | 
|  |    325 | 
 | 
|  |    326 | The lexical analyzer distinguishes the following kinds of tokens: delimiters,
 | 
|  |    327 | identifiers, unknowns, type variables and type unknowns.
 | 
|  |    328 | 
 | 
|  |    329 | Delimiters are user-defined, i.e.\ they are extracted from the syntax
 | 
|  |    330 | definition.  If $\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$ is a mixfix
 | 
|  |    331 | annotation, each $\alpha@i$ is decomposed into substrings
 | 
|  |    332 | $\beta@1~\dots~\beta@k$ which are separated by and do not contain
 | 
|  |    333 | \bfindex{white space} ( = blanks, tabs, newlines).  Each $\beta@j$ becomes a
 | 
|  |    334 | delimiter.  Thus a delimiter can be an arbitrary string not containing white
 | 
|  |    335 | space.
 | 
|  |    336 | 
 | 
|  |    337 | The lexical syntax of identifiers and variables ( = unknowns) is defined in
 | 
|  |    338 | the introductory manual.  Parsing an identifier $f$ generates {\tt
 | 
|  |    339 |   Free($f$,dummyT)}\index{*dummyT}.  Parsing a variable {\tt?}$v$ generates
 | 
|  |    340 | {\tt Var(($u$,$i$),dummyT)} where $i$ is the integer value of the longest
 | 
|  |    341 | numeric suffix of $v$ (possibly $0$), and $u$ is the remaining prefix.
 | 
|  |    342 | Parsing a variable {\tt?}$v{.}i$ generates {\tt Var(($v$,$i$),dummyT)}.  The
 | 
|  |    343 | following table covers the four different cases that can arise:
 | 
|  |    344 | \begin{center}\tt
 | 
|  |    345 | \begin{tabular}{cccc}
 | 
|  |    346 | "?v" & "?v.7" & "?v5" & "?v7.5" \\
 | 
|  |    347 | Var(("v",0),$d$) & Var(("v",7),$d$) & Var(("v",5),$d$) & Var(("v7",5),$d$)
 | 
|  |    348 | \end{tabular}
 | 
|  |    349 | \end{center}
 | 
|  |    350 | where $d = {\tt dummyT}$.
 | 
|  |    351 | 
 | 
|  |    352 | In mixfix annotations, \ttindexbold{id}, \ttindexbold{var},
 | 
|  |    353 | \ttindexbold{tfree} and \ttindexbold{tvar} are the predefined categories of
 | 
|  |    354 | identifiers, unknowns, type variables and type unknowns, respectively.
 | 
|  |    355 | 
 | 
|  |    356 | 
 | 
|  |    357 | The lexical analyzer translates input strings to token lists by repeatedly
 | 
|  |    358 | taking the maximal prefix of the input string that forms a valid token.  A
 | 
|  |    359 | maximal prefix that is both a delimiter and an identifier or variable (like
 | 
|  |    360 | {\tt ALL}) is treated as a delimiter.  White spaces are separators.
 | 
|  |    361 | 
 | 
|  |    362 | An important consequence of this translation scheme is that delimiters need
 | 
|  |    363 | not be separated by white space to be recognized as separate.  If \verb$"-"$
 | 
|  |    364 | is a delimiter but \verb$"--"$ is not, the string \verb$"--"$ is treated as
 | 
|  |    365 | two consecutive occurrences of \verb$"-"$.  This is in contrast to \ML\ which
 | 
|  |    366 | would treat \verb$"--"$ as a single (undeclared) identifier.  The
 | 
|  |    367 | consequence of Isabelle's more liberal scheme is that the same string may be
 | 
|  |    368 | parsed in a different way after extending the syntax: after adding
 | 
|  |    369 | \verb$"--"$ as a delimiter, the input \verb$"--"$ is treated as
 | 
|  |    370 | a single occurrence of \verb$"--"$.
 | 
|  |    371 | 
 | 
|  |    372 | \section{Infix operators}
 | 
|  |    373 | 
 | 
|  |    374 | {\tt Infixl} and {\tt infixr} declare infix operators which associate to the
 | 
|  |    375 | left and right respectively.  As in \ML, prefixing infix operators with
 | 
|  |    376 | \ttindexbold{op} turns them into curried functions.  Infix declarations can
 | 
|  |    377 | be reduced to mixfix ones as follows:
 | 
|  |    378 | \begin{center}\tt
 | 
|  |    379 | \begin{tabular}{l@{~~$\Longrightarrow$~~}l}
 | 
|  |    380 | "$c$" ::~$\tau$ (\ttindexbold{infixl} $p$) &
 | 
|  |    381 | "op $c$" ::~$\tau$ ("_ $c$ _" [$p$,$p+1$] $p$) \\
 | 
|  |    382 | "$c$" ::~$\tau$ (\ttindexbold{infixr} $p$) &
 | 
|  |    383 | "op $c$" ::~$\tau$ ("_ $c$ _" [$p+1$,$p$] $p$)
 | 
|  |    384 | \end{tabular}
 | 
|  |    385 | \end{center}
 | 
|  |    386 | 
 | 
|  |    387 | 
 | 
|  |    388 | \section{Binders}
 | 
|  |    389 | A {\bf binder} is a variable-binding constant, such as a quantifier.
 | 
|  |    390 | The declaration
 | 
|  |    391 | \begin{ttbox}
 | 
|  |    392 | consts \(c\) :: \(\tau\)  (binder \(Q\) \(p\))
 | 
|  |    393 | \end{ttbox}\indexbold{*binder}
 | 
|  |    394 | introduces a binder $c$ of type $\tau$,
 | 
|  |    395 | which must have the form $(\tau@1\To\tau@2)\To\tau@3$.  Its concrete syntax
 | 
|  |    396 | is $Q~x.t$.  A binder is like a generalized quantifier where $\tau@1$ is the
 | 
|  |    397 | type of the bound variable $x$, $\tau@2$ the type of the body $t$, and
 | 
|  |    398 | $\tau@3$ the type of the whole term.  For example $\forall$ can be declared
 | 
|  |    399 | like this:
 | 
|  |    400 | \begin{ttbox}
 | 
|  |    401 | consts All :: "('a => o) => o"  (binder "ALL " 10)
 | 
|  |    402 | \end{ttbox}
 | 
|  |    403 | This allows us to write $\forall x.P$ either as {\tt ALL $x$.$P$} or {\tt
 | 
|  |    404 |   All(\%$x$.$P$)}; the latter form is for purists only.
 | 
|  |    405 | 
 | 
|  |    406 | In case $\tau@2 = \tau@3$, nested quantifications can be written as $Q~x@1
 | 
|  |    407 | \dots x@n.t$.  From a syntactic point of view,
 | 
|  |    408 | \begin{ttbox}
 | 
|  |    409 | consts \(c\) :: "\((\tau@1\To\tau@2)\To\tau@3\)"  (binder "\(Q\)" \(p\))
 | 
|  |    410 | \end{ttbox}
 | 
|  |    411 | is equivalent to
 | 
|  |    412 | \begin{ttbox}
 | 
|  |    413 | consts \(c\)   :: "\((\tau@1\To\tau@2)\To\tau@3\)"
 | 
|  |    414 |        "\(Q\)" :: "[idts,\(\tau@2\)] => \(\tau@3\)"  ("\(Q\)_.  _" \(p\))
 | 
|  |    415 | \end{ttbox}
 | 
|  |    416 | where {\tt idts} is the syntactic category $idts$ defined in
 | 
|  |    417 | Figure~\ref{MetaLogicSyntax}.
 | 
|  |    418 | 
 | 
|  |    419 | However, there is more to binders than concrete syntax: behind the scenes the
 | 
|  |    420 | body of the quantified expression has to be converted into a
 | 
|  |    421 | $\lambda$-abstraction (when parsing) and back again (when printing).  This
 | 
|  |    422 | is performed by the translation mechanism, which is discussed below.  For
 | 
|  |    423 | binders, the definition of the required translation functions has been
 | 
|  |    424 | automated.  Many other syntactic forms, such as set comprehension, require
 | 
|  |    425 | special treatment.
 | 
|  |    426 | 
 | 
|  |    427 | 
 | 
|  |    428 | \section{Parse translations *}
 | 
|  |    429 | \label{Parse-translations}
 | 
|  |    430 | \index{parse translation|(}
 | 
|  |    431 | 
 | 
|  |    432 | So far we have pretended that there is a close enough relationship between
 | 
|  |    433 | concrete and abstract syntax to allow an automatic translation from one to
 | 
|  |    434 | the other using the constant name supplied with each production.  In many
 | 
|  |    435 | cases this scheme is not powerful enough, especially for constructs involving
 | 
|  |    436 | variable bindings.  Therefore the $ML$-section of a theory definition can
 | 
|  |    437 | associate constant names with user-defined translation functions by including
 | 
|  |    438 | a line
 | 
|  |    439 | \begin{ttbox}
 | 
|  |    440 | val parse_translation = \dots
 | 
|  |    441 | \end{ttbox}
 | 
|  |    442 | where the right-hand side of this binding must be an \ML-expression of type
 | 
|  |    443 | \verb$(string * (term list -> term))list$.
 | 
|  |    444 | 
 | 
|  |    445 | After the input string has been translated into a term according to the
 | 
|  |    446 | syntax definition, there is a second phase in which the term is translated
 | 
|  |    447 | using the user-supplied functions in a bottom-up manner.  Given a list $tab$
 | 
|  |    448 | of the above type, a term $t$ is translated as follows.  If $t$ is not of the
 | 
|  |    449 | form {\tt Const($c$,$\tau$)\$$t@1$\$\dots\$$t@n$}, then $t$ is returned
 | 
|  |    450 | unchanged.  Otherwise all $t@i$ are translated into $t@i'$.  Let {\tt $t' =$
 | 
|  |    451 |   Const($c$,$\tau$)\$$t@1'$\$\dots\$$t@n'$}.  If there is no pair $(c,f)$ in
 | 
|  |    452 | $tab$, return $t'$.  Otherwise apply $f$ to $[t@1',\dots,t@n']$.  If that
 | 
|  |    453 | raises an exception, return $t'$, otherwise return the result.
 | 
|  |    454 | \begin{example}\label{list-enum}
 | 
|  |    455 | \ML-lists are constructed by {\tt[]} and {\tt::}.  For readability the
 | 
|  |    456 | list \hbox{\tt$x$::$y$::$z$::[]} can be written \hbox{\tt[$x$,$y$,$z$]}.
 | 
|  |    457 | In Isabelle the two forms of lists are declared as follows:
 | 
|  |    458 | \begin{ttbox}
 | 
|  |    459 | types list 1
 | 
|  |    460 |       enum 0
 | 
|  |    461 | arities list :: (term)term
 | 
|  |    462 | consts "[]" :: "'a list"                   ("[]")
 | 
|  |    463 |        ":"  :: "['a, 'a list] => 'a list"  (infixr 50)
 | 
|  |    464 |        enum :: "enum => 'a list"           ("[_]")
 | 
|  |    465 |        sing :: "'a => enum"                ("_")
 | 
|  |    466 |        cons :: "['a,enum] => enum"         ("_, _")
 | 
|  |    467 | end
 | 
|  |    468 | \end{ttbox}
 | 
|  |    469 | Because \verb$::$ is already used for type constraints, it is replaced by
 | 
|  |    470 | \verb$:$ as the infix list constructor.
 | 
|  |    471 | 
 | 
|  |    472 | In order to allow list enumeration, the new type {\tt enum} is introduced.
 | 
|  |    473 | Its only purpose is syntactic and hence it does not need an arity, in
 | 
|  |    474 | contrast to the logical type {\tt list}.  Although \hbox{\tt[$x$,$y$,$z$]} is
 | 
|  |    475 | syntactically legal, it needs to be translated into a term built up from
 | 
|  |    476 | \verb$[]$ and \verb$:$.  This is what \verb$make_list$ accomplishes:
 | 
|  |    477 | \begin{ttbox}
 | 
|  |    478 | val cons = Const("op :", dummyT);
 | 
|  |    479 | 
 | 
|  |    480 | fun make_list (Const("sing",_)$e) = cons $ e $ Const("[]", dummyT)
 | 
|  |    481 |   | make_list (Const("cons",_)$e$es) = cons $ e $ make_list es;
 | 
|  |    482 | \end{ttbox}
 | 
|  |    483 | To hook this translation up to Isabelle's parser, the theory definition needs
 | 
|  |    484 | to contain the following $ML$-section:
 | 
|  |    485 | \begin{ttbox}
 | 
|  |    486 | ML
 | 
|  |    487 | fun enum_tr[enum] = make_list enum;
 | 
|  |    488 | val parse_translation = [("enum",enum_tr)]
 | 
|  |    489 | \end{ttbox}
 | 
|  |    490 | This causes \verb!Const("enum",_)$!$t$ to be replaced by
 | 
|  |    491 | \verb$enum_tr[$$t$\verb$]$.
 | 
|  |    492 | 
 | 
|  |    493 | Of course the definition of \verb$make_list$ should be included in the
 | 
|  |    494 | $ML$-section.
 | 
|  |    495 | \end{example}
 | 
|  |    496 | \begin{example}\label{SET}
 | 
|  |    497 |   Isabelle represents the set $\{ x \mid P(x) \}$ internally by $Set(\lambda
 | 
|  |    498 |   x.P(x))$.  The internal and external forms need separate
 | 
|  |    499 | constants:\footnote{In practice, the external form typically has a name
 | 
|  |    500 | beginning with an {\at} sign, such as {\tt {\at}SET}.  This emphasizes that
 | 
|  |    501 | the constant should be used only for parsing/printing.}
 | 
|  |    502 | \begin{ttbox}
 | 
|  |    503 | types set 1
 | 
|  |    504 | arities set :: (term)term
 | 
|  |    505 | consts Set :: "('a => o) => 'a set"
 | 
|  |    506 |        SET :: "[id,o] => 'a set"  ("\{_ | _\}")
 | 
|  |    507 | \end{ttbox}
 | 
|  |    508 | Parsing {\tt"\{$x$ | $P$\}"} according to this syntax yields the term {\tt
 | 
|  |    509 |   Const("SET",dummyT) \$ Free("\(x\)",dummyT) \$ \(p\)}, where $p$ is the
 | 
|  |    510 | result of parsing $P$.  What we need is the term {\tt
 | 
|  |    511 |   Const("Set",dummyT)\$Abs("$x$",dummyT,$p'$)}, where $p'$ is some
 | 
|  |    512 | ``abstracted'' version of $p$.  Therefore we define a function
 | 
|  |    513 | \begin{ttbox}
 | 
|  |    514 | fun set_tr[Free(s,T), p] = Const("Set", dummyT) $
 | 
|  |    515 |                            Abs(s, T, abstract_over(Free(s,T), p));
 | 
|  |    516 | \end{ttbox}
 | 
|  |    517 | where \verb$abstract_over: term*term -> term$ is a predefined function such
 | 
|  |    518 | that {\tt abstract_over($u$,$t$)} replaces every occurrence of $u$ in $t$ by
 | 
|  |    519 | a {\tt Bound} variable of the correct index (i.e.\ 0 at top level).  Remember
 | 
|  |    520 | that {\tt dummyT} is replaced by the correct types at a later stage (see
 | 
|  |    521 | \S\ref{Typing}).  Function {\tt set_tr} is associated with {\tt SET} by
 | 
|  |    522 | including the \ML-text
 | 
|  |    523 | \begin{ttbox}
 | 
|  |    524 | val parse_translation = [("SET", set_tr)];
 | 
|  |    525 | \end{ttbox}
 | 
|  |    526 | \end{example}
 | 
|  |    527 | 
 | 
|  |    528 | If you want to run the above examples in Isabelle, you should note that an
 | 
|  |    529 | $ML$-section needs to contain not just a definition of
 | 
|  |    530 | \verb$parse_translation$ but also of a variable \verb$print_translation$.  The
 | 
|  |    531 | purpose of the latter is to reverse the effect of the former during printing;
 | 
|  |    532 | details are found in \S\ref{Print-translations}.  Hence you need to include
 | 
|  |    533 | the line
 | 
|  |    534 | \begin{ttbox}
 | 
|  |    535 | val print_translation = [];
 | 
|  |    536 | \end{ttbox}
 | 
|  |    537 | This is instructive because the terms are then printed out in their internal
 | 
|  |    538 | form.  For example the input \hbox{\tt[$x$,$y$,$z$]} is echoed as
 | 
|  |    539 | \hbox{\tt$x$:$y$:$z$:[]}.  This helps to check that your parse translation is
 | 
|  |    540 | working correctly.
 | 
|  |    541 | 
 | 
|  |    542 | %\begin{warn}
 | 
|  |    543 | %Explicit type constraints disappear with type checking but are still
 | 
|  |    544 | %visible to the parse translation functions.
 | 
|  |    545 | %\end{warn}
 | 
|  |    546 | 
 | 
|  |    547 | \index{parse translation|)}
 | 
|  |    548 | 
 | 
|  |    549 | \section{Printing}
 | 
|  |    550 | 
 | 
|  |    551 | Syntax definitions provide printing information in three distinct ways:
 | 
|  |    552 | through
 | 
|  |    553 | \begin{itemize}
 | 
|  |    554 | \item the syntax of the language (as used for parsing),
 | 
|  |    555 | \item pretty printing information, and
 | 
|  |    556 | \item print translation functions.
 | 
|  |    557 | \end{itemize}
 | 
|  |    558 | The bare mixfix declarations enable Isabelle to print terms, but the result
 | 
|  |    559 | will not necessarily be pretty and may look different from what you expected.
 | 
|  |    560 | To produce a pleasing layout, you need to read the following sections.
 | 
|  |    561 | 
 | 
|  |    562 | \subsection{Printing with mixfix declarations}
 | 
|  |    563 | 
 | 
|  |    564 | Let {\tt$t =$ Const($c$,_)\$$t@1$\$\dots\$$t@n$} be a term and let
 | 
|  |    565 | \begin{ttbox}
 | 
|  |    566 | consts \(c\) :: \(\tau\) (\(sy\))
 | 
|  |    567 | \end{ttbox}
 | 
|  |    568 | be a mixfix declaration where $sy$ is of the form
 | 
|  |    569 | $\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$.  Printing $t$ according to
 | 
|  |    570 | $sy$ means printing the string
 | 
|  |    571 | $\alpha@0\beta@1\alpha@1\ldots\alpha@{n-1}\beta@n\alpha@n$, where $\beta@i$
 | 
|  |    572 | is the result of printing $t@i$.
 | 
|  |    573 | 
 | 
|  |    574 | Note that the system does {\em not\/} insert blanks.  They should be part of
 | 
|  |    575 | the mixfix syntax if they are required to separate tokens or achieve a
 | 
|  |    576 | certain layout.
 | 
|  |    577 | 
 | 
|  |    578 | \subsection{Pretty printing}
 | 
|  |    579 | \label{PrettyPrinting}
 | 
|  |    580 | \index{pretty printing}
 | 
|  |    581 | 
 | 
|  |    582 | In order to format the output, it is possible to embed pretty printing
 | 
|  |    583 | directives in mixfix annotations.  These directives are ignored during parsing
 | 
|  |    584 | and affect only printing.  The characters {\tt(}, {\tt)} and {\tt/} are
 | 
|  |    585 | interpreted as meta-characters\index{meta-character} when found in a mixfix
 | 
|  |    586 | annotation.  Their meaning is
 | 
|  |    587 | \begin{description}
 | 
|  |    588 | \item[~{\tt(}~] Open a block.  A sequence of digits following it is
 | 
|  |    589 |   interpreted as the \bfindex{indentation} of this block.  It causes the
 | 
|  |    590 |   output to be indented by $n$ positions if a line break occurs within the
 | 
|  |    591 |   block.  If {\tt(} is not followed by a digit, the indentation defaults to
 | 
|  |    592 |   $0$.
 | 
|  |    593 | \item[~{\tt)}~] Close a block.
 | 
|  |    594 | \item[~\ttindex{/}~] Allow a \bfindex{line break}.  White space immediately
 | 
|  |    595 |   following {\tt/} is not printed if the line is broken at this point.
 | 
|  |    596 | \end{description}
 | 
|  |    597 | 
 | 
|  |    598 | \subsection{Print translations *}
 | 
|  |    599 | \label{Print-translations}
 | 
|  |    600 | \index{print translation|(}
 | 
|  |    601 | 
 | 
|  |    602 | Since terms are translated after parsing (see \S\ref{Parse-translations}),
 | 
|  |    603 | there is a similar mechanism to translate them back before printing.
 | 
|  |    604 | Therefore the $ML$-section of a theory definition can associate constant
 | 
|  |    605 | names with user-defined translation functions by including a line
 | 
|  |    606 | \begin{ttbox}
 | 
|  |    607 | val print_translation = \dots
 | 
|  |    608 | \end{ttbox}
 | 
|  |    609 | where the right-hand side of this binding is again an \ML-expression of type
 | 
|  |    610 | \verb$(string * (term list -> term))list$.
 | 
|  |    611 | Including a pair $(c,f)$ in this list causes the printer to print
 | 
|  |    612 | $f[t@1,\dots,t@n]$ whenever it finds {\tt Const($c$,_)\$$t@1$\$\dots\$$t@n$}.
 | 
|  |    613 | \begin{example}
 | 
|  |    614 | Reversing the effect of the parse translation in Example~\ref{list-enum} is
 | 
|  |    615 | accomplished by the following function:
 | 
|  |    616 | \begin{ttbox}
 | 
|  |    617 | fun make_enum (Const("op :",_) $ e $ es) = case es of
 | 
|  |    618 |         Const("[]",_)  =>  Const("sing",dummyT) $ e
 | 
|  |    619 |       | _  =>  Const("enum",dummyT) $ e $ make_enum es;
 | 
|  |    620 | \end{ttbox}
 | 
|  |    621 | It translates \hbox{\tt$x$:$y$:$z$:[]} to \hbox{\tt[$x$,$y$,$z$]}.  However,
 | 
|  |    622 | if the input does not terminate with an empty list, e.g.\ \hbox{\tt$x$:$xs$},
 | 
|  |    623 | \verb$make_enum$ raises exception {\tt Match}.  This signals that the
 | 
|  |    624 | attempted translation has failed and the term should be printed as is.
 | 
|  |    625 | The connection with Isabelle's pretty printer is established as follows:
 | 
|  |    626 | \begin{ttbox}
 | 
|  |    627 | fun enum_tr'[x,xs] = Const("enum",dummyT) $
 | 
|  |    628 |                      make_enum(Const("op :",dummyT)$x$xs);
 | 
|  |    629 | val print_translation = [("op :", enum_tr')];
 | 
|  |    630 | \end{ttbox}
 | 
|  |    631 | This declaration causes the printer to print \hbox{\tt enum_tr'[$x$,$y$]}
 | 
|  |    632 | whenever it finds \verb!Const("op :",_)$!$x$\verb!$!$y$.
 | 
|  |    633 | \end{example}
 | 
|  |    634 | \begin{example}
 | 
|  |    635 |   In Example~\ref{SET} we showed how to translate the concrete syntax for set
 | 
|  |    636 |   comprehension into the proper internal form.  The string {\tt"\{$x$ |
 | 
|  |    637 |     $P$\}"} now becomes {\tt Const("Set",_)\$Abs("$x$",_,$p$)}.  If, however,
 | 
|  |    638 |   the latter term were printed without translating it back, it would result
 | 
|  |    639 |   in {\tt"Set(\%$x$.$P$)"}.  Therefore the abstraction has to be turned back
 | 
|  |    640 |   into a term that matches the concrete mixfix syntax:
 | 
|  |    641 | \begin{ttbox}
 | 
|  |    642 | fun set_tr'[Abs(x,T,P)] =
 | 
|  |    643 |       let val (x',P') = variant_abs(x,T,P)
 | 
|  |    644 |       in Const("SET",dummyT) $ Free(x',T) $ P' end;
 | 
|  |    645 | \end{ttbox}
 | 
|  |    646 | The function \verb$variant_abs$, a basic term manipulation function, replaces
 | 
|  |    647 | the bound variable $x$ by a {\tt Free} variable $x'$ having a unique name.  A
 | 
|  |    648 | term produced by {\tt set_tr'} can now be printed according to the concrete
 | 
|  |    649 | syntax defined in Example~\ref{SET} above.
 | 
|  |    650 | 
 | 
|  |    651 | Notice that the application of {\tt set_tr'} fails if the second component of
 | 
|  |    652 | the argument is not an abstraction, but for example just a {\tt Free}
 | 
|  |    653 | variable.  This is intentional because it signals to the caller that the
 | 
|  |    654 | translation is inapplicable.  As a result {\tt Const("Set",_)\$Free("P",_)}
 | 
|  |    655 | prints as {\tt"Set(P)"}.
 | 
|  |    656 | 
 | 
|  |    657 | The full theory extension, including concrete syntax and both translation
 | 
|  |    658 | functions, has the following form:
 | 
|  |    659 | \begin{ttbox}
 | 
|  |    660 | types set 1
 | 
|  |    661 | arities set :: (term)term
 | 
|  |    662 | consts Set :: "('a => o) => 'a set"
 | 
|  |    663 |        SET :: "[id,o] => 'a set"  ("\{_ | _\}")
 | 
|  |    664 | end
 | 
|  |    665 | ML
 | 
|  |    666 | fun set_tr[Free(s,T), p] = \dots;
 | 
|  |    667 | val parse_translation = [("SET", set_tr)];
 | 
|  |    668 | fun set_tr'[Abs(x,T,P)] = \dots;
 | 
|  |    669 | val print_translation = [("Set", set_tr')];
 | 
|  |    670 | \end{ttbox}
 | 
|  |    671 | \end{example}
 | 
|  |    672 | As during the parse translation process, the types attached to constants
 | 
|  |    673 | during print translation are ignored.  Thus {\tt Const("SET",dummyT)} in
 | 
|  |    674 | {\tt set_tr'} above is acceptable.  The types of {\tt Free}s and {\tt Var}s
 | 
|  |    675 | however must be preserved because they may get printed (see {\tt
 | 
|  |    676 | show_types}).
 | 
|  |    677 | 
 | 
|  |    678 | \index{print translation|)}
 | 
|  |    679 | 
 | 
|  |    680 | \subsection{Printing a term}
 | 
|  |    681 | 
 | 
|  |    682 | Let $tab$ be the set of all string-function pairs of print translations in the
 | 
|  |    683 | current syntax.
 | 
|  |    684 | 
 | 
|  |    685 | Terms are printed recursively; print translations are applied top down:
 | 
|  |    686 | \begin{itemize}
 | 
|  |    687 | \item {\tt Free($x$,_)} is printed as $x$.
 | 
|  |    688 | \item {\tt Var(($x$,$i$),_)} is printed as $x$, if $i = 0$ and $x$ does not
 | 
|  |    689 |   end with a digit, as $x$ followed by $i$, if $i \neq 0$ and $x$ does not
 | 
|  |    690 |   end with a digit, and as {\tt $x$.$i$}, if $x$ ends with a digit.  Thus the
 | 
|  |    691 |   following cases can arise:
 | 
|  |    692 | \begin{center}
 | 
|  |    693 | {\tt\begin{tabular}{cccc}
 | 
|  |    694 | \verb$Var(("v",0),_)$ & \verb$Var(("v",7),_)$ & \verb$Var(("v5",0),_)$ \\
 | 
|  |    695 | "?v" & "?v7" & "?v5.0"
 | 
|  |    696 | \end{tabular}}
 | 
|  |    697 | \end{center}
 | 
|  |    698 | \item {\tt Abs($x@1$,_,Abs($x@2$,_,\dots Abs($x@n$,_,$p$)\dots))}, where $p$
 | 
|  |    699 |   is not an abstraction, is printed as {\tt \%$y@1\dots y@n$.$P$}, where $P$
 | 
|  |    700 |   is the result of printing $p$, and the $x@i$ are replaced by $y@i$.  The
 | 
|  |    701 |   latter are (possibly new) unique names.
 | 
|  |    702 | \item {\tt Bound($i$)} is printed as {\tt B.$i$} \footnote{The occurrence of
 | 
|  |    703 |     such ``loose'' bound variables indicates that either you are trying to
 | 
|  |    704 |     print a subterm of an abstraction, or there is something wrong with your
 | 
|  |    705 |     print translations.}.
 | 
|  |    706 | \item The application {\tt$t =$ Const($c$,_)\$$t@1$\$\dots\$$t@n$} (where
 | 
|  |    707 |   $n$ may be $0$!) is printed as follows:
 | 
|  |    708 | 
 | 
|  |    709 |   If there is a pair $(c,f)$ in $tab$, print $f[t@1,\dots,t@n]$.  If this
 | 
|  |    710 |   application raises exception {\tt Match} or there is no pair $(c,f)$ in
 | 
|  |    711 |   $tab$, let $sy$ be the mixfix annotation associated with $c$.  If there is
 | 
|  |    712 |   no such $sy$, or if $sy$ does not have exactly $n$ argument positions, $t$
 | 
|  |    713 |   is printed as an application; otherwise $t$ is printed according to $sy$.
 | 
|  |    714 | 
 | 
|  |    715 |   All other applications are printed as applications.
 | 
|  |    716 | \end{itemize}
 | 
|  |    717 | Printing a term {\tt $c$\$$t@1$\$\dots\$$t@n$} as an application means
 | 
|  |    718 | printing it as {\tt $s$($s@1$,\dots,$s@n$)}, where $s@i$ is the result of
 | 
|  |    719 | printing $t@i$.  If $c$ is a {\tt Const}, $s$ is its first argument;
 | 
|  |    720 | otherwise $s$ is the result of printing $c$ as described above.
 | 
|  |    721 | \medskip
 | 
|  |    722 | 
 | 
|  |    723 | The printer also inserts parentheses where they are necessary for reasons
 | 
|  |    724 | of precedence.
 | 
|  |    725 | 
 | 
|  |    726 | \section{Identifiers, constants, and type inference *}
 | 
|  |    727 | \label{Typing}
 | 
|  |    728 | 
 | 
|  |    729 | There is one final step in the translation from strings to terms that we have
 | 
|  |    730 | not covered yet.  It explains how constants are distinguished from {\tt Free}s
 | 
|  |    731 | and how {\tt Free}s and {\tt Var}s are typed.  Both issues arise because {\tt
 | 
|  |    732 |   Free}s and {\tt Var}s are not declared.
 | 
|  |    733 | 
 | 
|  |    734 | An identifier $f$ that does not appear as a delimiter in the concrete syntax
 | 
|  |    735 | can be either a free variable or a constant.  Since the parser knows only
 | 
|  |    736 | about those constants which appear in mixfix annotations, it parses $f$ as
 | 
|  |    737 | {\tt Free("$f$",dummyT)}, where \ttindex{dummyT} is the predefined dummy {\tt
 | 
|  |    738 |   typ}.  Although the parser produces these very raw terms, most user
 | 
|  |    739 | interface level functions like {\tt goal} type terms according to the given
 | 
|  |    740 | theory, say $T$.  In a first step, every occurrence of {\tt Free($f$,_)} or
 | 
|  |    741 | {\tt Const($f$,_)} is replaced by {\tt Const($f$,$\tau$)}, provided there is
 | 
|  |    742 | a constant $f$ of {\tt typ} $\tau$ in $T$.  This means that identifiers are
 | 
|  |    743 | treated as {\tt Free}s iff they are not declared in the theory.  The types of
 | 
|  |    744 | the remaining {\tt Free}s (and {\tt Var}s) are inferred as in \ML.  Type
 | 
|  |    745 | constraints can be used to remove ambiguities.
 | 
|  |    746 | 
 | 
|  |    747 | One peculiarity of the current type inference algorithm is that variables
 | 
|  |    748 | with the same name must have the same type, irrespective of whether they are
 | 
|  |    749 | schematic, free or bound.  For example, take the first-order formula $f(x) = x
 | 
|  |    750 | \land (\forall f.~ f=f)$ where ${=} :: [\alpha{::}term,\alpha]\To o$ and
 | 
|  |    751 | $\forall :: (\alpha{::}term\To o)\To o$.  The first conjunct forces
 | 
|  |    752 | $x::\alpha{::}term$ and $f::\alpha\To\alpha$, the second one
 | 
|  |    753 | $f::\beta{::}term$.  Although the two $f$'s are distinct, they are required to
 | 
|  |    754 | have the same type.  Unifying $\alpha\To\alpha$ and $\beta{::}term$ fails
 | 
|  |    755 | because, in first-order logic, function types are not in class $term$.
 | 
|  |    756 | 
 | 
|  |    757 | 
 | 
|  |    758 | \section{Putting it all together}
 | 
|  |    759 | 
 | 
|  |    760 | Having discussed the individual building blocks of a logic definition, it
 | 
|  |    761 | remains to be shown how they fit together.  In particular we need to say how
 | 
|  |    762 | an object-logic syntax is hooked up to the meta-logic.  Since all theorems
 | 
|  |    763 | must conform to the syntax for $prop$ (see Figure~\ref{MetaLogicSyntax}),
 | 
|  |    764 | that syntax has to be extended with the object-level syntax.  Assume that the
 | 
|  |    765 | syntax of your object-logic defines a category $o$ of formulae.  These
 | 
|  |    766 | formulae can now appear in axioms and theorems wherever $prop$ does if you
 | 
|  |    767 | add the production
 | 
|  |    768 | \[ prop ~=~ form.  \]
 | 
|  |    769 | More precisely, you need a coercion from formulae to propositions:
 | 
|  |    770 | \begin{ttbox}
 | 
|  |    771 | Base = Pure +
 | 
|  |    772 | types o 0
 | 
|  |    773 | arities o :: logic
 | 
|  |    774 | consts Trueprop :: "o => prop"  ("_"  5)
 | 
|  |    775 | end
 | 
|  |    776 | \end{ttbox}
 | 
|  |    777 | The constant {\tt Trueprop} (the name is arbitrary) acts as an invisible
 | 
|  |    778 | coercion function.  Assuming this definition resides in a file {\tt base.thy},
 | 
|  |    779 | you have to load it with the command {\tt use_thy"base"}.
 | 
|  |    780 | 
 | 
|  |    781 | One of the simplest nontrivial logics is {\em minimal logic} of
 | 
|  |    782 | implication.  Its definition in Isabelle needs no advanced features but
 | 
|  |    783 | illustrates the overall mechanism quite nicely:
 | 
|  |    784 | \begin{ttbox}
 | 
|  |    785 | Hilbert = Base +
 | 
|  |    786 | consts "-->" :: "[o,o] => o"  (infixr 10)
 | 
|  |    787 | rules
 | 
|  |    788 | K   "P --> Q --> P"
 | 
|  |    789 | S   "(P --> Q --> R) --> (P --> Q) --> P --> R"
 | 
|  |    790 | MP  "[| P --> Q; P |] ==> Q"
 | 
|  |    791 | end
 | 
|  |    792 | \end{ttbox}
 | 
|  |    793 | After loading this definition you can start to prove theorems in this logic:
 | 
|  |    794 | \begin{ttbox}
 | 
|  |    795 | goal Hilbert.thy "P --> P";
 | 
|  |    796 | {\out Level 0}
 | 
|  |    797 | {\out P --> P}
 | 
|  |    798 | {\out  1.  P --> P}
 | 
|  |    799 | by (resolve_tac [Hilbert.MP] 1);
 | 
|  |    800 | {\out Level 1}
 | 
|  |    801 | {\out P --> P}
 | 
|  |    802 | {\out  1.  ?P --> P --> P}
 | 
|  |    803 | {\out  2.  ?P}
 | 
|  |    804 | by (resolve_tac [Hilbert.MP] 1);
 | 
|  |    805 | {\out Level 2}
 | 
|  |    806 | {\out P --> P}
 | 
|  |    807 | {\out  1.  ?P1 --> ?P --> P --> P}
 | 
|  |    808 | {\out  2.  ?P1}
 | 
|  |    809 | {\out  3.  ?P}
 | 
|  |    810 | by (resolve_tac [Hilbert.S] 1);
 | 
|  |    811 | {\out Level 3}
 | 
|  |    812 | {\out P --> P}
 | 
|  |    813 | {\out  1.  P --> ?Q2 --> P}
 | 
|  |    814 | {\out  2.  P --> ?Q2}
 | 
|  |    815 | by (resolve_tac [Hilbert.K] 1);
 | 
|  |    816 | {\out Level 4}
 | 
|  |    817 | {\out P --> P}
 | 
|  |    818 | {\out  1.  P --> ?Q2}
 | 
|  |    819 | by (resolve_tac [Hilbert.K] 1);
 | 
|  |    820 | {\out Level 5}
 | 
|  |    821 | {\out P --> P}
 | 
|  |    822 | {\out No subgoals!}
 | 
|  |    823 | \end{ttbox}
 | 
|  |    824 | As you can see, this Hilbert-style formulation of minimal logic is easy to
 | 
|  |    825 | define but difficult to use.  The following natural deduction formulation is
 | 
|  |    826 | far preferable:
 | 
|  |    827 | \begin{ttbox}
 | 
|  |    828 | MinI = Base +
 | 
|  |    829 | consts "-->" :: "[o,o] => o"  (infixr 10)
 | 
|  |    830 | rules
 | 
|  |    831 | impI  "(P ==> Q) ==> P --> Q"
 | 
|  |    832 | impE  "[| P --> Q; P |] ==> Q"
 | 
|  |    833 | end
 | 
|  |    834 | \end{ttbox}
 | 
|  |    835 | Note, however, that although the two systems are equivalent, this fact cannot
 | 
|  |    836 | be proved within Isabelle: {\tt S} and {\tt K} can be derived in \verb$MinI$
 | 
|  |    837 | (exercise!), but {\tt impI} cannot be derived in \verb!Hilbert!.  The reason
 | 
|  |    838 | is that {\tt impI} is only an {\em admissible} rule in \verb!Hilbert!,
 | 
|  |    839 | something that can only be shown by induction over all possible proofs in
 | 
|  |    840 | \verb!Hilbert!.
 | 
|  |    841 | 
 | 
|  |    842 | It is a very simple matter to extend minimal logic with falsity:
 | 
|  |    843 | \begin{ttbox}
 | 
|  |    844 | MinIF = MinI +
 | 
|  |    845 | consts False :: "o"
 | 
|  |    846 | rules
 | 
|  |    847 | FalseE  "False ==> P"
 | 
|  |    848 | end
 | 
|  |    849 | \end{ttbox}
 | 
|  |    850 | On the other hand, we may wish to introduce conjunction only:
 | 
|  |    851 | \begin{ttbox}
 | 
|  |    852 | MinC = Base +
 | 
|  |    853 | consts "&" :: "[o,o] => o"  (infixr 30)
 | 
|  |    854 | rules
 | 
|  |    855 | conjI  "[| P; Q |] ==> P & Q"
 | 
|  |    856 | conjE1 "P & Q ==> P"
 | 
|  |    857 | conjE2 "P & Q ==> Q"
 | 
|  |    858 | end
 | 
|  |    859 | \end{ttbox}
 | 
|  |    860 | And if we want to have all three connectives together, we define:
 | 
|  |    861 | \begin{ttbox}
 | 
|  |    862 | MinIFC = MinIF + MinC
 | 
|  |    863 | \end{ttbox}
 | 
|  |    864 | Now we can prove mixed theorems like
 | 
|  |    865 | \begin{ttbox}
 | 
|  |    866 | goal MinIFC.thy "P & False --> Q";
 | 
|  |    867 | by (resolve_tac [MinI.impI] 1);
 | 
|  |    868 | by (dresolve_tac [MinC.conjE2] 1);
 | 
|  |    869 | by (eresolve_tac [MinIF.FalseE] 1);
 | 
|  |    870 | \end{ttbox}
 | 
|  |    871 | Try this as an exercise!
 |