doc-src/Logics/old.defining.tex
author paulson
Fri Feb 16 18:00:47 1996 +0100 (1996-02-16)
changeset 1512 ce37c64244c0
parent 104 d8205bb279a7
permissions -rw-r--r--
Elimination of fully-functorial style.
Type tactic changed to a type abbrevation (from a datatype).
Constructor tactic and function apply deleted.
lcp@104
     1
\chapter{Defining Logics} \label{Defining-Logics}
lcp@104
     2
This chapter is intended for Isabelle experts.  It explains how to define new
lcp@104
     3
logical systems, Isabelle's {\it raison d'\^etre}.  Isabelle logics are
lcp@104
     4
hierarchies of theories.  A number of simple examples are contained in the
lcp@104
     5
introductory manual; the full syntax of theory definitions is shown in the
lcp@104
     6
{\em Reference Manual}.  The purpose of this chapter is to explain the
lcp@104
     7
remaining subtleties, especially some context conditions on the class
lcp@104
     8
structure and the definition of new mixfix syntax.  A full understanding of
lcp@104
     9
the material requires knowledge of the internal representation of terms (data
lcp@104
    10
type {\tt term}) as detailed in the {\em Reference Manual}.  Sections marked
lcp@104
    11
with a * can be skipped on first reading.
lcp@104
    12
lcp@104
    13
lcp@104
    14
\section{Classes and Types *}
lcp@104
    15
\index{*arities!context conditions}
lcp@104
    16
lcp@104
    17
Type declarations are subject to the following two well-formedness
lcp@104
    18
conditions:
lcp@104
    19
\begin{itemize}
lcp@104
    20
\item There are no two declarations $ty :: (\vec{r})c$ and $ty :: (\vec{s})c$
lcp@104
    21
  with $\vec{r} \neq \vec{s}$.  For example
lcp@104
    22
\begin{ttbox}
lcp@104
    23
types ty 1
lcp@104
    24
arities ty :: (\{logic\}) logic
lcp@104
    25
        ty :: (\{\})logic
lcp@104
    26
\end{ttbox}
lcp@104
    27
leads to an error message and fails.
lcp@104
    28
\item If there are two declarations $ty :: (s@1,\dots,s@n)c$ and $ty ::
lcp@104
    29
  (s@1',\dots,s@n')c'$ such that $c' < c$ then $s@i' \preceq s@i$ must hold
lcp@104
    30
  for $i=1,\dots,n$.  The relationship $\preceq$, defined as
lcp@104
    31
\[ s' \preceq s \iff \forall c\in s. \exists c'\in s'.~ c'\le c, \]
lcp@104
    32
expresses that the set of types represented by $s'$ is a subset of the set of
lcp@104
    33
types represented by $s$.  For example
lcp@104
    34
\begin{ttbox}
lcp@104
    35
classes term < logic
lcp@104
    36
types ty 1
lcp@104
    37
arities ty :: (\{logic\})logic
lcp@104
    38
        ty :: (\{\})term
lcp@104
    39
\end{ttbox}
lcp@104
    40
leads to an error message and fails.
lcp@104
    41
\end{itemize}
lcp@104
    42
These conditions guarantee principal types~\cite{nipkow-prehofer}.
lcp@104
    43
lcp@104
    44
\section{Precedence Grammars}
lcp@104
    45
\label{PrecedenceGrammars}
lcp@104
    46
\index{precedence grammar|(}
lcp@104
    47
lcp@104
    48
The precise syntax of a logic is best defined by a context-free grammar.
lcp@104
    49
These grammars obey the following conventions: identifiers denote
lcp@104
    50
nonterminals, {\tt typewriter} fount denotes terminals, repetition is
lcp@104
    51
indicated by \dots, and alternatives are separated by $|$.
lcp@104
    52
lcp@104
    53
In order to simplify the description of mathematical languages, we introduce
lcp@104
    54
an extended format which permits {\bf precedences}\index{precedence}.  This
lcp@104
    55
scheme generalizes precedence declarations in \ML\ and {\sc prolog}.  In this
lcp@104
    56
extended grammar format, nonterminals are decorated by integers, their
lcp@104
    57
precedence.  In the sequel, precedences are shown as subscripts.  A nonterminal
lcp@104
    58
$A@p$ on the right-hand side of a production may only be replaced using a
lcp@104
    59
production $A@q = \gamma$ where $p \le q$.
lcp@104
    60
lcp@104
    61
Formally, a set of context free productions $G$ induces a derivation
lcp@104
    62
relation $\rew@G$ on strings as follows:
lcp@104
    63
\[ \alpha A@p \beta ~\rew@G~ \alpha\gamma\beta ~~~iff~~~
lcp@104
    64
   \exists q \ge p.~(A@q=\gamma) \in G
lcp@104
    65
\]
lcp@104
    66
Any extended grammar of this kind can be translated into a normal context
lcp@104
    67
free grammar.  However, this translation may require the introduction of a
lcp@104
    68
large number of new nonterminals and productions.
lcp@104
    69
lcp@104
    70
\begin{example}
lcp@104
    71
\label{PrecedenceEx}
lcp@104
    72
The following simple grammar for arithmetic expressions demonstrates how
lcp@104
    73
binding power and associativity of operators can be enforced by precedences.
lcp@104
    74
\begin{center}
lcp@104
    75
\begin{tabular}{rclr}
lcp@104
    76
$A@9$ & = & {\tt0} \\
lcp@104
    77
$A@9$ & = & {\tt(} $A@0$ {\tt)} \\
lcp@104
    78
$A@0$ & = & $A@0$ {\tt+} $A@1$ \\
lcp@104
    79
$A@2$ & = & $A@3$ {\tt*} $A@2$ \\
lcp@104
    80
$A@3$ & = & {\tt-} $A@3$
lcp@104
    81
\end{tabular}
lcp@104
    82
\end{center}
lcp@104
    83
The choice of precedences determines that \verb$-$ binds tighter than
lcp@104
    84
\verb$*$ which binds tighter than \verb$+$, and that \verb$+$ and \verb$*$
lcp@104
    85
associate to the left and right, respectively.
lcp@104
    86
\end{example}
lcp@104
    87
lcp@104
    88
To minimize the number of subscripts, we adopt the following conventions:
lcp@104
    89
\begin{itemize}
lcp@104
    90
\item all precedences $p$ must be in the range $0 \leq p \leq max_pri$ for
lcp@104
    91
  some fixed $max_pri$.
lcp@104
    92
\item precedence $0$ on the right-hand side and precedence $max_pri$ on the
lcp@104
    93
  left-hand side may be omitted.
lcp@104
    94
\end{itemize}
lcp@104
    95
In addition, we write the production $A@p = \alpha$ as $A = \alpha~(p)$.
lcp@104
    96
lcp@104
    97
Using these conventions and assuming $max_pri=9$, the grammar in
lcp@104
    98
Example~\ref{PrecedenceEx} becomes
lcp@104
    99
\begin{center}
lcp@104
   100
\begin{tabular}{rclc}
lcp@104
   101
$A$ & = & {\tt0} & \hspace*{4em} \\
lcp@104
   102
 & $|$ & {\tt(} $A$ {\tt)} \\
lcp@104
   103
 & $|$ & $A$ {\tt+} $A@1$ & (0) \\
lcp@104
   104
 & $|$ & $A@3$ {\tt*} $A@2$ & (2) \\
lcp@104
   105
 & $|$ & {\tt-} $A@3$ & (3)
lcp@104
   106
\end{tabular}
lcp@104
   107
\end{center}
lcp@104
   108
lcp@104
   109
\index{precedence grammar|)}
lcp@104
   110
lcp@104
   111
\section{Basic syntax *}
lcp@104
   112
lcp@104
   113
An informal account of most of Isabelle's syntax (meta-logic, types etc) is
lcp@104
   114
contained in {\em Introduction to Isabelle}.  A precise description using a
lcp@104
   115
precedence grammar is shown in Figure~\ref{MetaLogicSyntax}.  This description
lcp@104
   116
is the basis of all extensions by object-logics.
lcp@104
   117
\begin{figure}[htb]
lcp@104
   118
\begin{center}
lcp@104
   119
\begin{tabular}{rclc}
lcp@104
   120
$prop$ &=& \ttindex{PROP} $aprop$ ~~$|$~~ {\tt(} $prop$ {\tt)} \\
lcp@104
   121
     &$|$& $logic@3$ \ttindex{==} $logic@2$ & (2) \\
lcp@104
   122
     &$|$& $prop@2$ \ttindex{==>} $prop@1$ & (1) \\
lcp@104
   123
     &$|$& {\tt[|} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt|]} {\tt==>} $prop@1$ & (1) \\
lcp@104
   124
     &$|$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\\\
lcp@104
   125
$logic$ &=& $prop$ ~~$|$~~ $fun$ \\\\
lcp@104
   126
$aprop$ &=& $id$ ~~$|$~~ $var$
lcp@104
   127
    ~~$|$~~ $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\\\
lcp@104
   128
$fun$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $fun$ {\tt)} \\
lcp@104
   129
    &$|$& \ttindex{\%} $idts$ {\tt.} $logic$ & (0) \\\\
lcp@104
   130
$idts$ &=& $idt$ ~~$|$~~ $idt@1$ $idts$ \\\\
lcp@104
   131
$idt$ &=& $id$ ~~$|$~~ {\tt(} $idt$ {\tt)} \\
lcp@104
   132
    &$|$& $id$ \ttindex{::} $type$ & (0) \\\\
lcp@104
   133
$type$ &=& $tfree$ ~~$|$~~ $tvar$ \\
lcp@104
   134
     &$|$& $tfree$ {\tt::} $sort$ ~~$|$~~ $tvar$ {\tt::} $sort$ \\
lcp@104
   135
     &$|$& $id$ ~~$|$~~ $type@{max_pri}$ $id$
lcp@104
   136
                ~~$|$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\
lcp@104
   137
     &$|$& $type@1$ \ttindex{=>} $type$ & (0) \\
lcp@104
   138
     &$|$& {\tt[}  $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0)\\
lcp@104
   139
     &$|$& {\tt(} $type$ {\tt)} \\\\
lcp@104
   140
$sort$ &=& $id$ ~~$|$~~ {\tt\{\}}
lcp@104
   141
                ~~$|$~~ {\tt\{} $id$ {\tt,} \dots {\tt,} $id$ {\tt\}} 
lcp@104
   142
\end{tabular}\index{*"!"!}\index{*"["|}\index{*"|"]}
lcp@104
   143
\indexbold{type@$type$}\indexbold{sort@$sort$}\indexbold{idts@$idts$}
lcp@104
   144
\indexbold{logic@$logic$}\indexbold{prop@$prop$}\indexbold{fun@$fun$}
lcp@104
   145
\end{center}
lcp@104
   146
\caption{Meta-Logic Syntax}
lcp@104
   147
\label{MetaLogicSyntax}
lcp@104
   148
\end{figure}
lcp@104
   149
The following main categories are defined:
lcp@104
   150
\begin{description}
lcp@104
   151
\item[$prop$] Terms of type $prop$, i.e.\ formulae of the meta-logic.
lcp@104
   152
\item[$aprop$] Atomic propositions.
lcp@104
   153
\item[$logic$] Terms of types in class $logic$.  Initially, $logic$ contains
lcp@104
   154
  merely $prop$.  As the syntax is extended by new object-logics, more
lcp@104
   155
  productions for $logic$ are added (see below).
lcp@104
   156
\item[$fun$] Terms potentially of function type.
lcp@104
   157
\item[$type$] Types.
lcp@104
   158
\item[$idts$] a list of identifiers, possibly constrained by types.  Note
lcp@104
   159
  that $x::nat~y$ is parsed as $x::(nat~y)$, i.e.\ $y$ is treated like a
lcp@104
   160
  type constructor applied to $nat$.
lcp@104
   161
\end{description}
lcp@104
   162
lcp@104
   163
The predefined types $id$, $var$, $tfree$ and $tvar$ represent identifiers
lcp@104
   164
({\tt f}), unknowns ({\tt ?f}), type variables ({\tt 'a}), and type unknowns
lcp@104
   165
({\tt ?'a}) respectively.  If we think of them as nonterminals with
lcp@104
   166
predefined syntax, we may assume that all their productions have precedence
lcp@104
   167
$max_pri$.
lcp@104
   168
lcp@104
   169
\subsection{Logical types and default syntax}
lcp@104
   170
lcp@104
   171
Isabelle is concerned with mathematical languages which have a certain
lcp@104
   172
minimal vocabulary: identifiers, variables, parentheses, and the lambda
lcp@104
   173
calculus.  Logical types, i.e.\ those of class $logic$, are automatically
lcp@104
   174
equipped with this basic syntax.  More precisely, for any type constructor
lcp@104
   175
$ty$ with arity $(\dots)c$, where $c$ is a subclass of $logic$, the following
lcp@104
   176
productions are added:
lcp@104
   177
\begin{center}
lcp@104
   178
\begin{tabular}{rclc}
lcp@104
   179
$ty$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $ty$ {\tt)} \\
lcp@104
   180
  &$|$& $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)}\\
lcp@104
   181
  &$|$& $ty@{max_pri}$ {\tt::} $type$\\\\
lcp@104
   182
$logic$ &=& $ty$
lcp@104
   183
\end{tabular}
lcp@104
   184
\end{center}
lcp@104
   185
lcp@104
   186
lcp@104
   187
\section{Mixfix syntax}
lcp@104
   188
\index{mixfix|(}
lcp@104
   189
lcp@104
   190
We distinguish between abstract and concrete syntax.  The {\em abstract}
lcp@104
   191
syntax is given by the typed constants of a theory.  Abstract syntax trees are
lcp@104
   192
well-typed terms, i.e.\ values of \ML\ type {\tt term}.  If none of the
lcp@104
   193
constants are introduced with mixfix annotations, there is no concrete syntax
lcp@104
   194
to speak of: terms can only be abstractions or applications of the form
lcp@104
   195
$f(t@1,\dots,t@n)$, where $f$ is a constant or variable.  Since this notation
lcp@104
   196
quickly becomes unreadable, Isabelle supports syntax definitions in the form
lcp@104
   197
of unrestricted context-free grammars using mixfix annotations.
lcp@104
   198
lcp@104
   199
Mixfix annotations describe the {\em concrete} syntax, its translation into
lcp@104
   200
the abstract syntax, and a pretty-printing scheme, all in one.  Isabelle
lcp@104
   201
syntax definitions are inspired by \OBJ's~\cite{OBJ} {\em mixfix\/} syntax.
lcp@104
   202
Each mixfix annotation defines a precedence grammar production and associates
lcp@104
   203
an Isabelle constant with it.
lcp@104
   204
lcp@104
   205
A {\em mixfix declaration} {\tt consts $c$ ::\ $\tau$ ($sy$ $ps$ $p$)} is
lcp@104
   206
interpreted as a grammar pro\-duction as follows:
lcp@104
   207
\begin{itemize}
lcp@104
   208
\item $sy$ is the right-hand side of this production, specified as a {\em
lcp@104
   209
    mixfix annotation}.  In general, $sy$ is of the form
lcp@104
   210
  $\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$, where each occurrence of
lcp@104
   211
  ``\ttindex{_}'' denotes an argument/nonterminal and the strings
lcp@104
   212
  $\alpha@i$ do not contain ``{\tt_}''.
lcp@104
   213
\item $\tau$ specifies the types of the nonterminals on the left and right
lcp@104
   214
  hand side. If $sy$ is of the form above, $\tau$ must be of the form
lcp@104
   215
  $[\tau@1,\dots,\tau@n] \To \tau'$.  Then argument $i$ is of type $\tau@i$
lcp@104
   216
  and the result, i.e.\ the left-hand side of the production, is of type
lcp@104
   217
  $\tau'$.  Both the $\tau@i$ and $\tau'$ may be function types.
lcp@104
   218
\item $c$ is the name of the Isabelle constant associated with this production.
lcp@104
   219
  Parsing an instance of the phrase $sy$ generates the {\tt term} {\tt
lcp@104
   220
    Const($c$,dummyT\footnote{Proper types are inserted later on.  See
lcp@104
   221
      \S\ref{Typing}})\$$a@1$\$$\dots$\$$a@n$}\index{*dummyT}, where $a@i$ is
lcp@104
   222
  the term generated by parsing the $i^{th}$ argument.
lcp@104
   223
\item $ps$ must be of the form $[p@1,\dots,p@n]$, where $p@i$ is the
lcp@104
   224
  minimal precedence\index{precedence} required of any phrase that may appear
lcp@104
   225
  as the $i^{th}$ argument.  The null list is interpreted as a list of 0's of
lcp@104
   226
  the appropriate length.
lcp@104
   227
\item $p$ is the precedence of this production.
lcp@104
   228
\end{itemize}
lcp@104
   229
Notice that there is a close connection between abstract and concrete syntax:
lcp@104
   230
each production has an associated constant, and types act as {\bf syntactic
lcp@104
   231
  categories} in the concrete syntax.  To emphasize this connection, we
lcp@104
   232
sometimes refer to the nonterminals on the right-hand side of a production as
lcp@104
   233
its arguments and to the nonterminal on the left-hand side as its result.
lcp@104
   234
lcp@104
   235
The maximal legal precedence is called \ttindexbold{max_pri}, which is
lcp@104
   236
currently 1000.  If you want to ignore precedences, the safest way to do so is
lcp@104
   237
to use the annotation {\tt($sy$)}: this production puts no precedence
lcp@104
   238
constraints on any of its arguments and has maximal precedence itself, i.e.\ 
lcp@104
   239
it is always applicable and does not exclude any productions of its
lcp@104
   240
arguments.
lcp@104
   241
lcp@104
   242
\begin{example}
lcp@104
   243
In mixfix notation the grammar in Example~\ref{PrecedenceEx} can be written
lcp@104
   244
as follows:
lcp@104
   245
\begin{ttbox}
lcp@104
   246
types exp 0
lcp@104
   247
consts "0"  ::              "exp"  ("0" 9)
lcp@104
   248
       "+"  :: "[exp,exp] => exp"  ("_ + _" [0,1] 0)
lcp@104
   249
       "*"  :: "[exp,exp] => exp"  ("_ * _" [3,2] 2)
lcp@104
   250
       "-"  ::       "exp => exp"  ("- _"   [3]   3)
lcp@104
   251
\end{ttbox}
lcp@104
   252
Parsing the string \verb!"0 + - 0 + 0"! produces the term {\tt
lcp@104
   253
  $p$\$($p$\$($m$\$$z$)\$$z$)\$$z$} where {\tt$p =$ Const("+",dummyT)},
lcp@104
   254
{\tt$m =$ Const("-",dummyT)}, and {\tt$z =$ Const("0",dummyT)}.
lcp@104
   255
\end{example}
lcp@104
   256
lcp@104
   257
The interpretation of \ttindex{_} in a mixfix annotation is always as a {\bf
lcp@104
   258
  meta-character}\index{meta-character} which does not represent itself but
lcp@104
   259
an argument position.  The following characters are also meta-characters:
lcp@104
   260
\begin{ttbox}
lcp@104
   261
'   (   )   /
lcp@104
   262
\end{ttbox}
lcp@104
   263
Preceding any character with a quote (\verb$'$) turns it into an ordinary
lcp@104
   264
character.  Thus you can write \verb!''! if you really want a single quote.
lcp@104
   265
The purpose of the other meta-characters is explained in
lcp@104
   266
\S\ref{PrettyPrinting}.  Remember that in \ML\ strings \verb$\$ is already a
lcp@104
   267
(different kind of) meta-character.
lcp@104
   268
lcp@104
   269
lcp@104
   270
\subsection{Types and syntactic categories *}
lcp@104
   271
lcp@104
   272
The precise mapping from types to syntactic categories is defined by the
lcp@104
   273
following function:
lcp@104
   274
\begin{eqnarray*}
lcp@104
   275
N(\tau@1\To\tau@2) &=& fun \\
lcp@104
   276
N((\tau@1,\dots,\tau@n)ty) &=& ty \\
lcp@104
   277
N(\alpha) &=& logic
lcp@104
   278
\end{eqnarray*}
lcp@104
   279
Only the outermost type constructor is taken into account and type variables
lcp@104
   280
can range over all logical types.  This catches some ill-typed terms (like
lcp@104
   281
$Cons(x,0)$, where $Cons :: [\alpha,\alpha list] \To \alpha list$ and $0 ::
lcp@104
   282
nat$) but leaves the real work to the type checker.
lcp@104
   283
lcp@104
   284
In terms of the precedence grammar format introduced in
lcp@104
   285
\S\ref{PrecedenceGrammars}, the declaration
lcp@104
   286
\begin{ttbox}
lcp@104
   287
consts \(c\) :: "[\(\tau@1\),\dots,\(\tau@n\)]\(\To\tau\)" ("\(\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n\)") [\(p@1\),\dots,\(p@n\)] \(p\))
lcp@104
   288
\end{ttbox}
lcp@104
   289
defines the production
lcp@104
   290
\[ N(\tau)@p ~~=~~ \alpha@0 ~N(\tau@1)@{p@1}~ \alpha@1~ \dots
lcp@104
   291
                  ~\alpha@{n-1} ~N(\tau@n)@{p@n}~ \alpha@n
lcp@104
   292
\]
lcp@104
   293
lcp@104
   294
\subsection{Copy productions *}
lcp@104
   295
lcp@104
   296
Productions which do not create a new node in the abstract syntax tree are
lcp@104
   297
called {\bf copy productions}.  They must have exactly one nonterminal on
lcp@104
   298
the right hand side.  The term generated when parsing that nonterminal is
lcp@104
   299
simply passed up as the result of parsing the whole copy production.  In
lcp@104
   300
Isabelle a copy production is indicated by an empty constant name, i.e.\ by
lcp@104
   301
\begin{ttbox}
lcp@104
   302
consts "" :: \(\tau\)  (\(sy\) \(ps\) \(p\))
lcp@104
   303
\end{ttbox}
lcp@104
   304
lcp@104
   305
A special kind of copy production is one where, modulo white space, $sy$ is
lcp@104
   306
{\tt"_"}.  It is called a {\bf chain production}.  Chain productions should be
lcp@104
   307
seen as an abbreviation mechanism.  Conceptually, they are removed from the
lcp@104
   308
grammar by adding appropriate new rules.  Precedence information attached to
lcp@104
   309
chain productions is ignored.  The following example demonstrates the effect:
lcp@104
   310
the grammar defined by
lcp@104
   311
\begin{ttbox}
lcp@104
   312
types A,B,C 0
lcp@104
   313
consts AB :: "B => A"  ("A _" [10] 517)
lcp@104
   314
       "" :: "C => B"  ("_"   [0]  100)
lcp@104
   315
       x  :: "C"       ("x"          5)
lcp@104
   316
       y  :: "C"       ("y"         15)
lcp@104
   317
\end{ttbox}
lcp@104
   318
admits {\tt"A y"} but not {\tt"A x"}.  Had the constant in the second
lcp@104
   319
production been some non-empty string, both {\tt"A y"} and {\tt"A x"} would
lcp@104
   320
be legal.
lcp@104
   321
lcp@104
   322
\index{mixfix|)}
lcp@104
   323
lcp@104
   324
\section{Lexical conventions}
lcp@104
   325
lcp@104
   326
The lexical analyzer distinguishes the following kinds of tokens: delimiters,
lcp@104
   327
identifiers, unknowns, type variables and type unknowns.
lcp@104
   328
lcp@104
   329
Delimiters are user-defined, i.e.\ they are extracted from the syntax
lcp@104
   330
definition.  If $\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$ is a mixfix
lcp@104
   331
annotation, each $\alpha@i$ is decomposed into substrings
lcp@104
   332
$\beta@1~\dots~\beta@k$ which are separated by and do not contain
lcp@104
   333
\bfindex{white space} ( = blanks, tabs, newlines).  Each $\beta@j$ becomes a
lcp@104
   334
delimiter.  Thus a delimiter can be an arbitrary string not containing white
lcp@104
   335
space.
lcp@104
   336
lcp@104
   337
The lexical syntax of identifiers and variables ( = unknowns) is defined in
lcp@104
   338
the introductory manual.  Parsing an identifier $f$ generates {\tt
lcp@104
   339
  Free($f$,dummyT)}\index{*dummyT}.  Parsing a variable {\tt?}$v$ generates
lcp@104
   340
{\tt Var(($u$,$i$),dummyT)} where $i$ is the integer value of the longest
lcp@104
   341
numeric suffix of $v$ (possibly $0$), and $u$ is the remaining prefix.
lcp@104
   342
Parsing a variable {\tt?}$v{.}i$ generates {\tt Var(($v$,$i$),dummyT)}.  The
lcp@104
   343
following table covers the four different cases that can arise:
lcp@104
   344
\begin{center}\tt
lcp@104
   345
\begin{tabular}{cccc}
lcp@104
   346
"?v" & "?v.7" & "?v5" & "?v7.5" \\
lcp@104
   347
Var(("v",0),$d$) & Var(("v",7),$d$) & Var(("v",5),$d$) & Var(("v7",5),$d$)
lcp@104
   348
\end{tabular}
lcp@104
   349
\end{center}
lcp@104
   350
where $d = {\tt dummyT}$.
lcp@104
   351
lcp@104
   352
In mixfix annotations, \ttindexbold{id}, \ttindexbold{var},
lcp@104
   353
\ttindexbold{tfree} and \ttindexbold{tvar} are the predefined categories of
lcp@104
   354
identifiers, unknowns, type variables and type unknowns, respectively.
lcp@104
   355
lcp@104
   356
lcp@104
   357
The lexical analyzer translates input strings to token lists by repeatedly
lcp@104
   358
taking the maximal prefix of the input string that forms a valid token.  A
lcp@104
   359
maximal prefix that is both a delimiter and an identifier or variable (like
lcp@104
   360
{\tt ALL}) is treated as a delimiter.  White spaces are separators.
lcp@104
   361
lcp@104
   362
An important consequence of this translation scheme is that delimiters need
lcp@104
   363
not be separated by white space to be recognized as separate.  If \verb$"-"$
lcp@104
   364
is a delimiter but \verb$"--"$ is not, the string \verb$"--"$ is treated as
lcp@104
   365
two consecutive occurrences of \verb$"-"$.  This is in contrast to \ML\ which
lcp@104
   366
would treat \verb$"--"$ as a single (undeclared) identifier.  The
lcp@104
   367
consequence of Isabelle's more liberal scheme is that the same string may be
lcp@104
   368
parsed in a different way after extending the syntax: after adding
lcp@104
   369
\verb$"--"$ as a delimiter, the input \verb$"--"$ is treated as
lcp@104
   370
a single occurrence of \verb$"--"$.
lcp@104
   371
lcp@104
   372
\section{Infix operators}
lcp@104
   373
lcp@104
   374
{\tt Infixl} and {\tt infixr} declare infix operators which associate to the
lcp@104
   375
left and right respectively.  As in \ML, prefixing infix operators with
lcp@104
   376
\ttindexbold{op} turns them into curried functions.  Infix declarations can
lcp@104
   377
be reduced to mixfix ones as follows:
lcp@104
   378
\begin{center}\tt
lcp@104
   379
\begin{tabular}{l@{~~$\Longrightarrow$~~}l}
lcp@104
   380
"$c$" ::~$\tau$ (\ttindexbold{infixl} $p$) &
lcp@104
   381
"op $c$" ::~$\tau$ ("_ $c$ _" [$p$,$p+1$] $p$) \\
lcp@104
   382
"$c$" ::~$\tau$ (\ttindexbold{infixr} $p$) &
lcp@104
   383
"op $c$" ::~$\tau$ ("_ $c$ _" [$p+1$,$p$] $p$)
lcp@104
   384
\end{tabular}
lcp@104
   385
\end{center}
lcp@104
   386
lcp@104
   387
lcp@104
   388
\section{Binders}
lcp@104
   389
A {\bf binder} is a variable-binding constant, such as a quantifier.
lcp@104
   390
The declaration
lcp@104
   391
\begin{ttbox}
lcp@104
   392
consts \(c\) :: \(\tau\)  (binder \(Q\) \(p\))
lcp@104
   393
\end{ttbox}\indexbold{*binder}
lcp@104
   394
introduces a binder $c$ of type $\tau$,
lcp@104
   395
which must have the form $(\tau@1\To\tau@2)\To\tau@3$.  Its concrete syntax
lcp@104
   396
is $Q~x.t$.  A binder is like a generalized quantifier where $\tau@1$ is the
lcp@104
   397
type of the bound variable $x$, $\tau@2$ the type of the body $t$, and
lcp@104
   398
$\tau@3$ the type of the whole term.  For example $\forall$ can be declared
lcp@104
   399
like this:
lcp@104
   400
\begin{ttbox}
lcp@104
   401
consts All :: "('a => o) => o"  (binder "ALL " 10)
lcp@104
   402
\end{ttbox}
lcp@104
   403
This allows us to write $\forall x.P$ either as {\tt ALL $x$.$P$} or {\tt
lcp@104
   404
  All(\%$x$.$P$)}; the latter form is for purists only.
lcp@104
   405
lcp@104
   406
In case $\tau@2 = \tau@3$, nested quantifications can be written as $Q~x@1
lcp@104
   407
\dots x@n.t$.  From a syntactic point of view,
lcp@104
   408
\begin{ttbox}
lcp@104
   409
consts \(c\) :: "\((\tau@1\To\tau@2)\To\tau@3\)"  (binder "\(Q\)" \(p\))
lcp@104
   410
\end{ttbox}
lcp@104
   411
is equivalent to
lcp@104
   412
\begin{ttbox}
lcp@104
   413
consts \(c\)   :: "\((\tau@1\To\tau@2)\To\tau@3\)"
lcp@104
   414
       "\(Q\)" :: "[idts,\(\tau@2\)] => \(\tau@3\)"  ("\(Q\)_.  _" \(p\))
lcp@104
   415
\end{ttbox}
lcp@104
   416
where {\tt idts} is the syntactic category $idts$ defined in
lcp@104
   417
Figure~\ref{MetaLogicSyntax}.
lcp@104
   418
lcp@104
   419
However, there is more to binders than concrete syntax: behind the scenes the
lcp@104
   420
body of the quantified expression has to be converted into a
lcp@104
   421
$\lambda$-abstraction (when parsing) and back again (when printing).  This
lcp@104
   422
is performed by the translation mechanism, which is discussed below.  For
lcp@104
   423
binders, the definition of the required translation functions has been
lcp@104
   424
automated.  Many other syntactic forms, such as set comprehension, require
lcp@104
   425
special treatment.
lcp@104
   426
lcp@104
   427
lcp@104
   428
\section{Parse translations *}
lcp@104
   429
\label{Parse-translations}
lcp@104
   430
\index{parse translation|(}
lcp@104
   431
lcp@104
   432
So far we have pretended that there is a close enough relationship between
lcp@104
   433
concrete and abstract syntax to allow an automatic translation from one to
lcp@104
   434
the other using the constant name supplied with each production.  In many
lcp@104
   435
cases this scheme is not powerful enough, especially for constructs involving
lcp@104
   436
variable bindings.  Therefore the $ML$-section of a theory definition can
lcp@104
   437
associate constant names with user-defined translation functions by including
lcp@104
   438
a line
lcp@104
   439
\begin{ttbox}
lcp@104
   440
val parse_translation = \dots
lcp@104
   441
\end{ttbox}
lcp@104
   442
where the right-hand side of this binding must be an \ML-expression of type
lcp@104
   443
\verb$(string * (term list -> term))list$.
lcp@104
   444
lcp@104
   445
After the input string has been translated into a term according to the
lcp@104
   446
syntax definition, there is a second phase in which the term is translated
lcp@104
   447
using the user-supplied functions in a bottom-up manner.  Given a list $tab$
lcp@104
   448
of the above type, a term $t$ is translated as follows.  If $t$ is not of the
lcp@104
   449
form {\tt Const($c$,$\tau$)\$$t@1$\$\dots\$$t@n$}, then $t$ is returned
lcp@104
   450
unchanged.  Otherwise all $t@i$ are translated into $t@i'$.  Let {\tt $t' =$
lcp@104
   451
  Const($c$,$\tau$)\$$t@1'$\$\dots\$$t@n'$}.  If there is no pair $(c,f)$ in
lcp@104
   452
$tab$, return $t'$.  Otherwise apply $f$ to $[t@1',\dots,t@n']$.  If that
lcp@104
   453
raises an exception, return $t'$, otherwise return the result.
lcp@104
   454
\begin{example}\label{list-enum}
lcp@104
   455
\ML-lists are constructed by {\tt[]} and {\tt::}.  For readability the
lcp@104
   456
list \hbox{\tt$x$::$y$::$z$::[]} can be written \hbox{\tt[$x$,$y$,$z$]}.
lcp@104
   457
In Isabelle the two forms of lists are declared as follows:
lcp@104
   458
\begin{ttbox}
lcp@104
   459
types list 1
lcp@104
   460
      enum 0
lcp@104
   461
arities list :: (term)term
lcp@104
   462
consts "[]" :: "'a list"                   ("[]")
lcp@104
   463
       ":"  :: "['a, 'a list] => 'a list"  (infixr 50)
lcp@104
   464
       enum :: "enum => 'a list"           ("[_]")
lcp@104
   465
       sing :: "'a => enum"                ("_")
lcp@104
   466
       cons :: "['a,enum] => enum"         ("_, _")
lcp@104
   467
end
lcp@104
   468
\end{ttbox}
lcp@104
   469
Because \verb$::$ is already used for type constraints, it is replaced by
lcp@104
   470
\verb$:$ as the infix list constructor.
lcp@104
   471
lcp@104
   472
In order to allow list enumeration, the new type {\tt enum} is introduced.
lcp@104
   473
Its only purpose is syntactic and hence it does not need an arity, in
lcp@104
   474
contrast to the logical type {\tt list}.  Although \hbox{\tt[$x$,$y$,$z$]} is
lcp@104
   475
syntactically legal, it needs to be translated into a term built up from
lcp@104
   476
\verb$[]$ and \verb$:$.  This is what \verb$make_list$ accomplishes:
lcp@104
   477
\begin{ttbox}
lcp@104
   478
val cons = Const("op :", dummyT);
lcp@104
   479
lcp@104
   480
fun make_list (Const("sing",_)$e) = cons $ e $ Const("[]", dummyT)
lcp@104
   481
  | make_list (Const("cons",_)$e$es) = cons $ e $ make_list es;
lcp@104
   482
\end{ttbox}
lcp@104
   483
To hook this translation up to Isabelle's parser, the theory definition needs
lcp@104
   484
to contain the following $ML$-section:
lcp@104
   485
\begin{ttbox}
lcp@104
   486
ML
lcp@104
   487
fun enum_tr[enum] = make_list enum;
lcp@104
   488
val parse_translation = [("enum",enum_tr)]
lcp@104
   489
\end{ttbox}
lcp@104
   490
This causes \verb!Const("enum",_)$!$t$ to be replaced by
lcp@104
   491
\verb$enum_tr[$$t$\verb$]$.
lcp@104
   492
lcp@104
   493
Of course the definition of \verb$make_list$ should be included in the
lcp@104
   494
$ML$-section.
lcp@104
   495
\end{example}
lcp@104
   496
\begin{example}\label{SET}
lcp@104
   497
  Isabelle represents the set $\{ x \mid P(x) \}$ internally by $Set(\lambda
lcp@104
   498
  x.P(x))$.  The internal and external forms need separate
lcp@104
   499
constants:\footnote{In practice, the external form typically has a name
lcp@104
   500
beginning with an {\at} sign, such as {\tt {\at}SET}.  This emphasizes that
lcp@104
   501
the constant should be used only for parsing/printing.}
lcp@104
   502
\begin{ttbox}
lcp@104
   503
types set 1
lcp@104
   504
arities set :: (term)term
lcp@104
   505
consts Set :: "('a => o) => 'a set"
lcp@104
   506
       SET :: "[id,o] => 'a set"  ("\{_ | _\}")
lcp@104
   507
\end{ttbox}
lcp@104
   508
Parsing {\tt"\{$x$ | $P$\}"} according to this syntax yields the term {\tt
lcp@104
   509
  Const("SET",dummyT) \$ Free("\(x\)",dummyT) \$ \(p\)}, where $p$ is the
lcp@104
   510
result of parsing $P$.  What we need is the term {\tt
lcp@104
   511
  Const("Set",dummyT)\$Abs("$x$",dummyT,$p'$)}, where $p'$ is some
lcp@104
   512
``abstracted'' version of $p$.  Therefore we define a function
lcp@104
   513
\begin{ttbox}
lcp@104
   514
fun set_tr[Free(s,T), p] = Const("Set", dummyT) $
lcp@104
   515
                           Abs(s, T, abstract_over(Free(s,T), p));
lcp@104
   516
\end{ttbox}
lcp@104
   517
where \verb$abstract_over: term*term -> term$ is a predefined function such
lcp@104
   518
that {\tt abstract_over($u$,$t$)} replaces every occurrence of $u$ in $t$ by
lcp@104
   519
a {\tt Bound} variable of the correct index (i.e.\ 0 at top level).  Remember
lcp@104
   520
that {\tt dummyT} is replaced by the correct types at a later stage (see
lcp@104
   521
\S\ref{Typing}).  Function {\tt set_tr} is associated with {\tt SET} by
lcp@104
   522
including the \ML-text
lcp@104
   523
\begin{ttbox}
lcp@104
   524
val parse_translation = [("SET", set_tr)];
lcp@104
   525
\end{ttbox}
lcp@104
   526
\end{example}
lcp@104
   527
lcp@104
   528
If you want to run the above examples in Isabelle, you should note that an
lcp@104
   529
$ML$-section needs to contain not just a definition of
lcp@104
   530
\verb$parse_translation$ but also of a variable \verb$print_translation$.  The
lcp@104
   531
purpose of the latter is to reverse the effect of the former during printing;
lcp@104
   532
details are found in \S\ref{Print-translations}.  Hence you need to include
lcp@104
   533
the line
lcp@104
   534
\begin{ttbox}
lcp@104
   535
val print_translation = [];
lcp@104
   536
\end{ttbox}
lcp@104
   537
This is instructive because the terms are then printed out in their internal
lcp@104
   538
form.  For example the input \hbox{\tt[$x$,$y$,$z$]} is echoed as
lcp@104
   539
\hbox{\tt$x$:$y$:$z$:[]}.  This helps to check that your parse translation is
lcp@104
   540
working correctly.
lcp@104
   541
lcp@104
   542
%\begin{warn}
lcp@104
   543
%Explicit type constraints disappear with type checking but are still
lcp@104
   544
%visible to the parse translation functions.
lcp@104
   545
%\end{warn}
lcp@104
   546
lcp@104
   547
\index{parse translation|)}
lcp@104
   548
lcp@104
   549
\section{Printing}
lcp@104
   550
lcp@104
   551
Syntax definitions provide printing information in three distinct ways:
lcp@104
   552
through
lcp@104
   553
\begin{itemize}
lcp@104
   554
\item the syntax of the language (as used for parsing),
lcp@104
   555
\item pretty printing information, and
lcp@104
   556
\item print translation functions.
lcp@104
   557
\end{itemize}
lcp@104
   558
The bare mixfix declarations enable Isabelle to print terms, but the result
lcp@104
   559
will not necessarily be pretty and may look different from what you expected.
lcp@104
   560
To produce a pleasing layout, you need to read the following sections.
lcp@104
   561
lcp@104
   562
\subsection{Printing with mixfix declarations}
lcp@104
   563
lcp@104
   564
Let {\tt$t =$ Const($c$,_)\$$t@1$\$\dots\$$t@n$} be a term and let
lcp@104
   565
\begin{ttbox}
lcp@104
   566
consts \(c\) :: \(\tau\) (\(sy\))
lcp@104
   567
\end{ttbox}
lcp@104
   568
be a mixfix declaration where $sy$ is of the form
lcp@104
   569
$\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$.  Printing $t$ according to
lcp@104
   570
$sy$ means printing the string
lcp@104
   571
$\alpha@0\beta@1\alpha@1\ldots\alpha@{n-1}\beta@n\alpha@n$, where $\beta@i$
lcp@104
   572
is the result of printing $t@i$.
lcp@104
   573
lcp@104
   574
Note that the system does {\em not\/} insert blanks.  They should be part of
lcp@104
   575
the mixfix syntax if they are required to separate tokens or achieve a
lcp@104
   576
certain layout.
lcp@104
   577
lcp@104
   578
\subsection{Pretty printing}
lcp@104
   579
\label{PrettyPrinting}
lcp@104
   580
\index{pretty printing}
lcp@104
   581
lcp@104
   582
In order to format the output, it is possible to embed pretty printing
lcp@104
   583
directives in mixfix annotations.  These directives are ignored during parsing
lcp@104
   584
and affect only printing.  The characters {\tt(}, {\tt)} and {\tt/} are
lcp@104
   585
interpreted as meta-characters\index{meta-character} when found in a mixfix
lcp@104
   586
annotation.  Their meaning is
lcp@104
   587
\begin{description}
lcp@104
   588
\item[~{\tt(}~] Open a block.  A sequence of digits following it is
lcp@104
   589
  interpreted as the \bfindex{indentation} of this block.  It causes the
lcp@104
   590
  output to be indented by $n$ positions if a line break occurs within the
lcp@104
   591
  block.  If {\tt(} is not followed by a digit, the indentation defaults to
lcp@104
   592
  $0$.
lcp@104
   593
\item[~{\tt)}~] Close a block.
lcp@104
   594
\item[~\ttindex{/}~] Allow a \bfindex{line break}.  White space immediately
lcp@104
   595
  following {\tt/} is not printed if the line is broken at this point.
lcp@104
   596
\end{description}
lcp@104
   597
lcp@104
   598
\subsection{Print translations *}
lcp@104
   599
\label{Print-translations}
lcp@104
   600
\index{print translation|(}
lcp@104
   601
lcp@104
   602
Since terms are translated after parsing (see \S\ref{Parse-translations}),
lcp@104
   603
there is a similar mechanism to translate them back before printing.
lcp@104
   604
Therefore the $ML$-section of a theory definition can associate constant
lcp@104
   605
names with user-defined translation functions by including a line
lcp@104
   606
\begin{ttbox}
lcp@104
   607
val print_translation = \dots
lcp@104
   608
\end{ttbox}
lcp@104
   609
where the right-hand side of this binding is again an \ML-expression of type
lcp@104
   610
\verb$(string * (term list -> term))list$.
lcp@104
   611
Including a pair $(c,f)$ in this list causes the printer to print
lcp@104
   612
$f[t@1,\dots,t@n]$ whenever it finds {\tt Const($c$,_)\$$t@1$\$\dots\$$t@n$}.
lcp@104
   613
\begin{example}
lcp@104
   614
Reversing the effect of the parse translation in Example~\ref{list-enum} is
lcp@104
   615
accomplished by the following function:
lcp@104
   616
\begin{ttbox}
lcp@104
   617
fun make_enum (Const("op :",_) $ e $ es) = case es of
lcp@104
   618
        Const("[]",_)  =>  Const("sing",dummyT) $ e
lcp@104
   619
      | _  =>  Const("enum",dummyT) $ e $ make_enum es;
lcp@104
   620
\end{ttbox}
lcp@104
   621
It translates \hbox{\tt$x$:$y$:$z$:[]} to \hbox{\tt[$x$,$y$,$z$]}.  However,
lcp@104
   622
if the input does not terminate with an empty list, e.g.\ \hbox{\tt$x$:$xs$},
lcp@104
   623
\verb$make_enum$ raises exception {\tt Match}.  This signals that the
lcp@104
   624
attempted translation has failed and the term should be printed as is.
lcp@104
   625
The connection with Isabelle's pretty printer is established as follows:
lcp@104
   626
\begin{ttbox}
lcp@104
   627
fun enum_tr'[x,xs] = Const("enum",dummyT) $
lcp@104
   628
                     make_enum(Const("op :",dummyT)$x$xs);
lcp@104
   629
val print_translation = [("op :", enum_tr')];
lcp@104
   630
\end{ttbox}
lcp@104
   631
This declaration causes the printer to print \hbox{\tt enum_tr'[$x$,$y$]}
lcp@104
   632
whenever it finds \verb!Const("op :",_)$!$x$\verb!$!$y$.
lcp@104
   633
\end{example}
lcp@104
   634
\begin{example}
lcp@104
   635
  In Example~\ref{SET} we showed how to translate the concrete syntax for set
lcp@104
   636
  comprehension into the proper internal form.  The string {\tt"\{$x$ |
lcp@104
   637
    $P$\}"} now becomes {\tt Const("Set",_)\$Abs("$x$",_,$p$)}.  If, however,
lcp@104
   638
  the latter term were printed without translating it back, it would result
lcp@104
   639
  in {\tt"Set(\%$x$.$P$)"}.  Therefore the abstraction has to be turned back
lcp@104
   640
  into a term that matches the concrete mixfix syntax:
lcp@104
   641
\begin{ttbox}
lcp@104
   642
fun set_tr'[Abs(x,T,P)] =
lcp@104
   643
      let val (x',P') = variant_abs(x,T,P)
lcp@104
   644
      in Const("SET",dummyT) $ Free(x',T) $ P' end;
lcp@104
   645
\end{ttbox}
lcp@104
   646
The function \verb$variant_abs$, a basic term manipulation function, replaces
lcp@104
   647
the bound variable $x$ by a {\tt Free} variable $x'$ having a unique name.  A
lcp@104
   648
term produced by {\tt set_tr'} can now be printed according to the concrete
lcp@104
   649
syntax defined in Example~\ref{SET} above.
lcp@104
   650
lcp@104
   651
Notice that the application of {\tt set_tr'} fails if the second component of
lcp@104
   652
the argument is not an abstraction, but for example just a {\tt Free}
lcp@104
   653
variable.  This is intentional because it signals to the caller that the
lcp@104
   654
translation is inapplicable.  As a result {\tt Const("Set",_)\$Free("P",_)}
lcp@104
   655
prints as {\tt"Set(P)"}.
lcp@104
   656
lcp@104
   657
The full theory extension, including concrete syntax and both translation
lcp@104
   658
functions, has the following form:
lcp@104
   659
\begin{ttbox}
lcp@104
   660
types set 1
lcp@104
   661
arities set :: (term)term
lcp@104
   662
consts Set :: "('a => o) => 'a set"
lcp@104
   663
       SET :: "[id,o] => 'a set"  ("\{_ | _\}")
lcp@104
   664
end
lcp@104
   665
ML
lcp@104
   666
fun set_tr[Free(s,T), p] = \dots;
lcp@104
   667
val parse_translation = [("SET", set_tr)];
lcp@104
   668
fun set_tr'[Abs(x,T,P)] = \dots;
lcp@104
   669
val print_translation = [("Set", set_tr')];
lcp@104
   670
\end{ttbox}
lcp@104
   671
\end{example}
lcp@104
   672
As during the parse translation process, the types attached to constants
lcp@104
   673
during print translation are ignored.  Thus {\tt Const("SET",dummyT)} in
lcp@104
   674
{\tt set_tr'} above is acceptable.  The types of {\tt Free}s and {\tt Var}s
lcp@104
   675
however must be preserved because they may get printed (see {\tt
lcp@104
   676
show_types}).
lcp@104
   677
lcp@104
   678
\index{print translation|)}
lcp@104
   679
lcp@104
   680
\subsection{Printing a term}
lcp@104
   681
lcp@104
   682
Let $tab$ be the set of all string-function pairs of print translations in the
lcp@104
   683
current syntax.
lcp@104
   684
lcp@104
   685
Terms are printed recursively; print translations are applied top down:
lcp@104
   686
\begin{itemize}
lcp@104
   687
\item {\tt Free($x$,_)} is printed as $x$.
lcp@104
   688
\item {\tt Var(($x$,$i$),_)} is printed as $x$, if $i = 0$ and $x$ does not
lcp@104
   689
  end with a digit, as $x$ followed by $i$, if $i \neq 0$ and $x$ does not
lcp@104
   690
  end with a digit, and as {\tt $x$.$i$}, if $x$ ends with a digit.  Thus the
lcp@104
   691
  following cases can arise:
lcp@104
   692
\begin{center}
lcp@104
   693
{\tt\begin{tabular}{cccc}
lcp@104
   694
\verb$Var(("v",0),_)$ & \verb$Var(("v",7),_)$ & \verb$Var(("v5",0),_)$ \\
lcp@104
   695
"?v" & "?v7" & "?v5.0"
lcp@104
   696
\end{tabular}}
lcp@104
   697
\end{center}
lcp@104
   698
\item {\tt Abs($x@1$,_,Abs($x@2$,_,\dots Abs($x@n$,_,$p$)\dots))}, where $p$
lcp@104
   699
  is not an abstraction, is printed as {\tt \%$y@1\dots y@n$.$P$}, where $P$
lcp@104
   700
  is the result of printing $p$, and the $x@i$ are replaced by $y@i$.  The
lcp@104
   701
  latter are (possibly new) unique names.
lcp@104
   702
\item {\tt Bound($i$)} is printed as {\tt B.$i$} \footnote{The occurrence of
lcp@104
   703
    such ``loose'' bound variables indicates that either you are trying to
lcp@104
   704
    print a subterm of an abstraction, or there is something wrong with your
lcp@104
   705
    print translations.}.
lcp@104
   706
\item The application {\tt$t =$ Const($c$,_)\$$t@1$\$\dots\$$t@n$} (where
lcp@104
   707
  $n$ may be $0$!) is printed as follows:
lcp@104
   708
lcp@104
   709
  If there is a pair $(c,f)$ in $tab$, print $f[t@1,\dots,t@n]$.  If this
lcp@104
   710
  application raises exception {\tt Match} or there is no pair $(c,f)$ in
lcp@104
   711
  $tab$, let $sy$ be the mixfix annotation associated with $c$.  If there is
lcp@104
   712
  no such $sy$, or if $sy$ does not have exactly $n$ argument positions, $t$
lcp@104
   713
  is printed as an application; otherwise $t$ is printed according to $sy$.
lcp@104
   714
lcp@104
   715
  All other applications are printed as applications.
lcp@104
   716
\end{itemize}
lcp@104
   717
Printing a term {\tt $c$\$$t@1$\$\dots\$$t@n$} as an application means
lcp@104
   718
printing it as {\tt $s$($s@1$,\dots,$s@n$)}, where $s@i$ is the result of
lcp@104
   719
printing $t@i$.  If $c$ is a {\tt Const}, $s$ is its first argument;
lcp@104
   720
otherwise $s$ is the result of printing $c$ as described above.
lcp@104
   721
\medskip
lcp@104
   722
lcp@104
   723
The printer also inserts parentheses where they are necessary for reasons
lcp@104
   724
of precedence.
lcp@104
   725
lcp@104
   726
\section{Identifiers, constants, and type inference *}
lcp@104
   727
\label{Typing}
lcp@104
   728
lcp@104
   729
There is one final step in the translation from strings to terms that we have
lcp@104
   730
not covered yet.  It explains how constants are distinguished from {\tt Free}s
lcp@104
   731
and how {\tt Free}s and {\tt Var}s are typed.  Both issues arise because {\tt
lcp@104
   732
  Free}s and {\tt Var}s are not declared.
lcp@104
   733
lcp@104
   734
An identifier $f$ that does not appear as a delimiter in the concrete syntax
lcp@104
   735
can be either a free variable or a constant.  Since the parser knows only
lcp@104
   736
about those constants which appear in mixfix annotations, it parses $f$ as
lcp@104
   737
{\tt Free("$f$",dummyT)}, where \ttindex{dummyT} is the predefined dummy {\tt
lcp@104
   738
  typ}.  Although the parser produces these very raw terms, most user
lcp@104
   739
interface level functions like {\tt goal} type terms according to the given
lcp@104
   740
theory, say $T$.  In a first step, every occurrence of {\tt Free($f$,_)} or
lcp@104
   741
{\tt Const($f$,_)} is replaced by {\tt Const($f$,$\tau$)}, provided there is
lcp@104
   742
a constant $f$ of {\tt typ} $\tau$ in $T$.  This means that identifiers are
lcp@104
   743
treated as {\tt Free}s iff they are not declared in the theory.  The types of
lcp@104
   744
the remaining {\tt Free}s (and {\tt Var}s) are inferred as in \ML.  Type
lcp@104
   745
constraints can be used to remove ambiguities.
lcp@104
   746
lcp@104
   747
One peculiarity of the current type inference algorithm is that variables
lcp@104
   748
with the same name must have the same type, irrespective of whether they are
lcp@104
   749
schematic, free or bound.  For example, take the first-order formula $f(x) = x
lcp@104
   750
\land (\forall f.~ f=f)$ where ${=} :: [\alpha{::}term,\alpha]\To o$ and
lcp@104
   751
$\forall :: (\alpha{::}term\To o)\To o$.  The first conjunct forces
lcp@104
   752
$x::\alpha{::}term$ and $f::\alpha\To\alpha$, the second one
lcp@104
   753
$f::\beta{::}term$.  Although the two $f$'s are distinct, they are required to
lcp@104
   754
have the same type.  Unifying $\alpha\To\alpha$ and $\beta{::}term$ fails
lcp@104
   755
because, in first-order logic, function types are not in class $term$.
lcp@104
   756
lcp@104
   757
lcp@104
   758
\section{Putting it all together}
lcp@104
   759
lcp@104
   760
Having discussed the individual building blocks of a logic definition, it
lcp@104
   761
remains to be shown how they fit together.  In particular we need to say how
lcp@104
   762
an object-logic syntax is hooked up to the meta-logic.  Since all theorems
lcp@104
   763
must conform to the syntax for $prop$ (see Figure~\ref{MetaLogicSyntax}),
lcp@104
   764
that syntax has to be extended with the object-level syntax.  Assume that the
lcp@104
   765
syntax of your object-logic defines a category $o$ of formulae.  These
lcp@104
   766
formulae can now appear in axioms and theorems wherever $prop$ does if you
lcp@104
   767
add the production
lcp@104
   768
\[ prop ~=~ form.  \]
lcp@104
   769
More precisely, you need a coercion from formulae to propositions:
lcp@104
   770
\begin{ttbox}
lcp@104
   771
Base = Pure +
lcp@104
   772
types o 0
lcp@104
   773
arities o :: logic
lcp@104
   774
consts Trueprop :: "o => prop"  ("_"  5)
lcp@104
   775
end
lcp@104
   776
\end{ttbox}
lcp@104
   777
The constant {\tt Trueprop} (the name is arbitrary) acts as an invisible
lcp@104
   778
coercion function.  Assuming this definition resides in a file {\tt base.thy},
lcp@104
   779
you have to load it with the command {\tt use_thy"base"}.
lcp@104
   780
lcp@104
   781
One of the simplest nontrivial logics is {\em minimal logic} of
lcp@104
   782
implication.  Its definition in Isabelle needs no advanced features but
lcp@104
   783
illustrates the overall mechanism quite nicely:
lcp@104
   784
\begin{ttbox}
lcp@104
   785
Hilbert = Base +
lcp@104
   786
consts "-->" :: "[o,o] => o"  (infixr 10)
lcp@104
   787
rules
lcp@104
   788
K   "P --> Q --> P"
lcp@104
   789
S   "(P --> Q --> R) --> (P --> Q) --> P --> R"
lcp@104
   790
MP  "[| P --> Q; P |] ==> Q"
lcp@104
   791
end
lcp@104
   792
\end{ttbox}
lcp@104
   793
After loading this definition you can start to prove theorems in this logic:
lcp@104
   794
\begin{ttbox}
lcp@104
   795
goal Hilbert.thy "P --> P";
lcp@104
   796
{\out Level 0}
lcp@104
   797
{\out P --> P}
lcp@104
   798
{\out  1.  P --> P}
lcp@104
   799
by (resolve_tac [Hilbert.MP] 1);
lcp@104
   800
{\out Level 1}
lcp@104
   801
{\out P --> P}
lcp@104
   802
{\out  1.  ?P --> P --> P}
lcp@104
   803
{\out  2.  ?P}
lcp@104
   804
by (resolve_tac [Hilbert.MP] 1);
lcp@104
   805
{\out Level 2}
lcp@104
   806
{\out P --> P}
lcp@104
   807
{\out  1.  ?P1 --> ?P --> P --> P}
lcp@104
   808
{\out  2.  ?P1}
lcp@104
   809
{\out  3.  ?P}
lcp@104
   810
by (resolve_tac [Hilbert.S] 1);
lcp@104
   811
{\out Level 3}
lcp@104
   812
{\out P --> P}
lcp@104
   813
{\out  1.  P --> ?Q2 --> P}
lcp@104
   814
{\out  2.  P --> ?Q2}
lcp@104
   815
by (resolve_tac [Hilbert.K] 1);
lcp@104
   816
{\out Level 4}
lcp@104
   817
{\out P --> P}
lcp@104
   818
{\out  1.  P --> ?Q2}
lcp@104
   819
by (resolve_tac [Hilbert.K] 1);
lcp@104
   820
{\out Level 5}
lcp@104
   821
{\out P --> P}
lcp@104
   822
{\out No subgoals!}
lcp@104
   823
\end{ttbox}
lcp@104
   824
As you can see, this Hilbert-style formulation of minimal logic is easy to
lcp@104
   825
define but difficult to use.  The following natural deduction formulation is
lcp@104
   826
far preferable:
lcp@104
   827
\begin{ttbox}
lcp@104
   828
MinI = Base +
lcp@104
   829
consts "-->" :: "[o,o] => o"  (infixr 10)
lcp@104
   830
rules
lcp@104
   831
impI  "(P ==> Q) ==> P --> Q"
lcp@104
   832
impE  "[| P --> Q; P |] ==> Q"
lcp@104
   833
end
lcp@104
   834
\end{ttbox}
lcp@104
   835
Note, however, that although the two systems are equivalent, this fact cannot
lcp@104
   836
be proved within Isabelle: {\tt S} and {\tt K} can be derived in \verb$MinI$
lcp@104
   837
(exercise!), but {\tt impI} cannot be derived in \verb!Hilbert!.  The reason
lcp@104
   838
is that {\tt impI} is only an {\em admissible} rule in \verb!Hilbert!,
lcp@104
   839
something that can only be shown by induction over all possible proofs in
lcp@104
   840
\verb!Hilbert!.
lcp@104
   841
lcp@104
   842
It is a very simple matter to extend minimal logic with falsity:
lcp@104
   843
\begin{ttbox}
lcp@104
   844
MinIF = MinI +
lcp@104
   845
consts False :: "o"
lcp@104
   846
rules
lcp@104
   847
FalseE  "False ==> P"
lcp@104
   848
end
lcp@104
   849
\end{ttbox}
lcp@104
   850
On the other hand, we may wish to introduce conjunction only:
lcp@104
   851
\begin{ttbox}
lcp@104
   852
MinC = Base +
lcp@104
   853
consts "&" :: "[o,o] => o"  (infixr 30)
lcp@104
   854
rules
lcp@104
   855
conjI  "[| P; Q |] ==> P & Q"
lcp@104
   856
conjE1 "P & Q ==> P"
lcp@104
   857
conjE2 "P & Q ==> Q"
lcp@104
   858
end
lcp@104
   859
\end{ttbox}
lcp@104
   860
And if we want to have all three connectives together, we define:
lcp@104
   861
\begin{ttbox}
lcp@104
   862
MinIFC = MinIF + MinC
lcp@104
   863
\end{ttbox}
lcp@104
   864
Now we can prove mixed theorems like
lcp@104
   865
\begin{ttbox}
lcp@104
   866
goal MinIFC.thy "P & False --> Q";
lcp@104
   867
by (resolve_tac [MinI.impI] 1);
lcp@104
   868
by (dresolve_tac [MinC.conjE2] 1);
lcp@104
   869
by (eresolve_tac [MinIF.FalseE] 1);
lcp@104
   870
\end{ttbox}
lcp@104
   871
Try this as an exercise!