| author | wenzelm | 
| Thu, 14 Oct 1999 15:03:34 +0200 | |
| changeset 7865 | d9be8bc5624e | 
| parent 6592 | c120262044b6 | 
| child 8136 | 8c65f3ca13f2 | 
| permissions | -rw-r--r-- | 
| 320 | 1 | %% $Id$ | 
| 2 | \chapter{Defining Logics} \label{Defining-Logics}
 | |
| 3 | This chapter explains how to define new formal systems --- in particular, | |
| 4 | their concrete syntax. While Isabelle can be regarded as a theorem prover | |
| 5 | for set theory, higher-order logic or the sequent calculus, its | |
| 6 | distinguishing feature is support for the definition of new logics. | |
| 7 | ||
| 8 | Isabelle logics are hierarchies of theories, which are described and | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 9 | illustrated in | 
| 320 | 10 | \iflabelundefined{sec:defining-theories}{{\em Introduction to Isabelle}}%
 | 
| 11 | {\S\ref{sec:defining-theories}}.  That material, together with the theory
 | |
| 12 | files provided in the examples directories, should suffice for all simple | |
| 13 | applications. The easiest way to define a new theory is by modifying a | |
| 14 | copy of an existing theory. | |
| 15 | ||
| 16 | This chapter documents the meta-logic syntax, mixfix declarations and | |
| 17 | pretty printing.  The extended examples in \S\ref{sec:min_logics}
 | |
| 18 | demonstrate the logical aspects of the definition of theories. | |
| 19 | ||
| 20 | ||
| 21 | \section{Priority grammars} \label{sec:priority_grammars}
 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 22 | \index{priority grammars|(}
 | 
| 320 | 23 | |
| 24 | A context-free grammar contains a set of {\bf nonterminal symbols}, a set of
 | |
| 25 | {\bf terminal symbols} and a set of {\bf productions}\index{productions}.
 | |
| 26 | Productions have the form ${A=\gamma}$, where $A$ is a nonterminal and
 | |
| 27 | $\gamma$ is a string of terminals and nonterminals. One designated | |
| 28 | nonterminal is called the {\bf start symbol}.  The language defined by the
 | |
| 29 | grammar consists of all strings of terminals that can be derived from the | |
| 30 | start symbol by applying productions as rewrite rules. | |
| 31 | ||
| 32 | The syntax of an Isabelle logic is specified by a {\bf priority
 | |
| 33 |   grammar}.\index{priorities} Each nonterminal is decorated by an integer
 | |
| 34 | priority, as in~$A^{(p)}$.  A nonterminal $A^{(p)}$ in a derivation may be
 | |
| 35 | rewritten using a production $A^{(q)} = \gamma$ only if~$p \le q$.  Any
 | |
| 36 | priority grammar can be translated into a normal context free grammar by | |
| 37 | introducing new nonterminals and productions. | |
| 38 | ||
| 39 | Formally, a set of context free productions $G$ induces a derivation | |
| 40 | relation $\longrightarrow@G$. Let $\alpha$ and $\beta$ denote strings of | |
| 41 | terminal or nonterminal symbols. Then | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 42 | \[ \alpha\, A^{(p)}\, \beta ~\longrightarrow@G~ \alpha\,\gamma\,\beta \]
 | 
| 320 | 43 | if and only if $G$ contains some production $A^{(q)}=\gamma$ for~$p \le q$.
 | 
| 44 | ||
| 45 | The following simple grammar for arithmetic expressions demonstrates how | |
| 46 | binding power and associativity of operators can be enforced by priorities. | |
| 47 | \begin{center}
 | |
| 48 | \begin{tabular}{rclr}
 | |
| 49 |   $A^{(9)}$ & = & {\tt0} \\
 | |
| 50 |   $A^{(9)}$ & = & {\tt(} $A^{(0)}$ {\tt)} \\
 | |
| 51 |   $A^{(0)}$ & = & $A^{(0)}$ {\tt+} $A^{(1)}$ \\
 | |
| 52 |   $A^{(2)}$ & = & $A^{(3)}$ {\tt*} $A^{(2)}$ \\
 | |
| 53 |   $A^{(3)}$ & = & {\tt-} $A^{(3)}$
 | |
| 54 | \end{tabular}
 | |
| 55 | \end{center}
 | |
| 56 | The choice of priorities determines that {\tt -} binds tighter than {\tt *},
 | |
| 57 | which binds tighter than {\tt +}.  Furthermore {\tt +} associates to the
 | |
| 58 | left and {\tt *} to the right.
 | |
| 59 | ||
| 60 | For clarity, grammars obey these conventions: | |
| 61 | \begin{itemize}
 | |
| 62 | \item All priorities must lie between~0 and \ttindex{max_pri}, which is a
 | |
| 63 |   some fixed integer.  Sometimes {\tt max_pri} is written as $\infty$.
 | |
| 64 | \item Priority 0 on the right-hand side and priority \ttindex{max_pri} on
 | |
| 65 | the left-hand side may be omitted. | |
| 66 | \item The production $A^{(p)} = \alpha$ is written as $A = \alpha~(p)$; the
 | |
| 67 | priority of the left-hand side actually appears in a column on the far | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 68 | right. | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 69 | \item Alternatives are separated by~$|$. | 
| 320 | 70 | \item Repetition is indicated by dots~(\dots) in an informal but obvious | 
| 71 | way. | |
| 72 | \end{itemize}
 | |
| 73 | ||
| 74 | Using these conventions and assuming $\infty=9$, the grammar | |
| 75 | takes the form | |
| 76 | \begin{center}
 | |
| 77 | \begin{tabular}{rclc}
 | |
| 78 | $A$ & = & {\tt0} & \hspace*{4em} \\
 | |
| 79 |  & $|$ & {\tt(} $A$ {\tt)} \\
 | |
| 80 |  & $|$ & $A$ {\tt+} $A^{(1)}$ & (0) \\
 | |
| 81 |  & $|$ & $A^{(3)}$ {\tt*} $A^{(2)}$ & (2) \\
 | |
| 82 |  & $|$ & {\tt-} $A^{(3)}$ & (3)
 | |
| 83 | \end{tabular}
 | |
| 84 | \end{center}
 | |
| 85 | \index{priority grammars|)}
 | |
| 86 | ||
| 87 | ||
| 4597 
a0bdee64194c
Fixed a lot of overfull and underfull lines (hboxes)
 paulson parents: 
4543diff
changeset | 88 | \begin{figure}\small
 | 
| 320 | 89 | \begin{center}
 | 
| 90 | \begin{tabular}{rclc}
 | |
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 91 | $any$ &=& $prop$ ~~$|$~~ $logic$ \\\\ | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 92 | $prop$ &=& {\tt(} $prop$ {\tt)} \\
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 93 |      &$|$& $prop^{(4)}$ {\tt::} $type$ & (3) \\
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 94 |      &$|$& {\tt PROP} $aprop$ \\
 | 
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 95 |      &$|$& $any^{(3)}$ {\tt ==} $any^{(2)}$ & (2) \\
 | 
| 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 96 |      &$|$& $any^{(3)}$ {\tt =?=} $any^{(2)}$ & (2) \\
 | 
| 320 | 97 |      &$|$& $prop^{(2)}$ {\tt ==>} $prop^{(1)}$ & (1) \\
 | 
| 98 |      &$|$& {\tt[|} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt|]} {\tt==>} $prop^{(1)}$ & (1) \\
 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 99 |      &$|$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 100 |      &$|$& {\tt OFCLASS} {\tt(} $type$ {\tt,} $logic$ {\tt)} \\\\
 | 
| 4543 | 101 | $aprop$ &=& $id$ ~~$|$~~ $longid$ ~~$|$~~ $var$ | 
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 102 |     ~~$|$~~ $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\\\
 | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 103 | $logic$ &=& {\tt(} $logic$ {\tt)} \\
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 104 |       &$|$& $logic^{(4)}$ {\tt::} $type$ & (3) \\
 | 
| 4543 | 105 | &$|$& $id$ ~~$|$~~ $longid$ ~~$|$~~ $var$ | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 106 |     ~~$|$~~ $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\
 | 
| 3694 | 107 |       &$|$& {\tt \%} $pttrns$ {\tt.} $any^{(3)}$ & (3) \\\\
 | 
| 320 | 108 | $idts$ &=& $idt$ ~~$|$~~ $idt^{(1)}$ $idts$ \\\\
 | 
| 109 | $idt$ &=& $id$ ~~$|$~~ {\tt(} $idt$ {\tt)} \\
 | |
| 110 |     &$|$& $id$ {\tt ::} $type$ & (0) \\\\
 | |
| 3694 | 111 | $pttrns$ &=& $pttrn$ ~~$|$~~ $pttrn^{(1)}$ $pttrns$ \\\\
 | 
| 112 | $pttrn$ &=& $idt$ \\\\ | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 113 | $type$ &=& {\tt(} $type$ {\tt)} \\
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 114 |      &$|$& $tid$ ~~$|$~~ $tvar$ ~~$|$~~ $tid$ {\tt::} $sort$
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 115 |        ~~$|$~~ $tvar$ {\tt::} $sort$ \\
 | 
| 320 | 116 |      &$|$& $id$ ~~$|$~~ $type^{(\infty)}$ $id$
 | 
| 117 |                 ~~$|$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\
 | |
| 4543 | 118 |      &$|$& $longid$ ~~$|$~~ $type^{(\infty)}$ $longid$
 | 
| 119 |                 ~~$|$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $longid$ \\
 | |
| 320 | 120 |      &$|$& $type^{(1)}$ {\tt =>} $type$ & (0) \\
 | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 121 |      &$|$& {\tt[}  $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0) \\\\
 | 
| 4543 | 122 | $sort$ &=& $id$ ~~$|$~~ $longid$ ~~$|$~~ {\tt\ttlbrace\ttrbrace} ~~$|$~~
 | 
| 123 |   {\tt\ttlbrace} $id$ ~$|$~ $longid$ {\tt,} \dots {\tt,} $id$ ~$|$~ $longid$ {\tt\ttrbrace}
 | |
| 320 | 124 | \end{tabular}
 | 
| 125 | \index{*PROP symbol}
 | |
| 126 | \index{*== symbol}\index{*=?= symbol}\index{*==> symbol}
 | |
| 127 | \index{*:: symbol}\index{*=> symbol}
 | |
| 332 | 128 | \index{sort constraints}
 | 
| 129 | %the index command: a percent is permitted, but braces must match! | |
| 320 | 130 | \index{%@{\tt\%} symbol}
 | 
| 131 | \index{{}@{\tt\ttlbrace} symbol}\index{{}@{\tt\ttrbrace} symbol}
 | |
| 132 | \index{*[ symbol}\index{*] symbol}
 | |
| 133 | \index{*"!"! symbol}
 | |
| 134 | \index{*"["| symbol}
 | |
| 135 | \index{*"|"] symbol}
 | |
| 136 | \end{center}
 | |
| 137 | \caption{Meta-logic syntax}\label{fig:pure_gram}
 | |
| 138 | \end{figure}
 | |
| 139 | ||
| 140 | ||
| 141 | \section{The Pure syntax} \label{sec:basic_syntax}
 | |
| 142 | \index{syntax!Pure|(}
 | |
| 143 | ||
| 144 | At the root of all object-logics lies the theory \thydx{Pure}.  It
 | |
| 145 | contains, among many other things, the Pure syntax. An informal account of | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 146 | this basic syntax (types, terms and formulae) appears in | 
| 320 | 147 | \iflabelundefined{sec:forward}{{\em Introduction to Isabelle}}%
 | 
| 148 | {\S\ref{sec:forward}}.  A more precise description using a priority grammar
 | |
| 149 | appears in Fig.\ts\ref{fig:pure_gram}.  It defines the following
 | |
| 150 | nonterminals: | |
| 151 | \begin{ttdescription}
 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 152 |   \item[\ndxbold{any}] denotes any term.
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 153 | |
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 154 |   \item[\ndxbold{prop}] denotes terms of type {\tt prop}.  These are formulae
 | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 155 |     of the meta-logic.  Note that user constants of result type {\tt prop}
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 156 | (i.e.\ $c :: \ldots \To prop$) should always provide concrete syntax. | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 157 | Otherwise atomic propositions with head $c$ may be printed incorrectly. | 
| 320 | 158 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 159 |   \item[\ndxbold{aprop}] denotes atomic propositions.
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 160 | |
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 161 | %% FIXME huh!? | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 162 | % These typically | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 163 | % include the judgement forms of the object-logic; its definition | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 164 | % introduces a meta-level predicate for each judgement form. | 
| 320 | 165 | |
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 166 |   \item[\ndxbold{logic}] denotes terms whose type belongs to class
 | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 167 |     \cldx{logic}, excluding type \tydx{prop}.
 | 
| 320 | 168 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 169 |   \item[\ndxbold{idts}] denotes a list of identifiers, possibly constrained
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 170 | by types. | 
| 3694 | 171 | |
| 172 |   \item[\ndxbold{pttrn}, \ndxbold{pttrns}] denote patterns for
 | |
| 173 | abstraction, cases etc. Initially the same as $idt$ and $idts$, | |
| 174 | these are indetended to be augmented by user extensions. | |
| 320 | 175 | |
| 176 |   \item[\ndxbold{type}] denotes types of the meta-logic.
 | |
| 177 | ||
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 178 |   \item[\ndxbold{sort}] denotes meta-level sorts.
 | 
| 320 | 179 | \end{ttdescription}
 | 
| 180 | ||
| 181 | \begin{warn}
 | |
| 182 |   In {\tt idts}, note that \verb|x::nat y| is parsed as \verb|x::(nat y)|,
 | |
| 183 |   treating {\tt y} like a type constructor applied to {\tt nat}.  The
 | |
| 184 | likely result is an error message. To avoid this interpretation, use | |
| 185 | parentheses and write \verb|(x::nat) y|. | |
| 332 | 186 |   \index{type constraints}\index{*:: symbol}
 | 
| 320 | 187 | |
| 188 | Similarly, \verb|x::nat y::nat| is parsed as \verb|x::(nat y::nat)| and | |
| 189 | yields an error. The correct form is \verb|(x::nat) (y::nat)|. | |
| 190 | \end{warn}
 | |
| 191 | ||
| 452 | 192 | \begin{warn}
 | 
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 193 | Type constraints bind very weakly. For example, \verb!x<y::nat! is normally | 
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 194 | parsed as \verb!(x<y)::nat!, unless \verb$<$ has priority of 3 or less, in | 
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 195 | which case the string is likely to be ambiguous. The correct form is | 
| 452 | 196 | \verb!x<(y::nat)!. | 
| 197 | \end{warn}
 | |
| 320 | 198 | |
| 867 | 199 | \subsection{Logical types and default syntax}\label{logical-types}
 | 
| 200 | \index{lambda calc@$\lambda$-calculus}
 | |
| 201 | ||
| 202 | Isabelle's representation of mathematical languages is based on the | |
| 203 | simply typed $\lambda$-calculus. All logical types, namely those of | |
| 204 | class \cldx{logic}, are automatically equipped with a basic syntax of
 | |
| 205 | types, identifiers, variables, parentheses, $\lambda$-abstraction and | |
| 206 | application. | |
| 207 | \begin{warn}
 | |
| 208 |   Isabelle combines the syntaxes for all types of class \cldx{logic} by
 | |
| 209 | mapping all those types to the single nonterminal $logic$. Thus all | |
| 210 | productions of $logic$, in particular $id$, $var$ etc, become available. | |
| 211 | \end{warn}
 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 212 | |
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 213 | |
| 320 | 214 | \subsection{Lexical matters}
 | 
| 215 | The parser does not process input strings directly. It operates on token | |
| 216 | lists provided by Isabelle's \bfindex{lexer}.  There are two kinds of
 | |
| 217 | tokens: \bfindex{delimiters} and \bfindex{name tokens}.
 | |
| 218 | ||
| 219 | \index{reserved words}
 | |
| 220 | Delimiters can be regarded as reserved words of the syntax. You can | |
| 221 | add new ones when extending theories.  In Fig.\ts\ref{fig:pure_gram} they
 | |
| 222 | appear in typewriter font, for example {\tt ==}, {\tt =?=} and
 | |
| 223 | {\tt PROP}\@.
 | |
| 224 | ||
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 225 | Name tokens have a predefined syntax. The lexer distinguishes six disjoint | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 226 | classes of names: \rmindex{identifiers}, \rmindex{unknowns}, type
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 227 | identifiers\index{type identifiers}, type unknowns\index{type unknowns},
 | 
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 228 | \rmindex{numerals}, \rmindex{strings}.  They are denoted by \ndxbold{id},
 | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 229 | \ndxbold{var}, \ndxbold{tid}, \ndxbold{tvar}, \ndxbold{xnum}, \ndxbold{xstr},
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 230 | respectively.  Typical examples are {\tt x}, {\tt ?x7}, {\tt 'a}, {\tt ?'a3},
 | 
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 231 | {\tt \#42}, {\tt ''foo bar''}.  Here is the precise syntax:
 | 
| 320 | 232 | \begin{eqnarray*}
 | 
| 233 | id & = & letter~quasiletter^* \\ | |
| 4543 | 234 | longid    & =   & id\mbox{\tt .}id~\dots~id \\
 | 
| 320 | 235 | var       & =   & \mbox{\tt ?}id ~~|~~ \mbox{\tt ?}id\mbox{\tt .}nat \\
 | 
| 236 | tid       & =   & \mbox{\tt '}id \\
 | |
| 237 | tvar      & =   & \mbox{\tt ?}tid ~~|~~
 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 238 |                   \mbox{\tt ?}tid\mbox{\tt .}nat \\
 | 
| 5542 | 239 | xnum      & =   & \mbox{\tt \#}nat ~~|~~ \mbox{\tt \#-}nat \\
 | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 240 | xstr      & =   & \mbox{\tt ''\rm text\tt ''} \\[1ex]
 | 
| 320 | 241 | letter    & =   & \mbox{one of {\tt a}\dots {\tt z} {\tt A}\dots {\tt Z}} \\
 | 
| 242 | digit     & =   & \mbox{one of {\tt 0}\dots {\tt 9}} \\
 | |
| 243 | quasiletter & =  & letter ~~|~~ digit ~~|~~ \mbox{\tt _} ~~|~~ \mbox{\tt '} \\
 | |
| 244 | nat & = & digit^+ | |
| 245 | \end{eqnarray*}
 | |
| 4543 | 246 | The lexer repeatedly takes the longest prefix of the input string that | 
| 247 | forms a valid token. A maximal prefix that is both a delimiter and a | |
| 248 | name is treated as a delimiter. Spaces, tabs, newlines and formfeeds | |
| 249 | are separators; they never occur within tokens, except those of class | |
| 250 | $xstr$. | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 251 | |
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 252 | \medskip | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 253 | Delimiters need not be separated by white space.  For example, if {\tt -}
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 254 | is a delimiter but {\tt --} is not, then the string {\tt --} is treated as
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 255 | two consecutive occurrences of the token~{\tt -}.  In contrast, \ML\
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 256 | treats {\tt --} as a single symbolic name.  The consequence of Isabelle's
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 257 | more liberal scheme is that the same string may be parsed in different ways | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 258 | after extending the syntax: after adding {\tt --} as a delimiter, the input
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 259 | {\tt --} is treated as a single token.
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 260 | |
| 320 | 261 | A \ndxbold{var} or \ndxbold{tvar} describes an unknown, which is internally
 | 
| 262 | a pair of base name and index (\ML\ type \mltydx{indexname}).  These
 | |
| 263 | components are either separated by a dot as in {\tt ?x.1} or {\tt ?x7.3} or
 | |
| 264 | run together as in {\tt ?x1}.  The latter form is possible if the base name
 | |
| 265 | does not end with digits. If the index is 0, it may be dropped altogether: | |
| 266 | {\tt ?x} abbreviates both {\tt ?x0} and {\tt ?x.0}.
 | |
| 267 | ||
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 268 | Tokens of class $xnum$ or $xstr$ are not used by the meta-logic. | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 269 | Object-logics may provide numerals and string constants by adding appropriate | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 270 | productions and translation functions. | 
| 320 | 271 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 272 | \medskip | 
| 320 | 273 | Although name tokens are returned from the lexer rather than the parser, it | 
| 274 | is more logical to regard them as nonterminals. Delimiters, however, are | |
| 275 | terminals; they are just syntactic sugar and contribute nothing to the | |
| 276 | abstract syntax tree. | |
| 277 | ||
| 278 | ||
| 3108 | 279 | \subsection{*Inspecting the syntax} \label{pg:print_syn}
 | 
| 320 | 280 | \begin{ttbox}
 | 
| 281 | syn_of : theory -> Syntax.syntax | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 282 | print_syntax : theory -> unit | 
| 320 | 283 | Syntax.print_syntax : Syntax.syntax -> unit | 
| 284 | Syntax.print_gram : Syntax.syntax -> unit | |
| 285 | Syntax.print_trans : Syntax.syntax -> unit | |
| 286 | \end{ttbox}
 | |
| 287 | The abstract type \mltydx{Syntax.syntax} allows manipulation of syntaxes
 | |
| 288 | in \ML. You can display values of this type by calling the following | |
| 289 | functions: | |
| 290 | \begin{ttdescription}
 | |
| 291 | \item[\ttindexbold{syn_of} {\it thy}] returns the syntax of the Isabelle
 | |
| 292 |   theory~{\it thy} as an \ML\ value.
 | |
| 293 | ||
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 294 | \item[\ttindexbold{print_syntax} $thy$] displays the syntax part of $thy$
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 295 |   using {\tt Syntax.print_syntax}.
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 296 | |
| 320 | 297 | \item[\ttindexbold{Syntax.print_syntax} {\it syn}] shows virtually all
 | 
| 298 |   information contained in the syntax {\it syn}.  The displayed output can
 | |
| 299 | be large. The following two functions are more selective. | |
| 300 | ||
| 301 | \item[\ttindexbold{Syntax.print_gram} {\it syn}] shows the grammar part
 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 302 |   of~{\it syn}, namely the lexicon, logical types and productions.  These are
 | 
| 320 | 303 | discussed below. | 
| 304 | ||
| 305 | \item[\ttindexbold{Syntax.print_trans} {\it syn}] shows the translation
 | |
| 306 |   part of~{\it syn}, namely the constants, parse/print macros and
 | |
| 307 | parse/print translations. | |
| 308 | \end{ttdescription}
 | |
| 309 | ||
| 310 | Let us demonstrate these functions by inspecting Pure's syntax. Even that | |
| 311 | is too verbose to display in full. | |
| 312 | \begin{ttbox}\index{*Pure theory}
 | |
| 313 | Syntax.print_syntax (syn_of Pure.thy); | |
| 314 | {\out lexicon: "!!" "\%" "(" ")" "," "." "::" ";" "==" "==>" \dots}
 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 315 | {\out logtypes: fun itself}
 | 
| 320 | 316 | {\out prods:}
 | 
| 317 | {\out   type = tid  (1000)}
 | |
| 318 | {\out   type = tvar  (1000)}
 | |
| 319 | {\out   type = id  (1000)}
 | |
| 320 | {\out   type = tid "::" sort[0]  => "_ofsort" (1000)}
 | |
| 321 | {\out   type = tvar "::" sort[0]  => "_ofsort" (1000)}
 | |
| 322 | {\out   \vdots}
 | |
| 323 | \ttbreak | |
| 3108 | 324 | {\out print modes: "symbols" "xterm"}
 | 
| 320 | 325 | {\out consts: "_K" "_appl" "_aprop" "_args" "_asms" "_bigimpl" \dots}
 | 
| 326 | {\out parse_ast_translation: "_appl" "_bigimpl" "_bracket"}
 | |
| 327 | {\out   "_idtyp" "_lambda" "_tapp" "_tappl"}
 | |
| 328 | {\out parse_rules:}
 | |
| 329 | {\out parse_translation: "!!" "_K" "_abs" "_aprop"}
 | |
| 330 | {\out print_translation: "all"}
 | |
| 331 | {\out print_rules:}
 | |
| 332 | {\out print_ast_translation: "==>" "_abs" "_idts" "fun"}
 | |
| 333 | \end{ttbox}
 | |
| 334 | ||
| 332 | 335 | As you can see, the output is divided into labelled sections. The grammar | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 336 | is represented by {\tt lexicon}, {\tt logtypes} and {\tt prods}.  The rest
 | 
| 320 | 337 | refers to syntactic translations and macro expansion. Here is an | 
| 338 | explanation of the various sections. | |
| 339 | \begin{description}
 | |
| 340 |   \item[{\tt lexicon}] lists the delimiters used for lexical
 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 341 |     analysis.\index{delimiters}
 | 
| 320 | 342 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 343 |   \item[{\tt logtypes}] lists the types that are regarded the same as {\tt
 | 
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 344 |     logic} syntactically.  Thus types of object-logics (e.g.\ {\tt nat}, say)
 | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 345 | will be automatically equipped with the standard syntax of | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 346 | $\lambda$-calculus. | 
| 320 | 347 | |
| 348 |   \item[{\tt prods}] lists the \rmindex{productions} of the priority grammar.
 | |
| 349 |     The nonterminal $A^{(n)}$ is rendered in {\sc ascii} as {\tt $A$[$n$]}.
 | |
| 350 |     Each delimiter is quoted.  Some productions are shown with {\tt =>} and
 | |
| 351 | an attached string. These strings later become the heads of parse | |
| 352 | trees; they also play a vital role when terms are printed (see | |
| 353 |     \S\ref{sec:asts}).
 | |
| 354 | ||
| 355 |     Productions with no strings attached are called {\bf copy
 | |
| 356 |       productions}\indexbold{productions!copy}.  Their right-hand side must
 | |
| 357 | have exactly one nonterminal symbol (or name token). The parser does | |
| 358 | not create a new parse tree node for copy productions, but simply | |
| 359 | returns the parse tree of the right-hand symbol. | |
| 360 | ||
| 361 | If the right-hand side consists of a single nonterminal with no | |
| 362 |     delimiters, then the copy production is called a {\bf chain
 | |
| 363 | production}. Chain productions act as abbreviations: | |
| 364 | conceptually, they are removed from the grammar by adding new | |
| 365 | productions. Priority information attached to chain productions is | |
| 366 | ignored; only the dummy value $-1$ is displayed. | |
| 3108 | 367 | |
| 368 |   \item[\ttindex{print_modes}] lists the alternative print modes
 | |
| 369 |     provided by this syntax (see \S\ref{sec:prmodes}).
 | |
| 320 | 370 | |
| 371 |   \item[{\tt consts}, {\tt parse_rules}, {\tt print_rules}]
 | |
| 372 |     relate to macros (see \S\ref{sec:macros}).
 | |
| 373 | ||
| 374 |   \item[{\tt parse_ast_translation}, {\tt print_ast_translation}]
 | |
| 375 | list sets of constants that invoke translation functions for abstract | |
| 376 |     syntax trees.  Section \S\ref{sec:asts} below discusses this obscure
 | |
| 377 |     matter.\index{constants!for translations}
 | |
| 378 | ||
| 4597 
a0bdee64194c
Fixed a lot of overfull and underfull lines (hboxes)
 paulson parents: 
4543diff
changeset | 379 |   \item[{\tt parse_translation}, {\tt print_translation}] list the sets
 | 
| 320 | 380 | of constants that invoke translation functions for terms (see | 
| 381 |     \S\ref{sec:tr_funs}).
 | |
| 382 | \end{description}
 | |
| 383 | \index{syntax!Pure|)}
 | |
| 384 | ||
| 385 | ||
| 386 | \section{Mixfix declarations} \label{sec:mixfix}
 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 387 | \index{mixfix declarations|(}
 | 
| 320 | 388 | |
| 389 | When defining a theory, you declare new constants by giving their names, | |
| 390 | their type, and an optional {\bf mixfix annotation}.  Mixfix annotations
 | |
| 391 | allow you to extend Isabelle's basic $\lambda$-calculus syntax with | |
| 392 | readable notation. They can express any context-free priority grammar. | |
| 393 | Isabelle syntax definitions are inspired by \OBJ~\cite{OBJ}; they are more
 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 394 | general than the priority declarations of \ML\ and Prolog. | 
| 320 | 395 | |
| 396 | A mixfix annotation defines a production of the priority grammar. It | |
| 397 | describes the concrete syntax, the translation to abstract syntax, and the | |
| 398 | pretty printing. Special case annotations provide a simple means of | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 399 | specifying infix operators and binders. | 
| 320 | 400 | |
| 401 | \subsection{The general mixfix form}
 | |
| 402 | Here is a detailed account of mixfix declarations. Suppose the following | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 403 | line occurs within a {\tt consts} or {\tt syntax} section of a {\tt .thy}
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 404 | file: | 
| 320 | 405 | \begin{center}
 | 
| 406 |   {\tt $c$ ::\ "$\sigma$" ("$template$" $ps$ $p$)}
 | |
| 407 | \end{center}
 | |
| 332 | 408 | This constant declaration and mixfix annotation are interpreted as follows: | 
| 320 | 409 | \begin{itemize}\index{productions}
 | 
| 410 | \item The string {\tt $c$} is the name of the constant associated with the
 | |
| 411 | production; unless it is a valid identifier, it must be enclosed in | |
| 412 |   quotes.  If $c$ is empty (given as~{\tt ""}) then this is a copy
 | |
| 413 |   production.\index{productions!copy} Otherwise, parsing an instance of the
 | |
| 414 |   phrase $template$ generates the \AST{} {\tt ("$c$" $a@1$ $\ldots$
 | |
| 415 |     $a@n$)}, where $a@i$ is the \AST{} generated by parsing the $i$-th
 | |
| 416 | argument. | |
| 417 | ||
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 418 | \item The constant $c$, if non-empty, is declared to have type $\sigma$ | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 419 |     ({\tt consts} section only).
 | 
| 320 | 420 | |
| 421 | \item The string $template$ specifies the right-hand side of | |
| 422 | the production. It has the form | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 423 | \[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n, \] | 
| 320 | 424 |     where each occurrence of {\tt_} denotes an argument position and
 | 
| 425 |     the~$w@i$ do not contain~{\tt _}.  (If you want a literal~{\tt _} in
 | |
| 426 | the concrete syntax, you must escape it as described below.) The $w@i$ | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 427 |     may consist of \rmindex{delimiters}, spaces or
 | 
| 320 | 428 |     \rmindex{pretty printing} annotations (see below).
 | 
| 429 | ||
| 430 | \item The type $\sigma$ specifies the production's nonterminal symbols | |
| 431 | (or name tokens). If $template$ is of the form above then $\sigma$ | |
| 432 | must be a function type with at least~$n$ argument positions, say | |
| 433 | $\sigma = [\tau@1, \dots, \tau@n] \To \tau$. Nonterminal symbols are | |
| 434 | derived from the types $\tau@1$, \ldots,~$\tau@n$, $\tau$ as described | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 435 | below. Any of these may be function types. | 
| 320 | 436 | |
| 437 |   \item The optional list~$ps$ may contain at most $n$ integers, say {\tt
 | |
| 438 | [$p@1$, $\ldots$, $p@m$]}, where $p@i$ is the minimal | |
| 439 |     priority\indexbold{priorities} required of any phrase that may appear
 | |
| 440 | as the $i$-th argument. Missing priorities default to~0. | |
| 4543 | 441 | |
| 442 | \item The integer $p$ is the priority of this production. If | |
| 443 | omitted, it defaults to the maximal priority. Priorities range | |
| 444 |     between 0 and \ttindexbold{max_pri} (= 1000).
 | |
| 320 | 445 | |
| 446 | \end{itemize}
 | |
| 447 | % | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 448 | The resulting production is \[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\,
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 449 | A@2^{(p@2)}\, \dots\, A@n^{(p@n)}\, w@n \] where $A$ and the $A@i$ are the
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 450 | nonterminals corresponding to the types $\tau$ and $\tau@i$ respectively. | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 451 | The nonterminal symbol associated with a type $(\ldots)ty$ is {\tt logic}, if
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 452 | this is a logical type (namely one of class {\tt logic} excluding {\tt
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 453 | prop}). Otherwise it is $ty$ (note that only the outermost type constructor | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 454 | is taken into account).  Finally, the nonterminal of a type variable is {\tt
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 455 | any}. | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 456 | |
| 911 
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
 wenzelm parents: 
885diff
changeset | 457 | \begin{warn}
 | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 458 | Theories must sometimes declare types for purely syntactic purposes --- | 
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 459 |   merely playing the role of nonterminals.  One example is \tydx{type}, the
 | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 460 | built-in type of types. This is a `type of all types' in the syntactic | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 461 |   sense only.  Do not declare such types under {\tt arities} as belonging to
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 462 |   class {\tt logic}\index{*logic class}, for that would make them useless as
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 463 | separate nonterminal symbols. | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 464 | \end{warn}
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 465 | |
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 466 | Associating nonterminals with types allows a constant's type to specify | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 467 | syntax as well. We can declare the function~$f$ to have type $[\tau@1, | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 468 | \ldots, \tau@n]\To \tau$ and, through a mixfix annotation, specify the layout | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 469 | of the function's $n$ arguments. The constant's name, in this case~$f$, will | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 470 | also serve as the label in the abstract syntax tree. | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 471 | |
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 472 | You may also declare mixfix syntax without adding constants to the theory's | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 473 | signature, by using a {\tt syntax} section instead of {\tt consts}.  Thus a
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 474 | production need not map directly to a logical function (this typically | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 475 | requires additional syntactic translations, see also | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 476 | Chapter~\ref{chap:syntax}).
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 477 | |
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 478 | |
| 911 
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
 wenzelm parents: 
885diff
changeset | 479 | \medskip | 
| 
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
 wenzelm parents: 
885diff
changeset | 480 | As a special case of the general mixfix declaration, the form | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 481 | \begin{center}
 | 
| 911 
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
 wenzelm parents: 
885diff
changeset | 482 |   {\tt $c$ ::\ "$\sigma$" ("$template$")}
 | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 483 | \end{center}
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 484 | specifies no priorities. The resulting production puts no priority | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 485 | constraints on any of its arguments and has maximal priority itself. | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 486 | Omitting priorities in this manner is prone to syntactic ambiguities unless | 
| 3098 | 487 | the production's right-hand side is fully bracketed, as in | 
| 488 | \verb|"if _ then _ else _ fi"|. | |
| 320 | 489 | |
| 490 | Omitting the mixfix annotation completely, as in {\tt $c$ ::\ "$\sigma$"},
 | |
| 491 | is sensible only if~$c$ is an identifier. Otherwise you will be unable to | |
| 492 | write terms involving~$c$. | |
| 493 | ||
| 494 | ||
| 495 | \subsection{Example: arithmetic expressions}
 | |
| 496 | \index{examples!of mixfix declarations}
 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 497 | This theory specification contains a {\tt syntax} section with mixfix
 | 
| 320 | 498 | declarations encoding the priority grammar from | 
| 499 | \S\ref{sec:priority_grammars}:
 | |
| 500 | \begin{ttbox}
 | |
| 3108 | 501 | ExpSyntax = Pure + | 
| 320 | 502 | types | 
| 503 | exp | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 504 | syntax | 
| 1387 | 505 |   "0" :: exp                 ("0"      9)
 | 
| 506 |   "+" :: [exp, exp] => exp   ("_ + _"  [0, 1] 0)
 | |
| 507 |   "*" :: [exp, exp] => exp   ("_ * _"  [3, 2] 2)
 | |
| 508 |   "-" :: exp => exp          ("- _"    [3] 3)
 | |
| 320 | 509 | end | 
| 510 | \end{ttbox}
 | |
| 3108 | 511 | If you put this into a file {\tt ExpSyntax.thy} and load it via {\tt
 | 
| 512 | use_thy"ExpSyntax"}, you can run some tests: | |
| 320 | 513 | \begin{ttbox}
 | 
| 3108 | 514 | val read_exp = Syntax.test_read (syn_of ExpSyntax.thy) "exp"; | 
| 320 | 515 | {\out val it = fn : string -> unit}
 | 
| 516 | read_exp "0 * 0 * 0 * 0 + 0 + 0 + 0"; | |
| 517 | {\out tokens: "0" "*" "0" "*" "0" "*" "0" "+" "0" "+" "0" "+" "0"}
 | |
| 518 | {\out raw: ("+" ("+" ("+" ("*" "0" ("*" "0" ("*" "0" "0"))) "0") "0") "0")}
 | |
| 519 | {\out \vdots}
 | |
| 520 | read_exp "0 + - 0 + 0"; | |
| 521 | {\out tokens: "0" "+" "-" "0" "+" "0"}
 | |
| 522 | {\out raw: ("+" ("+" "0" ("-" "0")) "0")}
 | |
| 523 | {\out \vdots}
 | |
| 524 | \end{ttbox}
 | |
| 525 | The output of \ttindex{Syntax.test_read} includes the token list ({\tt
 | |
| 526 |   tokens}) and the raw \AST{} directly derived from the parse tree,
 | |
| 527 | ignoring parse \AST{} translations.  The rest is tracing information
 | |
| 528 | provided by the macro expander (see \S\ref{sec:macros}).
 | |
| 529 | ||
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 530 | Executing {\tt Syntax.print_gram} reveals the productions derived from the
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 531 | above mixfix declarations (lots of additional information deleted): | 
| 320 | 532 | \begin{ttbox}
 | 
| 3108 | 533 | Syntax.print_gram (syn_of ExpSyntax.thy); | 
| 320 | 534 | {\out exp = "0"  => "0" (9)}
 | 
| 535 | {\out exp = exp[0] "+" exp[1]  => "+" (0)}
 | |
| 536 | {\out exp = exp[3] "*" exp[2]  => "*" (2)}
 | |
| 537 | {\out exp = "-" exp[3]  => "-" (3)}
 | |
| 538 | \end{ttbox}
 | |
| 539 | ||
| 3108 | 540 | Note that because {\tt exp} is not of class {\tt logic}, it has been
 | 
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 541 | retained as a separate nonterminal. This also entails that the syntax | 
| 3108 | 542 | does not provide for identifiers or paranthesized expressions. | 
| 543 | Normally you would also want to add the declaration {\tt arities
 | |
| 544 |   exp::logic} after {\tt types} and use {\tt consts} instead of {\tt
 | |
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 545 | syntax}. Try this as an exercise and study the changes in the | 
| 867 | 546 | grammar. | 
| 320 | 547 | |
| 548 | \subsection{The mixfix template}
 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 549 | Let us now take a closer look at the string $template$ appearing in mixfix | 
| 320 | 550 | annotations. This string specifies a list of parsing and printing | 
| 551 | directives: delimiters\index{delimiters}, arguments, spaces, blocks of
 | |
| 552 | indentation and line breaks. These are encoded by the following character | |
| 553 | sequences: | |
| 554 | \index{pretty printing|(}
 | |
| 555 | \begin{description}
 | |
| 556 | \item[~$d$~] is a delimiter, namely a non-empty sequence of characters | |
| 557 |   other than the special characters {\tt _}, {\tt(}, {\tt)} and~{\tt/}.
 | |
| 558 | Even these characters may appear if escaped; this means preceding it with | |
| 559 |   a~{\tt '} (single quote).  Thus you have to write {\tt ''} if you really
 | |
| 911 
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
 wenzelm parents: 
885diff
changeset | 560 |   want a single quote.  Furthermore, a~{\tt '} followed by a space separates
 | 
| 
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
 wenzelm parents: 
885diff
changeset | 561 | delimiters without extra white space being added for printing. | 
| 320 | 562 | |
| 563 | \item[~{\tt_}~] is an argument position, which stands for a nonterminal symbol
 | |
| 564 | or name token. | |
| 565 | ||
| 566 | \item[~$s$~] is a non-empty sequence of spaces for printing. This and the | |
| 567 | following specifications do not affect parsing at all. | |
| 568 | ||
| 569 | \item[~{\tt(}$n$~] opens a pretty printing block.  The optional number $n$
 | |
| 570 | specifies how much indentation to add when a line break occurs within the | |
| 571 |   block.  If {\tt(} is not followed by digits, the indentation defaults
 | |
| 572 | to~0. | |
| 573 | ||
| 574 | \item[~{\tt)}~] closes a pretty printing block.
 | |
| 575 | ||
| 576 | \item[~{\tt//}~] forces a line break.
 | |
| 577 | ||
| 578 | \item[~{\tt/}$s$~] allows a line break.  Here $s$ stands for the string of
 | |
| 579 |   spaces (zero or more) right after the {\tt /} character.  These spaces
 | |
| 580 | are printed if the break is not taken. | |
| 581 | \end{description}
 | |
| 582 | For example, the template {\tt"(_ +/ _)"} specifies an infix operator.
 | |
| 583 | There are two argument positions; the delimiter~{\tt+} is preceded by a
 | |
| 584 | space and followed by a space or line break; the entire phrase is a pretty | |
| 585 | printing block.  Other examples appear in Fig.\ts\ref{fig:set_trans} below.
 | |
| 586 | Isabelle's pretty printer resembles the one described in | |
| 6592 | 587 | Paulson~\cite{paulson-ml2}.
 | 
| 320 | 588 | |
| 589 | \index{pretty printing|)}
 | |
| 590 | ||
| 591 | ||
| 592 | \subsection{Infixes}
 | |
| 593 | \indexbold{infixes}
 | |
| 594 | ||
| 3108 | 595 | Infix operators associating to the left or right can be declared using | 
| 596 | {\tt infixl} or {\tt infixr}.  Basically, the form {\tt $c$ ::\ 
 | |
| 597 | $\sigma$ (infixl $p$)} abbreviates the mixfix declarations | |
| 320 | 598 | \begin{ttbox}
 | 
| 1387 | 599 | "op \(c\)" :: \(\sigma\)   ("(_ \(c\)/ _)" [\(p\), \(p+1\)] \(p\))
 | 
| 600 | "op \(c\)" :: \(\sigma\)   ("op \(c\)")
 | |
| 320 | 601 | \end{ttbox}
 | 
| 1387 | 602 | and {\tt $c$ ::\ $\sigma$ (infixr $p$)} abbreviates the mixfix declarations
 | 
| 320 | 603 | \begin{ttbox}
 | 
| 1387 | 604 | "op \(c\)" :: \(\sigma\)   ("(_ \(c\)/ _)" [\(p+1\), \(p\)] \(p\))
 | 
| 605 | "op \(c\)" :: \(\sigma\)   ("op \(c\)")
 | |
| 320 | 606 | \end{ttbox}
 | 
| 607 | The infix operator is declared as a constant with the prefix {\tt op}.
 | |
| 608 | Thus, prefixing infixes with \sdx{op} makes them behave like ordinary
 | |
| 609 | function symbols, as in \ML. Special characters occurring in~$c$ must be | |
| 610 | escaped, as in delimiters, using a single quote. | |
| 611 | ||
| 3108 | 612 | A slightly more general form of infix declarations allows constant | 
| 613 | names to be independent from their concrete syntax, namely \texttt{$c$
 | |
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 614 |   ::\ $\sigma$\ (infixl "$sy$" $p$)}, the same for \texttt{infixr}.  As
 | 
| 3108 | 615 | an example consider: | 
| 616 | \begin{ttbox}
 | |
| 617 | and :: [bool, bool] => bool (infixr "&" 35) | |
| 618 | \end{ttbox}
 | |
| 619 | The internal constant name will then be just \texttt{and}, without any
 | |
| 620 | \texttt{op} prefixed.
 | |
| 621 | ||
| 320 | 622 | |
| 623 | \subsection{Binders}
 | |
| 624 | \indexbold{binders}
 | |
| 625 | \begingroup | |
| 626 | \def\Q{{\cal Q}}
 | |
| 627 | A {\bf binder} is a variable-binding construct such as a quantifier.  The
 | |
| 628 | constant declaration | |
| 629 | \begin{ttbox}
 | |
| 1387 | 630 | \(c\) :: \(\sigma\) (binder "\(\Q\)" [\(pb\)] \(p\)) | 
| 320 | 631 | \end{ttbox}
 | 
| 632 | introduces a constant~$c$ of type~$\sigma$, which must have the form | |
| 633 | $(\tau@1 \To \tau@2) \To \tau@3$. Its concrete syntax is $\Q~x.P$, where | |
| 634 | $x$ is a bound variable of type~$\tau@1$, the body~$P$ has type $\tau@2$ | |
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 635 | and the whole term has type~$\tau@3$. The optional integer $pb$ | 
| 1060 
a122584b5cc5
In binders, the default body priority is now p instead of 0.
 lcp parents: 
911diff
changeset | 636 | specifies the body's priority, by default~$p$. Special characters | 
| 877 | 637 | in $\Q$ must be escaped using a single quote. | 
| 320 | 638 | |
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 639 | The declaration is expanded internally to something like | 
| 320 | 640 | \begin{ttbox}
 | 
| 3098 | 641 | \(c\)\hskip3pt :: (\(\tau@1\) => \(\tau@2\)) => \(\tau@3\) | 
| 642 | "\(\Q\)"  :: [idts, \(\tau@2\)] => \(\tau@3\)   ("(3\(\Q\)_./ _)" [0, \(pb\)] \(p\))
 | |
| 320 | 643 | \end{ttbox}
 | 
| 644 | Here \ndx{idts} is the nonterminal symbol for a list of identifiers with
 | |
| 332 | 645 | \index{type constraints}
 | 
| 320 | 646 | optional type constraints (see Fig.\ts\ref{fig:pure_gram}).  The
 | 
| 647 | declaration also installs a parse translation\index{translations!parse}
 | |
| 648 | for~$\Q$ and a print translation\index{translations!print} for~$c$ to
 | |
| 649 | translate between the internal and external forms. | |
| 650 | ||
| 651 | A binder of type $(\sigma \To \tau) \To \tau$ can be nested by giving a | |
| 652 | list of variables. The external form $\Q~x@1~x@2 \ldots x@n. P$ | |
| 653 | corresponds to the internal form | |
| 654 | \[ c(\lambda x@1. c(\lambda x@2. \ldots c(\lambda x@n. P) \ldots)). \] | |
| 655 | ||
| 656 | \medskip | |
| 657 | For example, let us declare the quantifier~$\forall$:\index{quantifiers}
 | |
| 658 | \begin{ttbox}
 | |
| 1387 | 659 | All :: ('a => o) => o   (binder "ALL " 10)
 | 
| 320 | 660 | \end{ttbox}
 | 
| 661 | This lets us write $\forall x.P$ as either {\tt All(\%$x$.$P$)} or {\tt ALL
 | |
| 662 | $x$.$P$}. When printing, Isabelle prefers the latter form, but must fall | |
| 663 | back on ${\tt All}(P)$ if $P$ is not an abstraction.  Both $P$ and {\tt ALL
 | |
| 664 | $x$.$P$} have type~$o$, the type of formulae, while the bound variable | |
| 665 | can be polymorphic. | |
| 666 | \endgroup | |
| 667 | ||
| 668 | \index{mixfix declarations|)}
 | |
| 669 | ||
| 3108 | 670 | |
| 671 | \section{*Alternative print modes} \label{sec:prmodes}
 | |
| 672 | \index{print modes|(}
 | |
| 673 | % | |
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 674 | Isabelle's pretty printer supports alternative output syntaxes. These | 
| 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 675 | may be used independently or in cooperation. The currently active | 
| 3108 | 676 | print modes (with precedence from left to right) are determined by a | 
| 677 | reference variable. | |
| 678 | \begin{ttbox}\index{*print_mode}
 | |
| 679 | print_mode: string list ref | |
| 680 | \end{ttbox}
 | |
| 681 | Initially this may already contain some print mode identifiers, | |
| 682 | depending on how Isabelle has been invoked (e.g.\ by some user | |
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 683 | interface). So changes should be incremental --- adding or deleting | 
| 3108 | 684 | modes relative to the current value. | 
| 685 | ||
| 686 | Any \ML{} string is a legal print mode identifier, without any
 | |
| 687 | predeclaration required. The following names should be considered | |
| 688 | reserved, though: \texttt{""} (yes, the empty string),
 | |
| 689 | \texttt{symbols}, \texttt{latex}, \texttt{xterm}.
 | |
| 690 | ||
| 691 | There is a separate table of mixfix productions for pretty printing | |
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 692 | associated with each print mode. The currently active ones are | 
| 3108 | 693 | conceptually just concatenated from left to right, with the standard | 
| 694 | syntax output table always coming last as default. Thus mixfix | |
| 695 | productions of preceding modes in the list may override those of later | |
| 696 | ones. Also note that token translations are always relative to some | |
| 697 | print mode (see \S\ref{sec:tok_tr}).
 | |
| 698 | ||
| 699 | \medskip The canonical application of print modes is optional printing | |
| 700 | of mathematical symbols from a special screen font instead of {\sc
 | |
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 701 | ascii}. Another example is to re-use Isabelle's advanced | 
| 3108 | 702 | $\lambda$-term printing mechanisms to generate completely different | 
| 3228 | 703 | output, say for interfacing external tools like \rmindex{model
 | 
| 704 |   checkers} (see also \texttt{HOL/Modelcheck}).
 | |
| 3108 | 705 | |
| 706 | \index{print modes|)}
 | |
| 707 | ||
| 708 | ||
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 709 | \section{Ambiguity of parsed expressions} \label{sec:ambiguity}
 | 
| 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 710 | \index{ambiguity!of parsed expressions}
 | 
| 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 711 | |
| 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 712 | To keep the grammar small and allow common productions to be shared | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 713 | all logical types (except {\tt prop}) are internally represented
 | 
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 714 | by one nonterminal, namely {\tt logic}.  This and omitted or too freely
 | 
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 715 | chosen priorities may lead to ways of parsing an expression that were | 
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 716 | not intended by the theory's maker. In most cases Isabelle is able to | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 717 | select one of multiple parse trees that an expression has lead | 
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 718 | to by checking which of them can be typed correctly. But this may not | 
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 719 | work in every case and always slows down parsing. | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 720 | The warning and error messages that can be produced during this process are | 
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 721 | as follows: | 
| 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 722 | |
| 880 | 723 | If an ambiguity can be resolved by type inference the following | 
| 724 | warning is shown to remind the user that parsing is (unnecessarily) | |
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 725 | slowed down. In cases where it's not easily possible to eliminate the | 
| 880 | 726 | ambiguity the frequency of the warning can be controlled by changing | 
| 883 
92abd2ad9d08
renamed Sign.ambiguity_level to Syntax.ambiguity_level
 clasohm parents: 
880diff
changeset | 727 | the value of {\tt Syntax.ambiguity_level} which has type {\tt int
 | 
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 728 | ref}. Its default value is 1 and by increasing it one can control how | 
| 883 
92abd2ad9d08
renamed Sign.ambiguity_level to Syntax.ambiguity_level
 clasohm parents: 
880diff
changeset | 729 | many parse trees are necessary to generate the warning. | 
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 730 | |
| 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 731 | \begin{ttbox}
 | 
| 3801 | 732 | {\out Ambiguous input "\dots"}
 | 
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 733 | {\out produces the following parse trees:}
 | 
| 3801 | 734 | {\out \dots}
 | 
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 735 | {\out Fortunately, only one parse tree is type correct.}
 | 
| 3801 | 736 | {\out You may still want to disambiguate your grammar or your input.}
 | 
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 737 | \end{ttbox}
 | 
| 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 738 | |
| 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 739 | The following message is normally caused by using the same | 
| 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 740 | syntax in two different productions: | 
| 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 741 | |
| 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 742 | \begin{ttbox}
 | 
| 3802 | 743 | {\out Ambiguous input "..."}
 | 
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 744 | {\out produces the following parse trees:}
 | 
| 3802 | 745 | {\out \dots}
 | 
| 746 | {\out More than one term is type correct:}
 | |
| 747 | {\out \dots}
 | |
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 748 | \end{ttbox}
 | 
| 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 749 | |
| 866 
2d3d020eef11
added documentation of bind_thm, qed, qed_goal, get_thm, thms_of
 clasohm parents: 
864diff
changeset | 750 | Ambiguities occuring in syntax translation rules cannot be resolved by | 
| 
2d3d020eef11
added documentation of bind_thm, qed, qed_goal, get_thm, thms_of
 clasohm parents: 
864diff
changeset | 751 | type inference because it is not necessary for these rules to be type | 
| 3485 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
 paulson parents: 
3318diff
changeset | 752 | correct. Therefore Isabelle always generates an error message and the | 
| 866 
2d3d020eef11
added documentation of bind_thm, qed, qed_goal, get_thm, thms_of
 clasohm parents: 
864diff
changeset | 753 | ambiguity should be eliminated by changing the grammar or the rule. | 
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 754 | |
| 320 | 755 | |
| 756 | \section{Example: some minimal logics} \label{sec:min_logics}
 | |
| 757 | \index{examples!of logic definitions}
 | |
| 758 | ||
| 759 | This section presents some examples that have a simple syntax. They | |
| 760 | demonstrate how to define new object-logics from scratch. | |
| 761 | ||
| 711 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
 clasohm parents: 
452diff
changeset | 762 | First we must define how an object-logic syntax is embedded into the | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 763 | meta-logic.  Since all theorems must conform to the syntax for~\ndx{prop}
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 764 | (see Fig.\ts\ref{fig:pure_gram}), that syntax has to be extended with the
 | 
| 320 | 765 | object-level syntax. Assume that the syntax of your object-logic defines a | 
| 864 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 766 | meta-type~\tydx{o} of formulae which refers to the nonterminal {\tt logic}.
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 767 | These formulae can now appear in axioms and theorems wherever \ndx{prop} does
 | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 768 | if you add the production | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 769 | \[ prop ~=~ logic. \] | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 770 | This is not supposed to be a copy production but an implicit coercion from | 
| 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
 wenzelm parents: 
711diff
changeset | 771 | formulae to propositions: | 
| 320 | 772 | \begin{ttbox}
 | 
| 773 | Base = Pure + | |
| 774 | types | |
| 775 | o | |
| 776 | arities | |
| 777 | o :: logic | |
| 778 | consts | |
| 1387 | 779 |   Trueprop :: o => prop   ("_" 5)
 | 
| 320 | 780 | end | 
| 781 | \end{ttbox}
 | |
| 782 | The constant \cdx{Trueprop} (the name is arbitrary) acts as an invisible
 | |
| 332 | 783 | coercion function.  Assuming this definition resides in a file {\tt Base.thy},
 | 
| 320 | 784 | you have to load it with the command {\tt use_thy "Base"}.
 | 
| 785 | ||
| 786 | One of the simplest nontrivial logics is {\bf minimal logic} of
 | |
| 787 | implication. Its definition in Isabelle needs no advanced features but | |
| 788 | illustrates the overall mechanism nicely: | |
| 789 | \begin{ttbox}
 | |
| 790 | Hilbert = Base + | |
| 791 | consts | |
| 1387 | 792 | "-->" :: [o, o] => o (infixr 10) | 
| 320 | 793 | rules | 
| 794 | K "P --> Q --> P" | |
| 795 | S "(P --> Q --> R) --> (P --> Q) --> P --> R" | |
| 796 | MP "[| P --> Q; P |] ==> Q" | |
| 797 | end | |
| 798 | \end{ttbox}
 | |
| 332 | 799 | After loading this definition from the file {\tt Hilbert.thy}, you can
 | 
| 320 | 800 | start to prove theorems in the logic: | 
| 801 | \begin{ttbox}
 | |
| 5205 | 802 | Goal "P --> P"; | 
| 320 | 803 | {\out Level 0}
 | 
| 804 | {\out P --> P}
 | |
| 805 | {\out  1.  P --> P}
 | |
| 806 | \ttbreak | |
| 807 | by (resolve_tac [Hilbert.MP] 1); | |
| 808 | {\out Level 1}
 | |
| 809 | {\out P --> P}
 | |
| 810 | {\out  1.  ?P --> P --> P}
 | |
| 811 | {\out  2.  ?P}
 | |
| 812 | \ttbreak | |
| 813 | by (resolve_tac [Hilbert.MP] 1); | |
| 814 | {\out Level 2}
 | |
| 815 | {\out P --> P}
 | |
| 816 | {\out  1.  ?P1 --> ?P --> P --> P}
 | |
| 817 | {\out  2.  ?P1}
 | |
| 818 | {\out  3.  ?P}
 | |
| 819 | \ttbreak | |
| 820 | by (resolve_tac [Hilbert.S] 1); | |
| 821 | {\out Level 3}
 | |
| 822 | {\out P --> P}
 | |
| 823 | {\out  1.  P --> ?Q2 --> P}
 | |
| 824 | {\out  2.  P --> ?Q2}
 | |
| 825 | \ttbreak | |
| 826 | by (resolve_tac [Hilbert.K] 1); | |
| 827 | {\out Level 4}
 | |
| 828 | {\out P --> P}
 | |
| 829 | {\out  1.  P --> ?Q2}
 | |
| 830 | \ttbreak | |
| 831 | by (resolve_tac [Hilbert.K] 1); | |
| 832 | {\out Level 5}
 | |
| 833 | {\out P --> P}
 | |
| 834 | {\out No subgoals!}
 | |
| 835 | \end{ttbox}
 | |
| 836 | As we can see, this Hilbert-style formulation of minimal logic is easy to | |
| 837 | define but difficult to use. The following natural deduction formulation is | |
| 838 | better: | |
| 839 | \begin{ttbox}
 | |
| 840 | MinI = Base + | |
| 841 | consts | |
| 1387 | 842 | "-->" :: [o, o] => o (infixr 10) | 
| 320 | 843 | rules | 
| 844 | impI "(P ==> Q) ==> P --> Q" | |
| 845 | impE "[| P --> Q; P |] ==> Q" | |
| 846 | end | |
| 847 | \end{ttbox}
 | |
| 848 | Note, however, that although the two systems are equivalent, this fact | |
| 849 | cannot be proved within Isabelle.  Axioms {\tt S} and {\tt K} can be
 | |
| 850 | derived in {\tt MinI} (exercise!), but {\tt impI} cannot be derived in {\tt
 | |
| 851 |   Hilbert}.  The reason is that {\tt impI} is only an {\bf admissible} rule
 | |
| 852 | in {\tt Hilbert}, something that can only be shown by induction over all
 | |
| 853 | possible proofs in {\tt Hilbert}.
 | |
| 854 | ||
| 855 | We may easily extend minimal logic with falsity: | |
| 856 | \begin{ttbox}
 | |
| 857 | MinIF = MinI + | |
| 858 | consts | |
| 1387 | 859 | False :: o | 
| 320 | 860 | rules | 
| 861 | FalseE "False ==> P" | |
| 862 | end | |
| 863 | \end{ttbox}
 | |
| 864 | On the other hand, we may wish to introduce conjunction only: | |
| 865 | \begin{ttbox}
 | |
| 866 | MinC = Base + | |
| 867 | consts | |
| 1387 | 868 | "&" :: [o, o] => o (infixr 30) | 
| 320 | 869 | \ttbreak | 
| 870 | rules | |
| 871 | conjI "[| P; Q |] ==> P & Q" | |
| 872 | conjE1 "P & Q ==> P" | |
| 873 | conjE2 "P & Q ==> Q" | |
| 874 | end | |
| 875 | \end{ttbox}
 | |
| 876 | And if we want to have all three connectives together, we create and load a | |
| 3108 | 877 | theory file consisting of a single line: | 
| 320 | 878 | \begin{ttbox}
 | 
| 879 | MinIFC = MinIF + MinC | |
| 880 | \end{ttbox}
 | |
| 881 | Now we can prove mixed theorems like | |
| 882 | \begin{ttbox}
 | |
| 5205 | 883 | Goal "P & False --> Q"; | 
| 320 | 884 | by (resolve_tac [MinI.impI] 1); | 
| 885 | by (dresolve_tac [MinC.conjE2] 1); | |
| 886 | by (eresolve_tac [MinIF.FalseE] 1); | |
| 887 | \end{ttbox}
 | |
| 888 | Try this as an exercise! | |
| 5371 | 889 | |
| 890 | ||
| 891 | %%% Local Variables: | |
| 892 | %%% mode: latex | |
| 893 | %%% TeX-master: "ref" | |
| 894 | %%% End: |