doc-src/Logics/defining.tex
author wenzelm
Wed, 19 Jan 1994 14:22:37 +0100
changeset 238 6af40e3a2bcb
parent 188 6be0856cdf49
child 291 a615050a7494
permissions -rw-r--r--
MAJOR INTERNAL CHANGE: extend and merge operations of syntax tables now much leaner (eliminated gramgraph, all data except tables of old parser are shared); simplified the internal interfaces for syntax extension; added translations for _explode, _implode (experimental);
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
     1
%% $Id$
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
     2
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
     3
\newcommand{\rmindex}[1]{{#1}\index{#1}\@}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
     4
\newcommand{\mtt}[1]{\mbox{\tt #1}}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
     5
\newcommand{\ttfct}[1]{\mathop{\mtt{#1}}\nolimits}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
     6
\newcommand{\ttrel}[1]{\mathrel{\mtt{#1}}}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
     7
\newcommand{\Constant}{\ttfct{Constant}}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
     8
\newcommand{\Variable}{\ttfct{Variable}}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
     9
\newcommand{\Appl}[1]{\ttfct{Appl}\mathopen{\mtt[}#1\mathclose{\mtt]}}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    10
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    11
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    12
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    13
\chapter{Defining Logics} \label{Defining-Logics}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    14
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    15
This chapter is intended for Isabelle experts. It explains how to define new
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    16
logical systems, Isabelle's {\em raison d'\^etre}. Isabelle logics are
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    17
hierarchies of theories. A number of simple examples are contained in {\em
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    18
Introduction to Isabelle}; the full syntax of theory definition files ({\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    19
.thy} files) is shown in {\em The Isabelle Reference Manual}. This chapter's
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    20
chief purpose is a thorough description of all syntax related matters
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    21
concerning theories. The most important sections are \S\ref{sec:mixfix} about
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    22
mixfix declarations and \S\ref{sec:macros} describing the macro system. The
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    23
concluding examples of \S\ref{sec:min_logics} are more concerned with the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    24
logical aspects of the definition of theories. Sections marked with * can be
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
    25
skipped on first reading.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    26
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    27
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    28
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    29
\section{Precedence grammars} \label{sec:precedence_grammars}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    30
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    31
The precise syntax of a logic is best defined by a \rmindex{context-free
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    32
grammar}. In order to simplify the description of mathematical languages, we
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    33
introduce an extended format which permits {\bf
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    34
precedences}\indexbold{precedence}. This scheme generalizes precedence
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    35
declarations in \ML\ and {\sc prolog}. In this extended grammar format,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    36
nonterminals are decorated by integers, their precedence. In the sequel,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    37
precedences are shown as subscripts. A nonterminal $A@p$ on the right-hand
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    38
side of a production may only be replaced using a production $A@q = \gamma$
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    39
where $p \le q$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    40
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    41
Formally, a set of context free productions $G$ induces a derivation
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    42
relation $\rew@G$ on strings as follows:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    43
\[ \alpha A@p \beta ~\rew@G~ \alpha\gamma\beta ~~~iff~~~
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    44
   \exists (A@q=\gamma) \in G.~q \ge p
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    45
\]
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    46
Any extended grammar of this kind can be translated into a normal context
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    47
free grammar. However, this translation may require the introduction of a
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    48
large number of new nonterminals and productions.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    49
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    50
\begin{example} \label{ex:precedence}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    51
The following simple grammar for arithmetic expressions demonstrates how
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    52
binding power and associativity of operators can be enforced by precedences.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    53
\begin{center}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    54
\begin{tabular}{rclr}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    55
  $A@9$ & = & {\tt0} \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    56
  $A@9$ & = & {\tt(} $A@0$ {\tt)} \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    57
  $A@0$ & = & $A@0$ {\tt+} $A@1$ \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    58
  $A@2$ & = & $A@3$ {\tt*} $A@2$ \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    59
  $A@3$ & = & {\tt-} $A@3$
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    60
\end{tabular}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    61
\end{center}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    62
The choice of precedences determines that {\tt -} binds tighter than {\tt *}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    63
which binds tighter than {\tt +}, and that {\tt +} associates to the left and
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    64
{\tt *} to the right.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    65
\end{example}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    66
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    67
To minimize the number of subscripts, we adopt the following conventions:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    68
\begin{itemize}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    69
\item All precedences $p$ must be in the range $0 \leq p \leq max_pri$ for
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    70
  some fixed $max_pri$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    71
\item Precedence $0$ on the right-hand side and precedence $max_pri$ on the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    72
  left-hand side may be omitted.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    73
\end{itemize}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    74
In addition, we write the production $A@p = \alpha$ as $A = \alpha~(p)$,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    75
i.e.\ the precedence of the left-hand side actually appears in the uttermost
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    76
right. Finally, alternatives may be separated by $|$, and repetition
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    77
indicated by \dots.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    78
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    79
Using these conventions and assuming $max_pri=9$, the grammar in
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    80
Example~\ref{ex:precedence} becomes
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    81
\begin{center}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    82
\begin{tabular}{rclc}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    83
$A$ & = & {\tt0} & \hspace*{4em} \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    84
 & $|$ & {\tt(} $A$ {\tt)} \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    85
 & $|$ & $A$ {\tt+} $A@1$ & (0) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    86
 & $|$ & $A@3$ {\tt*} $A@2$ & (2) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    87
 & $|$ & {\tt-} $A@3$ & (3)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    88
\end{tabular}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    89
\end{center}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    90
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    91
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    92
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    93
\section{Basic syntax} \label{sec:basic_syntax}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    94
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    95
The basis of all extensions by object-logics is the \rmindex{Pure theory},
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    96
bound to the \ML-identifier \ttindex{Pure.thy}. It contains, among many other
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    97
things, the \rmindex{Pure syntax}. An informal account of this basic syntax
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    98
(meta-logic, types etc.) may be found in {\em Introduction to Isabelle}. A
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
    99
more precise description using a precedence grammar is shown in
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   100
Figure~\ref{fig:pure_gram}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   101
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   102
\begin{figure}[htb]
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   103
\begin{center}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   104
\begin{tabular}{rclc}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   105
$prop$ &=& \ttindex{PROP} $aprop$ ~~$|$~~ {\tt(} $prop$ {\tt)} \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   106
     &$|$& $logic@3$ \ttindex{==} $logic@2$ & (2) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   107
     &$|$& $logic@3$ \ttindex{=?=} $logic@2$ & (2) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   108
     &$|$& $prop@2$ \ttindex{==>} $prop@1$ & (1) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   109
     &$|$& {\tt[|} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt|]} {\tt==>} $prop@1$ & (1) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   110
     &$|$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\\\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   111
$logic$ &=& $prop$ ~~$|$~~ $fun$ \\\\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   112
$aprop$ &=& $id$ ~~$|$~~ $var$
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   113
    ~~$|$~~ $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\\\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   114
$fun$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $fun$ {\tt)} \\
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   115
    &$|$& $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   116
    &$|$& $fun@{max_pri}$ {\tt::} $type$ \\
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   117
    &$|$& \ttindex{\%} $idts$ {\tt.} $logic$ & (0) \\\\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   118
$idts$ &=& $idt$ ~~$|$~~ $idt@1$ $idts$ \\\\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   119
$idt$ &=& $id$ ~~$|$~~ {\tt(} $idt$ {\tt)} \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   120
    &$|$& $id$ \ttindex{::} $type$ & (0) \\\\
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   121
$type$ &=& $tfree$ ~~$|$~~ $tvar$ ~~$|$~~ $tfree$ {\tt::} $sort$
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   122
  ~~$|$~~ $tvar$ {\tt::} $sort$ \\
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   123
     &$|$& $id$ ~~$|$~~ $type@{max_pri}$ $id$
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   124
                ~~$|$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   125
     &$|$& $type@1$ \ttindex{=>} $type$ & (0) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   126
     &$|$& {\tt[}  $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0)\\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   127
     &$|$& {\tt(} $type$ {\tt)} \\\\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   128
$sort$ &=& $id$ ~~$|$~~ {\tt\ttlbrace\ttrbrace}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   129
                ~~$|$~~ {\tt\ttlbrace} $id$ {\tt,} \dots {\tt,} $id$ {\tt\ttrbrace}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   130
\end{tabular}\index{*"!"!}\index{*"["|}\index{*"|"]}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   131
\indexbold{type@$type$} \indexbold{sort@$sort$} \indexbold{idt@$idt$}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   132
\indexbold{idts@$idts$} \indexbold{logic@$logic$} \indexbold{prop@$prop$}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   133
\indexbold{fun@$fun$}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   134
\end{center}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   135
\caption{Meta-Logic Syntax}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   136
\label{fig:pure_gram}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   137
\end{figure}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   138
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   139
The following main categories are defined:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   140
\begin{description}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   141
  \item[$prop$] Terms of type $prop$, i.e.\ formulae of the meta-logic.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   142
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   143
  \item[$aprop$] Atomic propositions.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   144
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   145
  \item[$logic$] Terms of types in class $logic$. Initially, $logic$ contains
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   146
    merely $prop$. As the syntax is extended by new object-logics, more
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   147
    productions for $logic$ are added automatically (see below).
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   148
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   149
  \item[$fun$] Terms potentially of function type.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   150
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   151
  \item[$type$] Meta-types.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   152
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   153
  \item[$idts$] A list of identifiers, possibly constrained by types. Note
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   154
    that \verb|x :: nat y| is parsed as \verb|x :: (nat y)|, i.e.\ {\tt y}
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   155
    would be treated like a type constructor applied to {\tt nat}.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   156
\end{description}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   157
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   158
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   159
\subsection{Logical types and default syntax}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   160
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   161
Isabelle is concerned with mathematical languages which have a certain
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   162
minimal vocabulary: identifiers, variables, parentheses, and the lambda
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   163
calculus. Logical types, i.e.\ those of class $logic$, are automatically
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   164
equipped with this basic syntax. More precisely, for any type constructor
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   165
$ty$ with arity $(\vec{s})c$, where $c$ is a subclass of $logic$, the
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   166
following productions are added:
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   167
\begin{center}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   168
\begin{tabular}{rclc}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   169
$ty$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $ty$ {\tt)} \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   170
  &$|$& $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)}\\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   171
  &$|$& $ty@{max_pri}$ {\tt::} $type$\\\\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   172
$logic$ &=& $ty$
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   173
\end{tabular}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   174
\end{center}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   175
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   176
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   177
\subsection{Lexical matters *}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   178
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   179
The parser does not process input strings directly, it rather operates on
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   180
token lists provided by Isabelle's \rmindex{lexical analyzer} (the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   181
\bfindex{lexer}). There are two different kinds of tokens: {\bf
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   182
  delimiters}\indexbold{delimiter} and {\bf names}\indexbold{name}.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   183
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   184
Delimiters can be regarded as reserved words\index{reserved word} of the
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   185
syntax and the user can add new ones, when extending theories. In
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   186
Figure~\ref{fig:pure_gram} they appear in typewriter font, e.g.\ {\tt PROP},
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   187
{\tt ==}, {\tt =?=}, {\tt ;}.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   188
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   189
Names on the other hand have a fixed predefined syntax. The lexer
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   190
distinguishes four kinds of them: identifiers\index{identifier},
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   191
unknowns\index{unknown}\index{scheme variable|see{unknown}}, type
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   192
variables\index{type variable}, type unknowns\index{type unknown}\index{type
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   193
scheme variable|see{type unknown}}; they are denoted by $id$\index{id@$id$},
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   194
$var$\index{var@$var$}, $tfree$\index{tfree@$tfree$},
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   195
$tvar$\index{tvar@$tvar$}, respectively. Typical examples are {\tt x}, {\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   196
?x7}, {\tt 'a}, {\tt ?'a3}, the exact syntax is:
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   197
\begin{eqnarray*}
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   198
id        & =   & letter~quasiletter^* \\
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   199
var       & =   & \mbox{\tt ?}id ~~|~~ \mbox{\tt ?}id\mbox{\tt .}nat \\
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   200
tfree     & =   & \mbox{\tt '}id \\
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   201
tvar      & =   & \mbox{\tt ?}tfree ~~|~~
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   202
                  \mbox{\tt ?}tfree\mbox{\tt .}nat \\[1ex]
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   203
letter    & =   & \mbox{one of {\tt a}\dots {\tt z} {\tt A}\dots {\tt Z}} \\
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   204
digit     & =   & \mbox{one of {\tt 0}\dots {\tt 9}} \\
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   205
quasiletter & =  & letter ~~|~~ digit ~~|~~ \mbox{\tt _} ~~|~~ \mbox{\tt '} \\
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   206
nat       & =   & digit^+
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   207
\end{eqnarray*}
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   208
A $var$ or $tvar$ describes an \rmindex{unknown} which is internally a pair
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   209
of base name and index (\ML\ type \ttindex{indexname}).  These components are
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   210
either explicitly separated by a dot as in {\tt ?x.1} or {\tt ?x7.3} or
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   211
directly run together as in {\tt ?x1}. The latter form is possible, if the
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   212
base name does not end with digits; if the index is 0, it may be dropped
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   213
altogether: {\tt ?x} abbreviates {\tt ?x0} or {\tt ?x.0}.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   214
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   215
Note that $id$, $var$, $tfree$, $tvar$ are mutually disjoint, but it is
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   216
perfectly legal that they overlap with the set of delimiters (e.g.\ {\tt
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   217
  PROP}, {\tt ALL}, {\tt EX}).
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   218
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   219
The lexical analyzer translates input strings to token lists by repeatedly
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   220
taking the maximal prefix of the input string that forms a valid token. A
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   221
maximal prefix that is both a delimiter and a name is treated as a delimiter.
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   222
Spaces, tabs and newlines are separators; they never occur within tokens.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   223
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   224
Note that delimiters need not necessarily be surrounded by white space to be
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   225
recognized as separate. For example, if {\tt -} is a delimiter but {\tt --}
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   226
is not, then the string {\tt --} is treated as two consecutive occurrences of
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   227
{\tt -}. This is in contrast to \ML\ which always treats {\tt --} as a single
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   228
symbolic name. The consequence of Isabelle's more liberal scheme is that the
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   229
same string may be parsed in different ways after extending the syntax: after
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   230
adding {\tt --} as a delimiter, the input {\tt --} is treated as a single
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   231
token.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   232
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   233
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   234
\subsection{Inspecting syntax *}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   235
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   236
You may get the \ML\ representation of the syntax of any Isabelle theory by
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   237
applying \index{*syn_of}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   238
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   239
  syn_of: theory -> Syntax.syntax
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   240
\end{ttbox}
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   241
\ttindex{Syntax.syntax} is an abstract type. Values of this type can be
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   242
displayed by the following functions: \index{*Syntax.print_syntax}
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   243
\index{*Syntax.print_gram} \index{*Syntax.print_trans}
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   244
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   245
  Syntax.print_syntax: Syntax.syntax -> unit
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   246
  Syntax.print_gram: Syntax.syntax -> unit
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   247
  Syntax.print_trans: Syntax.syntax -> unit
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   248
\end{ttbox}
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   249
{\tt Syntax.print_syntax} shows virtually all information contained in a
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   250
syntax. Its output is divided into labeled sections. The syntax proper is
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   251
represented by {\tt lexicon}, {\tt roots} and {\tt prods}. The rest refers to
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   252
the manifold facilities to apply syntactic translations (macro expansion
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   253
etc.).
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   254
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   255
To cope with the verbosity of {\tt Syntax.print_syntax}, there are
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   256
\ttindex{Syntax.print_gram} and \ttindex{Syntax.print_trans} to print the
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   257
syntax proper and the translation related information only.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   258
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   259
Let's have a closer look at part of Pure's syntax:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   260
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   261
Syntax.print_syntax (syn_of Pure.thy);
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   262
{\out lexicon: "!!" "%" "(" ")" "," "." "::" ";" "==" "==>" \dots}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   263
{\out roots: logic type fun prop}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   264
{\out prods:}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   265
{\out   type = tfree  (1000)}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   266
{\out   type = tvar  (1000)}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   267
{\out   type = id  (1000)}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   268
{\out   type = tfree "::" sort[0]  => "_ofsort" (1000)}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   269
{\out   type = tvar "::" sort[0]  => "_ofsort" (1000)}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   270
{\out   \vdots}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   271
{\out consts: "_K" "_appl" "_aprop" "_args" "_asms" "_bigimpl" \dots}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   272
{\out parse_ast_translation: "_appl" "_bigimpl" "_bracket"}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   273
{\out   "_idtyp" "_lambda" "_tapp" "_tappl"}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   274
{\out parse_rules:}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   275
{\out parse_translation: "!!" "_K" "_abs" "_aprop"}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   276
{\out print_translation: "all"}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   277
{\out print_rules:}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   278
{\out print_ast_translation: "==>" "_abs" "_idts" "fun"}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   279
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   280
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   281
\begin{description}
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   282
  \item[\ttindex{lexicon}] The set of delimiters used for lexical analysis.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   283
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   284
  \item[\ttindex{roots}]
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   285
    The legal syntactic categories to start parsing with. You name the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   286
    desired root directly as a string when calling lower level functions or
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   287
    specifying macros. Higher level functions usually expect a type and
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   288
    derive the actual root as follows:\index{root_of_type@$root_of_type$}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   289
    \begin{itemize}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   290
      \item $root_of_type(\tau@1 \To \tau@2) = \mtt{fun}$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   291
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   292
      \item $root_of_type(\tau@1 \mathrel{ty} \tau@2) = ty$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   293
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   294
      \item $root_of_type((\tau@1, \dots, \tau@n)ty) = ty$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   295
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   296
      \item $root_of_type(\alpha) = \mtt{logic}$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   297
    \end{itemize}
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   298
    Thereby $\tau@1, \dots, \tau@n$ are types, $ty$ an infix or ordinary type
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   299
    constructor and $\alpha$ a type variable or unknown. Note that only the
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   300
    outermost type constructor is taken into account.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   301
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   302
  \item[\ttindex{prods}]
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   303
    The list of productions describing the precedence grammar. Nonterminals
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   304
    $A@n$ are rendered in {\sc ascii} as {\tt $A$[$n$]}, delimiters are
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   305
    quoted. Some productions have strings attached after an {\tt =>}. These
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   306
    strings later become the heads of parse trees, but they also play a vital
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   307
    role when terms are printed (see \S\ref{sec:asts}).
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   308
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   309
    Productions which do not have string attached and thus do not create a
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   310
    new parse tree node are called {\bf copy productions}\indexbold{copy
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   311
    production}. They must have exactly one
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   312
    argument\index{argument!production} (i.e.\ nonterminal or name token)
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   313
    on the right-hand side. The parse tree generated when parsing that
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   314
    argument is simply passed up as the result of parsing the whole copy
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   315
    production.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   316
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   317
    A special kind of copy production is one where the argument is a
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   318
    nonterminal and no additional syntactic sugar (delimiters) exists. It is
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   319
    called a \bfindex{chain production}. Chain productions should be seen as
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   320
    an abbreviation mechanism: conceptually, they are removed from the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   321
    grammar by adding appropriate new rules. Precedence information attached
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   322
    to chain productions is ignored, only the dummy value $-1$ is displayed.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   323
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   324
  \item[\ttindex{consts}, \ttindex{parse_rules}, \ttindex{print_rules}]
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   325
    This is macro related information (see \S\ref{sec:macros}).
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   326
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   327
  \item[\tt *_translation]
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   328
    \index{*parse_ast_translation} \index{*parse_translation}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   329
    \index{*print_translation} \index{*print_ast_translation}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   330
    The sets of constants that invoke translation functions. These are more
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   331
    arcane matters (see \S\ref{sec:asts} and \S\ref{sec:tr_funs}).
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   332
\end{description}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   333
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   334
You may inspect the syntax of any theory using the above calling
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   335
sequence. Beware that, as more and more material accumulates, the output
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   336
becomes longer and longer.
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   337
% When extending syntaxes, new {\tt roots}, {\tt prods}, {\tt parse_rules}
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   338
% and {\tt print_rules} are appended to the end. The other lists are
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   339
% displayed sorted.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   340
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   341
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   342
\section{Abstract syntax trees} \label{sec:asts}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   343
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   344
Figure~\ref{fig:parse_print} shows a simplified model of the parsing and
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   345
printing process.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   346
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   347
\begin{figure}[htb]
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   348
\begin{center}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   349
\begin{tabular}{cl}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   350
string          & \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   351
$\downarrow$    & parser \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   352
parse tree      & \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   353
$\downarrow$    & \rmindex{parse ast translation} \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   354
ast             & \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   355
$\downarrow$    & ast rewriting (macros) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   356
ast             & \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   357
$\downarrow$    & \rmindex{parse translation}, type-inference \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   358
--- well-typed term --- & \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   359
$\downarrow$    & \rmindex{print translation} \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   360
ast             & \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   361
$\downarrow$    & ast rewriting (macros) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   362
ast             & \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   363
$\downarrow$    & \rmindex{print ast translation}, printer \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   364
string          &
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   365
\end{tabular}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   366
\end{center}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   367
\caption{Parsing and Printing}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   368
\label{fig:parse_print}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   369
\end{figure}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   370
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   371
The parser takes an input string (more precisely the token list produced by
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   372
the lexer) and produces a \rmin«ôx{parse tree}, which is directly derived
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   373
from the productions involved. By applying some internal transformations the
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   374
parse tree becomes an \bfindex{abstract syntax tree} (\bfindex{ast}). Macro
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   375
expansion, further translations and finally type inference yields a
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   376
well-typed term\index{term!well-typed}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   377
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   378
The printing process is the reverse, except for some subtleties to be
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   379
discussed later.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   380
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   381
Asts are an intermediate form between the very raw parse trees and the typed
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   382
$\lambda$-terms. An ast is either an atom (constant or variable) or a list of
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   383
{\em at least two\/} subasts. Internally, they have type
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   384
\ttindex{Syntax.ast}: \index{*Constant} \index{*Variable} \index{*Appl}
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   385
\begin{ttbox}
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   386
datatype ast = Constant of string
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   387
             | Variable of string
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   388
             | Appl of ast list
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   389
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   390
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   391
We write constant atoms as quoted strings, variable atoms as non-quoted
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   392
strings and applications as list of subasts separated by space and enclosed
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   393
in parentheses. For example:
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   394
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   395
  Appl [Constant "_constrain",
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   396
    Appl [Constant "_abs", Variable "x", Variable "t"],
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   397
    Appl [Constant "fun", Variable "'a", Variable "'b"]]
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   398
\end{ttbox}
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   399
is written as {\tt ("_constrain" ("_abs" x t) ("fun" 'a 'b))}.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   400
Note that {\tt ()} and {\tt (f)} are both illegal.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   401
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   402
The resemblance of Lisp's S-expressions is intentional, but notice the two
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   403
kinds of atomic symbols: $\Constant x$ and $\Variable x$. This distinction
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   404
has some more obscure reasons and you can ignore it about half of the time.
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   405
You should not take the names ``{\tt Constant}'' and ``{\tt Variable}'' too
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   406
literally. In the later translation to terms, $\Variable x$ may become a
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   407
constant, free or bound variable, even a type constructor or class name; the
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   408
actual outcome depends on the context.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   409
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   410
Similarly, you can think of ${\tt (} f~x@1~\ldots~x@n{\tt )}$ as some sort of
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   411
application of $f$ to the arguments $x@1, \ldots, x@n$. But the actual kind
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   412
of application (say a type constructor $f$ applied to types $x@1, \ldots,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   413
x@n$) is determined later by context, too.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   414
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   415
Forms like {\tt (("_abs" x $t$) $u$)} are perfectly legal, but asts are not
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   416
higher-order: the {\tt "_abs"} does not yet bind the {\tt x} in any way,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   417
though later at the term level, {\tt ("_abs" x $t$)} will become an {\tt Abs}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   418
node and occurrences of {\tt x} in $t$ will be replaced by {\tt Bound}s. Even
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   419
if non-constant heads of applications may seem unusual, asts should be
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   420
regarded as first-order. Purists may think of ${\tt (} f~x@1~\ldots~x@n{\tt
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   421
)}$ as a first-order application of some invisible $(n+1)$-ary constant.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   422
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   423
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   424
\subsection{Parse trees to asts}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   425
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   426
Asts are generated from parse trees by applying some translation functions,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   427
which are internally called {\bf parse ast translations}\indexbold{parse ast
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   428
translation}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   429
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   430
We can think of the raw output of the parser as trees built up by nesting the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   431
right-hand sides of those productions that were used to derive a word that
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   432
matches the input token list. Then parse trees are simply lists of tokens and
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   433
sub parse trees, the latter replacing the nonterminals of the productions.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   434
Note that we refer here to the actual productions in their internal form as
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   435
displayed by {\tt Syntax.print_syntax}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   436
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   437
Ignoring parse ast translations, the mapping
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   438
$ast_of_pt$\index{ast_of_pt@$ast_of_pt$} from parse trees to asts is derived
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   439
from the productions involved as follows:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   440
\begin{itemize}
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   441
  \item Name tokens: $ast_of_pt(t) = \Variable s$, where $t$ is an $id$,
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   442
    $var$, $tfree$ or $tvar$ token, and $s$ its associated string.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   443
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   444
  \item Copy productions: $ast_of_pt(\ldots P \ldots) = ast_of_pt(P)$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   445
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   446
  \item $0$-ary productions: $ast_of_pt(\ldots \mtt{=>} c) = \Constant c$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   447
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   448
  \item $n$-ary productions: $ast_of_pt(\ldots P@1 \ldots P@n \ldots \mtt{=>}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   449
    c) = \Appl{\Constant c,$ $ast_of_pt(P@1),$ $\ldots,$ $ast_of_pt(P@n)}$,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   450
    where $n \ge 1$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   451
\end{itemize}
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   452
where $P, P@1, \ldots, P@n$ denote parse trees or name tokens and ``\dots''
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   453
zero or more delimiters. Delimiters are stripped and do not appear in asts.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   454
The following table presents some simple examples:
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   455
\begin{center}
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   456
{\tt\begin{tabular}{ll}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   457
\rm input string    & \rm ast \\\hline
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   458
"f"                 & f \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   459
"'a"                & 'a \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   460
"t == u"            & ("==" t u) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   461
"f(x)"              & ("_appl" f x) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   462
"f(x, y)"           & ("_appl" f ("_args" x y)) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   463
"f(x, y, z)"        & ("_appl" f ("_args" x ("_args" y z))) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   464
"\%x y.\ t"         & ("_lambda" ("_idts" x y) t) \\
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   465
\end{tabular}}  
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   466
\end{center}
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   467
Some of these examples illustrate why further translations are desirable in
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   468
order to provide some nice standard form of asts before macro expansion takes
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   469
place. Hence the Pure syntax provides predefined parse ast
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   470
translations\index{parse ast translation!of Pure} for ordinary applications,
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   471
type applications, nested abstractions, meta implications and function types.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   472
Their net effect on some representative input strings is shown in
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   473
Figure~\ref{fig:parse_ast_tr}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   474
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   475
\begin{figure}[htb]
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   476
\begin{center}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   477
{\tt\begin{tabular}{ll}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   478
\rm string                  & \rm ast \\\hline
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   479
"f(x, y, z)"                & (f x y z) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   480
"'a ty"                     & (ty 'a) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   481
"('a, 'b) ty"               & (ty 'a 'b) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   482
"\%x y z.\ t"               & ("_abs" x ("_abs" y ("_abs" z t))) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   483
"\%x ::\ 'a.\ t"            & ("_abs" ("_constrain" x 'a) t) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   484
"[| P; Q; R |] => S"        & ("==>" P ("==>" Q ("==>" R S))) \\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   485
"['a, 'b, 'c] => 'd"        & ("fun" 'a ("fun" 'b ("fun" 'c 'd)))
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   486
\end{tabular}}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   487
\end{center}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   488
\caption{Built-in Parse Ast Translations}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   489
\label{fig:parse_ast_tr}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   490
\end{figure}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   491
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   492
This translation process is guided by names of constant heads of asts. The
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   493
list of constants invoking parse ast translations is shown in the output of
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   494
{\tt Syntax.print_syntax} under {\tt parse_ast_translation}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   495
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   496
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   497
\subsection{Asts to terms *}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   498
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   499
After the ast has been normalized by the macro expander (see
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   500
\S\ref{sec:macros}), it is transformed into a term with yet another set of
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   501
translation functions interspersed: {\bf parse translations}\indexbold{parse
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   502
translation}. The result is a non-typed term\index{term!non-typed} which may
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   503
contain type constraints, that is 2-place applications with head {\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   504
"_constrain"}. The second argument of a constraint is a type encoded as term.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   505
Real types will be introduced later during type inference, which is not
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   506
discussed here.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   507
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   508
If we ignore the built-in parse translations of Pure first, then the mapping
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   509
$term_of_ast$\index{term_of_ast@$term_of_ast$} from asts to (non-typed) terms
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   510
is defined by:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   511
\begin{itemize}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   512
  \item $term_of_ast(\Constant x) = \ttfct{Const} (x, \mtt{dummyT})$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   513
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   514
  \item $term_of_ast(\Variable \mtt{"?}xi\mtt") = \ttfct{Var} ((x, i),
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   515
    \mtt{dummyT})$, where $x$ is the base name and $i$ the index extracted
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   516
    from $xi$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   517
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   518
  \item $term_of_ast(\Variable x) = \ttfct{Free} (x, \mtt{dummyT})$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   519
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   520
  \item $term_of_ast(\Appl{f, x@1, \ldots, x@n}) = term_of_ast(f) ~{\tt\$}~
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   521
    term_of_ast(x@1)$ {\tt\$} \dots\ {\tt\$} $term_of_ast(x@n)$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   522
\end{itemize}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   523
Note that {\tt Const}, {\tt Var}, {\tt Free} belong to the datatype {\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   524
term} and \ttindex{dummyT} is bound to some dummy {\tt typ}, which is ignored
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   525
during type-inference.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   526
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   527
So far the outcome is still a first-order term. {\tt Abs}s and {\tt Bound}s
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   528
are introduced by parse translations associated with {\tt "_abs"} and {\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   529
"!!"} (and any other user defined binder).
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   530
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   531
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   532
\subsection{Printing of terms *}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   533
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   534
When terms are prepared for printing, they are first transformed into asts
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   535
via $ast_of_term$\index{ast_of_term@$ast_of_term$} (ignoring {\bf print
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   536
translations}\indexbold{print translation}):
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   537
\begin{itemize}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   538
  \item $ast_of_term(\ttfct{Const} (x, \tau)) = \Constant x$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   539
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   540
  \item $ast_of_term(\ttfct{Free} (x, \tau)) = constrain (\Variable x,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   541
    \tau)$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   542
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   543
  \item $ast_of_term(\ttfct{Var} ((x, i), \tau)) = constrain (\Variable
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   544
    \mtt{"?}xi\mtt", \tau)$, where $\mtt?xi$ is the string representation of
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   545
    the {\tt indexname} $(x, i)$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   546
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   547
  \item $ast_of_term(\ttfct{Abs} (x, \tau, t)) = \ttfct{Appl}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   548
    \mathopen{\mtt[} \Constant \mtt{"_abs"}, constrain(\Variable x', \tau),$
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   549
    $ast_of_term(t') \mathclose{\mtt]}$, where $x'$ is a version of $x$
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   550
    renamed away from all names occurring in $t$, and $t'$ the body $t$ with
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   551
    all {\tt Bound}s referring to this {\tt Abs} replaced by $\ttfct{Free}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   552
    (x', \mtt{dummyT})$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   553
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   554
  \item $ast_of_term(\ttfct{Bound} i) = \Variable \mtt{"B.}i\mtt"$. Note that
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   555
    the occurrence of loose bound variables is abnormal and should never
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   556
    happen when printing well-typed terms.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   557
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   558
  \item $ast_of_term(f \ttrel{\$} x@1 \ttrel{\$} \ldots \ttrel{\$} x@n) =
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   559
    \ttfct{Appl} \mathopen{\mtt[} ast_of_term(f), ast_of_term(x@1), \ldots,$
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   560
    $ast_of_term(x@n) \mathclose{\mtt]}$, where $f$ is not an application.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   561
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   562
  \item $constrain(x, \tau) = x$, if $\tau = \mtt{dummyT}$ \index{*dummyT} or
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   563
    \ttindex{show_types} not set to {\tt true}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   564
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   565
  \item $constrain(x, \tau) = \Appl{\Constant \mtt{"_constrain"}, x, ty}$,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   566
    where $ty$ is the ast encoding of $\tau$. That is: type constructors as
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   567
    {\tt Constant}s, type variables as {\tt Variable}s and type applications
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   568
    as {\tt Appl}s with the head type constructor as first element.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   569
    Additionally, if \ttindex{show_sorts} is set to {\tt true}, some type
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   570
    variables are decorated with an ast encoding of their sort.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   571
\end{itemize}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   572
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   573
\medskip
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   574
After an ast has been normalized wrt.\ the print macros, it is transformed
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   575
into the final output string. The built-in {\bf print ast
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   576
translations}\indexbold{print ast translation} are essentially the reverse
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   577
ones of the parse ast translations of Figure~\ref{fig:parse_ast_tr}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   578
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   579
For the actual printing process, the names attached to grammar productions of
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   580
the form $\ldots A@{p@1}^1 \ldots A@{p@n}^n \ldots \mtt{=>} c$ play a vital
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   581
role. Whenever an ast with constant head $c$, i.e.\ $\mtt"c\mtt"$ or
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   582
$(\mtt"c\mtt"~ x@1 \ldots x@n)$, is encountered it is printed according to
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   583
the production for $c$. This means that first the arguments $x@1$ \dots $x@n$
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   584
are printed, then put in parentheses if necessary for reasons of precedence,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   585
and finally joined to a single string, separated by the syntactic sugar of
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   586
the production (denoted by ``\dots'' above).
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   587
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   588
If an application $(\mtt"c\mtt"~ x@1 \ldots x@m)$ has more arguments than the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   589
corresponding production, it is first split into $((\mtt"c\mtt"~ x@1 \ldots
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   590
x@n) ~ x@{n+1} \ldots x@m)$. Applications with too few arguments or with
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   591
non-constant head or without a corresponding production are printed as
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   592
$f(x@1, \ldots, x@l)$ or $(\alpha@1, \ldots, \alpha@l) ty$. A single
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   593
$\Variable x$ is simply printed as $x$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   594
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   595
Note that the system does {\em not\/} insert blanks automatically. They
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   596
should be part of the mixfix declaration the production has been derived
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   597
from, if they are required to separate tokens. Mixfix declarations may also
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   598
contain pretty printing annotations.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   599
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   600
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   601
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   602
\section{Mixfix declarations} \label{sec:mixfix}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   603
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   604
When defining theories, the user usually declares new constants as names
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   605
associated with a type followed by an optional \bfindex{mixfix annotation}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   606
If none of the constants are introduced with mixfix annotations, there is no
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   607
concrete syntax to speak of: terms can only be abstractions or applications
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   608
of the form $f(t@1, \dots, t@n)$. Since this notation quickly becomes
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   609
unreadable, Isabelle supports syntax definitions in the form of unrestricted
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   610
context-free \index{context-free grammar} \index{precedence grammar}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   611
precedence grammars using mixfix annotations.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   612
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   613
Mixfix annotations describe the {\em concrete\/} syntax, a standard
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   614
translation into the abstract syntax and a pretty printing scheme, all in
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   615
one. Isabelle syntax definitions are inspired by \OBJ's~\cite{OBJ} {\em
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   616
mixfix\/} syntax. Each mixfix annotation defines a precedence grammar
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   617
production and optionally associates a constant with it.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   618
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   619
There is a general form of mixfix annotation and some shortcuts for common
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   620
cases like infix operators.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   621
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   622
The general \bfindex{mixfix declaration} as it may occur within the {\tt
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   623
consts} section\index{consts section@{\tt consts} section} of a {\tt .thy}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   624
file, specifies a constant declaration and a grammar production at the same
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   625
time. It has the form {\tt $c$ ::\ "$\tau$" ("$sy$" $ps$ $p$)} and is
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   626
interpreted as follows:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   627
\begin{itemize}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   628
  \item {\tt $c$ ::\ "$\tau$"} is the actual constant declaration without any
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   629
    syntactic significance.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   630
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   631
  \item $sy$ is the right-hand side of the production, specified as a
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   632
    symbolic string. In general, $sy$ is of the form $\alpha@0 \_ \alpha@1
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   633
    \dots \alpha@{n-1} \_ \alpha@n$, where each occurrence of \ttindex{_}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   634
    denotes an argument\index{argument!mixfix} position and the strings
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   635
    $\alpha@i$ do not contain (non-escaped) {\tt _}. The $\alpha@i$ may
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   636
    consist of delimiters\index{delimiter}, spaces\index{space (pretty
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   637
      printing)} or \rmindex{pretty printing} annotations (see below).
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   638
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   639
  \item $\tau$ specifies the syntactic categories of the arguments on the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   640
    left-hand and of the right-hand side. Arguments may be nonterminals or
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   641
    name tokens. If $sy$ is of the form above, $\tau$ must be a nested
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   642
    function type with at least $n$ argument positions, say $\tau = [\tau@1,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   643
    \dots, \tau@n] \To \tau'$. The syntactic category of argument $i$ is
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   644
    derived from type $\tau@i$ (see $root_of_type$ in
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   645
    \S\ref{sec:basic_syntax}). The result, i.e.\ the left-hand side of the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   646
    production, is derived from type $\tau'$. Note that the $\tau@i$ and
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   647
    $\tau'$ may be function types.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   648
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   649
  \item $c$ is the name of the constant associated with the production. If
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   650
    $c$ is the empty string {\tt ""} then this is a \rmindex{copy
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   651
    production}. Otherwise, parsing an instance of the phrase $sy$ generates
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   652
    the ast {\tt ("$c$" $a@1$ $\ldots$ $a@n$)}, where $a@i$ is the ast
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   653
    generated by parsing the $i$-th argument.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   654
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   655
  \item $ps$ is an optional list of at most $n$ integers, say {\tt [$p@1$,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   656
    $\ldots$, $p@m$]}, where $p@i$ is the minimal \rmindex{precedence}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   657
    required of any phrase that may appear as the $i$-th argument. Missing
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   658
    precedences default to $0$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   659
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   660
  \item $p$ is the \rmindex{precedence} the of this production.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   661
\end{itemize}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   662
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   663
Precedences\index{precedence!range of} may range between $0$ and
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   664
$max_pri$\indexbold{max_pri@$max_pri$} (= 1000). If you want to ignore them,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   665
the safest way to do so is to use the declaration {\tt $c$ ::\ "$\tau$"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   666
("$sy$")}. The resulting production puts no precedence constraints on any of
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   667
its arguments and has maximal precedence itself.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   668
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   669
\begin{example}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   670
The following theory specification contains a {\tt consts} section that
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   671
encodes the precedence grammar of Example~\ref{ex:precedence} as mixfix
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   672
declarations:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   673
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   674
EXP = Pure +
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   675
types
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   676
  exp 0
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   677
arities
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   678
  exp :: logic
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   679
consts
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   680
  "0" :: "exp"                ("0" 9)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   681
  "+" :: "[exp, exp] => exp"  ("_ + _" [0, 1] 0)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   682
  "*" :: "[exp, exp] => exp"  ("_ * _" [3, 2] 2)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   683
  "-" :: "exp => exp"         ("- _" [3] 3)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   684
end
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   685
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   686
Note that the {\tt arities} declaration causes {\tt exp} to be added to the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   687
syntax' roots. If you put the above text into a file {\tt exp.thy} and load
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   688
it via {\tt use_thy "EXP"}, you can run some tests:
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   689
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   690
val read_exp = Syntax.test_read (syn_of EXP.thy) "exp";
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   691
read_exp "0 * 0 * 0 * 0 + 0 + 0 + 0";
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   692
{\out tokens: "0" "*" "0" "*" "0" "*" "0" "+" "0" "+" "0" "+" "0"}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   693
{\out raw: ("+" ("+" ("+" ("*" "0" ("*" "0" ("*" "0" "0"))) "0") "0") "0")}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   694
{\out \vdots}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   695
read_exp "0 + - 0 + 0";
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   696
{\out tokens: "0" "+" "-" "0" "+" "0"}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   697
{\out raw: ("+" ("+" "0" ("-" "0")) "0")}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   698
{\out \vdots}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   699
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   700
The output of \ttindex{Syntax.test_read} includes the token list ({\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   701
tokens}) and the raw ast directly derived from the parse tree, ignoring parse
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   702
ast translations. The rest is tracing information provided by the macro
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   703
expander (see \S\ref{sec:macros}).
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   704
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   705
Executing {\tt Syntax.print_gram (syn_of EXP.thy)} reveals the actual grammar
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   706
productions derived from the above mixfix declarations (lots of additional
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   707
information deleted):
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   708
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   709
exp = "0"  => "0" (9)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   710
exp = exp[0] "+" exp[1]  => "+" (0)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   711
exp = exp[3] "*" exp[2]  => "*" (2)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   712
exp = "-" exp[3]  => "-" (3)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   713
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   714
\end{example}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   715
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   716
Let us now have a closer look at the structure of the string $sy$ appearing
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   717
in mixfix annotations. This string specifies a list of parsing and printing
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   718
directives, namely delimiters\index{delimiter},
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   719
arguments\index{argument!mixfix}, spaces\index{space (pretty printing)},
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   720
blocks of indentation\index{block (pretty printing)} and optional or forced
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   721
line breaks\index{break (pretty printing)}. These are encoded via the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   722
following character sequences:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   723
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   724
\begin{description}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   725
  \item[~\ttindex_~] An argument\index{argument!mixfix} position.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   726
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   727
  \item[~$d$~] A \rmindex{delimiter}, i.e.\ a non-empty sequence of
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   728
    non-special or escaped characters. Escaping a character\index{escape
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   729
      character} means preceding it with a {\tt '} (quote). Thus you have to
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   730
    write {\tt ''} if you really want a single quote. Delimiters may never
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   731
    contain white space, though.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   732
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   733
  \item[~$s$~] A non-empty sequence of spaces\index{space (pretty printing)}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   734
    for printing.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   735
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   736
  \item[~{\ttindex($n$}~] Open a block\index{block (pretty printing)}. $n$ is
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   737
    an optional sequence of digits that specifies the amount of indentation
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   738
    to be added when a line break occurs within the block. If {\tt(} is not
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   739
    followed by a digit, the indentation defaults to $0$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   740
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   741
  \item[~\ttindex)~] Close a block.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   742
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   743
  \item[~\ttindex{//}~] Force a line break\index{break (pretty printing)}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   744
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   745
  \item[~\ttindex/$s$~] Allow a line break\index{break (pretty printing)}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   746
    Spaces $s$ right after {\tt /} are only printed if the break is not
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   747
    taken.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   748
\end{description}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   749
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   750
In terms of parsing, arguments are nonterminals or name tokens and
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   751
delimiters are delimiters. The other directives have only significance for
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   752
printing. The \rmindex{pretty printing} mechanisms of Isabelle is essentially
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   753
the one described in \cite{paulson91}.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   754
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   755
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   756
\subsection{Infixes}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   757
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   758
Infix\index{infix} operators associating to the left or right can be declared
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   759
conveniently using \ttindex{infixl} or \ttindex{infixr}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   760
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   761
Roughly speaking, the form {\tt $c$ ::\ "$\tau$" (infixl $p$)} abbreviates:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   762
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   763
"op \(c\)" ::\ "\(\tau\)"   ("op \(c\)")
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   764
"op \(c\)" ::\ "\(\tau\)"   ("(_ \(c\)/ _)" [\(p\), \(p + 1\)] \(p\))
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   765
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   766
and {\tt $c$ ::\ "$\tau$" (infixr $p$)} abbreviates:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   767
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   768
"op \(c\)" ::\ "\(\tau\)"   ("op \(c\)")
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   769
"op \(c\)" ::\ "\(\tau\)"   ("(_ \(c\)/ _)" [\(p + 1\), \(p\)] \(p\))
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   770
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   771
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   772
Thus, prefixing infixes with \ttindex{op} makes them behave like ordinary
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   773
function symbols. Special characters occurring in $c$ have to be escaped as
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   774
in delimiters. Also note that the expanded forms above would be actually
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   775
illegal at the user level because of duplicate declarations of constants.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   776
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   777
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   778
\subsection{Binders}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   779
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   780
A \bfindex{binder} is a variable-binding construct, such as a
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   781
\rmindex{quantifier}. The constant declaration \indexbold{*binder}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   782
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   783
\(c\) ::\ "\(\tau\)"   (binder "\(Q\)" \(p\))
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   784
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   785
introduces a binder $c$ of type $\tau$, which must have the form $(\tau@1 \To
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   786
\tau@2) \To \tau@3$. Its concrete syntax is $Q~x.P$. A binder is like a
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   787
generalized quantifier where $\tau@1$ is the type of the bound variable $x$,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   788
$\tau@2$ the type of the body $P$, and $\tau@3$ the type of the whole term.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   789
For example $\forall$ can be declared like this:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   790
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   791
All :: "('a => o) => o"   (binder "ALL " 10)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   792
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   793
This allows us to write $\forall x.P$ either as {\tt All(\%$x$.$P$)} or {\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   794
ALL $x$.$P$}. When printing terms, Isabelle usually uses the latter form, but
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   795
has to fall back on $\mtt{All}(P)$, if $P$ is not an abstraction.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   796
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   797
Binders $c$ of type $(\sigma \To \tau) \To \tau$ can be nested; then the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   798
internal form $c(\lambda x@1. c(\lambda x@2. \ldots c(\lambda x@n. P)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   799
\ldots))$ corresponds to external $Q~x@1~x@2 \ldots x@n. P$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   800
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   801
\medskip
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   802
The general binder declaration
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   803
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   804
\(c\) ::\ "(\(\tau@1\) => \(\tau@2\)) => \(\tau@3\)"   (binder "\(Q\)" \(p\))
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   805
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   806
is internally expanded to
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   807
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   808
\(c\)   ::\ "(\(\tau@1\) => \(\tau@2\)) => \(\tau@3\)"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   809
"\(Q\)" ::\ "[idts, \(\tau@2\)] => \(\tau@3\)"   ("(3\(Q\)_./ _)" \(p\))
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   810
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   811
with $idts$ being the syntactic category for a list of $id$s optionally
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   812
constrained (see Figure~\ref{fig:pure_gram}). Note that special characters in
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   813
$Q$ have to be escaped as in delimiters.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   814
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   815
Additionally, a parse translation\index{parse translation!for binder} for $Q$
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   816
and a print translation\index{print translation!for binder} for $c$ is
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   817
installed. These perform behind the scenes the translation between the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   818
internal and external forms.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   819
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   820
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   821
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   822
\section{Syntactic translations (macros)} \label{sec:macros}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   823
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   824
So far we have pretended that there is a close enough relationship between
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   825
concrete and abstract syntax to allow an automatic translation from one to
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   826
the other using the constant name supplied with each non-copy production. In
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   827
many cases this scheme is not powerful enough. Some typical examples involve
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   828
variable binding constructs (e.g.\ {\tt ALL x:A.P} vs.\ {\tt Ball(A, \%x.P)}
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   829
or convenient notations for enumerations like finite sets, lists etc.\ (e.g.\
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   830
{\tt [x, y, z]} vs.\ {\tt Cons(x, Cons(y, Cons(z, Nil)))}).
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   831
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   832
Isabelle offers such translation facilities at two different levels, namely
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   833
{\bf macros}\indexbold{macro} and {\bf translation functions}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   834
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   835
Macros are specified by first-order rewriting systems that operate on asts.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   836
They are usually easy to read and in most cases not very difficult to write.
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   837
Unfortunately, some more obscure translations cannot be expressed as macros
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   838
and you have to fall back on the more powerful mechanism of translation
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   839
functions written in \ML. These are quite unreadable and hard to write (see
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   840
\S\ref{sec:tr_funs}).
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   841
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   842
\medskip
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   843
Let us now get started with the macro system by a simple example:
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   844
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   845
\begin{example}~ \label{ex:set_trans}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   846
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   847
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   848
SET = Pure +
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   849
types
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   850
  i, o 0
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   851
arities
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   852
  i, o :: logic
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   853
consts
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   854
  Trueprop      :: "o => prop"              ("_" 5)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   855
  Collect       :: "[i, i => o] => i"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   856
  "{\at}Collect"    :: "[idt, i, o] => i"       ("(1{\ttlbrace}_:_./ _{\ttrbrace})")
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   857
  Replace       :: "[i, [i, i] => o] => i"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   858
  "{\at}Replace"    :: "[idt, idt, i, o] => i"  ("(1{\ttlbrace}_./ _:_, _{\ttrbrace})")
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   859
  Ball          :: "[i, i => o] => o"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   860
  "{\at}Ball"       :: "[idt, i, o] => o"       ("(3ALL _:_./ _)" 10)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   861
translations
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   862
  "{\ttlbrace}x:A. P{\ttrbrace}"    == "Collect(A, %x. P)"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   863
  "{\ttlbrace}y. x:A, Q{\ttrbrace}" == "Replace(A, %x y. Q)"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   864
  "ALL x:A. P"  == "Ball(A, %x. P)"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   865
end
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   866
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   867
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   868
This and the following theories are complete working examples, though they
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   869
are fragmentary as they contain merely syntax. The full theory definition can
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   870
be found in {\tt ZF/zf.thy}.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   871
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   872
{\tt SET} defines constants for set comprehension ({\tt Collect}),
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   873
replacement ({\tt Replace}) and bounded universal quantification ({\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   874
Ball}). Without additional syntax you would have to express $\forall x \in A.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   875
P(x)$ as {\tt Ball(A, P)}. Since this is quite awkward, we define additional
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   876
constants with appropriate concrete syntax. These constants are decorated
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   877
with {\tt\at} to stress their pure syntactic purpose; they should never occur
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   878
within the final well-typed terms. Another consequence is that the user
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   879
cannot refer to such names directly, since they are not legal identifiers.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   880
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   881
The translations cause the replacement of external forms by internal forms
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   882
after parsing, and vice versa before printing of terms.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   883
\end{example}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   884
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   885
This is only a very simple but common instance of a more powerful mechanism.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   886
As a specification of what is to be translated, it should be comprehensible
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   887
without further explanations. But there are also some snags and other
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   888
peculiarities that are typical for macro systems in general. The purpose of
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   889
this section is to explain how Isabelle's macro system really works.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   890
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   891
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   892
\subsection{Specifying macros}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   893
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   894
Basically macros are rewrite rules on asts. But unlike other macro systems of
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   895
various programming languages, Isabelle's macros work in both directions.
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   896
Therefore a syntax contains two lists of rules: one for parsing and one for
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   897
printing.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   898
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   899
The {\tt translations} section\index{translations section@{\tt translations}
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   900
section} consists of a list of rule specifications of the form:
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   901
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   902
\begin{center}
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   903
{\tt $[$ ($root$) $]$ $string$ $[$ => $|$ <= $|$ == $]$ $[$ ($root$) $]$
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   904
$string$}.
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   905
\end{center}
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   906
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   907
This specifies a \rmindex{parse rule} ({\tt =>}) a \rmindex{print rule} ({\tt
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   908
  <=}) or both ({\tt ==}). The two $string$s preceded by optional
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   909
parenthesized $root$s are the left-hand and right-hand side of the rule in
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
   910
user-defined syntax.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   911
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   912
Rules are internalized wrt.\ an intermediate signature that is obtained from
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   913
the parent theories' ones by adding all material of all sections preceding
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   914
{\tt translations} in the {\tt .thy} file. Especially, new syntax defined in
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   915
{\tt consts} is already effective.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   916
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   917
Then part of the process that transforms input strings into terms is applied:
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   918
lexing, parsing and parse ast translations (see \S\ref{sec:asts}). Macros
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   919
specified in the parents are {\em not\/} expanded. Also note that the lexer
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   920
runs in a different mode that additionally accepts identifiers of the form
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   921
$\_~letter~quasiletter^*$ (like {\tt _idt}, {\tt _K}). The syntactic category
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   922
to parse is specified by $root$, which defaults to {\tt logic}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   923
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   924
Finally, Isabelle tries to guess which atoms of the resulting ast of the rule
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   925
should be treated as constants during matching (see below). These names are
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   926
extracted from all class, type and constant declarations made so far.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   927
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   928
\medskip
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   929
The result are two lists of translation rules in internal form, that is pairs
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   930
of asts. They can be viewed using {\tt Syntax.print_syntax} (sections
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   931
\ttindex{parse_rules} and \ttindex{print_rules}). For {\tt SET} of
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   932
Example~\ref{ex:set_trans} these are:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   933
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   934
parse_rules:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   935
  ("{\at}Collect" x A P)  ->  ("Collect" A ("_abs" x P))
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   936
  ("{\at}Replace" y x A Q)  ->  ("Replace" A ("_abs" x ("_abs" y Q)))
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   937
  ("{\at}Ball" x A P)  ->  ("Ball" A ("_abs" x P))
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   938
print_rules:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   939
  ("Collect" A ("_abs" x P))  ->  ("{\at}Collect" x A P)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   940
  ("Replace" A ("_abs" x ("_abs" y Q)))  ->  ("{\at}Replace" y x A Q)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   941
  ("Ball" A ("_abs" x P))  ->  ("{\at}Ball" x A P)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   942
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   943
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   944
Note that in this notation all rules are oriented left to right. In the {\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   945
translations} section, which has been split into two parts, print rules
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   946
appeared right to left.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   947
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   948
\begin{warn}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   949
Be careful not to choose names for variables in rules that are actually
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   950
treated as constant. If in doubt, check the rules in their internal form or
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   951
the section labeled {\tt consts} in the output of {\tt Syntax.print_syntax}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   952
\end{warn}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   953
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   954
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   955
\subsection{Applying rules}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   956
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   957
In the course of parsing and printing terms, asts are generated as an
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   958
intermediate form as pictured in Figure~\ref{fig:parse_print}. These asts are
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   959
normalized wrt.\ the given lists of translation rules in a uniform manner. As
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
   960
stated earlier, asts are supposed to be first-order `terms'. The rewriting
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   961
systems derived from {\tt translations} sections essentially resemble
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   962
traditional first-order term rewriting systems. We first examine how a single
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   963
rule is applied.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   964
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   965
Let $t$ be the ast to be normalized and $(l, r)$ some translation rule. A
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   966
subast $u$ of $t$ is called {\bf redex}\indexbold{redex (ast)} (reducible
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   967
expression), if it is an instance of $l$. In this case $l$ is said to {\bf
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   968
match}\indexbold{match (ast)} $u$. A redex matched by $l$ may be replaced by
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   969
the corresponding instance of $r$, thus {\bf rewriting}\index{rewrite (ast)}
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   970
the ast $t$.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   971
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   972
Matching requires some notion of {\bf place-holders}\indexbold{place-holder
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   973
(ast)} that may occur in rule patterns but not in ordinary asts, which are
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   974
considered ground. Here we simply use {\tt Variable}s for this purpose.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   975
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   976
More formally, the matching of $u$ by $l$ is performed as follows (the rule
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   977
pattern is the second argument): \index{match (ast)@$match$ (ast)}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   978
\begin{itemize}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   979
  \item $match(\Constant x, \Constant x) = \mbox{OK}$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   980
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   981
  \item $match(\Variable x, \Constant x) = \mbox{OK}$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   982
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   983
  \item $match(u, \Variable x) = \mbox{OK, bind}~x~\mbox{to}~u$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   984
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   985
  \item $match(\Appl{u@1, \ldots, u@n}, \Appl{l@1, \ldots, l@n}) = match(u@1,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   986
    l@1), \ldots, match(u@n, l@n)$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   987
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   988
  \item $match(u, l) = \mbox{FAIL}$ in any other case.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   989
\end{itemize}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   990
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   991
This means that a {\tt Constant} pattern matches any atomic asts of the same
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   992
name, while a {\tt Variable} matches any ast. If successful, $match$ yields a
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
   993
substitution $\sigma$ that is applied to $r$, generating the appropriate
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   994
instance that replaces $u$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   995
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   996
\medskip
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   997
In order to make things simple and fast, ast rewrite rules $(l, r)$ are
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   998
restricted by the following conditions:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
   999
\begin{itemize}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1000
  \item Rules have to be left linear, i.e.\ $l$ must not contain any {\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1001
    Variable} more than once.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1002
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1003
  \item Rules must have constant heads, i.e.\ $l = \mtt"c\mtt"$ or $l =
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1004
    (\mtt"c\mtt" ~ x@1 \ldots x@n)$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1005
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1006
  \item The set of variables contained in $r$ has to be a subset of those of
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1007
    $l$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1008
\end{itemize}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1009
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1010
\medskip
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1011
Having first-order matching in mind, the second case of $match$ may look a
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1012
bit odd. But this is exactly the place, where {\tt Variable}s of non-rule
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1013
asts behave like {\tt Constant}s. The deeper meaning of this is related with
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1014
asts being very `primitive' in some sense, ignorant of the underlying
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1015
`semantics', not far removed from parse trees. At this level it is not yet
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1016
known, which $id$s will become constants, bounds, frees, types or classes. As
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1017
$ast_of_pt$ (see \S\ref{sec:asts}) shows, former parse tree heads appear in
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1018
asts as {\tt Constant}s, while $id$s, $var$s, $tfree$s and $tvar$s become
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1019
{\tt Variable}s.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1020
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1021
This is at variance with asts generated from terms before printing (see
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1022
$ast_of_term$ in \S\ref{sec:asts}), where all constants and type constructors
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1023
become {\tt Constant}s.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1024
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1025
\begin{warn}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1026
This means asts may contain quite a messy mixture of {\tt Variable}s and {\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1027
Constant}s, which is insignificant at macro level because $match$ treats them
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1028
alike.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1029
\end{warn}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1030
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1031
Because of this behaviour, different kinds of atoms with the same name are
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1032
indistinguishable, which may make some rules prone to misbehaviour. Regard
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1033
the following fragmentary example:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1034
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1035
types
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1036
  Nil 0
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1037
consts
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1038
  Nil     :: "'a list"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1039
  "[]"    :: "'a list"    ("[]")
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1040
translations
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1041
  "[]"    == "Nil"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1042
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1043
Then the term {\tt Nil} will be printed as {\tt []}, just as expected. What
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1044
happens with \verb|%Nil.t| or {\tt x::Nil} is left as an exercise.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1045
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1046
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1047
\subsection{Rewriting strategy}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1048
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1049
When normalizing an ast by repeatedly applying translation rules until no
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1050
more rule is applicable, there are in each step two choices: which rule to
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1051
apply next, and which redex to reduce.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1052
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1053
We could assume that the user always supplies terminating and confluent
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1054
rewriting systems, but this would often complicate things unnecessarily.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1055
Therefore, we reveal part of the actual rewriting strategy: The normalizer
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1056
always applies the first matching rule reducing an unspecified redex chosen
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1057
first.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1058
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1059
Thereby, `first rule' is roughly speaking meant wrt.\ the appearance of the
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1060
rules in the {\tt translations} sections. But this is more tricky than it
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1061
seems: If a given theory is {\em extended}, new rules are simply appended to
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1062
the end. But if theories are {\em merged}, it is not clear which list of
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1063
rules has priority over the other. In fact the merge order is left
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1064
unspecified. This shouldn't cause any problems in practice, since
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1065
translations of different theories usually do not overlap. {\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1066
Syntax.print_syntax} shows the rules in their internal order.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1067
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1068
\medskip
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1069
You can watch the normalization of asts during parsing and printing by
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1070
setting \ttindex{Syntax.trace_norm_ast} to {\tt true}. An alternative is the
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1071
use of \ttindex{Syntax.test_read}, which is always in trace mode. The
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1072
information displayed when tracing\index{tracing (ast)} includes: the ast
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1073
before normalization ({\tt pre}), redexes with results ({\tt rewrote}), the
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1074
normal form finally reached ({\tt post}) and some statistics ({\tt
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1075
normalize}). If tracing is off, \ttindex{Syntax.stat_norm_ast} can be set to
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1076
{\tt true} in order to enable printing of the normal form and statistics
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1077
only.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1078
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1079
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1080
\subsection{More examples}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1081
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1082
Let us first reconsider Example~\ref{ex:set_trans}, which is concerned with
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1083
variable binding constructs.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1084
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1085
There is a minor disadvantage over an implementation via translation
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1086
functions (as done for binders):
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1087
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1088
\begin{warn}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1089
If \ttindex{eta_contract} is set to {\tt true}, terms will be
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1090
$\eta$-contracted {\em before\/} the ast rewriter sees them. Thus some
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1091
abstraction nodes needed for print rules to match may get lost. E.g.\
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1092
\verb|Ball(A, %x. P(x))| is contracted to {\tt Ball(A, P)}, the print rule is
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1093
no longer applicable and the output will be {\tt Ball(A, P)}. Note that
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1094
$\eta$-expansion via macros is {\em not\/} possible.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1095
\end{warn}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1096
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1097
\medskip
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1098
Another common trap are meta constraints. If \ttindex{show_types} is set to
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1099
{\tt true}, bound variables will be decorated by their meta types at the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1100
binding place (but not at occurrences in the body). E.g.\ matching with
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1101
\verb|Collect(A, %x. P)| binds {\tt x} to something like {\tt ("_constrain" y
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1102
"i")} rather than only {\tt y}. Ast rewriting will cause the constraint to
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1103
appear in the external form, say \verb|{y::i:A::i. P::o}|. Therefore your
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1104
syntax should be ready for such constraints to be re-read. This is the case
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1105
in our example, because of the category {\tt idt} of the first argument.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1106
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1107
\begin{warn}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1108
Choosing {\tt id} instead of {\tt idt} is a very common error, especially
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1109
since it appears in former versions of most of Isabelle's object-logics.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1110
\end{warn}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1111
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1112
\begin{example} \label{ex:finset_trans}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1113
This example demonstrates the use of recursive macros to implement a
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1114
convenient notation for finite sets.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1115
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1116
FINSET = SET +
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1117
types
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1118
  is 0
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1119
consts
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1120
  ""            :: "i => is"                ("_")
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1121
  "{\at}Enum"       :: "[i, is] => is"          ("_,/ _")
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1122
  empty         :: "i"                      ("{\ttlbrace}{\ttrbrace}")
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1123
  insert        :: "[i, i] => i"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1124
  "{\at}Finset"     :: "is => i"                ("{\ttlbrace}(_){\ttrbrace}")
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1125
translations
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1126
  "{\ttlbrace}x, xs{\ttrbrace}"     == "insert(x, {\ttlbrace}xs{\ttrbrace})"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1127
  "{\ttlbrace}x{\ttrbrace}"         == "insert(x, {\ttlbrace}{\ttrbrace})"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1128
end
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1129
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1130
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1131
Finite sets are internally built up by {\tt empty} and {\tt insert}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1132
Externally we would like to see \verb|{x, y, z}| rather than {\tt insert(x,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1133
insert(y, insert(z, empty)))}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1134
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1135
First we define the generic syntactic category {\tt is} for one or more
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1136
objects of type {\tt i} separated by commas (including breaks for pretty
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1137
printing). The category has to be declared as a 0-place type constructor, but
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1138
without {\tt arities} declaration. Hence {\tt is} is not a logical type, no
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1139
default productions will be added, and we can cook our own syntax for {\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1140
is} (first two lines of {\tt consts} section). If we had needed generic
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1141
enumerations of type $\alpha$ (i.e.\ {\tt logic}), we could have used the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1142
predefined category \ttindex{args} and skipped this part altogether.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1143
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1144
Next follows {\tt empty}, which is already equipped with its syntax
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1145
\verb|{}|, and {\tt insert} without concrete syntax. The syntactic constant
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1146
{\tt\at Finset} provides concrete syntax for enumerations of {\tt i} enclosed
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1147
in curly braces. Remember that a pair of parentheses specifies a block of
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1148
indentation for pretty printing. The category {\tt is} can later be reused
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1149
for other enumerations like lists or tuples.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1150
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1151
The translations may look a bit odd at first sight, but rules can only be
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1152
fully understood in their internal forms, which are:
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1153
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1154
parse_rules:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1155
  ("{\at}Finset" ("{\at}Enum" x xs))  ->  ("insert" x ("{\at}Finset" xs))
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1156
  ("{\at}Finset" x)  ->  ("insert" x "empty")
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1157
print_rules:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1158
  ("insert" x ("{\at}Finset" xs))  ->  ("{\at}Finset" ("{\at}Enum" x xs))
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1159
  ("insert" x "empty")  ->  ("{\at}Finset" x)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1160
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1161
This shows that \verb|{x, xs}| indeed matches any set enumeration of at least
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1162
two elements, binding the first to {\tt x} and the rest to {\tt xs}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1163
Likewise, \verb|{xs}| and \verb|{x}| represent any set enumeration. Note that
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1164
the parse rules only work in this order.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1165
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1166
\medskip
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1167
Some rules are prone to misbehaviour, as
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1168
\verb|%empty insert. insert(x, empty)| shows, which is printed as
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1169
\verb|%empty insert. {x}|. This problem arises, because the ast rewriter
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1170
cannot discern constants, frees, bounds etc.\ and looks only for names of
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1171
atoms.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1172
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1173
Thus the names of {\tt Constant}s occurring in the (internal) left-hand side
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1174
of translation rules should be regarded as `reserved keywords'. It is good
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1175
practice to choose non-identifiers here like {\tt\at Finset} or sufficiently
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1176
long and strange names.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1177
\end{example}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1178
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1179
\begin{example} \label{ex:prod_trans}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1180
One of the well-formedness conditions for ast rewrite rules stated earlier
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1181
implies that you can never introduce new {\tt Variable}s on the right-hand
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1182
side. Something like \verb|"K(B)" => "%x. B"| is illegal and could cause
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1183
variable capturing, if it were allowed. In such cases you usually have to
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1184
fall back on translation functions. But there is a trick that makes things
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1185
quite readable in some cases: {\em calling parse translations by parse
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1186
rules}. This is demonstrated here.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1187
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1188
PROD = FINSET +
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1189
consts
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1190
  Pi            :: "[i, i => i] => i"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1191
  "{\at}PROD"       :: "[idt, i, i] => i"     ("(3PROD _:_./ _)" 10)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1192
  "{\at}->"         :: "[i, i] => i"          ("(_ ->/ _)" [51, 50] 50)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1193
translations
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1194
  "PROD x:A. B" => "Pi(A, %x. B)"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1195
  "A -> B"      => "Pi(A, _K(B))"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1196
end
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1197
ML
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1198
  val print_translation = [("Pi", dependent_tr' ("{\at}PROD", "{\at}->"))];
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1199
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1200
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1201
{\tt Pi} is an internal constant for constructing dependent products. Two
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1202
external forms exist: {\tt PROD x:A.B}, the general case, and {\tt A -> B},
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1203
an abbreviation for \verb|Pi(A, %x.B)| with {\tt B} not actually depending on
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1204
{\tt x}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1205
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1206
Now the second parse rule is where the trick comes in: {\tt _K(B)} is
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1207
introduced during ast rewriting, which later becomes \verb|%x.B| due to a
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1208
parse translation associated with \ttindex{_K}. Note that a leading {\tt _}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1209
in $id$s is allowed in translation rules, but not in ordinary terms. This
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1210
special behaviour of the lexer is very useful for `forging' asts containing
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1211
names that are not directly accessible normally.
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1212
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1213
Unfortunately, there is no such trick for printing, so we have to add a {\tt
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1214
ML} section for the print translation \ttindex{dependent_tr'}.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1215
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1216
The parse translation for {\tt _K} is already installed in Pure, and {\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1217
dependent_tr'} is exported by the syntax module for public use. See
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1218
\S\ref{sec:tr_funs} for more of the arcane lore of translation functions.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1219
\end{example}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1220
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1221
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1222
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1223
\section{Translation functions *} \label{sec:tr_funs}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1224
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1225
This section is about the remaining translation mechanism which enables the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1226
designer of theories to do almost everything with terms or asts during the
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1227
parsing or printing process, by writing \ML-functions. The logic \LK\ is a
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1228
good example of a quite sophisticated use of this to transform between
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1229
internal and external representations of associative sequences. The high
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1230
level macro system described in \S\ref{sec:macros} fails here completely.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1231
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1232
\begin{warn}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1233
A full understanding of the matters presented here requires some familiarity
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1234
with Isabelle's internals, especially the datatypes {\tt term}, {\tt typ},
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1235
{\tt Syntax.ast} and the encodings of types and terms as such at the various
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1236
stages of the parsing or printing process. You probably do not really want to
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1237
use translation functions at all!
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1238
\end{warn}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1239
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1240
As already mentioned in \S\ref{sec:asts}, there are four kinds of translation
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1241
functions. All such functions are associated with a name which specifies an
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1242
ast's or term's head invoking that function. Such names can be (logical or
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1243
syntactic) constants or type constructors.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1244
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1245
{\tt Syntax.print_syntax} displays the sets of names associated with the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1246
translation functions of a {\tt Syntax.syntax} under
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1247
\ttindex{parse_ast_translation}, \ttindex{parse_translation},
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1248
\ttindex{print_translation} and \ttindex{print_ast_translation}. The user can
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1249
add new ones via the {\tt ML} section\index{ML section@{\tt ML} section} of a
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1250
{\tt .thy} file. But there may never be more than one function of the same
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1251
kind per name.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1252
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1253
\begin{warn}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1254
Conceptually, the {\tt ML} section should appear between {\tt consts} and
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1255
{\tt translations}, i.e.\ newly installed translation functions are already
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1256
effective when macros and logical rules are parsed. {\tt ML} has to be the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1257
last section because the {\tt .thy} file parser is unable to detect the end
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1258
of \ML\ code in another way than by end-of-file.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1259
\end{warn}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1260
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1261
All text of the {\tt ML} section is simply copied verbatim into the \ML\ file
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1262
generated from a {\tt .thy} file. Definitions made here by the user become
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1263
components of a \ML\ structure of the same name as the theory to be created.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1264
Therefore local things should be declared within {\tt local}. The following
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1265
special \ML\ values, which are all optional, serve as the interface for the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1266
installation of user defined translation functions.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1267
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1268
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1269
val parse_ast_translation: (string * (ast list -> ast)) list
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1270
val parse_translation: (string * (term list -> term)) list
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1271
val print_translation: (string * (term list -> term)) list
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1272
val print_ast_translation: (string * (ast list -> ast)) list
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1273
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1274
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1275
The basic idea behind all four kinds of functions is relatively simple (see
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1276
also Figure~\ref{fig:parse_print}): Whenever --- during the transformations
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1277
between parse trees, asts and terms --- a combination of the form
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1278
$(\mtt"c\mtt"~x@1 \ldots x@n)$ is encountered, and a translation function $f$
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1279
of appropriate kind exists for $c$, the result will be $f \mtt[ x@1, \ldots,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1280
x@n \mtt]$. Thereby, $x@1, \ldots, x@n$ (with $n \ge 0$) are asts for ast
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1281
translations and terms for term translations. A `combination' at ast level is
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1282
of the form $\Constant c$ or $\Appl{\Constant c, x@1, \ldots, x@n}$, and at
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1283
term level $\ttfct{Const} (c, \tau)$ or $\ttfct{Const} (c, \tau) \ttrel{\$}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1284
x@1 \ttrel{\$} \dots \ttrel{\$} x@n$.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1285
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1286
\medskip
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1287
Translation functions at ast level differ from those at term level only in
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1288
the same way, as asts and terms differ. Terms, being more complex and more
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1289
specific, allow more sophisticated transformations (typically involving
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1290
abstractions and bound variables).
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1291
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1292
On the other hand, {\em parse\/} (ast) translations differ from {\em print\/}
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1293
(ast) translations more fundamentally:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1294
\begin{description}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1295
  \item[Parse (ast) translations] are applied bottom-up, i.e.\ the arguments
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1296
    supplied ($x@1, \ldots, x@n$ above) are already in translated form.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1297
    Additionally, they may not fail, exceptions are re-raised after printing
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1298
    an error message.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1299
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1300
  \item[Print (ast) translations] are applied top-down, i.e.\ supplied with
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1301
    arguments that are partly still in internal form. The result is again fed
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1302
    into the translation machinery as a whole. Therefore a print (ast)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1303
    translation should not introduce as head a constant of the same name that
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1304
    invoked it in the first place. Alternatively, exception \ttindex{Match}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1305
    may be raised, indicating failure of translation.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1306
\end{description}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1307
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1308
Another difference between the parsing and the printing process is, which
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1309
atoms are {\tt Constant}s or {\tt Const}s, i.e.\ able to invoke translation
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1310
functions.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1311
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1312
For parse ast translations only former parse tree heads are {\tt Constant}s
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1313
(see also $ast_of_pt$ in \S\ref{sec:asts}). These and additionally introduced
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1314
{\tt Constant}s (e.g.\ by macros), become {\tt Const}s for parse translations
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1315
(see also $term_of_ast$ in \S\ref{sec:asts}).
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1316
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1317
The situation is slightly different, when terms are prepared for printing,
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1318
since the role of atoms is known. Initially, all logical constants and type
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1319
constructors may invoke print translations. New constants may be introduced
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1320
by these or by macros, able to invoke parse ast translations.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1321
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1322
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1323
\subsection{A simple example *}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1324
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1325
Presenting a simple and useful example of translation functions is not that
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1326
easy, since the macro system is sufficient for most simple applications. By
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1327
convention, translation functions always have names ending with {\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1328
_ast_tr}, {\tt _tr}, {\tt _tr'} or {\tt _ast_tr'}. You may look for such
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1329
names in the sources of Pure Isabelle for more examples.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1330
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1331
\begin{example} \label{ex:tr_funs}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1332
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1333
We continue Example~\ref{ex:prod_trans} by presenting the \ML\ sources of the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1334
parse translation for \ttindex{_K} and the print translation
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1335
\ttindex{dependent_tr'}:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1336
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1337
(* nondependent abstraction *)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1338
fun k_tr (*"_K"*) [t] = Abs ("x", dummyT, incr_boundvars 1 t)
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
  1339
  | k_tr (*"_K"*) ts = raise TERM("k_tr",ts);
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1340
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1341
(* dependent / nondependent quantifiers *)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1342
fun dependent_tr' (q, r) (A :: Abs (x, T, B) :: ts) =
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1343
      if 0 mem (loose_bnos B) then
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1344
        let val (x', B') = variant_abs (x, dummyT, B);
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1345
        in list_comb (Const (q, dummyT) $ Free (x', T) $ A $ B', ts)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1346
        end
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1347
      else list_comb (Const (r, dummyT) $ A $ B, ts)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1348
  | dependent_tr' _ _ = raise Match;
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1349
\end{ttbox}
188
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
  1350
This text is taken from the Pure sources. Ordinary user translations
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
  1351
would typically appear within the {\tt ML} section of the {\tt .thy} file. To
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
  1352
link {\tt k_tr} into the parser requires the definition
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
  1353
\begin{ttbox}
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
  1354
val parse_translation = [("_K",k_tr)];
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
  1355
\end{ttbox}
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
  1356
For {\tt k_tr} this is not necessary, as it is built-in, but user-defined
6be0856cdf49 last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents: 142
diff changeset
  1357
translation functions are added in exactly the same way.  \medskip
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1358
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1359
If {\tt k_tr} is called with exactly one argument $t$, it creates a new {\tt
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1360
Abs} node with a body derived from $t$: loose {\tt Bound}s, i.e.\ those
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1361
referring to outer {\tt Abs}s, are incremented using
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1362
\ttindex{incr_boundvars}. This and many other basic term manipulation
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1363
functions are defined in {\tt Pure/term.ML}, the comments there being in most
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1364
cases the only documentation.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1365
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1366
Since terms fed into parse translations are not yet typed, the type of the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1367
bound variable in the new {\tt Abs} is simply {\tt dummyT}.
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1368
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1369
\medskip
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1370
The argument $(q, r)$ for {\tt dependent_tr'} is supplied already during the
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1371
installation within an {\tt ML} section. This yields a parse translation that
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1372
transforms something like $c(A, \mtt{Abs}(x, T, B), t@1, \ldots, t@n)$ into
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1373
$q(x', A, B', t@1, \ldots, t@n)$ or $r(A, B, t@1, \ldots, t@n)$. The latter
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1374
form, if $B$ does not actually depend on $x$. This is checked using
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1375
\ttindex{loose_bnos}, yet another function of {\tt Pure/term.ML}. Note that
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1376
$x'$ is a version of $x$ renamed away from all names in $B$, and $B'$ the
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1377
body $B$ with {\tt Bound}s referring to our {\tt Abs} node replaced by
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1378
$\ttfct{Free} (x', \mtt{dummyT})$.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1379
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1380
We have to be more careful with types here. While types of {\tt Const}s are
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1381
completely ignored, type constraints may be printed for some {\tt Free}s and
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1382
{\tt Var}s (if \ttindex{show_types} is set to {\tt true}). Variables of type
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1383
\ttindex{dummyT} are never printed with constraint, though. Thus, a
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1384
constraint of $x'$ may only appear at its binding place, since {\tt Free}s of
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1385
$B'$ replacing the appropriate {\tt Bound}s of $B$ via \ttindex{variant_abs}
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1386
have all type {\tt dummyT}. \end{example}
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1387
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1388
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1389
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1390
\section{Example: some minimal logics} \label{sec:min_logics}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1391
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1392
This concluding section presents some examples that are very simple from a
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1393
syntactic point of view. They should rather demonstrate how to define new
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1394
object-logics from scratch. In particular we need to say how an object-logic
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1395
syntax is hooked up to the meta-logic. Since all theorems must conform to the
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1396
syntax for $prop$ (see Figure~\ref{fig:pure_gram}), that syntax has to be
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1397
extended with the object-level syntax. Assume that the syntax of your
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1398
object-logic defines a category $o$ of formulae. These formulae can now
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1399
appear in axioms and theorems wherever $prop$ does if you add the production
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1400
\[ prop ~=~ o. \]
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1401
More precisely, you need a coercion from formulae to propositions:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1402
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1403
Base = Pure +
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1404
types
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1405
  o 0
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1406
arities
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1407
  o :: logic
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1408
consts
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1409
  Trueprop :: "o => prop"   ("_" 5)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1410
end
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1411
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1412
The constant {\tt Trueprop} (the name is arbitrary) acts as an invisible
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1413
coercion function. Assuming this definition resides in a file {\tt base.thy},
135
493308514ea8 various minor changes;
wenzelm
parents: 108
diff changeset
  1414
you have to load it with the command {\tt use_thy "Base"}.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1415
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1416
One of the simplest nontrivial logics is {\em minimal logic\/} of
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1417
implication. Its definition in Isabelle needs no advanced features but
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1418
illustrates the overall mechanism quite nicely:
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1419
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1420
Hilbert = Base +
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1421
consts
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1422
  "-->" :: "[o, o] => o"   (infixr 10)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1423
rules
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1424
  K     "P --> Q --> P"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1425
  S     "(P --> Q --> R) --> (P --> Q) --> P --> R"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1426
  MP    "[| P --> Q; P |] ==> Q"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1427
end
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1428
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1429
After loading this definition you can start to prove theorems in this logic:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1430
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1431
goal Hilbert.thy "P --> P";
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1432
{\out Level 0}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1433
{\out P --> P}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1434
{\out  1.  P --> P}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1435
by (resolve_tac [Hilbert.MP] 1);
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1436
{\out Level 1}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1437
{\out P --> P}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1438
{\out  1.  ?P --> P --> P}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1439
{\out  2.  ?P}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1440
by (resolve_tac [Hilbert.MP] 1);
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1441
{\out Level 2}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1442
{\out P --> P}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1443
{\out  1.  ?P1 --> ?P --> P --> P}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1444
{\out  2.  ?P1}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1445
{\out  3.  ?P}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1446
by (resolve_tac [Hilbert.S] 1);
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1447
{\out Level 3}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1448
{\out P --> P}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1449
{\out  1.  P --> ?Q2 --> P}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1450
{\out  2.  P --> ?Q2}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1451
by (resolve_tac [Hilbert.K] 1);
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1452
{\out Level 4}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1453
{\out P --> P}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1454
{\out  1.  P --> ?Q2}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1455
by (resolve_tac [Hilbert.K] 1);
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1456
{\out Level 5}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1457
{\out P --> P}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1458
{\out No subgoals!}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1459
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1460
As you can see, this Hilbert-style formulation of minimal logic is easy to
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1461
define but difficult to use. The following natural deduction formulation is
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1462
far preferable:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1463
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1464
MinI = Base +
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1465
consts
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1466
  "-->" :: "[o, o] => o"   (infixr 10)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1467
rules
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1468
  impI  "(P ==> Q) ==> P --> Q"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1469
  impE  "[| P --> Q; P |] ==> Q"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1470
end
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1471
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1472
Note, however, that although the two systems are equivalent, this fact cannot
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1473
be proved within Isabelle: {\tt S} and {\tt K} can be derived in {\tt MinI}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1474
(exercise!), but {\tt impI} cannot be derived in {\tt Hilbert}. The reason is
108
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1475
that {\tt impI} is only an {\em admissible\/} rule in {\tt Hilbert},
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1476
something that can only be shown by induction over all possible proofs in
e332c5bf9e1f replaced by current version;
wenzelm
parents: 104
diff changeset
  1477
{\tt Hilbert}.
104
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1478
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1479
It is a very simple matter to extend minimal logic with falsity:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1480
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1481
MinIF = MinI +
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1482
consts
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1483
  False :: "o"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1484
rules
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1485
  FalseE "False ==> P"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1486
end
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1487
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1488
On the other hand, we may wish to introduce conjunction only:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1489
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1490
MinC = Base +
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1491
consts
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1492
  "&" :: "[o, o] => o"   (infixr 30)
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1493
rules
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1494
  conjI  "[| P; Q |] ==> P & Q"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1495
  conjE1 "P & Q ==> P"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1496
  conjE2 "P & Q ==> Q"
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1497
end
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1498
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1499
And if we want to have all three connectives together, we define:
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1500
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1501
MinIFC = MinIF + MinC
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1502
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1503
Now we can prove mixed theorems like
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1504
\begin{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1505
goal MinIFC.thy "P & False --> Q";
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1506
by (resolve_tac [MinI.impI] 1);
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1507
by (dresolve_tac [MinC.conjE2] 1);
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1508
by (eresolve_tac [MinIF.FalseE] 1);
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1509
\end{ttbox}
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1510
Try this as an exercise!
d8205bb279a7 Initial revision
lcp
parents:
diff changeset
  1511