doc-src/Ref/defining.tex
changeset 864 d63b111b917a
parent 711 bb868a30e66f
child 866 2d3d020eef11
equal deleted inserted replaced
863:67692db44c70 864:d63b111b917a
     4 their concrete syntax.  While Isabelle can be regarded as a theorem prover
     4 their concrete syntax.  While Isabelle can be regarded as a theorem prover
     5 for set theory, higher-order logic or the sequent calculus, its
     5 for set theory, higher-order logic or the sequent calculus, its
     6 distinguishing feature is support for the definition of new logics.
     6 distinguishing feature is support for the definition of new logics.
     7 
     7 
     8 Isabelle logics are hierarchies of theories, which are described and
     8 Isabelle logics are hierarchies of theories, which are described and
     9 illustrated in 
     9 illustrated in
    10 \iflabelundefined{sec:defining-theories}{{\em Introduction to Isabelle}}%
    10 \iflabelundefined{sec:defining-theories}{{\em Introduction to Isabelle}}%
    11 {\S\ref{sec:defining-theories}}.  That material, together with the theory
    11 {\S\ref{sec:defining-theories}}.  That material, together with the theory
    12 files provided in the examples directories, should suffice for all simple
    12 files provided in the examples directories, should suffice for all simple
    13 applications.  The easiest way to define a new theory is by modifying a
    13 applications.  The easiest way to define a new theory is by modifying a
    14 copy of an existing theory.
    14 copy of an existing theory.
    17 pretty printing.  The extended examples in \S\ref{sec:min_logics}
    17 pretty printing.  The extended examples in \S\ref{sec:min_logics}
    18 demonstrate the logical aspects of the definition of theories.
    18 demonstrate the logical aspects of the definition of theories.
    19 
    19 
    20 
    20 
    21 \section{Priority grammars} \label{sec:priority_grammars}
    21 \section{Priority grammars} \label{sec:priority_grammars}
    22 \index{priority grammars|(} 
    22 \index{priority grammars|(}
    23 
    23 
    24 A context-free grammar contains a set of {\bf nonterminal symbols}, a set of
    24 A context-free grammar contains a set of {\bf nonterminal symbols}, a set of
    25 {\bf terminal symbols} and a set of {\bf productions}\index{productions}.
    25 {\bf terminal symbols} and a set of {\bf productions}\index{productions}.
    26 Productions have the form ${A=\gamma}$, where $A$ is a nonterminal and
    26 Productions have the form ${A=\gamma}$, where $A$ is a nonterminal and
    27 $\gamma$ is a string of terminals and nonterminals.  One designated
    27 $\gamma$ is a string of terminals and nonterminals.  One designated
    37 introducing new nonterminals and productions.
    37 introducing new nonterminals and productions.
    38 
    38 
    39 Formally, a set of context free productions $G$ induces a derivation
    39 Formally, a set of context free productions $G$ induces a derivation
    40 relation $\longrightarrow@G$.  Let $\alpha$ and $\beta$ denote strings of
    40 relation $\longrightarrow@G$.  Let $\alpha$ and $\beta$ denote strings of
    41 terminal or nonterminal symbols.  Then
    41 terminal or nonterminal symbols.  Then
    42 \[ \alpha\, A^{(p)}\, \beta ~\longrightarrow@G~ \alpha\,\gamma\,\beta \] 
    42 \[ \alpha\, A^{(p)}\, \beta ~\longrightarrow@G~ \alpha\,\gamma\,\beta \]
    43 if and only if $G$ contains some production $A^{(q)}=\gamma$ for~$p \le q$.
    43 if and only if $G$ contains some production $A^{(q)}=\gamma$ for~$p \le q$.
    44 
    44 
    45 The following simple grammar for arithmetic expressions demonstrates how
    45 The following simple grammar for arithmetic expressions demonstrates how
    46 binding power and associativity of operators can be enforced by priorities.
    46 binding power and associativity of operators can be enforced by priorities.
    47 \begin{center}
    47 \begin{center}
    63   some fixed integer.  Sometimes {\tt max_pri} is written as $\infty$.
    63   some fixed integer.  Sometimes {\tt max_pri} is written as $\infty$.
    64 \item Priority 0 on the right-hand side and priority \ttindex{max_pri} on
    64 \item Priority 0 on the right-hand side and priority \ttindex{max_pri} on
    65   the left-hand side may be omitted.
    65   the left-hand side may be omitted.
    66 \item The production $A^{(p)} = \alpha$ is written as $A = \alpha~(p)$; the
    66 \item The production $A^{(p)} = \alpha$ is written as $A = \alpha~(p)$; the
    67   priority of the left-hand side actually appears in a column on the far
    67   priority of the left-hand side actually appears in a column on the far
    68   right.  
    68   right.
    69 \item Alternatives are separated by~$|$.  
    69 \item Alternatives are separated by~$|$.
    70 \item Repetition is indicated by dots~(\dots) in an informal but obvious
    70 \item Repetition is indicated by dots~(\dots) in an informal but obvious
    71   way.
    71   way.
    72 \end{itemize}
    72 \end{itemize}
    73 
    73 
    74 Using these conventions and assuming $\infty=9$, the grammar
    74 Using these conventions and assuming $\infty=9$, the grammar
    87 
    87 
    88 \begin{figure}
    88 \begin{figure}
    89 \begin{center}
    89 \begin{center}
    90 \begin{tabular}{rclc}
    90 \begin{tabular}{rclc}
    91 $any$ &=& $prop$ ~~$|$~~ $logic$ \\\\
    91 $any$ &=& $prop$ ~~$|$~~ $logic$ \\\\
    92 $prop$ &=& {\tt PROP} $aprop$ ~~$|$~~ {\tt(} $prop$ {\tt)} \\
    92 $prop$ &=& {\tt(} $prop$ {\tt)} \\
       
    93      &$|$& $prop^{(4)}$ {\tt::} $type$ & (3) \\
       
    94      &$|$& {\tt PROP} $aprop$ \\
    93      &$|$& $any^{(3)}$ {\tt ==} $any^{(2)}$ & (2) \\
    95      &$|$& $any^{(3)}$ {\tt ==} $any^{(2)}$ & (2) \\
    94      &$|$& $any^{(3)}$ {\tt =?=} $any^{(2)}$ & (2) \\
    96      &$|$& $any^{(3)}$ {\tt =?=} $any^{(2)}$ & (2) \\
    95      &$|$& $prop^{(2)}$ {\tt ==>} $prop^{(1)}$ & (1) \\
    97      &$|$& $prop^{(2)}$ {\tt ==>} $prop^{(1)}$ & (1) \\
    96      &$|$& {\tt[|} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt|]} {\tt==>} $prop^{(1)}$ & (1) \\
    98      &$|$& {\tt[|} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt|]} {\tt==>} $prop^{(1)}$ & (1) \\
    97      &$|$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\\\
    99      &$|$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\
       
   100      &$|$& {\tt OFCLASS} {\tt(} $type$ {\tt,} $logic$ {\tt)} \\\\
    98 $aprop$ &=& $id$ ~~$|$~~ $var$
   101 $aprop$ &=& $id$ ~~$|$~~ $var$
    99     ~~$|$~~ $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\\\
   102     ~~$|$~~ $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\\\
   100 $logic$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $logic$ {\tt)} \\
   103 $logic$ &=& {\tt(} $logic$ {\tt)} \\
   101     &$|$& $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\
   104       &$|$& $logic^{(4)}$ {\tt::} $type$ & (3) \\
   102     &$|$& $logic^{(4)}$ {\tt::} $type$ & (4) \\
   105       &$|$& $id$ ~~$|$~~ $var$
   103     &$|$& {\tt \%} $idts$ {\tt.} $any$ & (0) \\\\
   106     ~~$|$~~ $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\
       
   107       &$|$& {\tt \%} $idts$ {\tt.} $any$ & (0) \\\\
   104 $idts$ &=& $idt$ ~~$|$~~ $idt^{(1)}$ $idts$ \\\\
   108 $idts$ &=& $idt$ ~~$|$~~ $idt^{(1)}$ $idts$ \\\\
   105 $idt$ &=& $id$ ~~$|$~~ {\tt(} $idt$ {\tt)} \\
   109 $idt$ &=& $id$ ~~$|$~~ {\tt(} $idt$ {\tt)} \\
   106     &$|$& $id$ {\tt ::} $type$ & (0) \\\\
   110     &$|$& $id$ {\tt ::} $type$ & (0) \\\\
   107 $type$ &=& $tid$ ~~$|$~~ $tvar$ ~~$|$~~ $tid$ {\tt::} $sort$
   111 $type$ &=& {\tt(} $type$ {\tt)} \\
   108   ~~$|$~~ $tvar$ {\tt::} $sort$ \\
   112      &$|$& $tid$ ~~$|$~~ $tvar$ ~~$|$~~ $tid$ {\tt::} $sort$
       
   113        ~~$|$~~ $tvar$ {\tt::} $sort$ \\
   109      &$|$& $id$ ~~$|$~~ $type^{(\infty)}$ $id$
   114      &$|$& $id$ ~~$|$~~ $type^{(\infty)}$ $id$
   110                 ~~$|$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\
   115                 ~~$|$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\
   111      &$|$& $type^{(1)}$ {\tt =>} $type$ & (0) \\
   116      &$|$& $type^{(1)}$ {\tt =>} $type$ & (0) \\
   112      &$|$& {\tt[}  $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0)\\
   117      &$|$& {\tt[}  $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0) \\\\
   113      &$|$& {\tt(} $type$ {\tt)} \\\\
       
   114 $sort$ &=& $id$ ~~$|$~~ {\tt\ttlbrace\ttrbrace}
   118 $sort$ &=& $id$ ~~$|$~~ {\tt\ttlbrace\ttrbrace}
   115                 ~~$|$~~ {\tt\ttlbrace} $id$ {\tt,} \dots {\tt,} $id$ {\tt\ttrbrace}
   119                 ~~$|$~~ {\tt\ttlbrace} $id$ {\tt,} \dots {\tt,} $id$ {\tt\ttrbrace}
   116 \end{tabular}
   120 \end{tabular}
   117 \index{*PROP symbol}
   121 \index{*PROP symbol}
   118 \index{*== symbol}\index{*=?= symbol}\index{*==> symbol}
   122 \index{*== symbol}\index{*=?= symbol}\index{*==> symbol}
   133 \section{The Pure syntax} \label{sec:basic_syntax}
   137 \section{The Pure syntax} \label{sec:basic_syntax}
   134 \index{syntax!Pure|(}
   138 \index{syntax!Pure|(}
   135 
   139 
   136 At the root of all object-logics lies the theory \thydx{Pure}.  It
   140 At the root of all object-logics lies the theory \thydx{Pure}.  It
   137 contains, among many other things, the Pure syntax.  An informal account of
   141 contains, among many other things, the Pure syntax.  An informal account of
   138 this basic syntax (types, terms and formulae) appears in 
   142 this basic syntax (types, terms and formulae) appears in
   139 \iflabelundefined{sec:forward}{{\em Introduction to Isabelle}}%
   143 \iflabelundefined{sec:forward}{{\em Introduction to Isabelle}}%
   140 {\S\ref{sec:forward}}.  A more precise description using a priority grammar
   144 {\S\ref{sec:forward}}.  A more precise description using a priority grammar
   141 appears in Fig.\ts\ref{fig:pure_gram}.  It defines the following
   145 appears in Fig.\ts\ref{fig:pure_gram}.  It defines the following
   142 nonterminals:
   146 nonterminals:
   143 \begin{ttdescription}
   147 \begin{ttdescription}
       
   148   \item[\ndxbold{any}] denotes any term.
       
   149 
   144   \item[\ndxbold{prop}] denotes terms of type {\tt prop}.  These are formulae
   150   \item[\ndxbold{prop}] denotes terms of type {\tt prop}.  These are formulae
   145   of the meta-logic.
   151     of the meta-logic.  Note that user constants of result type {\tt prop}
   146 
   152     (i.e.\ $c :: \ldots \To prop$) should always provide concrete syntax.
   147   \item[\ndxbold{aprop}] denotes atomic propositions.  These typically
   153     Otherwise atomic propositions with head $c$ may be printed incorrectly.
   148   include the judgement forms of the object-logic; its definition
   154 
   149   introduces a meta-level predicate for each judgement form.
   155   \item[\ndxbold{aprop}] denotes atomic propositions.
       
   156 
       
   157 %% FIXME huh!?
       
   158 %  These typically
       
   159 %  include the judgement forms of the object-logic; its definition
       
   160 %  introduces a meta-level predicate for each judgement form.
   150 
   161 
   151   \item[\ndxbold{logic}] denotes terms whose type belongs to class
   162   \item[\ndxbold{logic}] denotes terms whose type belongs to class
   152   \cldx{logic}.  As the syntax is extended by new object-logics, more
   163     \cldx{logic}, excluding type \tydx{prop}.
   153   productions for {\tt logic} are added automatically (see below).
       
   154 
       
   155   \item[\ndxbold{any}] denotes terms that either belong to {\tt prop}
       
   156     or {\tt logic}.
       
   157 
       
   158   \item[\ndxbold{type}] denotes types of the meta-logic.
       
   159 
   164 
   160   \item[\ndxbold{idts}] denotes a list of identifiers, possibly constrained
   165   \item[\ndxbold{idts}] denotes a list of identifiers, possibly constrained
   161     by types.
   166     by types.
       
   167 
       
   168   \item[\ndxbold{type}] denotes types of the meta-logic.
       
   169 
       
   170   \item[\ndxbold{sort}] denotes meta-level sorts.
   162 \end{ttdescription}
   171 \end{ttdescription}
   163 
   172 
   164 \begin{warn}
   173 \begin{warn}
   165   In {\tt idts}, note that \verb|x::nat y| is parsed as \verb|x::(nat y)|,
   174   In {\tt idts}, note that \verb|x::nat y| is parsed as \verb|x::(nat y)|,
   166   treating {\tt y} like a type constructor applied to {\tt nat}.  The
   175   treating {\tt y} like a type constructor applied to {\tt nat}.  The
   170 
   179 
   171   Similarly, \verb|x::nat y::nat| is parsed as \verb|x::(nat y::nat)| and
   180   Similarly, \verb|x::nat y::nat| is parsed as \verb|x::(nat y::nat)| and
   172   yields an error.  The correct form is \verb|(x::nat) (y::nat)|.
   181   yields an error.  The correct form is \verb|(x::nat) (y::nat)|.
   173 \end{warn}
   182 \end{warn}
   174 
   183 
   175 \subsection{Logical types and default syntax}\label{logical-types}
       
   176 \index{lambda calc@$\lambda$-calculus}
       
   177 
       
   178 Isabelle's representation of mathematical languages is based on the
       
   179 simply typed $\lambda$-calculus.  All logical types, namely those of
       
   180 class \cldx{logic}, are automatically equipped with a basic syntax of
       
   181 types, identifiers, variables, parentheses, $\lambda$-abstractions and
       
   182 applications.
       
   183 
       
   184 More precisely, Isabelle internally replaces every nonterminal by
       
   185 $logic$ if it belongs to a subclass of \cldx{logic}.  Thereby these
       
   186 productions (which actually are productions of the nonterminal
       
   187 $logic$) can be used for $ty$:
       
   188 
       
   189 \begin{center}
       
   190 \begin{tabular}{rclc}
       
   191 $ty$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $ty$ {\tt)} \\
       
   192   &$|$& $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)}\\
       
   193   &$|$& $ty^{(4)}$ {\tt::} $type$ ~~~~~~~ (3) \\\\
       
   194 \end{tabular}
       
   195 \end{center}
       
   196 
       
   197 \begin{warn}
   184 \begin{warn}
   198   Type constraints bind very weakly. For example, \verb!x<y::nat! is normally
   185   Type constraints bind very weakly. For example, \verb!x<y::nat! is normally
   199   parsed as \verb!(x<y)::nat!, unless \verb$<$ has priority of 3 or less, in
   186   parsed as \verb!(x<y)::nat!, unless \verb$<$ has priority of 3 or less, in
   200   which case the string is likely to be ambiguous. The correct form is
   187   which case the string is likely to be ambiguous. The correct form is
   201   \verb!x<(y::nat)!.
   188   \verb!x<(y::nat)!.
   202 \end{warn}
   189 \end{warn}
   203 
   190 
       
   191 Isabelle's representation of mathematical languages is based on the simply
       
   192 typed $\lambda$-calculus\index{lambda calc@$\lambda$-calculus}.  All user
       
   193 defined logical types, namely those of class \cldx{logic}, refer to the
       
   194 nonterminal {\tt logic}. Thus they are automatically equipped with a basic
       
   195 syntax of types, identifiers, variables, parentheses, $\lambda$-abstractions
       
   196 and applications.
       
   197 
       
   198 
   204 \subsection{Lexical matters}
   199 \subsection{Lexical matters}
   205 The parser does not process input strings directly.  It operates on token
   200 The parser does not process input strings directly.  It operates on token
   206 lists provided by Isabelle's \bfindex{lexer}.  There are two kinds of
   201 lists provided by Isabelle's \bfindex{lexer}.  There are two kinds of
   207 tokens: \bfindex{delimiters} and \bfindex{name tokens}.
   202 tokens: \bfindex{delimiters} and \bfindex{name tokens}.
   208 
   203 
   210 Delimiters can be regarded as reserved words of the syntax.  You can
   205 Delimiters can be regarded as reserved words of the syntax.  You can
   211 add new ones when extending theories.  In Fig.\ts\ref{fig:pure_gram} they
   206 add new ones when extending theories.  In Fig.\ts\ref{fig:pure_gram} they
   212 appear in typewriter font, for example {\tt ==}, {\tt =?=} and
   207 appear in typewriter font, for example {\tt ==}, {\tt =?=} and
   213 {\tt PROP}\@.
   208 {\tt PROP}\@.
   214 
   209 
   215 Name tokens have a predefined syntax.  The lexer distinguishes four
   210 Name tokens have a predefined syntax.  The lexer distinguishes six disjoint
   216 disjoint classes of names: \rmindex{identifiers}, \rmindex{unknowns}, type
   211 classes of names: \rmindex{identifiers}, \rmindex{unknowns}, type
   217 identifiers\index{type identifiers}, type unknowns\index{type unknowns}.
   212 identifiers\index{type identifiers}, type unknowns\index{type unknowns},
   218 They are denoted by \ndxbold{id}, \ndxbold{var}, \ndxbold{tid},
   213 \rmindex{numerals}, \rmindex{strings}. They are denoted by \ndxbold{id},
   219 \ndxbold{tvar}, respectively.  Typical examples are {\tt x}, {\tt ?x7},
   214 \ndxbold{var}, \ndxbold{tid}, \ndxbold{tvar}, \ndxbold{xnum}, \ndxbold{xstr},
   220 {\tt 'a}, {\tt ?'a3}.  Here is the precise syntax:
   215 respectively.  Typical examples are {\tt x}, {\tt ?x7}, {\tt 'a}, {\tt ?'a3},
       
   216 {\tt \#42}, {\tt ''foo bar''}. Here is the precise syntax:
   221 \begin{eqnarray*}
   217 \begin{eqnarray*}
   222 id        & =   & letter~quasiletter^* \\
   218 id        & =   & letter~quasiletter^* \\
   223 var       & =   & \mbox{\tt ?}id ~~|~~ \mbox{\tt ?}id\mbox{\tt .}nat \\
   219 var       & =   & \mbox{\tt ?}id ~~|~~ \mbox{\tt ?}id\mbox{\tt .}nat \\
   224 tid       & =   & \mbox{\tt '}id \\
   220 tid       & =   & \mbox{\tt '}id \\
   225 tvar      & =   & \mbox{\tt ?}tid ~~|~~
   221 tvar      & =   & \mbox{\tt ?}tid ~~|~~
   226                   \mbox{\tt ?}tid\mbox{\tt .}nat \\[1ex]
   222                   \mbox{\tt ?}tid\mbox{\tt .}nat \\
       
   223 xnum      & =   & \mbox{\tt \#}nat ~~|~~ \mbox{\tt \#\char`\~}nat \\
       
   224 xstr      & =   & \mbox{\tt ''\rm text\tt ''} \\[1ex]
   227 letter    & =   & \mbox{one of {\tt a}\dots {\tt z} {\tt A}\dots {\tt Z}} \\
   225 letter    & =   & \mbox{one of {\tt a}\dots {\tt z} {\tt A}\dots {\tt Z}} \\
   228 digit     & =   & \mbox{one of {\tt 0}\dots {\tt 9}} \\
   226 digit     & =   & \mbox{one of {\tt 0}\dots {\tt 9}} \\
   229 quasiletter & =  & letter ~~|~~ digit ~~|~~ \mbox{\tt _} ~~|~~ \mbox{\tt '} \\
   227 quasiletter & =  & letter ~~|~~ digit ~~|~~ \mbox{\tt _} ~~|~~ \mbox{\tt '} \\
   230 nat       & =   & digit^+
   228 nat       & =   & digit^+
   231 \end{eqnarray*}
   229 \end{eqnarray*}
       
   230 The lexer repeatedly takes the maximal prefix of the input string that forms
       
   231 a valid token.  A maximal prefix that is both a delimiter and a name is
       
   232 treated as a delimiter.  Spaces, tabs, newlines and formfeeds are separators;
       
   233 they never occur within tokens, except those of class $xstr$.
       
   234 
       
   235 \medskip
       
   236 Delimiters need not be separated by white space.  For example, if {\tt -}
       
   237 is a delimiter but {\tt --} is not, then the string {\tt --} is treated as
       
   238 two consecutive occurrences of the token~{\tt -}.  In contrast, \ML\
       
   239 treats {\tt --} as a single symbolic name.  The consequence of Isabelle's
       
   240 more liberal scheme is that the same string may be parsed in different ways
       
   241 after extending the syntax: after adding {\tt --} as a delimiter, the input
       
   242 {\tt --} is treated as a single token.
       
   243 
   232 A \ndxbold{var} or \ndxbold{tvar} describes an unknown, which is internally
   244 A \ndxbold{var} or \ndxbold{tvar} describes an unknown, which is internally
   233 a pair of base name and index (\ML\ type \mltydx{indexname}).  These
   245 a pair of base name and index (\ML\ type \mltydx{indexname}).  These
   234 components are either separated by a dot as in {\tt ?x.1} or {\tt ?x7.3} or
   246 components are either separated by a dot as in {\tt ?x.1} or {\tt ?x7.3} or
   235 run together as in {\tt ?x1}.  The latter form is possible if the base name
   247 run together as in {\tt ?x1}.  The latter form is possible if the base name
   236 does not end with digits.  If the index is 0, it may be dropped altogether:
   248 does not end with digits.  If the index is 0, it may be dropped altogether:
   237 {\tt ?x} abbreviates both {\tt ?x0} and {\tt ?x.0}.
   249 {\tt ?x} abbreviates both {\tt ?x0} and {\tt ?x.0}.
   238 
   250 
   239 The lexer repeatedly takes the maximal prefix of the input string that
   251 Tokens of class $xnum$ or $xstr$ are not used by the meta-logic.
   240 forms a valid token.  A maximal prefix that is both a delimiter and a name
   252 Object-logics may provide numerals and string constants by adding appropriate
   241 is treated as a delimiter.  Spaces, tabs and newlines are separators; they
   253 productions and translation functions.
   242 never occur within tokens.
   254 
   243 
   255 \medskip
   244 Delimiters need not be separated by white space.  For example, if {\tt -}
       
   245 is a delimiter but {\tt --} is not, then the string {\tt --} is treated as
       
   246 two consecutive occurrences of the token~{\tt -}.  In contrast, \ML\ 
       
   247 treats {\tt --} as a single symbolic name.  The consequence of Isabelle's
       
   248 more liberal scheme is that the same string may be parsed in different ways
       
   249 after extending the syntax: after adding {\tt --} as a delimiter, the input
       
   250 {\tt --} is treated as a single token.
       
   251 
       
   252 Although name tokens are returned from the lexer rather than the parser, it
   256 Although name tokens are returned from the lexer rather than the parser, it
   253 is more logical to regard them as nonterminals.  Delimiters, however, are
   257 is more logical to regard them as nonterminals.  Delimiters, however, are
   254 terminals; they are just syntactic sugar and contribute nothing to the
   258 terminals; they are just syntactic sugar and contribute nothing to the
   255 abstract syntax tree.
   259 abstract syntax tree.
   256 
   260 
   257 
   261 
   258 \subsection{*Inspecting the syntax}
   262 \subsection{*Inspecting the syntax}
   259 \begin{ttbox}
   263 \begin{ttbox}
   260 syn_of              : theory -> Syntax.syntax
   264 syn_of              : theory -> Syntax.syntax
       
   265 print_syntax        : theory -> unit
   261 Syntax.print_syntax : Syntax.syntax -> unit
   266 Syntax.print_syntax : Syntax.syntax -> unit
   262 Syntax.print_gram   : Syntax.syntax -> unit
   267 Syntax.print_gram   : Syntax.syntax -> unit
   263 Syntax.print_trans  : Syntax.syntax -> unit
   268 Syntax.print_trans  : Syntax.syntax -> unit
   264 \end{ttbox}
   269 \end{ttbox}
   265 The abstract type \mltydx{Syntax.syntax} allows manipulation of syntaxes
   270 The abstract type \mltydx{Syntax.syntax} allows manipulation of syntaxes
   267 functions:
   272 functions:
   268 \begin{ttdescription}
   273 \begin{ttdescription}
   269 \item[\ttindexbold{syn_of} {\it thy}] returns the syntax of the Isabelle
   274 \item[\ttindexbold{syn_of} {\it thy}] returns the syntax of the Isabelle
   270   theory~{\it thy} as an \ML\ value.
   275   theory~{\it thy} as an \ML\ value.
   271 
   276 
       
   277 \item[\ttindexbold{print_syntax} $thy$] displays the syntax part of $thy$
       
   278   using {\tt Syntax.print_syntax}.
       
   279 
   272 \item[\ttindexbold{Syntax.print_syntax} {\it syn}] shows virtually all
   280 \item[\ttindexbold{Syntax.print_syntax} {\it syn}] shows virtually all
   273   information contained in the syntax {\it syn}.  The displayed output can
   281   information contained in the syntax {\it syn}.  The displayed output can
   274   be large.  The following two functions are more selective.
   282   be large.  The following two functions are more selective.
   275 
   283 
   276 \item[\ttindexbold{Syntax.print_gram} {\it syn}] shows the grammar part
   284 \item[\ttindexbold{Syntax.print_gram} {\it syn}] shows the grammar part
   277   of~{\it syn}, namely the lexicon, roots and productions.  These are
   285   of~{\it syn}, namely the lexicon, logical types and productions.  These are
   278   discussed below.
   286   discussed below.
   279 
   287 
   280 \item[\ttindexbold{Syntax.print_trans} {\it syn}] shows the translation
   288 \item[\ttindexbold{Syntax.print_trans} {\it syn}] shows the translation
   281   part of~{\it syn}, namely the constants, parse/print macros and
   289   part of~{\it syn}, namely the constants, parse/print macros and
   282   parse/print translations.
   290   parse/print translations.
   285 Let us demonstrate these functions by inspecting Pure's syntax.  Even that
   293 Let us demonstrate these functions by inspecting Pure's syntax.  Even that
   286 is too verbose to display in full.
   294 is too verbose to display in full.
   287 \begin{ttbox}\index{*Pure theory}
   295 \begin{ttbox}\index{*Pure theory}
   288 Syntax.print_syntax (syn_of Pure.thy);
   296 Syntax.print_syntax (syn_of Pure.thy);
   289 {\out lexicon: "!!" "\%" "(" ")" "," "." "::" ";" "==" "==>" \dots}
   297 {\out lexicon: "!!" "\%" "(" ")" "," "." "::" ";" "==" "==>" \dots}
   290 {\out roots: logic type fun prop}
   298 {\out logtypes: fun itself}
   291 {\out prods:}
   299 {\out prods:}
   292 {\out   type = tid  (1000)}
   300 {\out   type = tid  (1000)}
   293 {\out   type = tvar  (1000)}
   301 {\out   type = tvar  (1000)}
   294 {\out   type = id  (1000)}
   302 {\out   type = id  (1000)}
   295 {\out   type = tid "::" sort[0]  => "_ofsort" (1000)}
   303 {\out   type = tid "::" sort[0]  => "_ofsort" (1000)}
   305 {\out print_rules:}
   313 {\out print_rules:}
   306 {\out print_ast_translation: "==>" "_abs" "_idts" "fun"}
   314 {\out print_ast_translation: "==>" "_abs" "_idts" "fun"}
   307 \end{ttbox}
   315 \end{ttbox}
   308 
   316 
   309 As you can see, the output is divided into labelled sections.  The grammar
   317 As you can see, the output is divided into labelled sections.  The grammar
   310 is represented by {\tt lexicon}, {\tt roots} and {\tt prods}.  The rest
   318 is represented by {\tt lexicon}, {\tt logtypes} and {\tt prods}.  The rest
   311 refers to syntactic translations and macro expansion.  Here is an
   319 refers to syntactic translations and macro expansion.  Here is an
   312 explanation of the various sections.
   320 explanation of the various sections.
   313 \begin{description}
   321 \begin{description}
   314   \item[{\tt lexicon}] lists the delimiters used for lexical
   322   \item[{\tt lexicon}] lists the delimiters used for lexical
   315     analysis.\index{delimiters} 
   323     analysis.\index{delimiters}
   316 
   324 
   317   \item[{\tt roots}] lists the grammar's nonterminal symbols.  You must
   325   \item[{\tt logtypes}] lists the types that are regarded the same as {\tt
   318     name the desired root when calling lower level functions or specifying
   326     logic} syntactically. Thus types of object-logics (e.g.\ {\tt nat}, say)
   319     macros.  Higher level functions usually expect a type and derive the
   327     will be automatically equipped with the standard syntax of
   320     actual root as described in~\S\ref{sec:grammar}.
   328     $\lambda$-calculus.
   321 
   329 
   322   \item[{\tt prods}] lists the \rmindex{productions} of the priority grammar.
   330   \item[{\tt prods}] lists the \rmindex{productions} of the priority grammar.
   323     The nonterminal $A^{(n)}$ is rendered in {\sc ascii} as {\tt $A$[$n$]}.
   331     The nonterminal $A^{(n)}$ is rendered in {\sc ascii} as {\tt $A$[$n$]}.
   324     Each delimiter is quoted.  Some productions are shown with {\tt =>} and
   332     Each delimiter is quoted.  Some productions are shown with {\tt =>} and
   325     an attached string.  These strings later become the heads of parse
   333     an attached string.  These strings later become the heads of parse
   353 \end{description}
   361 \end{description}
   354 \index{syntax!Pure|)}
   362 \index{syntax!Pure|)}
   355 
   363 
   356 
   364 
   357 \section{Mixfix declarations} \label{sec:mixfix}
   365 \section{Mixfix declarations} \label{sec:mixfix}
   358 \index{mixfix declarations|(} 
   366 \index{mixfix declarations|(}
   359 
   367 
   360 When defining a theory, you declare new constants by giving their names,
   368 When defining a theory, you declare new constants by giving their names,
   361 their type, and an optional {\bf mixfix annotation}.  Mixfix annotations
   369 their type, and an optional {\bf mixfix annotation}.  Mixfix annotations
   362 allow you to extend Isabelle's basic $\lambda$-calculus syntax with
   370 allow you to extend Isabelle's basic $\lambda$-calculus syntax with
   363 readable notation.  They can express any context-free priority grammar.
   371 readable notation.  They can express any context-free priority grammar.
   364 Isabelle syntax definitions are inspired by \OBJ~\cite{OBJ}; they are more
   372 Isabelle syntax definitions are inspired by \OBJ~\cite{OBJ}; they are more
   365 general than the priority declarations of \ML\ and Prolog.  
   373 general than the priority declarations of \ML\ and Prolog.
   366 
   374 
   367 A mixfix annotation defines a production of the priority grammar.  It
   375 A mixfix annotation defines a production of the priority grammar.  It
   368 describes the concrete syntax, the translation to abstract syntax, and the
   376 describes the concrete syntax, the translation to abstract syntax, and the
   369 pretty printing.  Special case annotations provide a simple means of
   377 pretty printing.  Special case annotations provide a simple means of
   370 specifying infix operators, binders and so forth.
   378 specifying infix operators and binders.
   371 
   379 
   372 \subsection{Grammar productions}\label{sec:grammar}\index{productions}
   380 
   373 
   381 %% FIXME remove
   374 Let us examine the treatment of the production
   382 %\subsection{Grammar productions}\label{sec:grammar}\index{productions}
   375 \[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\, A@2^{(p@2)}\, \ldots\,  
   383 %
   376                   A@n^{(p@n)}\, w@n. \]
   384 %Let us examine the treatment of the production
   377 Here $A@i^{(p@i)}$ is a nonterminal with priority~$p@i$ for $i=1$,
   385 %\[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\, A@2^{(p@2)}\, \ldots\,
   378 \ldots,~$n$, while $w@0$, \ldots,~$w@n$ are strings of terminals.
   386 %                  A@n^{(p@n)}\, w@n. \]
   379 In the corresponding mixfix annotation, the priorities are given separately
   387 %Here $A@i^{(p@i)}$ is a nonterminal with priority~$p@i$ for $i=1$,
   380 as $[p@1,\ldots,p@n]$ and~$p$.  The nonterminal symbols are identified with
   388 %\ldots,~$n$, while $w@0$, \ldots,~$w@n$ are strings of terminals.
   381 types~$\tau$, $\tau@1$, \ldots,~$\tau@n$ respectively, and the production's
   389 %In the corresponding mixfix annotation, the priorities are given separately
   382 effect on nonterminals is expressed as the function type
   390 %as $[p@1,\ldots,p@n]$ and~$p$.  The nonterminal symbols are derived from
   383 \[ [\tau@1, \ldots, \tau@n]\To \tau. \]
   391 %types~$\tau$, $\tau@1$, \ldots,~$\tau@n$ respectively, and the production's
   384 Finally, the template
   392 %effect on nonterminals is expressed as the function type
   385 \[ w@0  \;_\; w@1 \;_\; \ldots \;_\; w@n \]
   393 %\[ [\tau@1, \ldots, \tau@n]\To \tau. \]
   386 describes the strings of terminals.
   394 %Finally, the template
   387 
   395 %\[ w@0  \;_\; w@1 \;_\; \ldots \;_\; w@n \]
   388 A simple type is typically declared for each nonterminal symbol.  In
   396 %describes the strings of terminals.
   389 first-order logic, type~$i$ stands for terms and~$o$ for formulae.  Only
   397 
   390 the outermost type constructor is taken into account.  For example, any
       
   391 type of the form $\sigma list$ stands for a list;  productions may refer
       
   392 to the symbol {\tt list} and will apply to lists of any type.
       
   393 
       
   394 The symbol associated with a type is called its {\bf root} since it may
       
   395 serve as the root of a parse tree.  Precisely, the root of $(\tau@1, \dots,
       
   396 \tau@n)ty$ is $ty$, where $\tau@1$, \ldots, $\tau@n$ are types and $ty$ is
       
   397 a type constructor.  Type infixes are a special case of this; in
       
   398 particular, the root of $\tau@1 \To \tau@2$ is {\tt fun}.  Finally, the
       
   399 root of a type variable is {\tt logic}; general productions might
       
   400 refer to this nonterminal.
       
   401 
       
   402 Identifying nonterminals with types allows a constant's type to specify
       
   403 syntax as well.  We can declare the function~$f$ to have type $[\tau@1,
       
   404 \ldots, \tau@n]\To \tau$ and, through a mixfix annotation, specify the
       
   405 layout of the function's $n$ arguments.  The constant's name, in this
       
   406 case~$f$, will also serve as the label in the abstract syntax tree.  There
       
   407 are two exceptions to this treatment of constants:
       
   408 \begin{enumerate}\index{constants!syntactic}
       
   409   \item A production need not map directly to a logical function.  In this
       
   410     case, you must declare a constant whose purpose is purely syntactic.
       
   411     By convention such constants begin with the symbol~{\tt\at}, 
       
   412     ensuring that they can never be written in formulae.
       
   413 
       
   414   \item A copy production has no associated constant.\index{productions!copy}
       
   415 \end{enumerate}
       
   416 There is something artificial about this representation of productions,
       
   417 but it is convenient, particularly for simple theory extensions.
       
   418 
   398 
   419 \subsection{The general mixfix form}
   399 \subsection{The general mixfix form}
   420 Here is a detailed account of mixfix declarations.  Suppose the following
   400 Here is a detailed account of mixfix declarations.  Suppose the following
   421 line occurs within the {\tt consts} section of a {\tt .thy} file:
   401 line occurs within a {\tt consts} or {\tt syntax} section of a {\tt .thy}
       
   402 file:
   422 \begin{center}
   403 \begin{center}
   423   {\tt $c$ ::\ "$\sigma$" ("$template$" $ps$ $p$)}
   404   {\tt $c$ ::\ "$\sigma$" ("$template$" $ps$ $p$)}
   424 \end{center}
   405 \end{center}
   425 This constant declaration and mixfix annotation are interpreted as follows:
   406 This constant declaration and mixfix annotation are interpreted as follows:
   426 \begin{itemize}\index{productions}
   407 \begin{itemize}\index{productions}
   430   production.\index{productions!copy} Otherwise, parsing an instance of the
   411   production.\index{productions!copy} Otherwise, parsing an instance of the
   431   phrase $template$ generates the \AST{} {\tt ("$c$" $a@1$ $\ldots$
   412   phrase $template$ generates the \AST{} {\tt ("$c$" $a@1$ $\ldots$
   432     $a@n$)}, where $a@i$ is the \AST{} generated by parsing the $i$-th
   413     $a@n$)}, where $a@i$ is the \AST{} generated by parsing the $i$-th
   433   argument.
   414   argument.
   434 
   415 
   435   \item The constant $c$, if non-empty, is declared to have type $\sigma$.
   416   \item The constant $c$, if non-empty, is declared to have type $\sigma$
       
   417     ({\tt consts} section only).
   436 
   418 
   437   \item The string $template$ specifies the right-hand side of
   419   \item The string $template$ specifies the right-hand side of
   438     the production.  It has the form
   420     the production.  It has the form
   439     \[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n, \] 
   421     \[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n, \]
   440     where each occurrence of {\tt_} denotes an argument position and
   422     where each occurrence of {\tt_} denotes an argument position and
   441     the~$w@i$ do not contain~{\tt _}.  (If you want a literal~{\tt _} in
   423     the~$w@i$ do not contain~{\tt _}.  (If you want a literal~{\tt _} in
   442     the concrete syntax, you must escape it as described below.)  The $w@i$
   424     the concrete syntax, you must escape it as described below.)  The $w@i$
   443     may consist of \rmindex{delimiters}, spaces or 
   425     may consist of \rmindex{delimiters}, spaces or
   444     \rmindex{pretty printing} annotations (see below).
   426     \rmindex{pretty printing} annotations (see below).
   445 
   427 
   446   \item The type $\sigma$ specifies the production's nonterminal symbols
   428   \item The type $\sigma$ specifies the production's nonterminal symbols
   447     (or name tokens).  If $template$ is of the form above then $\sigma$
   429     (or name tokens).  If $template$ is of the form above then $\sigma$
   448     must be a function type with at least~$n$ argument positions, say
   430     must be a function type with at least~$n$ argument positions, say
   449     $\sigma = [\tau@1, \dots, \tau@n] \To \tau$.  Nonterminal symbols are
   431     $\sigma = [\tau@1, \dots, \tau@n] \To \tau$.  Nonterminal symbols are
   450     derived from the types $\tau@1$, \ldots,~$\tau@n$, $\tau$ as described
   432     derived from the types $\tau@1$, \ldots,~$\tau@n$, $\tau$ as described
   451     above.  Any of these may be function types; the corresponding root is
   433     below.  Any of these may be function types.
   452     then \tydx{fun}.
       
   453 
   434 
   454   \item The optional list~$ps$ may contain at most $n$ integers, say {\tt
   435   \item The optional list~$ps$ may contain at most $n$ integers, say {\tt
   455       [$p@1$, $\ldots$, $p@m$]}, where $p@i$ is the minimal
   436       [$p@1$, $\ldots$, $p@m$]}, where $p@i$ is the minimal
   456     priority\indexbold{priorities} required of any phrase that may appear
   437     priority\indexbold{priorities} required of any phrase that may appear
   457     as the $i$-th argument.  Missing priorities default to~0.
   438     as the $i$-th argument.  Missing priorities default to~0.
   459   \item The integer $p$ is the priority of this production.  If omitted, it
   440   \item The integer $p$ is the priority of this production.  If omitted, it
   460     defaults to the maximal priority.
   441     defaults to the maximal priority.
   461     Priorities range between 0 and \ttindexbold{max_pri} (= 1000).
   442     Priorities range between 0 and \ttindexbold{max_pri} (= 1000).
   462 \end{itemize}
   443 \end{itemize}
   463 %
   444 %
   464 The declaration {\tt $c$ ::\ "$\sigma$" ("$template$")} specifies no
   445 The resulting production is \[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\,
   465 priorities.  The resulting production puts no priority constraints on any
   446 A@2^{(p@2)}\, \dots\, A@n^{(p@n)}\, w@n \] where $A$ and the $A@i$ are the
   466 of its arguments and has maximal priority itself.  Omitting priorities in
   447 nonterminals corresponding to the types $\tau$ and $\tau@i$ respectively.
   467 this manner will introduce syntactic ambiguities unless the production's
   448 The nonterminal symbol associated with a type $(\ldots)ty$ is {\tt logic}, if
   468 right-hand side is fully bracketed, as in \verb|"if _ then _ else _ fi"|.
   449 this is a logical type (namely one of class {\tt logic} excluding {\tt
       
   450 prop}).  Otherwise it is $ty$ (note that only the outermost type constructor
       
   451 is taken into account).  Finally, the nonterminal of a type variable is {\tt
       
   452 any}.
       
   453 
       
   454 \begin{warn} 
       
   455   Theories must sometimes declare types for purely syntactic purposes ---
       
   456   merely playing the role of nonterminals. One example is \tydx{type}, the
       
   457   built-in type of types.  This is a `type of all types' in the syntactic
       
   458   sense only.  Do not declare such types under {\tt arities} as belonging to
       
   459   class {\tt logic}\index{*logic class}, for that would make them useless as
       
   460   separate nonterminal symbols.
       
   461 \end{warn}
       
   462 
       
   463 Associating nonterminals with types allows a constant's type to specify
       
   464 syntax as well.  We can declare the function~$f$ to have type $[\tau@1,
       
   465 \ldots, \tau@n]\To \tau$ and, through a mixfix annotation, specify the layout
       
   466 of the function's $n$ arguments.  The constant's name, in this case~$f$, will
       
   467 also serve as the label in the abstract syntax tree.
       
   468 
       
   469 You may also declare mixfix syntax without adding constants to the theory's
       
   470 signature, by using a {\tt syntax} section instead of {\tt consts}.  Thus a
       
   471 production need not map directly to a logical function (this typically
       
   472 requires additional syntactic translations, see also
       
   473 Chapter~\ref{chap:syntax}).
       
   474 
       
   475 
       
   476 \medskip 
       
   477 As a special case of the general mixfix declaration, the form 
       
   478 \begin{center}
       
   479   {\tt $c$ ::\ "$\sigma$" ("$template$")} 
       
   480 \end{center}
       
   481 specifies no priorities.  The resulting production puts no priority
       
   482 constraints on any of its arguments and has maximal priority itself.
       
   483 Omitting priorities in this manner is prone to syntactic ambiguities unless
       
   484 the production's right-hand side is fully bracketed, as in \verb|"if _ then _
       
   485 else _ fi"|.
   469 
   486 
   470 Omitting the mixfix annotation completely, as in {\tt $c$ ::\ "$\sigma$"},
   487 Omitting the mixfix annotation completely, as in {\tt $c$ ::\ "$\sigma$"},
   471 is sensible only if~$c$ is an identifier.  Otherwise you will be unable to
   488 is sensible only if~$c$ is an identifier.  Otherwise you will be unable to
   472 write terms involving~$c$.
   489 write terms involving~$c$.
   473 
   490 
   474 \begin{warn}
   491 \medskip 
   475   Theories must sometimes declare types for purely syntactic purposes.  One
   492 There is something artificial about the representation of productions as
   476   example is \tydx{type}, the built-in type of types.  This is a `type of
   493 mixfix declarations, but it is convenient, particularly for simple theory
   477   all types' in the syntactic sense only.  Do not declare such types under
   494 extensions.
   478   {\tt arities} as belonging to class {\tt logic}\index{*logic class}, for
   495 
   479   that would allow their use in arbitrary Isabelle
       
   480   expressions~(\S\ref{logical-types}).
       
   481 \end{warn}
       
   482 
   496 
   483 \subsection{Example: arithmetic expressions}
   497 \subsection{Example: arithmetic expressions}
   484 \index{examples!of mixfix declarations}
   498 \index{examples!of mixfix declarations}
   485 This theory specification contains a {\tt consts} section with mixfix
   499 This theory specification contains a {\tt syntax} section with mixfix
   486 declarations encoding the priority grammar from
   500 declarations encoding the priority grammar from
   487 \S\ref{sec:priority_grammars}:
   501 \S\ref{sec:priority_grammars}:
   488 \begin{ttbox}
   502 \begin{ttbox}
   489 EXP = Pure +
   503 EXP = Pure +
   490 types
   504 types
   491   exp
   505   exp
   492 arities
   506 syntax
   493   exp :: logic
       
   494 consts
       
   495   "0" :: "exp"                ("0"      9)
   507   "0" :: "exp"                ("0"      9)
   496   "+" :: "[exp, exp] => exp"  ("_ + _"  [0, 1] 0)
   508   "+" :: "[exp, exp] => exp"  ("_ + _"  [0, 1] 0)
   497   "*" :: "[exp, exp] => exp"  ("_ * _"  [3, 2] 2)
   509   "*" :: "[exp, exp] => exp"  ("_ * _"  [3, 2] 2)
   498   "-" :: "exp => exp"         ("- _"    [3] 3)
   510   "-" :: "exp => exp"         ("- _"    [3] 3)
   499 end
   511 end
   500 \end{ttbox}
   512 \end{ttbox}
   501 The {\tt arities} declaration causes {\tt exp} to be added as a new root.
   513 If you put this into a file {\tt EXP.thy} and load it via {\tt use_thy"EXP"},
   502 If you put this into a file {\tt EXP.thy} and load it via {\tt
   514 you can run some tests:
   503   use_thy "EXP"}, you can run some tests:
       
   504 \begin{ttbox}
   515 \begin{ttbox}
   505 val read_exp = Syntax.test_read (syn_of EXP.thy) "exp";
   516 val read_exp = Syntax.test_read (syn_of EXP.thy) "exp";
   506 {\out val it = fn : string -> unit}
   517 {\out val it = fn : string -> unit}
   507 read_exp "0 * 0 * 0 * 0 + 0 + 0 + 0";
   518 read_exp "0 * 0 * 0 * 0 + 0 + 0 + 0";
   508 {\out tokens: "0" "*" "0" "*" "0" "*" "0" "+" "0" "+" "0" "+" "0"}
   519 {\out tokens: "0" "*" "0" "*" "0" "*" "0" "+" "0" "+" "0" "+" "0"}
   516 The output of \ttindex{Syntax.test_read} includes the token list ({\tt
   527 The output of \ttindex{Syntax.test_read} includes the token list ({\tt
   517   tokens}) and the raw \AST{} directly derived from the parse tree,
   528   tokens}) and the raw \AST{} directly derived from the parse tree,
   518 ignoring parse \AST{} translations.  The rest is tracing information
   529 ignoring parse \AST{} translations.  The rest is tracing information
   519 provided by the macro expander (see \S\ref{sec:macros}).
   530 provided by the macro expander (see \S\ref{sec:macros}).
   520 
   531 
   521 Executing {\tt Syntax.print_gram} reveals the productions derived
   532 Executing {\tt Syntax.print_gram} reveals the productions derived from the
   522 from our mixfix declarations (lots of additional information deleted):
   533 above mixfix declarations (lots of additional information deleted):
   523 \begin{ttbox}
   534 \begin{ttbox}
   524 Syntax.print_gram (syn_of EXP.thy);
   535 Syntax.print_gram (syn_of EXP.thy);
   525 {\out exp = "0"  => "0" (9)}
   536 {\out exp = "0"  => "0" (9)}
   526 {\out exp = exp[0] "+" exp[1]  => "+" (0)}
   537 {\out exp = exp[0] "+" exp[1]  => "+" (0)}
   527 {\out exp = exp[3] "*" exp[2]  => "*" (2)}
   538 {\out exp = exp[3] "*" exp[2]  => "*" (2)}
   528 {\out exp = "-" exp[3]  => "-" (3)}
   539 {\out exp = "-" exp[3]  => "-" (3)}
   529 \end{ttbox}
   540 \end{ttbox}
   530 
   541 
   531 
   542 
   532 \subsection{The mixfix template}
   543 \subsection{The mixfix template}
   533 Let us take a closer look at the string $template$ appearing in mixfix
   544 Let us now take a closer look at the string $template$ appearing in mixfix
   534 annotations.  This string specifies a list of parsing and printing
   545 annotations.  This string specifies a list of parsing and printing
   535 directives: delimiters\index{delimiters}, arguments, spaces, blocks of
   546 directives: delimiters\index{delimiters}, arguments, spaces, blocks of
   536 indentation and line breaks.  These are encoded by the following character
   547 indentation and line breaks.  These are encoded by the following character
   537 sequences:
   548 sequences:
   538 \index{pretty printing|(}
   549 \index{pretty printing|(}
   576 \indexbold{infixes}
   587 \indexbold{infixes}
   577 
   588 
   578 Infix operators associating to the left or right can be declared
   589 Infix operators associating to the left or right can be declared
   579 using {\tt infixl} or {\tt infixr}.
   590 using {\tt infixl} or {\tt infixr}.
   580 Roughly speaking, the form {\tt $c$ ::\ "$\sigma$" (infixl $p$)}
   591 Roughly speaking, the form {\tt $c$ ::\ "$\sigma$" (infixl $p$)}
   581 abbreviates the constant declarations
   592 abbreviates the mixfix declarations
   582 \begin{ttbox}
   593 \begin{ttbox}
       
   594 "op \(c\)" :: "\(\sigma\)"   ("(_ \(c\)/ _)" [\(p\), \(p+1\)] \(p\))
   583 "op \(c\)" :: "\(\sigma\)"   ("op \(c\)")
   595 "op \(c\)" :: "\(\sigma\)"   ("op \(c\)")
   584 "op \(c\)" :: "\(\sigma\)"   ("(_ \(c\)/ _)" [\(p\), \(p+1\)] \(p\))
   596 \end{ttbox}
   585 \end{ttbox}
   597 and {\tt $c$ ::\ "$\sigma$" (infixr $p$)} abbreviates the mixfix declarations
   586 and {\tt $c$ ::\ "$\sigma$" (infixr $p$)} abbreviates the constant declarations
   598 \begin{ttbox}
   587 \begin{ttbox}
   599 "op \(c\)" :: "\(\sigma\)"   ("(_ \(c\)/ _)" [\(p+1\), \(p\)] \(p\))
   588 "op \(c\)" :: "\(\sigma\)"   ("op \(c\)")
   600 "op \(c\)" :: "\(\sigma\)"   ("op \(c\)")
   589 "op \(c\)" :: "\(\sigma\)"   ("(_ \(c\)/ _)" [\(p+1\), \(p\)] \(p\))
       
   590 \end{ttbox}
   601 \end{ttbox}
   591 The infix operator is declared as a constant with the prefix {\tt op}.
   602 The infix operator is declared as a constant with the prefix {\tt op}.
   592 Thus, prefixing infixes with \sdx{op} makes them behave like ordinary
   603 Thus, prefixing infixes with \sdx{op} makes them behave like ordinary
   593 function symbols, as in \ML.  Special characters occurring in~$c$ must be
   604 function symbols, as in \ML.  Special characters occurring in~$c$ must be
   594 escaped, as in delimiters, using a single quote.
   605 escaped, as in delimiters, using a single quote.
   595 
       
   596 The expanded forms above would be actually illegal in a {\tt .thy} file
       
   597 because they declare the constant \hbox{\tt"op \(c\)"} twice.
       
   598 
   606 
   599 
   607 
   600 \subsection{Binders}
   608 \subsection{Binders}
   601 \indexbold{binders}
   609 \indexbold{binders}
   602 \begingroup
   610 \begingroup
   610 $(\tau@1 \To \tau@2) \To \tau@3$.  Its concrete syntax is $\Q~x.P$, where
   618 $(\tau@1 \To \tau@2) \To \tau@3$.  Its concrete syntax is $\Q~x.P$, where
   611 $x$ is a bound variable of type~$\tau@1$, the body~$P$ has type $\tau@2$
   619 $x$ is a bound variable of type~$\tau@1$, the body~$P$ has type $\tau@2$
   612 and the whole term has type~$\tau@3$.  Special characters in $\Q$ must be
   620 and the whole term has type~$\tau@3$.  Special characters in $\Q$ must be
   613 escaped using a single quote.
   621 escaped using a single quote.
   614 
   622 
   615 The declaration is expanded internally to
   623 The declaration is expanded internally to something like
   616 \begin{ttbox}
   624 \begin{ttbox}
   617 \(c\)    :: "(\(\tau@1\) => \(\tau@2\)) => \(\tau@3\)"
   625 \(c\)    :: "(\(\tau@1\) => \(\tau@2\)) => \(\tau@3\)"
   618 "\(\Q\)"\hskip-3pt  :: "[idts, \(\tau@2\)] => \(\tau@3\)"   ("(3\(\Q\)_./ _)" \(p\))
   626 "\(\Q\)"\hskip-3pt  :: "[idts, \(\tau@2\)] => \(\tau@3\)"   ("(3\(\Q\)_./ _)" \(p\))
   619 \end{ttbox}
   627 \end{ttbox}
   620 Here \ndx{idts} is the nonterminal symbol for a list of identifiers with
   628 Here \ndx{idts} is the nonterminal symbol for a list of identifiers with
   645 
   653 
   646 \section{Ambiguity of parsed expressions} \label{sec:ambiguity}
   654 \section{Ambiguity of parsed expressions} \label{sec:ambiguity}
   647 \index{ambiguity!of parsed expressions}
   655 \index{ambiguity!of parsed expressions}
   648 
   656 
   649 To keep the grammar small and allow common productions to be shared
   657 To keep the grammar small and allow common productions to be shared
   650 all logical types are internally represented
   658 all logical types (except {\tt prop}) are internally represented
   651 by one nonterminal, namely {\tt logic}. This and omitted or too freely 
   659 by one nonterminal, namely {\tt logic}. This and omitted or too freely
   652 chosen priorities may lead to ways of parsing an expression that were
   660 chosen priorities may lead to ways of parsing an expression that were
   653 not intended by the theory's maker. In most cases Isabelle is able to
   661 not intended by the theory's maker. In most cases Isabelle is able to
   654 select one of multiple parse trees that an expression has lead 
   662 select one of multiple parse trees that an expression has lead
   655 to by checking which of them can be typed correctly. But this may not
   663 to by checking which of them can be typed correctly. But this may not
   656 work in every case and always slows down parsing.
   664 work in every case and always slows down parsing.
   657 The warning and error messages that can be produced during this process are 
   665 The warning and error messages that can be produced during this process are
   658 as follows:
   666 as follows:
   659 
   667 
   660 If an ambiguity can be resolved by type inference this warning
   668 If an ambiguity can be resolved by type inference this warning
   661 is shown to remind the user that parsing is (unnecessarily) slowed
   669 is shown to remind the user that parsing is (unnecessarily) slowed
   662 down:
   670 down:
   679 {\out Error: More than one term is type correct:}
   687 {\out Error: More than one term is type correct:}
   680 {\out ...}
   688 {\out ...}
   681 \end{ttbox}
   689 \end{ttbox}
   682 
   690 
   683 On the other hand it's also possible that none of the parse trees can be
   691 On the other hand it's also possible that none of the parse trees can be
   684 typed correctly though the user did not make a mistake. By default Isabelle 
   692 typed correctly although the user did not make a mistake.
   685 assumes that the type of a syntax translation rule is {\tt logic} but does 
   693 
   686 not look at the type unless parsing the rule produces more than one parse 
   694 %% FIXME remove?
   687 tree. In that case this message is output if the rule's type is different 
   695 %By default Isabelle
   688 from {\tt logic}:
   696 %assumes that the type of a syntax translation rule is {\tt logic} but does
   689 
   697 %not look at the type unless parsing the rule produces more than one parse
   690 \begin{ttbox}
   698 %tree. In that case this message is output if the rule's type is different
   691 {\out Warning: Ambiguous input "..."}
   699 %from {\tt logic}:
   692 {\out produces the following parse trees:}
   700 %
   693 {\out ...}
   701 %\begin{ttbox}
   694 {\out This occured in syntax translation rule: "..."  ->  "..."}
   702 %{\out Warning: Ambiguous input "..."}
   695 {\out Type checking error: Term does not have expected type}
   703 %{\out produces the following parse trees:}
   696 {\out ...}
   704 %{\out ...}
   697 \end{ttbox}
   705 %{\out This occured in syntax translation rule: "..."  ->  "..."}
   698 
   706 %{\out Type checking error: Term does not have expected type}
   699 To circumvent this the rule's type has to be stated.
   707 %{\out ...}
       
   708 %\end{ttbox}
       
   709 %
       
   710 %To circumvent this the rule's type has to be stated.
   700 
   711 
   701 
   712 
   702 \section{Example: some minimal logics} \label{sec:min_logics}
   713 \section{Example: some minimal logics} \label{sec:min_logics}
   703 \index{examples!of logic definitions}
   714 \index{examples!of logic definitions}
   704 
   715 
   705 This section presents some examples that have a simple syntax.  They
   716 This section presents some examples that have a simple syntax.  They
   706 demonstrate how to define new object-logics from scratch.
   717 demonstrate how to define new object-logics from scratch.
   707 
   718 
   708 First we must define how an object-logic syntax is embedded into the
   719 First we must define how an object-logic syntax is embedded into the
   709 meta-logic.  Since all theorems must conform to the syntax for~\ndx{prop} (see
   720 meta-logic.  Since all theorems must conform to the syntax for~\ndx{prop}
   710 Fig.\ts\ref{fig:pure_gram}), that syntax has to be extended with the
   721 (see Fig.\ts\ref{fig:pure_gram}), that syntax has to be extended with the
   711 object-level syntax.  Assume that the syntax of your object-logic defines a
   722 object-level syntax.  Assume that the syntax of your object-logic defines a
   712 nonterminal symbol~\ndx{o} of formulae.  These formulae can now appear in
   723 meta-type~\tydx{o} of formulae which refers to the nonterminal {\tt logic}.
   713 axioms and theorems wherever \ndx{prop} does if you add the production
   724 These formulae can now appear in axioms and theorems wherever \ndx{prop} does
   714 \[ prop ~=~ o. \]
   725 if you add the production
   715 This is not a copy production but a coercion from formulae to propositions:
   726 \[ prop ~=~ logic. \]
       
   727 This is not supposed to be a copy production but an implicit coercion from
       
   728 formulae to propositions:
   716 \begin{ttbox}
   729 \begin{ttbox}
   717 Base = Pure +
   730 Base = Pure +
   718 types
   731 types
   719   o
   732   o
   720 arities
   733 arities