summary |
shortlog |
changelog |
graph |
tags |
bookmarks |
branches |
files |
changeset |
raw | gz |
help

author | wenzelm |

Wed, 18 Jan 1995 10:17:55 +0100 | |

changeset 864 | d63b111b917a |

parent 863 | 67692db44c70 |

child 865 | b38c67991122 |

quite a lot of minor and major revisions (inspecting theories, read_axm,
cert_axm, grammar generation, lexical matters, macro examples, ...);

--- a/doc-src/Ref/defining.tex Fri Jan 13 02:02:00 1995 +0100 +++ b/doc-src/Ref/defining.tex Wed Jan 18 10:17:55 1995 +0100 @@ -6,7 +6,7 @@ distinguishing feature is support for the definition of new logics. Isabelle logics are hierarchies of theories, which are described and -illustrated in +illustrated in \iflabelundefined{sec:defining-theories}{{\em Introduction to Isabelle}}% {\S\ref{sec:defining-theories}}. That material, together with the theory files provided in the examples directories, should suffice for all simple @@ -19,7 +19,7 @@ \section{Priority grammars} \label{sec:priority_grammars} -\index{priority grammars|(} +\index{priority grammars|(} A context-free grammar contains a set of {\bf nonterminal symbols}, a set of {\bf terminal symbols} and a set of {\bf productions}\index{productions}. @@ -39,7 +39,7 @@ Formally, a set of context free productions $G$ induces a derivation relation $\longrightarrow@G$. Let $\alpha$ and $\beta$ denote strings of terminal or nonterminal symbols. Then -\[ \alpha\, A^{(p)}\, \beta ~\longrightarrow@G~ \alpha\,\gamma\,\beta \] +\[ \alpha\, A^{(p)}\, \beta ~\longrightarrow@G~ \alpha\,\gamma\,\beta \] if and only if $G$ contains some production $A^{(q)}=\gamma$ for~$p \le q$. The following simple grammar for arithmetic expressions demonstrates how @@ -65,8 +65,8 @@ the left-hand side may be omitted. \item The production $A^{(p)} = \alpha$ is written as $A = \alpha~(p)$; the priority of the left-hand side actually appears in a column on the far - right. -\item Alternatives are separated by~$|$. + right. +\item Alternatives are separated by~$|$. \item Repetition is indicated by dots~(\dots) in an informal but obvious way. \end{itemize} @@ -89,28 +89,32 @@ \begin{center} \begin{tabular}{rclc} $any$ &=& $prop$ ~~$|$~~ $logic$ \\\\ -$prop$ &=& {\tt PROP} $aprop$ ~~$|$~~ {\tt(} $prop$ {\tt)} \\ +$prop$ &=& {\tt(} $prop$ {\tt)} \\ + &$|$& $prop^{(4)}$ {\tt::} $type$ & (3) \\ + &$|$& {\tt PROP} $aprop$ \\ &$|$& $any^{(3)}$ {\tt ==} $any^{(2)}$ & (2) \\ &$|$& $any^{(3)}$ {\tt =?=} $any^{(2)}$ & (2) \\ &$|$& $prop^{(2)}$ {\tt ==>} $prop^{(1)}$ & (1) \\ &$|$& {\tt[|} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt|]} {\tt==>} $prop^{(1)}$ & (1) \\ - &$|$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\\\ + &$|$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\ + &$|$& {\tt OFCLASS} {\tt(} $type$ {\tt,} $logic$ {\tt)} \\\\ $aprop$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\\\ -$logic$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $logic$ {\tt)} \\ - &$|$& $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\ - &$|$& $logic^{(4)}$ {\tt::} $type$ & (4) \\ - &$|$& {\tt \%} $idts$ {\tt.} $any$ & (0) \\\\ +$logic$ &=& {\tt(} $logic$ {\tt)} \\ + &$|$& $logic^{(4)}$ {\tt::} $type$ & (3) \\ + &$|$& $id$ ~~$|$~~ $var$ + ~~$|$~~ $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\ + &$|$& {\tt \%} $idts$ {\tt.} $any$ & (0) \\\\ $idts$ &=& $idt$ ~~$|$~~ $idt^{(1)}$ $idts$ \\\\ $idt$ &=& $id$ ~~$|$~~ {\tt(} $idt$ {\tt)} \\ &$|$& $id$ {\tt ::} $type$ & (0) \\\\ -$type$ &=& $tid$ ~~$|$~~ $tvar$ ~~$|$~~ $tid$ {\tt::} $sort$ - ~~$|$~~ $tvar$ {\tt::} $sort$ \\ +$type$ &=& {\tt(} $type$ {\tt)} \\ + &$|$& $tid$ ~~$|$~~ $tvar$ ~~$|$~~ $tid$ {\tt::} $sort$ + ~~$|$~~ $tvar$ {\tt::} $sort$ \\ &$|$& $id$ ~~$|$~~ $type^{(\infty)}$ $id$ ~~$|$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\ &$|$& $type^{(1)}$ {\tt =>} $type$ & (0) \\ - &$|$& {\tt[} $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0)\\ - &$|$& {\tt(} $type$ {\tt)} \\\\ + &$|$& {\tt[} $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0) \\\\ $sort$ &=& $id$ ~~$|$~~ {\tt\ttlbrace\ttrbrace} ~~$|$~~ {\tt\ttlbrace} $id$ {\tt,} \dots {\tt,} $id$ {\tt\ttrbrace} \end{tabular} @@ -135,30 +139,35 @@ At the root of all object-logics lies the theory \thydx{Pure}. It contains, among many other things, the Pure syntax. An informal account of -this basic syntax (types, terms and formulae) appears in +this basic syntax (types, terms and formulae) appears in \iflabelundefined{sec:forward}{{\em Introduction to Isabelle}}% {\S\ref{sec:forward}}. A more precise description using a priority grammar appears in Fig.\ts\ref{fig:pure_gram}. It defines the following nonterminals: \begin{ttdescription} + \item[\ndxbold{any}] denotes any term. + \item[\ndxbold{prop}] denotes terms of type {\tt prop}. These are formulae - of the meta-logic. + of the meta-logic. Note that user constants of result type {\tt prop} + (i.e.\ $c :: \ldots \To prop$) should always provide concrete syntax. + Otherwise atomic propositions with head $c$ may be printed incorrectly. - \item[\ndxbold{aprop}] denotes atomic propositions. These typically - include the judgement forms of the object-logic; its definition - introduces a meta-level predicate for each judgement form. + \item[\ndxbold{aprop}] denotes atomic propositions. + +%% FIXME huh!? +% These typically +% include the judgement forms of the object-logic; its definition +% introduces a meta-level predicate for each judgement form. \item[\ndxbold{logic}] denotes terms whose type belongs to class - \cldx{logic}. As the syntax is extended by new object-logics, more - productions for {\tt logic} are added automatically (see below). + \cldx{logic}, excluding type \tydx{prop}. - \item[\ndxbold{any}] denotes terms that either belong to {\tt prop} - or {\tt logic}. + \item[\ndxbold{idts}] denotes a list of identifiers, possibly constrained + by types. \item[\ndxbold{type}] denotes types of the meta-logic. - \item[\ndxbold{idts}] denotes a list of identifiers, possibly constrained - by types. + \item[\ndxbold{sort}] denotes meta-level sorts. \end{ttdescription} \begin{warn} @@ -172,28 +181,6 @@ yields an error. The correct form is \verb|(x::nat) (y::nat)|. \end{warn} -\subsection{Logical types and default syntax}\label{logical-types} -\index{lambda calc@$\lambda$-calculus} - -Isabelle's representation of mathematical languages is based on the -simply typed $\lambda$-calculus. All logical types, namely those of -class \cldx{logic}, are automatically equipped with a basic syntax of -types, identifiers, variables, parentheses, $\lambda$-abstractions and -applications. - -More precisely, Isabelle internally replaces every nonterminal by -$logic$ if it belongs to a subclass of \cldx{logic}. Thereby these -productions (which actually are productions of the nonterminal -$logic$) can be used for $ty$: - -\begin{center} -\begin{tabular}{rclc} -$ty$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $ty$ {\tt)} \\ - &$|$& $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)}\\ - &$|$& $ty^{(4)}$ {\tt::} $type$ ~~~~~~~ (3) \\\\ -\end{tabular} -\end{center} - \begin{warn} Type constraints bind very weakly. For example, \verb!x<y::nat! is normally parsed as \verb!(x<y)::nat!, unless \verb$<$ has priority of 3 or less, in @@ -201,6 +188,14 @@ \verb!x<(y::nat)!. \end{warn} +Isabelle's representation of mathematical languages is based on the simply +typed $\lambda$-calculus\index{lambda calc@$\lambda$-calculus}. All user +defined logical types, namely those of class \cldx{logic}, refer to the +nonterminal {\tt logic}. Thus they are automatically equipped with a basic +syntax of types, identifiers, variables, parentheses, $\lambda$-abstractions +and applications. + + \subsection{Lexical matters} The parser does not process input strings directly. It operates on token lists provided by Isabelle's \bfindex{lexer}. There are two kinds of @@ -212,23 +207,40 @@ appear in typewriter font, for example {\tt ==}, {\tt =?=} and {\tt PROP}\@. -Name tokens have a predefined syntax. The lexer distinguishes four -disjoint classes of names: \rmindex{identifiers}, \rmindex{unknowns}, type -identifiers\index{type identifiers}, type unknowns\index{type unknowns}. -They are denoted by \ndxbold{id}, \ndxbold{var}, \ndxbold{tid}, -\ndxbold{tvar}, respectively. Typical examples are {\tt x}, {\tt ?x7}, -{\tt 'a}, {\tt ?'a3}. Here is the precise syntax: +Name tokens have a predefined syntax. The lexer distinguishes six disjoint +classes of names: \rmindex{identifiers}, \rmindex{unknowns}, type +identifiers\index{type identifiers}, type unknowns\index{type unknowns}, +\rmindex{numerals}, \rmindex{strings}. They are denoted by \ndxbold{id}, +\ndxbold{var}, \ndxbold{tid}, \ndxbold{tvar}, \ndxbold{xnum}, \ndxbold{xstr}, +respectively. Typical examples are {\tt x}, {\tt ?x7}, {\tt 'a}, {\tt ?'a3}, +{\tt \#42}, {\tt ''foo bar''}. Here is the precise syntax: \begin{eqnarray*} id & = & letter~quasiletter^* \\ var & = & \mbox{\tt ?}id ~~|~~ \mbox{\tt ?}id\mbox{\tt .}nat \\ tid & = & \mbox{\tt '}id \\ tvar & = & \mbox{\tt ?}tid ~~|~~ - \mbox{\tt ?}tid\mbox{\tt .}nat \\[1ex] + \mbox{\tt ?}tid\mbox{\tt .}nat \\ +xnum & = & \mbox{\tt \#}nat ~~|~~ \mbox{\tt \#\char`\~}nat \\ +xstr & = & \mbox{\tt ''\rm text\tt ''} \\[1ex] letter & = & \mbox{one of {\tt a}\dots {\tt z} {\tt A}\dots {\tt Z}} \\ digit & = & \mbox{one of {\tt 0}\dots {\tt 9}} \\ quasiletter & = & letter ~~|~~ digit ~~|~~ \mbox{\tt _} ~~|~~ \mbox{\tt '} \\ nat & = & digit^+ \end{eqnarray*} +The lexer repeatedly takes the maximal prefix of the input string that forms +a valid token. A maximal prefix that is both a delimiter and a name is +treated as a delimiter. Spaces, tabs, newlines and formfeeds are separators; +they never occur within tokens, except those of class $xstr$. + +\medskip +Delimiters need not be separated by white space. For example, if {\tt -} +is a delimiter but {\tt --} is not, then the string {\tt --} is treated as +two consecutive occurrences of the token~{\tt -}. In contrast, \ML\ +treats {\tt --} as a single symbolic name. The consequence of Isabelle's +more liberal scheme is that the same string may be parsed in different ways +after extending the syntax: after adding {\tt --} as a delimiter, the input +{\tt --} is treated as a single token. + A \ndxbold{var} or \ndxbold{tvar} describes an unknown, which is internally a pair of base name and index (\ML\ type \mltydx{indexname}). These components are either separated by a dot as in {\tt ?x.1} or {\tt ?x7.3} or @@ -236,19 +248,11 @@ does not end with digits. If the index is 0, it may be dropped altogether: {\tt ?x} abbreviates both {\tt ?x0} and {\tt ?x.0}. -The lexer repeatedly takes the maximal prefix of the input string that -forms a valid token. A maximal prefix that is both a delimiter and a name -is treated as a delimiter. Spaces, tabs and newlines are separators; they -never occur within tokens. +Tokens of class $xnum$ or $xstr$ are not used by the meta-logic. +Object-logics may provide numerals and string constants by adding appropriate +productions and translation functions. -Delimiters need not be separated by white space. For example, if {\tt -} -is a delimiter but {\tt --} is not, then the string {\tt --} is treated as -two consecutive occurrences of the token~{\tt -}. In contrast, \ML\ -treats {\tt --} as a single symbolic name. The consequence of Isabelle's -more liberal scheme is that the same string may be parsed in different ways -after extending the syntax: after adding {\tt --} as a delimiter, the input -{\tt --} is treated as a single token. - +\medskip Although name tokens are returned from the lexer rather than the parser, it is more logical to regard them as nonterminals. Delimiters, however, are terminals; they are just syntactic sugar and contribute nothing to the @@ -258,6 +262,7 @@ \subsection{*Inspecting the syntax} \begin{ttbox} syn_of : theory -> Syntax.syntax +print_syntax : theory -> unit Syntax.print_syntax : Syntax.syntax -> unit Syntax.print_gram : Syntax.syntax -> unit Syntax.print_trans : Syntax.syntax -> unit @@ -269,12 +274,15 @@ \item[\ttindexbold{syn_of} {\it thy}] returns the syntax of the Isabelle theory~{\it thy} as an \ML\ value. +\item[\ttindexbold{print_syntax} $thy$] displays the syntax part of $thy$ + using {\tt Syntax.print_syntax}. + \item[\ttindexbold{Syntax.print_syntax} {\it syn}] shows virtually all information contained in the syntax {\it syn}. The displayed output can be large. The following two functions are more selective. \item[\ttindexbold{Syntax.print_gram} {\it syn}] shows the grammar part - of~{\it syn}, namely the lexicon, roots and productions. These are + of~{\it syn}, namely the lexicon, logical types and productions. These are discussed below. \item[\ttindexbold{Syntax.print_trans} {\it syn}] shows the translation @@ -287,7 +295,7 @@ \begin{ttbox}\index{*Pure theory} Syntax.print_syntax (syn_of Pure.thy); {\out lexicon: "!!" "\%" "(" ")" "," "." "::" ";" "==" "==>" \dots} -{\out roots: logic type fun prop} +{\out logtypes: fun itself} {\out prods:} {\out type = tid (1000)} {\out type = tvar (1000)} @@ -307,17 +315,17 @@ \end{ttbox} As you can see, the output is divided into labelled sections. The grammar -is represented by {\tt lexicon}, {\tt roots} and {\tt prods}. The rest +is represented by {\tt lexicon}, {\tt logtypes} and {\tt prods}. The rest refers to syntactic translations and macro expansion. Here is an explanation of the various sections. \begin{description} \item[{\tt lexicon}] lists the delimiters used for lexical - analysis.\index{delimiters} + analysis.\index{delimiters} - \item[{\tt roots}] lists the grammar's nonterminal symbols. You must - name the desired root when calling lower level functions or specifying - macros. Higher level functions usually expect a type and derive the - actual root as described in~\S\ref{sec:grammar}. + \item[{\tt logtypes}] lists the types that are regarded the same as {\tt + logic} syntactically. Thus types of object-logics (e.g.\ {\tt nat}, say) + will be automatically equipped with the standard syntax of + $\lambda$-calculus. \item[{\tt prods}] lists the \rmindex{productions} of the priority grammar. The nonterminal $A^{(n)}$ is rendered in {\sc ascii} as {\tt $A$[$n$]}. @@ -355,70 +363,43 @@ \section{Mixfix declarations} \label{sec:mixfix} -\index{mixfix declarations|(} +\index{mixfix declarations|(} When defining a theory, you declare new constants by giving their names, their type, and an optional {\bf mixfix annotation}. Mixfix annotations allow you to extend Isabelle's basic $\lambda$-calculus syntax with readable notation. They can express any context-free priority grammar. Isabelle syntax definitions are inspired by \OBJ~\cite{OBJ}; they are more -general than the priority declarations of \ML\ and Prolog. +general than the priority declarations of \ML\ and Prolog. A mixfix annotation defines a production of the priority grammar. It describes the concrete syntax, the translation to abstract syntax, and the pretty printing. Special case annotations provide a simple means of -specifying infix operators, binders and so forth. - -\subsection{Grammar productions}\label{sec:grammar}\index{productions} +specifying infix operators and binders. -Let us examine the treatment of the production -\[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\, A@2^{(p@2)}\, \ldots\, - A@n^{(p@n)}\, w@n. \] -Here $A@i^{(p@i)}$ is a nonterminal with priority~$p@i$ for $i=1$, -\ldots,~$n$, while $w@0$, \ldots,~$w@n$ are strings of terminals. -In the corresponding mixfix annotation, the priorities are given separately -as $[p@1,\ldots,p@n]$ and~$p$. The nonterminal symbols are identified with -types~$\tau$, $\tau@1$, \ldots,~$\tau@n$ respectively, and the production's -effect on nonterminals is expressed as the function type -\[ [\tau@1, \ldots, \tau@n]\To \tau. \] -Finally, the template -\[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n \] -describes the strings of terminals. - -A simple type is typically declared for each nonterminal symbol. In -first-order logic, type~$i$ stands for terms and~$o$ for formulae. Only -the outermost type constructor is taken into account. For example, any -type of the form $\sigma list$ stands for a list; productions may refer -to the symbol {\tt list} and will apply to lists of any type. -The symbol associated with a type is called its {\bf root} since it may -serve as the root of a parse tree. Precisely, the root of $(\tau@1, \dots, -\tau@n)ty$ is $ty$, where $\tau@1$, \ldots, $\tau@n$ are types and $ty$ is -a type constructor. Type infixes are a special case of this; in -particular, the root of $\tau@1 \To \tau@2$ is {\tt fun}. Finally, the -root of a type variable is {\tt logic}; general productions might -refer to this nonterminal. +%% FIXME remove +%\subsection{Grammar productions}\label{sec:grammar}\index{productions} +% +%Let us examine the treatment of the production +%\[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\, A@2^{(p@2)}\, \ldots\, +% A@n^{(p@n)}\, w@n. \] +%Here $A@i^{(p@i)}$ is a nonterminal with priority~$p@i$ for $i=1$, +%\ldots,~$n$, while $w@0$, \ldots,~$w@n$ are strings of terminals. +%In the corresponding mixfix annotation, the priorities are given separately +%as $[p@1,\ldots,p@n]$ and~$p$. The nonterminal symbols are derived from +%types~$\tau$, $\tau@1$, \ldots,~$\tau@n$ respectively, and the production's +%effect on nonterminals is expressed as the function type +%\[ [\tau@1, \ldots, \tau@n]\To \tau. \] +%Finally, the template +%\[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n \] +%describes the strings of terminals. -Identifying nonterminals with types allows a constant's type to specify -syntax as well. We can declare the function~$f$ to have type $[\tau@1, -\ldots, \tau@n]\To \tau$ and, through a mixfix annotation, specify the -layout of the function's $n$ arguments. The constant's name, in this -case~$f$, will also serve as the label in the abstract syntax tree. There -are two exceptions to this treatment of constants: -\begin{enumerate}\index{constants!syntactic} - \item A production need not map directly to a logical function. In this - case, you must declare a constant whose purpose is purely syntactic. - By convention such constants begin with the symbol~{\tt\at}, - ensuring that they can never be written in formulae. - - \item A copy production has no associated constant.\index{productions!copy} -\end{enumerate} -There is something artificial about this representation of productions, -but it is convenient, particularly for simple theory extensions. \subsection{The general mixfix form} Here is a detailed account of mixfix declarations. Suppose the following -line occurs within the {\tt consts} section of a {\tt .thy} file: +line occurs within a {\tt consts} or {\tt syntax} section of a {\tt .thy} +file: \begin{center} {\tt $c$ ::\ "$\sigma$" ("$template$" $ps$ $p$)} \end{center} @@ -432,15 +413,16 @@ $a@n$)}, where $a@i$ is the \AST{} generated by parsing the $i$-th argument. - \item The constant $c$, if non-empty, is declared to have type $\sigma$. + \item The constant $c$, if non-empty, is declared to have type $\sigma$ + ({\tt consts} section only). \item The string $template$ specifies the right-hand side of the production. It has the form - \[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n, \] + \[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n, \] where each occurrence of {\tt_} denotes an argument position and the~$w@i$ do not contain~{\tt _}. (If you want a literal~{\tt _} in the concrete syntax, you must escape it as described below.) The $w@i$ - may consist of \rmindex{delimiters}, spaces or + may consist of \rmindex{delimiters}, spaces or \rmindex{pretty printing} annotations (see below). \item The type $\sigma$ specifies the production's nonterminal symbols @@ -448,8 +430,7 @@ must be a function type with at least~$n$ argument positions, say $\sigma = [\tau@1, \dots, \tau@n] \To \tau$. Nonterminal symbols are derived from the types $\tau@1$, \ldots,~$\tau@n$, $\tau$ as described - above. Any of these may be function types; the corresponding root is - then \tydx{fun}. + below. Any of these may be function types. \item The optional list~$ps$ may contain at most $n$ integers, say {\tt [$p@1$, $\ldots$, $p@m$]}, where $p@i$ is the minimal @@ -461,46 +442,76 @@ Priorities range between 0 and \ttindexbold{max_pri} (= 1000). \end{itemize} % -The declaration {\tt $c$ ::\ "$\sigma$" ("$template$")} specifies no -priorities. The resulting production puts no priority constraints on any -of its arguments and has maximal priority itself. Omitting priorities in -this manner will introduce syntactic ambiguities unless the production's -right-hand side is fully bracketed, as in \verb|"if _ then _ else _ fi"|. +The resulting production is \[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\, +A@2^{(p@2)}\, \dots\, A@n^{(p@n)}\, w@n \] where $A$ and the $A@i$ are the +nonterminals corresponding to the types $\tau$ and $\tau@i$ respectively. +The nonterminal symbol associated with a type $(\ldots)ty$ is {\tt logic}, if +this is a logical type (namely one of class {\tt logic} excluding {\tt +prop}). Otherwise it is $ty$ (note that only the outermost type constructor +is taken into account). Finally, the nonterminal of a type variable is {\tt +any}. + +\begin{warn} + Theories must sometimes declare types for purely syntactic purposes --- + merely playing the role of nonterminals. One example is \tydx{type}, the + built-in type of types. This is a `type of all types' in the syntactic + sense only. Do not declare such types under {\tt arities} as belonging to + class {\tt logic}\index{*logic class}, for that would make them useless as + separate nonterminal symbols. +\end{warn} + +Associating nonterminals with types allows a constant's type to specify +syntax as well. We can declare the function~$f$ to have type $[\tau@1, +\ldots, \tau@n]\To \tau$ and, through a mixfix annotation, specify the layout +of the function's $n$ arguments. The constant's name, in this case~$f$, will +also serve as the label in the abstract syntax tree. + +You may also declare mixfix syntax without adding constants to the theory's +signature, by using a {\tt syntax} section instead of {\tt consts}. Thus a +production need not map directly to a logical function (this typically +requires additional syntactic translations, see also +Chapter~\ref{chap:syntax}). + + +\medskip +As a special case of the general mixfix declaration, the form +\begin{center} + {\tt $c$ ::\ "$\sigma$" ("$template$")} +\end{center} +specifies no priorities. The resulting production puts no priority +constraints on any of its arguments and has maximal priority itself. +Omitting priorities in this manner is prone to syntactic ambiguities unless +the production's right-hand side is fully bracketed, as in \verb|"if _ then _ +else _ fi"|. Omitting the mixfix annotation completely, as in {\tt $c$ ::\ "$\sigma$"}, is sensible only if~$c$ is an identifier. Otherwise you will be unable to write terms involving~$c$. -\begin{warn} - Theories must sometimes declare types for purely syntactic purposes. One - example is \tydx{type}, the built-in type of types. This is a `type of - all types' in the syntactic sense only. Do not declare such types under - {\tt arities} as belonging to class {\tt logic}\index{*logic class}, for - that would allow their use in arbitrary Isabelle - expressions~(\S\ref{logical-types}). -\end{warn} +\medskip +There is something artificial about the representation of productions as +mixfix declarations, but it is convenient, particularly for simple theory +extensions. + \subsection{Example: arithmetic expressions} \index{examples!of mixfix declarations} -This theory specification contains a {\tt consts} section with mixfix +This theory specification contains a {\tt syntax} section with mixfix declarations encoding the priority grammar from \S\ref{sec:priority_grammars}: \begin{ttbox} EXP = Pure + types exp -arities - exp :: logic -consts +syntax "0" :: "exp" ("0" 9) "+" :: "[exp, exp] => exp" ("_ + _" [0, 1] 0) "*" :: "[exp, exp] => exp" ("_ * _" [3, 2] 2) "-" :: "exp => exp" ("- _" [3] 3) end \end{ttbox} -The {\tt arities} declaration causes {\tt exp} to be added as a new root. -If you put this into a file {\tt EXP.thy} and load it via {\tt - use_thy "EXP"}, you can run some tests: +If you put this into a file {\tt EXP.thy} and load it via {\tt use_thy"EXP"}, +you can run some tests: \begin{ttbox} val read_exp = Syntax.test_read (syn_of EXP.thy) "exp"; {\out val it = fn : string -> unit} @@ -518,8 +529,8 @@ ignoring parse \AST{} translations. The rest is tracing information provided by the macro expander (see \S\ref{sec:macros}). -Executing {\tt Syntax.print_gram} reveals the productions derived -from our mixfix declarations (lots of additional information deleted): +Executing {\tt Syntax.print_gram} reveals the productions derived from the +above mixfix declarations (lots of additional information deleted): \begin{ttbox} Syntax.print_gram (syn_of EXP.thy); {\out exp = "0" => "0" (9)} @@ -530,7 +541,7 @@ \subsection{The mixfix template} -Let us take a closer look at the string $template$ appearing in mixfix +Let us now take a closer look at the string $template$ appearing in mixfix annotations. This string specifies a list of parsing and printing directives: delimiters\index{delimiters}, arguments, spaces, blocks of indentation and line breaks. These are encoded by the following character @@ -578,24 +589,21 @@ Infix operators associating to the left or right can be declared using {\tt infixl} or {\tt infixr}. Roughly speaking, the form {\tt $c$ ::\ "$\sigma$" (infixl $p$)} -abbreviates the constant declarations +abbreviates the mixfix declarations \begin{ttbox} +"op \(c\)" :: "\(\sigma\)" ("(_ \(c\)/ _)" [\(p\), \(p+1\)] \(p\)) "op \(c\)" :: "\(\sigma\)" ("op \(c\)") -"op \(c\)" :: "\(\sigma\)" ("(_ \(c\)/ _)" [\(p\), \(p+1\)] \(p\)) \end{ttbox} -and {\tt $c$ ::\ "$\sigma$" (infixr $p$)} abbreviates the constant declarations +and {\tt $c$ ::\ "$\sigma$" (infixr $p$)} abbreviates the mixfix declarations \begin{ttbox} +"op \(c\)" :: "\(\sigma\)" ("(_ \(c\)/ _)" [\(p+1\), \(p\)] \(p\)) "op \(c\)" :: "\(\sigma\)" ("op \(c\)") -"op \(c\)" :: "\(\sigma\)" ("(_ \(c\)/ _)" [\(p+1\), \(p\)] \(p\)) \end{ttbox} The infix operator is declared as a constant with the prefix {\tt op}. Thus, prefixing infixes with \sdx{op} makes them behave like ordinary function symbols, as in \ML. Special characters occurring in~$c$ must be escaped, as in delimiters, using a single quote. -The expanded forms above would be actually illegal in a {\tt .thy} file -because they declare the constant \hbox{\tt"op \(c\)"} twice. - \subsection{Binders} \indexbold{binders} @@ -612,7 +620,7 @@ and the whole term has type~$\tau@3$. Special characters in $\Q$ must be escaped using a single quote. -The declaration is expanded internally to +The declaration is expanded internally to something like \begin{ttbox} \(c\) :: "(\(\tau@1\) => \(\tau@2\)) => \(\tau@3\)" "\(\Q\)"\hskip-3pt :: "[idts, \(\tau@2\)] => \(\tau@3\)" ("(3\(\Q\)_./ _)" \(p\)) @@ -647,14 +655,14 @@ \index{ambiguity!of parsed expressions} To keep the grammar small and allow common productions to be shared -all logical types are internally represented -by one nonterminal, namely {\tt logic}. This and omitted or too freely +all logical types (except {\tt prop}) are internally represented +by one nonterminal, namely {\tt logic}. This and omitted or too freely chosen priorities may lead to ways of parsing an expression that were not intended by the theory's maker. In most cases Isabelle is able to -select one of multiple parse trees that an expression has lead +select one of multiple parse trees that an expression has lead to by checking which of them can be typed correctly. But this may not work in every case and always slows down parsing. -The warning and error messages that can be produced during this process are +The warning and error messages that can be produced during this process are as follows: If an ambiguity can be resolved by type inference this warning @@ -681,22 +689,25 @@ \end{ttbox} On the other hand it's also possible that none of the parse trees can be -typed correctly though the user did not make a mistake. By default Isabelle -assumes that the type of a syntax translation rule is {\tt logic} but does -not look at the type unless parsing the rule produces more than one parse -tree. In that case this message is output if the rule's type is different -from {\tt logic}: +typed correctly although the user did not make a mistake. -\begin{ttbox} -{\out Warning: Ambiguous input "..."} -{\out produces the following parse trees:} -{\out ...} -{\out This occured in syntax translation rule: "..." -> "..."} -{\out Type checking error: Term does not have expected type} -{\out ...} -\end{ttbox} - -To circumvent this the rule's type has to be stated. +%% FIXME remove? +%By default Isabelle +%assumes that the type of a syntax translation rule is {\tt logic} but does +%not look at the type unless parsing the rule produces more than one parse +%tree. In that case this message is output if the rule's type is different +%from {\tt logic}: +% +%\begin{ttbox} +%{\out Warning: Ambiguous input "..."} +%{\out produces the following parse trees:} +%{\out ...} +%{\out This occured in syntax translation rule: "..." -> "..."} +%{\out Type checking error: Term does not have expected type} +%{\out ...} +%\end{ttbox} +% +%To circumvent this the rule's type has to be stated. \section{Example: some minimal logics} \label{sec:min_logics} @@ -706,13 +717,15 @@ demonstrate how to define new object-logics from scratch. First we must define how an object-logic syntax is embedded into the -meta-logic. Since all theorems must conform to the syntax for~\ndx{prop} (see -Fig.\ts\ref{fig:pure_gram}), that syntax has to be extended with the +meta-logic. Since all theorems must conform to the syntax for~\ndx{prop} +(see Fig.\ts\ref{fig:pure_gram}), that syntax has to be extended with the object-level syntax. Assume that the syntax of your object-logic defines a -nonterminal symbol~\ndx{o} of formulae. These formulae can now appear in -axioms and theorems wherever \ndx{prop} does if you add the production -\[ prop ~=~ o. \] -This is not a copy production but a coercion from formulae to propositions: +meta-type~\tydx{o} of formulae which refers to the nonterminal {\tt logic}. +These formulae can now appear in axioms and theorems wherever \ndx{prop} does +if you add the production +\[ prop ~=~ logic. \] +This is not supposed to be a copy production but an implicit coercion from +formulae to propositions: \begin{ttbox} Base = Pure + types

--- a/doc-src/Ref/syntax.tex Fri Jan 13 02:02:00 1995 +0100 +++ b/doc-src/Ref/syntax.tex Wed Jan 18 10:17:55 1995 +0100 @@ -14,7 +14,7 @@ \section{Abstract syntax trees} \label{sec:asts} -\index{ASTs|(} +\index{ASTs|(} The parser, given a token list from the lexer, applies productions to yield a parse tree\index{parse trees}. By applying some internal transformations @@ -72,7 +72,7 @@ \end{ttbox} is shown as {\tt ("_constrain" ("_abs" x t) ("fun" 'a 'b))}. Both {\tt ()} and {\tt (f)} are illegal because they have too few -subtrees. +subtrees. The resemblance to Lisp's S-expressions is intentional, but there are two kinds of atomic symbols: $\Constant x$ and $\Variable x$. Do not take the @@ -105,14 +105,17 @@ productions used to recognize the input. Such parse trees are simply lists of tokens and constituent parse trees, the latter representing the nonterminals of the productions. Let us refer to the actual productions in -the form displayed by {\tt Syntax.print_syntax}. +the form displayed by {\tt print_syntax} (see \S\ref{sec:inspct-thy} for an +example). Ignoring parse \AST{} translations, parse trees are transformed to \AST{}s by stripping out delimiters and copy productions. More precisely, the mapping $\astofpt{-}$ is derived from the productions as follows: \begin{itemize} \item Name tokens: $\astofpt{t} = \Variable s$, where $t$ is an \ndx{id}, - \ndx{var}, \ndx{tid} or \ndx{tvar} token, and $s$ its associated string. + \ndx{var}, \ndx{tid}, \ndx{tvar}, \ndx{xnum} or \ndx{xstr} token, and $s$ + its associated string. Note that for {\tt xstr} this does not include the + quotes. \item Copy productions:\index{productions!copy} $\astofpt{\ldots P \ldots} = \astofpt{P}$. Here $\ldots$ stands for @@ -122,12 +125,12 @@ \item 0-ary productions: $\astofpt{\ldots \mtt{=>} c} = \Constant c$. Here there are no constituents other than delimiters, which are - discarded. + discarded. \item $n$-ary productions, where $n \ge 1$: delimiters are discarded and the remaining constituents $P@1$, \ldots, $P@n$ are built into an application whose head constant is~$c$: - \[ \astofpt{\ldots P@1 \ldots P@n \ldots \mtt{=>} c} = + \[ \astofpt{\ldots P@1 \ldots P@n \ldots \mtt{=>} c} = \Appl{\Constant c, \astofpt{P@1}, \ldots, \astofpt{P@n}} \] \end{itemize} @@ -154,7 +157,7 @@ "\%x y.\ t" & ("_lambda" ("_idts" x y) t) \\ \end{tabular} \end{center} -\caption{Parsing examples using the Pure syntax}\label{fig:parse_ast} +\caption{Parsing examples using the Pure syntax}\label{fig:parse_ast} \end{figure} \begin{figure} @@ -175,7 +178,7 @@ The names of constant heads in the \AST{} control the translation process. The list of constants invoking parse \AST{} translations appears in the -output of {\tt Syntax.print_syntax} under {\tt parse_ast_translation}. +output of {\tt print_syntax} under {\tt parse_ast_translation}. \section{Transforming \AST{}s to terms}\label{sec:termofast} @@ -208,7 +211,7 @@ \mtt{dummyT})$. \item Function applications with $n$ arguments: - \[ \termofast{\Appl{f, x@1, \ldots, x@n}} = + \[ \termofast{\Appl{f, x@1, \ldots, x@n}} = \termofast{f} \ttapp \termofast{x@1} \ttapp \ldots \ttapp \termofast{x@n} \] @@ -254,18 +257,18 @@ the free variable $x'$. This replaces corresponding occurrences of the constructor \ttindex{Bound} by the term $\ttfct{Free} (x', \mtt{dummyT})$: - \[ \astofterm{\ttfct{Abs} (x, \tau, t)} = - \Appl{\Constant \mtt{"_abs"}, + \[ \astofterm{\ttfct{Abs} (x, \tau, t)} = + \Appl{\Constant \mtt{"_abs"}, constrain(\Variable x', \tau), \astofterm{t'}}. \] - \item $\astofterm{\ttfct{Bound} i} = \Variable \mtt{"B.}i\mtt"$. + \item $\astofterm{\ttfct{Bound} i} = \Variable \mtt{"B.}i\mtt"$. The occurrence of constructor \ttindex{Bound} should never happen when printing well-typed terms; it indicates a de Bruijn index with no matching abstraction. \item Where $f$ is not an application, - \[ \astofterm{f \ttapp x@1 \ttapp \ldots \ttapp x@n} = + \[ \astofterm{f \ttapp x@1 \ttapp \ldots \ttapp x@n} = \Appl{\astofterm{f}, \astofterm{x@1}, \ldots,\astofterm{x@n}} \] \end{itemize} @@ -276,8 +279,8 @@ \item $constrain(x, \tau) = x$ \ if $\tau = \mtt{dummyT}$ \index{*dummyT} or \ttindex{show_types} is set to {\tt false}. - \item $constrain(x, \tau) = \Appl{\Constant \mtt{"_constrain"}, x, - \astofterm{\tau}}$ \ otherwise. + \item $constrain(x, \tau) = \Appl{\Constant \mtt{"_constrain"}, x, + \astofterm{\tau}}$ \ otherwise. Here, $\astofterm{\tau}$ is the \AST{} encoding of $\tau$: type constructors go to {\tt Constant}s; type identifiers go to {\tt @@ -316,7 +319,7 @@ \section{Macros: Syntactic rewriting} \label{sec:macros} -\index{macros|(}\index{rewriting!syntactic|(} +\index{macros|(}\index{rewriting!syntactic|(} Mixfix declarations alone can handle situations where there is a direct connection between the concrete syntax and the underlying term. Sometimes @@ -338,7 +341,7 @@ Figure~\ref{fig:set_trans} defines a fragment of first-order logic and set theory.\footnote{This and the following theories are complete working examples, though they specify only syntax, no axioms. The file {\tt - ZF/ZF.thy} presents the full set theory definition, including many + ZF/ZF.thy} presents a full set theory definition, including many macro rules.} Theory {\tt SET} defines constants for set comprehension ({\tt Collect}), replacement ({\tt Replace}) and bounded universal quantification ({\tt Ball}). Each of these binds some variables. Without @@ -349,16 +352,17 @@ \begin{ttbox} SET = Pure + types - i, o + i o arities i, o :: logic consts Trueprop :: "o => prop" ("_" 5) Collect :: "[i, i => o] => i" + Replace :: "[i, [i, i] => o] => i" + Ball :: "[i, i => o] => o" +syntax "{\at}Collect" :: "[idt, i, o] => i" ("(1{\ttlbrace}_:_./ _{\ttrbrace})") - Replace :: "[i, [i, i] => o] => i" "{\at}Replace" :: "[idt, idt, i, o] => i" ("(1{\ttlbrace}_./ _:_, _{\ttrbrace})") - Ball :: "[i, i => o] => o" "{\at}Ball" :: "[idt, i, o] => o" ("(3ALL _:_./ _)" 10) translations "{\ttlbrace}x:A. P{\ttrbrace}" == "Collect(A, \%x. P)" @@ -369,13 +373,12 @@ \caption{Macro example: set theory}\label{fig:set_trans} \end{figure} -The theory specifies a variable-binding syntax through additional -productions that have mixfix declarations. Each non-copy production must -specify some constant, which is used for building \AST{}s. -\index{constants!syntactic} The additional constants are decorated with -{\tt\at} to stress their purely syntactic purpose; they should never occur -within the final well-typed terms. Furthermore, they cannot be written in -formulae because they are not legal identifiers. +The theory specifies a variable-binding syntax through additional productions +that have mixfix declarations. Each non-copy production must specify some +constant, which is used for building \AST{}s. \index{constants!syntactic} The +additional constants are decorated with {\tt\at} to stress their purely +syntactic purpose; they may not occur within the final well-typed terms, +being declared as {\tt syntax} rather than {\tt consts}. The translations cause the replacement of external forms by internal forms after parsing, and vice versa before printing of terms. As a specification @@ -405,7 +408,7 @@ \[ (root)\; string \quad \left\{\begin{array}[c]{c} \mtt{=>} \\ \mtt{<=} \\ \mtt{==} \end{array} \right\} \quad - (root)\; string + (root)\; string \] % This specifies a parse rule ({\tt =>}), a print rule ({\tt <=}), or both @@ -435,10 +438,10 @@ Some atoms of the macro rule's \AST{} are designated as constants for matching. These are all names that have been declared as classes, types or -constants. +constants (logical and syntactic). The result of this preprocessing is two lists of macro rules, each stored -as a pair of \AST{}s. They can be viewed using {\tt Syntax.print_syntax} +as a pair of \AST{}s. They can be viewed using {\tt print_syntax} (sections \ttindex{parse_rules} and \ttindex{print_rules}). For theory~{\tt SET} of Fig.~\ref{fig:set_trans} these are \begin{ttbox} @@ -455,7 +458,7 @@ \begin{warn} Avoid choosing variable names that have previously been used as constants, types or type classes; the {\tt consts} section in the output - of {\tt Syntax.print_syntax} lists all such names. If a macro rule works + of {\tt print_syntax} lists all such names. If a macro rule works incorrectly, inspect its internal form as shown above, recalling that constants appear as quoted strings and variables without quotes. \end{warn} @@ -477,7 +480,7 @@ binding place (but not at occurrences in the body). Matching with \verb|Collect(A, %x. P)| binds {\tt x} to something like {\tt ("_constrain" y "i")} rather than only {\tt y}. \AST{} rewriting will cause the constraint to -appear in the external form, say \verb|{y::i:A::i. P::o}|. +appear in the external form, say \verb|{y::i:A::i. P::o}|. To allow such constraints to be re-read, your syntax should specify bound variables using the nonterminal~\ndx{idt}. This is the case in our @@ -526,11 +529,11 @@ far removed from parse trees; at this level it is not yet known which identifiers will become constants, bounds, frees, types or classes. As \S\ref{sec:asts} describes, former parse tree heads appear in \AST{}s as -{\tt Constant}s, while the name tokens \ndx{id}, \ndx{var}, \ndx{tid} and -\ndx{tvar} become {\tt Variable}s. On the other hand, when \AST{}s -generated from terms for printing, all constants and type constructors -become {\tt Constant}s; see \S\ref{sec:asts}. Thus \AST{}s may contain a -messy mixture of {\tt Variable}s and {\tt Constant}s. This is +{\tt Constant}s, while the name tokens \ndx{id}, \ndx{var}, \ndx{tid}, +\ndx{tvar}, \ndx{xnum} and \ndx{xstr} become {\tt Variable}s. On the other +hand, when \AST{}s generated from terms for printing, all constants and type +constructors become {\tt Constant}s; see \S\ref{sec:asts}. Thus \AST{}s may +contain a messy mixture of {\tt Variable}s and {\tt Constant}s. This is insignificant at macro level because matching treats them alike. Because of this behaviour, different kinds of atoms with the same name are @@ -540,13 +543,14 @@ Nil consts Nil :: "'a list" +syntax "[]" :: "'a list" ("[]") translations "[]" == "Nil" \end{ttbox} The term {\tt Nil} will be printed as {\tt []}, just as expected. The term \verb|%Nil.t| will be printed as \verb|%[].t|, which might not be -expected! How is the type~{\tt Nil} printed? +expected! Guess how type~{\tt Nil} is printed? Normalizing an \AST{} involves repeatedly applying macro rules until none are applicable. Macro rules are chosen in the order that they appear in the @@ -573,11 +577,13 @@ FINSET = SET + types is -consts +syntax "" :: "i => is" ("_") "{\at}Enum" :: "[i, is] => is" ("_,/ _") +consts empty :: "i" ("{\ttlbrace}{\ttrbrace}") insert :: "[i, i] => i" +syntax "{\at}Finset" :: "is => i" ("{\ttlbrace}(_){\ttrbrace}") translations "{\ttlbrace}x, xs{\ttrbrace}" == "insert(x, {\ttlbrace}xs{\ttrbrace})" @@ -596,12 +602,11 @@ line break is required then a space is printed instead. The nonterminal is declared as the type~{\tt is}, but with no {\tt arities} -declaration. Hence {\tt is} is not a logical type and no default -productions are added. If we had needed enumerations of the nonterminal -{\tt logic}, which would include all the logical types, we could have used -the predefined nonterminal symbol \ndx{args} and skipped this part -altogether. The nonterminal~{\tt is} can later be reused for other -enumerations of type~{\tt i} like lists or tuples. +declaration. Hence {\tt is} is not a logical type and may be used safely as +a new nonterminal for custom syntax. The nonterminal~{\tt is} can later be +re-used for other enumerations of type~{\tt i} like lists or tuples. If we +had needed polymorphic enumerations, we could have used the predefined +nonterminal symbol \ndx{args} and skipped this part altogether. \index{"@Finset@{\tt\at Finset} constant} Next follows {\tt empty}, which is already equipped with its syntax @@ -620,9 +625,9 @@ ("insert" x ("{\at}Finset" xs)) -> ("{\at}Finset" ("{\at}Enum" x xs)) ("insert" x "empty") -> ("{\at}Finset" x) \end{ttbox} -This shows that \verb|{x, xs}| indeed matches any set enumeration of at least +This shows that \verb|{x,xs}| indeed matches any set enumeration of at least two elements, binding the first to {\tt x} and the rest to {\tt xs}. -Likewise, \verb|{xs}| and \verb|{x}| represent any set enumeration. +Likewise, \verb|{xs}| and \verb|{x}| represent any set enumeration. The parse rules only work in the order given. \begin{warn} @@ -635,7 +640,7 @@ \begin{ttbox} \%empty insert. insert(x, empty) \end{ttbox} -\par\noindent is printed as \verb|%empty insert. {x}|. +\par\noindent is incorrectly printed as \verb|%empty insert. {x}|. \end{warn} @@ -643,15 +648,16 @@ \index{examples!of macros} As stated earlier, a macro rule may not introduce new {\tt Variable}s on -the right-hand side. Something like \verb|"K(B)" => "%x. B"| is illegal; +the right-hand side. Something like \verb|"K(B)" => "%x.B"| is illegal; if allowed, it could cause variable capture. In such cases you usually must fall back on translation functions. But a trick can make things -readable in some cases: {\em calling translation functions by parse - macros}: +readable in some cases: {\em calling\/} translation functions by parse +macros: \begin{ttbox} PROD = FINSET + consts Pi :: "[i, i => i] => i" +syntax "{\at}PROD" :: "[idt, i, i] => i" ("(3PROD _:_./ _)" 10) "{\at}->" :: "[i, i] => i" ("(_ ->/ _)" [51, 50] 50) \ttbreak @@ -663,16 +669,15 @@ val print_translation = [("Pi", dependent_tr' ("{\at}PROD", "{\at}->"))]; \end{ttbox} -Here {\tt Pi} is an internal constant for constructing general products. +Here {\tt Pi} is a logical constant for constructing general products. Two external forms exist: the general case {\tt PROD x:A.B} and the function space {\tt A -> B}, which abbreviates \verb|Pi(A, %x.B)| when {\tt B} does not depend on~{\tt x}. -The second parse macro introduces {\tt _K(B)}, which later becomes \verb|%x.B| -due to a parse translation associated with \cdx{_K}. The order of the -parse rules is critical. Unfortunately there is no such trick for -printing, so we have to add a {\tt ML} section for the print translation -\ttindex{dependent_tr'}. +The second parse macro introduces {\tt _K(B)}, which later becomes +\verb|%x.B| due to a parse translation associated with \cdx{_K}. +Unfortunately there is no such trick for printing, so we have to add a {\tt +ML} section for the print translation \ttindex{dependent_tr'}. Recall that identifiers with a leading {\tt _} are allowed in translation rules, but not in ordinary terms. Thus we can create \AST{}s containing @@ -685,13 +690,13 @@ \section{Translation functions} \label{sec:tr_funs} -\index{translations|(} +\index{translations|(} % This section describes the translation function mechanism. By writing \ML{} functions, you can do almost everything with terms or \AST{}s during parsing and printing. The logic \LK\ is a good example of sophisticated transformations between internal and external representations of sequents; -here, macros would be useless. +here, macros would be useless. A full understanding of translations requires some familiarity with Isabelle's internals, especially the datatypes {\tt term}, {\tt typ}, @@ -704,21 +709,21 @@ associated with a name, which triggers calls to it. Such names can be constants (logical or syntactic) or type constructors. -{\tt Syntax.print_syntax} displays the sets of names associated with the +{\tt print_syntax} displays the sets of names associated with the translation functions of a {\tt Syntax.syntax} under \ttindex{parse_ast_translation}, \ttindex{parse_translation}, \ttindex{print_translation} and \ttindex{print_ast_translation}. You can add new ones via the {\tt ML} section\index{*ML section} of a {\tt .thy} file. There may never be more than one function of the same -kind per name. Conceptually, the {\tt ML} section should appear between -{\tt consts} and {\tt translations}; newly installed translation functions -are already effective when macros and logical rules are parsed. +kind per name. Even though the {\tt ML} section is the very last part of a +{\tt .thy} file, newly installed translation functions are effective when +processing the preceding section. -The {\tt ML} section is copied verbatim into the \ML\ file generated from a -{\tt .thy} file. Definitions made here are accessible as components of an -\ML\ structure; to make some definitions private, use an \ML{} {\tt local} -declaration. The {\tt ML} section may install translation functions by -declaring any of the following identifiers: +The {\tt ML} section is copied verbatim near the beginning of the \ML\ file +generated from a {\tt .thy} file. Definitions made here are accessible as +components of an \ML\ structure; to make some definitions private, use an +\ML{} {\tt local} declaration. The {\tt ML} section may install translation +functions by declaring any of the following identifiers: \begin{ttbox} val parse_ast_translation : (string * (ast list -> ast)) list val print_ast_translation : (string * (ast list -> ast)) list @@ -742,8 +747,8 @@ transformations than \AST{}s do, typically involving abstractions and bound variables. -Regardless of whether they act on terms or \AST{}s, -parse translations differ from print translations fundamentally: +Regardless of whether they act on terms or \AST{}s, parse translations differ +from print translations in their overall behaviour fundamentally: \begin{description} \item[Parse translations] are applied bottom-up. The arguments are already in translated form. The translations must not fail; exceptions trigger @@ -786,7 +791,7 @@ Here is the parse translation for {\tt _K}: \begin{ttbox} fun k_tr [t] = Abs ("x", dummyT, incr_boundvars 1 t) - | k_tr ts = raise TERM("k_tr",ts); + | k_tr ts = raise TERM ("k_tr", ts); \end{ttbox} If {\tt k_tr} is called with exactly one argument~$t$, it creates a new {\tt Abs} node with a body derived from $t$. Since terms given to parse @@ -797,7 +802,7 @@ Here is the print translation for dependent types: \begin{ttbox} -fun dependent_tr' (q,r) (A :: Abs (x, T, B) :: ts) = +fun dependent_tr' (q, r) (A :: Abs (x, T, B) :: ts) = if 0 mem (loose_bnos B) then let val (x', B') = variant_abs (x, dummyT, B); in list_comb (Const (q, dummyT) $ Free (x', T) $ A $ B', ts) @@ -833,6 +838,6 @@ replaces bound variable occurrences in~$B$ by the free variable $x'$ with type {\tt dummyT}. Only the binding occurrence of~$x'$ is given the correct type~{\tt T}, so this is the only place where a type -constraint might appear. +constraint might appear. \index{translations|)} \index{syntax!transformations|)}

--- a/doc-src/Ref/theories.tex Fri Jan 13 02:02:00 1995 +0100 +++ b/doc-src/Ref/theories.tex Wed Jan 18 10:17:55 1995 +0100 @@ -11,7 +11,7 @@ Signatures, which contain information about sorts, types, constants and syntax, have the \ML\ type~\mltydx{Sign.sg}. For identification, each -signature carries a unique list of \bfindex{stamps}, which are \ML\ +signature carries a unique list of \bfindex{stamps}, which are \ML\ references to strings. The strings serve as human-readable names; the references serve as unique identifiers. Each primitive signature has a single stamp. When two signatures are merged, their lists of stamps are @@ -26,65 +26,68 @@ \section{Defining theories}\label{sec:ref-defining-theories} -Theories can be defined either using concrete syntax or by calling certain -\ML{} functions (see \S\ref{BuildingATheory}). Appendix~\ref{app:TheorySyntax} -presents the concrete syntax for theories; here is an explanation of the -constituent parts: -\begin{description} -\item[{\it theoryDef}] +Theories are usually defined using theory definition files (which have a name +suffix {\tt .thy}). There is also a low level interface provided by certain +\ML{} functions (see \S\ref{BuildingATheory}). +Appendix~\ref{app:TheorySyntax} presents the concrete syntax for theory +definitions; here is an explanation of the constituent parts: +\begin{description} +\item[{\it theoryDef}] is the full definition. The new theory is called $id$. It is the union of the named {\bf parent theories}\indexbold{theories!parent}, possibly extended with new classes, etc. The basic theory, which contains only the meta-logic, is called \thydx{Pure}. - Normally each {\it name\/} is an identifier, the name of the parent - theory. Strings can be used to document additional dependencies; see + Normally each {\it name\/} is an identifier, the name of the parent theory. + Quoted strings can be used to document additional file dependencies; see \S\ref{LoadingTheories} for details. -\item[$classes$] - is a series of class declarations. Declaring {\tt$id$ < $id@1$ \dots\ +\item[$classes$] + is a series of class declarations. Declaring {\tt$id$ < $id@1$ \dots\ $id@n$} makes $id$ a subclass of the existing classes $id@1\dots id@n$. This rules out cyclic class structures. Isabelle automatically computes the transitive closure of subclass hierarchies; it is not necessary to declare {\tt c < e} in addition to {\tt c < d} and {\tt d < e}. -\item[$default$] - introduces $sort$ as the new default sort for type variables. This - applies to unconstrained type variables in an input string but not to - type variables created internally. If omitted, the default sort is the - union of the default sorts of the parent theories. +\item[$default$] + introduces $sort$ as the new default sort for type variables. This applies + to unconstrained type variables in an input string but not to type + variables created internally. If omitted, the default sort is the listwise + union of the default sorts of the parent theories (i.e.\ their logical + intersection). -\item[$sort$] +\item[$sort$] is a finite set of classes. A single class $id$ abbreviates the singleton set {\tt\{}$id${\tt\}}. -\item[$types$] +\item[$types$] is a series of type declarations. Each declares a new type constructor or type synonym. An $n$-place type constructor is specified by $(\alpha@1,\dots,\alpha@n)name$, where the type variables serve only to indicate the number~$n$. - Only 2-place type constructors can have infix status and symbolic names; - an example is {\tt ('a,'b)"*"}, which expresses binary product types. - A {\bf type synonym}\indexbold{type synonyms} is an abbreviation $(\alpha@1,\dots,\alpha@n)name = \mbox{\tt"}\tau\mbox{\tt"}$, where $name$ can be a string and $\tau$ must be enclosed in quotation marks. - -\item[$infix$] + +\item[$infix$] declares a type or constant to be an infix operator of priority $nat$ - associating to the left ({\tt infixl}) or right ({\tt infixr}). + associating to the left ({\tt infixl}) or right ({\tt infixr}). Only + 2-place type constructors can have infix status; an example is {\tt + ('a,'b)~"*"~(infixr~20)}, which expresses binary product types. -\item[$arities$] +\item[$arities$] is a series of arity declarations. Each assigns arities to type constructors. The $name$ must be an existing type constructor, which is given the additional arity $arity$. -\item[$constDecl$] +\item[$constDecl$] is a series of constant declarations. Each new constant $name$ is given - the type specified by the $string$. The optional $mixfix$ annotations - may attach concrete syntax to the constant. + the type specified by the $string$. The optional $mixfix$ annotations may + attach concrete syntax to the constant. A variant of {\tt consts} is the + {\tt syntax} section\index{*syntax section}, which adds just syntax without + declaring logical constants. \item[$mixfix$] \index{mixfix declarations} annotations can take three forms: @@ -104,20 +107,22 @@ like $f(F)$, where $p$ is the priority. \end{itemize} -\item[$trans$] - specifies syntactic translation rules. There are three forms: parse - rules ({\tt =>}), print rules ({\tt <=}), and parse/print rules ({\tt ==}). +\item[$trans$] + specifies syntactic translation rules (macros). There are three forms: + parse rules ({\tt =>}), print rules ({\tt <=}), and parse/print rules ({\tt + ==}). -\item[$rule$] - is a series of rule declarations. Each has a name $id$ and the - formula is given by the $string$. Rule names must be distinct. +\item[$rule$] + is a series of rule declarations. Each has a name $id$ and the formula is + given by the $string$. Rule names must be distinct within any single + theory file. \item[$ml$] \index{*ML section} - consists of \ML\ code, typically for parse and print translations. + consists of \ML\ code, typically for parse and print translation functions. \end{description} % -Chapter~\ref{chap:syntax} explains mixfix declarations, translation rules -and the {\tt ML} section in more detail. +Chapters~\ref{Defining-Logics} and \ref{chap:syntax} explain mixfix +declarations, translation rules and the {\tt ML} section in more detail. \subsection{*Classes and arities} @@ -130,9 +135,11 @@ (\vec{s})c$ with $\vec{r} \neq \vec{s}$. For example, the following is forbidden: \begin{ttbox} -types 'a ty -arities ty :: ({\ttlbrace}logic{\ttrbrace}) logic - ty :: ({\ttlbrace}{\ttrbrace})logic +types + 'a ty +arities + ty :: ({\ttlbrace}logic{\ttrbrace}) logic + ty :: ({\ttlbrace}{\ttrbrace})logic \end{ttbox} \item If there are two declarations $ty :: (s@1,\dots,s@n)c$ and $ty :: @@ -142,10 +149,13 @@ expresses that the set of types represented by $s'$ is a subset of the set of types represented by $s$. For example, the following is forbidden: \begin{ttbox} -classes term < logic -types 'a ty -arities ty :: ({\ttlbrace}logic{\ttrbrace})logic - ty :: ({\ttlbrace}{\ttrbrace})term +classes + term < logic +types + 'a ty +arities + ty :: ({\ttlbrace}logic{\ttrbrace})logic + ty :: ({\ttlbrace}{\ttrbrace})term \end{ttbox} \end{itemize} @@ -153,7 +163,7 @@ \section{Loading a new theory}\label{LoadingTheories} \index{theories!loading}\index{files!reading} -\begin{ttbox} +\begin{ttbox} use_thy : string -> unit time_use_thy : string -> unit loadpath : string list ref \hfill{\bf initially {\tt["."]}} @@ -161,18 +171,18 @@ \end{ttbox} \begin{ttdescription} -\item[\ttindexbold{use_thy} $thyname$] +\item[\ttindexbold{use_thy} $thyname$] reads the theory $thyname$ and creates an \ML{} structure as described below. -\item[\ttindexbold{time_use_thy} $thyname$] +\item[\ttindexbold{time_use_thy} $thyname$] calls {\tt use_thy} $thyname$ and reports the time taken. -\item[\ttindexbold{loadpath}] +\item[\ttindexbold{loadpath}] contains a list of directories to search when locating the files that define a theory. This list is only used if the theory name in {\tt use_thy} does not specify the path explicitly. -\item[\ttindexbold{delete_tmpfiles} := false;] +\item[\ttindexbold{delete_tmpfiles} := false;] suppresses the deletion of temporary files. \end{ttdescription} % @@ -185,34 +195,34 @@ previously; the recursive calls may continue to any depth. One {\tt use_thy} call can read an entire logic provided all theories are linked appropriately. -The result is an \ML\ structure~$T$ containing a component {\tt thy} for -the new theory and components for each of the rules. The structure also +The result is an \ML\ structure~$T$ containing at least a component {\tt thy} +for the new theory and components for each of the rules. The structure also contains the definitions of the {\tt ML} section, if present. The file -{\tt.{\it T}.thy.ML} is then deleted if {\tt delete_tmpfiles} is set -to {\tt true} and no errors occurred. +{\tt.{\it T}.thy.ML} is then deleted if {\tt delete_tmpfiles} is set to {\tt +true} and no errors occurred. Finally the file {\it T}{\tt.ML} is read, if it exists. This file normally begins with the declaration {\tt open~$T$} and contains proofs involving the new theory. -Special applications, such as \ZF's inductive definition package, construct -theories directly by calling \ML\ functions. In this situation there is no -{\tt.thy} file, only an {\tt.ML} file. The {\tt.ML} file must declare an -\ML\ structure having the theory's name. See the file {\tt ZF/ex/LList.ML} -for an example. Section~\ref{sec:pseudo-theories} below describes a way of -linking such theories to their parents. +Some applications construct theories directly by calling \ML\ functions. In +this situation there is no {\tt.thy} file, only an {\tt.ML} file. The +{\tt.ML} file must declare an \ML\ structure having the theory's name and a +component {\tt thy} containing the new theory object. +Section~\ref{sec:pseudo-theories} below describes a way of linking such +theories to their parents. \begin{warn} -Temporary files are written to the current directory, which therefore must -be writable. Isabelle inherits the current directory from the operating -system; you can change it within Isabelle by typing \hbox{\tt\ \ - \ttindex{cd} "{\it dir}";\ \ }. + Temporary files are written to the current directory, so this must be + writable. Isabelle inherits the current directory from the operating + system; you can change it within Isabelle by typing {\tt + cd"$dir$"}\index{*cd}. \end{warn} \section{Reloading modified theories}\label{sec:reloading-theories} \indexbold{theories!reloading} -\begin{ttbox} +\begin{ttbox} update : unit -> unit unlink_thy : string -> unit \end{ttbox} @@ -231,8 +241,8 @@ marks the children as out-of-date. \begin{ttdescription} -\item[\ttindexbold{update}()] - reloads all modified theories and their descendants in the correct order. +\item[\ttindexbold{update}()] + reloads all modified theories and their descendants in the correct order. \item[\ttindexbold{unlink_thy} $thyname$]\indexbold{theories!removing} informs Isabelle that theory $thyname$ no longer exists. If you delete the @@ -245,9 +255,9 @@ \subsection{Important note for Poly/ML users}\index{Poly/{\ML} compiler} The theory mechanism depends upon reference variables. At the end of a Poly/\ML{} session, the contents of references are lost unless they are -declared in the current database. Assignments to references in the {\tt - Pure} database are lost, including all information about loaded theories. -To avoid losing this information simply call +declared in the current database. In particular, assignments to references +of the {\tt Pure} database are lost, including all information about loaded +theories. To avoid losing this information simply call \begin{ttbox} init_thy_reader(); \end{ttbox} @@ -258,60 +268,65 @@ \indexbold{theories!pseudo}% Any automatic reloading facility requires complete knowledge of all dependencies. Sometimes theories depend on objects created in \ML{} files -with no associated {\tt.thy} file. These objects may be theories but they -could also be theorems, proof procedures, etc. +with no associated theory definition file. These objects may be theories but +they could also be theorems, proof procedures, etc. Unless such dependencies are documented, {\tt update} fails to reload these \ML{} files and the system is left in a state where some objects, such as theorems, still refer to old versions of theories. This may lead to the error \begin{ttbox} -Attempt to merge different versions of theory: \(T\) +Attempt to merge different versions of theories: \dots \end{ttbox} Therefore there is a way to link theories and {\bf orphaned} \ML{} files --- -those without associated {\tt.thy} file. +those not associated with a theory definition. Let us assume we have an orphaned \ML{} file named {\tt orphan.ML} and a theory~$B$ that depends on {\tt orphan.ML} --- for example, {\tt B.ML} uses theorems proved in {\tt orphan.ML}. Then {\tt B.thy} should -mention this dependence by means of a string: +mention this dependency as follows: \begin{ttbox} -B = ... + "orphan" + ... +B = \(\ldots\) + "orphan" + \(\ldots\) \end{ttbox} -Strings stand for \ML{} files rather than {\tt.thy} files, and merely -document additional dependencies. Thus {\tt orphan} is not a theory and is -not used in building the base of theory~$B$, but {\tt orphan.ML} is loaded -automatically whenever~$B$ is (re)built. +Quoted strings stand for \ML{} files rather than theories, and merely +document additional dependencies. Thus {\tt orphan} is not used in building +the base of theory~$B$, but {\tt orphan.ML} is loaded automatically +whenever~$B$ is (re)built. -The orphaned file may have its own dependencies. If {\tt orphan.ML} -depends on theories $A@1$, \ldots, $A@n$, record this by creating a {\bf +The orphaned file may have its own dependencies. If {\tt orphan.ML} depends +on theories or files $A@1$, \ldots, $A@n$, record this by creating a {\bf pseudo theory} in the file {\tt orphan.thy}: \begin{ttbox} -orphan = A1 + \(...\) + An +orphan = \(A@1\) + \(\ldots\) + \(A@n\) \end{ttbox} The resulting theory is a dummy, but it ensures that {\tt update} reloads -theory {\it orphan} whenever it reloads one of the $A@i$. +{\tt orphan} whenever it reloads one of the $A@i$. -For an extensive example of how this technique can be used to link over 30 -theory files and load them by just two {\tt use_thy} calls, consult the -source files of {\tt ZF} set theory. +For an extensive example of how this technique can be used to link lots of +theory files and load them by just a few {\tt use_thy} calls, consult the +sources of \ZF{} set theory. \section{Basic operations on theories} -\subsection{Extracting an axiom from a theory} +\subsection{Extracting an axiom or theorem from a theory} \index{theories!axioms of}\index{axioms!extracting} -\begin{ttbox} -get_axiom: theory -> string -> thm -assume_ax: theory -> string -> thm +\index{theories!theorems of}\index{theorems!extracting} +\begin{ttbox} +get_axiom : theory -> string -> thm +get_thm : theory -> string -> thm +assume_ax : theory -> string -> thm \end{ttbox} \begin{ttdescription} -\item[\ttindexbold{get_axiom} $thy$ $name$] +\item[\ttindexbold{get_axiom} $thy$ $name$] returns an axiom with the given $name$ from $thy$, raising exception \xdx{THEORY} if none exists. Merging theories can cause several axioms to have the same name; {\tt get_axiom} returns an arbitrary one. -\item[\ttindexbold{assume_ax} $thy$ $formula$] +\item[\ttindexbold{get_thm} $thy$ $name$] + is analogous to {\tt get_axiom}, but looks for a stored theorem. + +\item[\ttindexbold{assume_ax} $thy$ $formula$] reads the {\it formula} using the syntax of $thy$, following the same conventions as axioms in a theory definition. You can thus pretend that {\it formula} is an axiom and use the resulting theorem like an axiom. @@ -326,18 +341,28 @@ \subsection{Building a theory} \label{BuildingATheory} \index{theories!constructing|bold} -\begin{ttbox} +\begin{ttbox} pure_thy : theory merge_theories : theory * theory -> theory \end{ttbox} \begin{ttdescription} -\item[\ttindexbold{pure_thy}] contains just the types, constants, and syntax +\item[\ttindexbold{pure_thy}] contains just the syntax and signature of the meta-logic. There are no axioms: meta-level inferences are carried out by \ML\ functions. \item[\ttindexbold{merge_theories} ($thy@1$, $thy@2$)] merges the two - theories $thy@1$ and $thy@2$. The resulting theory contains all types, - constants and axioms of the constituent theories; its default sort is the - union of the default sorts of the constituent theories. + theories $thy@1$ and $thy@2$. The resulting theory contains all of the + syntax, signature and axioms of the constituent theories. Merging theories + that contain different identification stamps of the same name fails with + the following message +\begin{ttbox} +Attempt to merge different versions of theories: "\(T@1\)", \(\ldots\), "\(T@n\)" +\end{ttbox} + This error may especially occur when a theory is redeclared --- say to + change an incorrect axiom --- and bindings to old versions persist. + Isabelle ensures that old and new theories of the same name are not + involved in a proof. + +%% FIXME %\item [\ttindexbold{extend_theory} $thy$ {\tt"}$T${\tt"} $\cdots$] extends % the theory $thy$ with new types, constants, etc. $T$ identifies the theory % internally. When a theory is redeclared, say to change an incorrect axiom, @@ -345,11 +370,12 @@ % new theories are not involved in the same proof. Attempting to combine % different theories having the same name $T$ yields the fatal error %extend_theory : theory -> string -> \(\cdots\) -> theory -\begin{ttbox} -Attempt to merge different versions of theory: \(T\) -\end{ttbox} +%\begin{ttbox} +%Attempt to merge different versions of theory: \(T\) +%\end{ttbox} \end{ttdescription} +%% FIXME %\item [\ttindexbold{extend_theory} $thy$ {\tt"}$T${\tt"} % ($classes$, $default$, $types$, $arities$, $consts$, $sextopt$) $rules$] %\hfill\break %%% include if line is just too short @@ -381,34 +407,38 @@ %\end{tabular} -\subsection{Inspecting a theory} +\subsection{Inspecting a theory}\label{sec:inspct-thy} \index{theories!inspecting|bold} -\begin{ttbox} +\begin{ttbox} print_theory : theory -> unit -axioms_of : theory -> (string*thm)list +axioms_of : theory -> (string * thm) list +thms_of : theory -> (string * thm) list parents_of : theory -> theory list sign_of : theory -> Sign.sg stamps_of_thy : theory -> string ref list \end{ttbox} -These provide a simple means of viewing a theory's components. -Unfortunately, there is no direct connection between a theorem and its -theory. +These provide means of viewing a theory's components. \begin{ttdescription} -\item[\ttindexbold{print_theory} {\it thy}] -prints the theory {\it thy\/} at the terminal as a set of identifiers. +\item[\ttindexbold{print_theory} $thy$] +prints the contents of $thy$ excluding the syntax related parts (which are +shown by {\tt print_syntax}). The output is quite verbose. + +\item[\ttindexbold{axioms_of} $thy$] +returns the additional axioms of the most recent extend node of~$thy$. -\item[\ttindexbold{axioms_of} $thy$] -returns the axioms of~$thy$ and its ancestors. +\item[\ttindexbold{thms_of} $thy$] +similar to {\tt axioms_of}, but returns the stored theorems. -\item[\ttindexbold{parents_of} $thy$] -returns the complete list of ancestors of~$thy$. +\item[\ttindexbold{parents_of} $thy$] +returns the direct ancestors of~$thy$. + +\item[\ttindexbold{sign_of} $thy$] +returns the signature associated with~$thy$. It is useful with functions +like {\tt read_instantiate_sg}, which take a signature as an argument. \item[\ttindexbold{stamps_of_thy} $thy$]\index{signatures} -returns the \rmindex{stamps} of the signature associated with~$thy$. - -\item[\ttindexbold{sign_of} $thy$] -returns the signature associated with~$thy$. It is useful with functions -like {\tt read_instantiate_sg}, which take a signature as an argument. +returns the identification \rmindex{stamps} of the signature associated +with~$thy$. \end{ttdescription} @@ -431,7 +461,7 @@ is the {\bf constant} with name~$a$ and type~$T$. Constants include connectives like $\land$ and $\forall$ as well as constants like~0 and~$Suc$. Other constants may be required to define a logic's concrete - syntax. + syntax. \item[\ttindexbold{Free}($a$, $T$)] \index{variables!free|bold} is the {\bf free variable} with name~$a$ and type~$T$. @@ -457,12 +487,12 @@ and printing; it has no logical significance. \item[$t$ \$ $u$] \index{$@{\tt\$}|bold} \index{function applications|bold} -is the {\bf application} of~$t$ to~$u$. +is the {\bf application} of~$t$ to~$u$. \end{ttdescription} Application is written as an infix operator to aid readability. Here is an \ML\ pattern to recognize \FOL{} formulae of the form~$A\imp B$, binding the subformulae to~$A$ and~$B$: -\begin{ttbox} +\begin{ttbox} Const("Trueprop",_) $ (Const("op -->",_) $ A $ B) \end{ttbox} @@ -478,32 +508,32 @@ These functions are all concerned with the de Bruijn representation of bound variables. \begin{ttdescription} -\item[\ttindexbold{loose_bnos} $t$] +\item[\ttindexbold{loose_bnos} $t$] returns the list of all dangling bound variable references. In particular, {\tt Bound~0} is loose unless it is enclosed in an abstraction. Similarly {\tt Bound~1} is loose unless it is enclosed in at least two abstractions; if enclosed in just one, the list will contain the number 0. A well-formed term does not contain any loose variables. -\item[\ttindexbold{incr_boundvars} $j$] +\item[\ttindexbold{incr_boundvars} $j$] increases a term's dangling bound variables by the offset~$j$. This is required when moving a subterm into a context where it is enclosed by a different number of abstractions. Bound variables with a matching abstraction are unaffected. -\item[\ttindexbold{abstract_over} $(v,t)$] +\item[\ttindexbold{abstract_over} $(v,t)$] forms the abstraction of~$t$ over~$v$, which may be any well-formed term. It replaces every occurrence of \(v\) by a {\tt Bound} variable with the correct index. -\item[\ttindexbold{variant_abs} $(a,T,u)$] +\item[\ttindexbold{variant_abs} $(a,T,u)$] substitutes into $u$, which should be the body of an abstraction. It replaces each occurrence of the outermost bound variable by a free variable. The free variable has type~$T$ and its name is a variant of~$a$ chosen to be distinct from all constants and from all variables free in~$u$. -\item[$t$ \ttindexbold{aconv} $u$] +\item[$t$ \ttindexbold{aconv} $u$] tests whether terms~$t$ and~$u$ are \(\alpha\)-convertible: identical up to renaming of bound variables. \begin{itemize} @@ -525,12 +555,12 @@ \end{ttdescription} -\section{Certified terms}\index{terms!certified|bold}\index{signatures} -A term $t$ can be {\bf certified} under a signature to ensure that every -type in~$t$ is declared in the signature and every constant in~$t$ is -declared as a constant of the same type in the signature. It must be -well-typed and its use of bound variables must be well-formed. Meta-rules -such as {\tt forall_elim} take certified terms as arguments. +\section{Certified terms}\index{terms!certified|bold}\index{signatures} +A term $t$ can be {\bf certified} under a signature to ensure that every type +in~$t$ is well-formed and every constant in~$t$ is a type instance of a +constant declared in the signature. The term must be well-typed and its use +of bound variables must be well-formed. Meta-rules such as {\tt forall_elim} +take certified terms as arguments. Certified terms belong to the abstract type \mltydx{cterm}. Elements of the type can only be created through the certification process. @@ -538,46 +568,59 @@ \subsection{Printing terms} \index{terms!printing of} -\begin{ttbox} +\begin{ttbox} string_of_cterm : cterm -> string Sign.string_of_term : Sign.sg -> term -> string \end{ttbox} \begin{ttdescription} -\item[\ttindexbold{string_of_cterm} $ct$] +\item[\ttindexbold{string_of_cterm} $ct$] displays $ct$ as a string. -\item[\ttindexbold{Sign.string_of_term} $sign$ $t$] +\item[\ttindexbold{Sign.string_of_term} $sign$ $t$] displays $t$ as a string, using the syntax of~$sign$. \end{ttdescription} \subsection{Making and inspecting certified terms} -\begin{ttbox} +\begin{ttbox} cterm_of : Sign.sg -> term -> cterm read_cterm : Sign.sg -> string * typ -> cterm +cert_axm : Sign.sg -> string * term -> string * term +read_axm : Sign.sg -> string * string -> string * term rep_cterm : cterm -> \{T:typ, t:term, sign:Sign.sg, maxidx:int\} \end{ttbox} \begin{ttdescription} \item[\ttindexbold{cterm_of} $sign$ $t$] \index{signatures} certifies $t$ with respect to signature~$sign$. -\item[\ttindexbold{read_cterm} $sign$ ($s$, $T$)] +\item[\ttindexbold{read_cterm} $sign$ ($s$, $T$)] reads the string~$s$ using the syntax of~$sign$, creating a certified term. The term is checked to have type~$T$; this type also tells the parser what kind of phrase to parse. -\item[\ttindexbold{rep_cterm} $ct$] +\item[\ttindexbold{cert_axm} $sign$ ($name$, $t$)] +certifies $t$ with respect to $sign$ as a meta-proposition and converts all +exceptions to an error, including the final message +\begin{ttbox} +The error(s) above occurred in axiom "\(name\)" +\end{ttbox} + +\item[\ttindexbold{read_axm} $sign$ ($name$, $s$)] +similar to {\tt cert_axm}, but first reads the string $s$ using the syntax of +$sign$. + +\item[\ttindexbold{rep_cterm} $ct$] decomposes $ct$ as a record containing its type, the term itself, its signature, and the maximum subscript of its unknowns. The type and maximum subscript are computed during certification. \end{ttdescription} -\section{Types}\index{types|bold} -Types belong to the \ML\ type \mltydx{typ}, which is a concrete -datatype with three constructor functions. These correspond to type -constructors, free type variables and schematic type variables. Types are -classified by sorts, which are lists of classes. A class is represented by -a string. +\section{Types}\index{types|bold} +Types belong to the \ML\ type \mltydx{typ}, which is a concrete datatype with +three constructor functions. These correspond to type constructors, free +type variables and schematic type variables. Types are classified by sorts, +which are lists of classes (representing an intersection). A class is +represented by a string. \begin{ttbox} type class = string; type sort = class list; @@ -587,7 +630,7 @@ | TVar of indexname * sort; infixr 5 -->; -fun S --> T = Type("fun",[S,T]); +fun S --> T = Type ("fun", [S, T]); \end{ttbox} \begin{ttdescription} \item[\ttindexbold{Type}($a$, $Ts$)] \index{type constructors|bold} @@ -610,26 +653,26 @@ \section{Certified types} \index{types!certified|bold} -Certified types, which are analogous to certified terms, have type +Certified types, which are analogous to certified terms, have type \ttindexbold{ctyp}. \subsection{Printing types} \index{types!printing of} -\begin{ttbox} +\begin{ttbox} string_of_ctyp : ctyp -> string Sign.string_of_typ : Sign.sg -> typ -> string \end{ttbox} \begin{ttdescription} -\item[\ttindexbold{string_of_ctyp} $cT$] +\item[\ttindexbold{string_of_ctyp} $cT$] displays $cT$ as a string. -\item[\ttindexbold{Sign.string_of_typ} $sign$ $T$] +\item[\ttindexbold{Sign.string_of_typ} $sign$ $T$] displays $T$ as a string, using the syntax of~$sign$. \end{ttdescription} \subsection{Making and inspecting certified types} -\begin{ttbox} +\begin{ttbox} ctyp_of : Sign.sg -> typ -> ctyp rep_ctyp : ctyp -> \{T: typ, sign: Sign.sg\} \end{ttbox} @@ -637,7 +680,7 @@ \item[\ttindexbold{ctyp_of} $sign$ $T$] \index{signatures} certifies $T$ with respect to signature~$sign$. -\item[\ttindexbold{rep_ctyp} $cT$] +\item[\ttindexbold{rep_ctyp} $cT$] decomposes $cT$ as a record containing the type itself and its signature. \end{ttdescription}