diff -r 1c4500446ba4 -r 428e55887f24 doc-src/IsarRef/Thy/document/Inner_Syntax.tex --- a/doc-src/IsarRef/Thy/document/Inner_Syntax.tex Mon Jun 18 16:30:20 2012 +0200 +++ b/doc-src/IsarRef/Thy/document/Inner_Syntax.tex Mon Jun 18 21:17:34 2012 +0200 @@ -1244,7 +1244,127 @@ \isamarkuptrue% % \begin{isamarkuptext}% -FIXME% +The ML datatype \verb|Ast.ast| explicitly represents the + intermediate AST format that is used for syntax rewriting + (\secref{sec:syn-trans}). It is defined in ML as follows: + \begin{ttbox} + datatype ast = + Constant of string | + Variable of string | + Appl of ast list + \end{ttbox} + + An AST is either an atom (constant or variable) or a list of (at + least two) subtrees. Occasional diagnostic output of ASTs uses + notation that resembles S-expression of LISP. Constant atoms are + shown as quoted strings, variable atoms as non-quoted strings and + applications as a parenthesized list of subtrees. For example, the + AST + \begin{verbatim} +Ast.Appl + [Ast.Constant "_abs", Ast.Variable "x", Ast.Variable "t"] +\end{verbatim} + is pretty-printed as \verb|("_abs" x t)|. Note that + \verb|()| and \verb|(x)| are excluded as ASTs, because + they have too few subtrees. + + \medskip AST application is merely a pro-forma mechanism to indicate + certain syntactic structures. Thus \verb|(c a b)| could mean + either term application or type application, depending on the + syntactic context. + + Nested application like \verb|(("_abs" x t) u)| is also + possible, but ASTs are definitely first-order: the syntax constant + \verb|"_abs"| does not bind the \verb|x| in any way. + Proper bindings are introduced in later stages of the term syntax, + where \verb|("_abs" x t)| becomes an \verb|Abs| node and + occurrences of \verb|x| in \verb|t| are replaced by bound + variables (represented as de-Bruijn indices).% +\end{isamarkuptext}% +\isamarkuptrue% +% +\isamarkupsubsubsection{Ast constants versus variables% +} +\isamarkuptrue% +% +\begin{isamarkuptext}% +Depending on the situation --- input syntax, output syntax, + translation patterns --- the distinction of atomic asts as \verb|Ast.Constant| versus \verb|Ast.Variable| serves slightly different + purposes. + + Input syntax of a term such as \isa{{\isaliteral{22}{\isachardoublequote}}f\ a\ b\ {\isaliteral{3D}{\isacharequal}}\ c{\isaliteral{22}{\isachardoublequote}}} does not yet + indicate the scopes of atomic entities \isa{{\isaliteral{22}{\isachardoublequote}}f{\isaliteral{2C}{\isacharcomma}}\ a{\isaliteral{2C}{\isacharcomma}}\ b{\isaliteral{2C}{\isacharcomma}}\ c{\isaliteral{22}{\isachardoublequote}}}: they + could be global constants or local variables, even bound ones + depending on the context of the term. \verb|Ast.Variable| leaves + this choice still open: later syntax layers (or translation + functions) may capture such a variable to determine its role + specifically, to make it a constant, bound variable, free variable + etc. In contrast, syntax translations that introduce already known + constants would rather do it via \verb|Ast.Constant| to prevent + accidental re-interpretation later on. + + Output syntax turns term constants into \verb|Ast.Constant| and + variables (free or schematic) into \verb|Ast.Variable|. This + information is precise when printing fully formal \isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{5C3C6C616D6264613E}{\isasymlambda}}{\isaliteral{22}{\isachardoublequote}}}-terms. + + In AST translation patterns (\secref{sec:syn-trans}) the system + guesses from the current theory context which atoms should be + treated as constant versus variable for the matching process. + Sometimes this needs to be indicated more explicitly using \isa{{\isaliteral{22}{\isachardoublequote}}CONST\ c{\isaliteral{22}{\isachardoublequote}}} inside the term language. It is also possible to use + \hyperlink{command.syntax}{\mbox{\isa{\isacommand{syntax}}}} declarations (without mixfix annotation) to + enforce that certain unqualified names are always treated as + constant within the syntax machinery. + + \medskip For ASTs that represent the language of types or sorts, the + situation is much simpler, since the concrete syntax already + distinguishes type variables from type constants (constructors). So + \isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{28}{\isacharparenleft}}{\isaliteral{27}{\isacharprime}}a{\isaliteral{2C}{\isacharcomma}}\ {\isaliteral{27}{\isacharprime}}b{\isaliteral{29}{\isacharparenright}}\ foo{\isaliteral{22}{\isachardoublequote}}} corresponds to an AST application of some + constant for \isa{foo} and variable arguments for \isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{27}{\isacharprime}}a{\isaliteral{22}{\isachardoublequote}}} and + \isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{27}{\isacharprime}}b{\isaliteral{22}{\isachardoublequote}}}. Note that the postfix application is merely a feature + of the concrete syntax, while in the AST the constructor occurs in + head position.% +\end{isamarkuptext}% +\isamarkuptrue% +% +\isamarkupsubsubsection{Authentic syntax names% +} +\isamarkuptrue% +% +\begin{isamarkuptext}% +Naming constant entities within ASTs is another delicate + issue. Unqualified names are looked up in the name space tables in + the last stage of parsing, after all translations have been applied. + Since syntax transformations do not know about this later name + resolution yet, there can be surprises in boundary cases. + + \emph{Authentic syntax names} for \verb|Ast.Constant| avoid this + problem: the fully-qualified constant name with a special prefix for + its formal category (\isa{{\isaliteral{22}{\isachardoublequote}}class{\isaliteral{22}{\isachardoublequote}}}, \isa{{\isaliteral{22}{\isachardoublequote}}type{\isaliteral{22}{\isachardoublequote}}}, \isa{{\isaliteral{22}{\isachardoublequote}}const{\isaliteral{22}{\isachardoublequote}}}, \isa{{\isaliteral{22}{\isachardoublequote}}fixed{\isaliteral{22}{\isachardoublequote}}}) represents the information faithfully + within the untyped AST format. Accidental overlap with free or + bound variables is excluded as well. Authentic syntax names work + implicitly in the following situations: + + \begin{itemize} + + \item Input of term constants (or fixed variables) that are + introduced by concrete syntax via \hyperlink{command.notation}{\mbox{\isa{\isacommand{notation}}}}: the + correspondence of a particular grammar production to some known term + entity is preserved. + + \item Input type constants (constructors) and type classes --- + thanks to explicit syntactic distinction independently on the + context. + + \item Output of term constants, type constants, type classes --- + this information is already available from the internal term to be + printed. + + \end{itemize} + + In other words, syntax transformations that operate on input terms + written as prefix applications are difficult to achieve. Luckily, + this case rarely occurs in practice, because syntax forms to be + translated usually correspond to some bits of concrete notation.% \end{isamarkuptext}% \isamarkuptrue% %