doc-src/IsarRef/Thy/document/Inner_Syntax.tex
changeset 48114 428e55887f24
parent 48113 1c4500446ba4
child 48115 d868e4f7905b
--- a/doc-src/IsarRef/Thy/document/Inner_Syntax.tex	Mon Jun 18 16:30:20 2012 +0200
+++ b/doc-src/IsarRef/Thy/document/Inner_Syntax.tex	Mon Jun 18 21:17:34 2012 +0200
@@ -1244,7 +1244,127 @@
 \isamarkuptrue%
 %
 \begin{isamarkuptext}%
-FIXME%
+The ML datatype \verb|Ast.ast| explicitly represents the
+  intermediate AST format that is used for syntax rewriting
+  (\secref{sec:syn-trans}).  It is defined in ML as follows:
+  \begin{ttbox}
+  datatype ast =
+    Constant of string |
+    Variable of string |
+    Appl of ast list
+  \end{ttbox}
+
+  An AST is either an atom (constant or variable) or a list of (at
+  least two) subtrees.  Occasional diagnostic output of ASTs uses
+  notation that resembles S-expression of LISP.  Constant atoms are
+  shown as quoted strings, variable atoms as non-quoted strings and
+  applications as a parenthesized list of subtrees.  For example, the
+  AST
+  \begin{verbatim}
+Ast.Appl
+  [Ast.Constant "_abs", Ast.Variable "x", Ast.Variable "t"]
+\end{verbatim}
+  is pretty-printed as \verb|("_abs" x t)|.  Note that
+  \verb|()| and \verb|(x)| are excluded as ASTs, because
+  they have too few subtrees.
+
+  \medskip AST application is merely a pro-forma mechanism to indicate
+  certain syntactic structures.  Thus \verb|(c a b)| could mean
+  either term application or type application, depending on the
+  syntactic context.
+
+  Nested application like \verb|(("_abs" x t) u)| is also
+  possible, but ASTs are definitely first-order: the syntax constant
+  \verb|"_abs"| does not bind the \verb|x| in any way.
+  Proper bindings are introduced in later stages of the term syntax,
+  where \verb|("_abs" x t)| becomes an \verb|Abs| node and
+  occurrences of \verb|x| in \verb|t| are replaced by bound
+  variables (represented as de-Bruijn indices).%
+\end{isamarkuptext}%
+\isamarkuptrue%
+%
+\isamarkupsubsubsection{Ast constants versus variables%
+}
+\isamarkuptrue%
+%
+\begin{isamarkuptext}%
+Depending on the situation --- input syntax, output syntax,
+  translation patterns --- the distinction of atomic asts as \verb|Ast.Constant| versus \verb|Ast.Variable| serves slightly different
+  purposes.
+
+  Input syntax of a term such as \isa{{\isaliteral{22}{\isachardoublequote}}f\ a\ b\ {\isaliteral{3D}{\isacharequal}}\ c{\isaliteral{22}{\isachardoublequote}}} does not yet
+  indicate the scopes of atomic entities \isa{{\isaliteral{22}{\isachardoublequote}}f{\isaliteral{2C}{\isacharcomma}}\ a{\isaliteral{2C}{\isacharcomma}}\ b{\isaliteral{2C}{\isacharcomma}}\ c{\isaliteral{22}{\isachardoublequote}}}: they
+  could be global constants or local variables, even bound ones
+  depending on the context of the term.  \verb|Ast.Variable| leaves
+  this choice still open: later syntax layers (or translation
+  functions) may capture such a variable to determine its role
+  specifically, to make it a constant, bound variable, free variable
+  etc.  In contrast, syntax translations that introduce already known
+  constants would rather do it via \verb|Ast.Constant| to prevent
+  accidental re-interpretation later on.
+
+  Output syntax turns term constants into \verb|Ast.Constant| and
+  variables (free or schematic) into \verb|Ast.Variable|.  This
+  information is precise when printing fully formal \isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{5C3C6C616D6264613E}{\isasymlambda}}{\isaliteral{22}{\isachardoublequote}}}-terms.
+
+  In AST translation patterns (\secref{sec:syn-trans}) the system
+  guesses from the current theory context which atoms should be
+  treated as constant versus variable for the matching process.
+  Sometimes this needs to be indicated more explicitly using \isa{{\isaliteral{22}{\isachardoublequote}}CONST\ c{\isaliteral{22}{\isachardoublequote}}} inside the term language.  It is also possible to use
+  \hyperlink{command.syntax}{\mbox{\isa{\isacommand{syntax}}}} declarations (without mixfix annotation) to
+  enforce that certain unqualified names are always treated as
+  constant within the syntax machinery.
+
+  \medskip For ASTs that represent the language of types or sorts, the
+  situation is much simpler, since the concrete syntax already
+  distinguishes type variables from type constants (constructors).  So
+  \isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{28}{\isacharparenleft}}{\isaliteral{27}{\isacharprime}}a{\isaliteral{2C}{\isacharcomma}}\ {\isaliteral{27}{\isacharprime}}b{\isaliteral{29}{\isacharparenright}}\ foo{\isaliteral{22}{\isachardoublequote}}} corresponds to an AST application of some
+  constant for \isa{foo} and variable arguments for \isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{27}{\isacharprime}}a{\isaliteral{22}{\isachardoublequote}}} and
+  \isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{27}{\isacharprime}}b{\isaliteral{22}{\isachardoublequote}}}.  Note that the postfix application is merely a feature
+  of the concrete syntax, while in the AST the constructor occurs in
+  head position.%
+\end{isamarkuptext}%
+\isamarkuptrue%
+%
+\isamarkupsubsubsection{Authentic syntax names%
+}
+\isamarkuptrue%
+%
+\begin{isamarkuptext}%
+Naming constant entities within ASTs is another delicate
+  issue.  Unqualified names are looked up in the name space tables in
+  the last stage of parsing, after all translations have been applied.
+  Since syntax transformations do not know about this later name
+  resolution yet, there can be surprises in boundary cases.
+
+  \emph{Authentic syntax names} for \verb|Ast.Constant| avoid this
+  problem: the fully-qualified constant name with a special prefix for
+  its formal category (\isa{{\isaliteral{22}{\isachardoublequote}}class{\isaliteral{22}{\isachardoublequote}}}, \isa{{\isaliteral{22}{\isachardoublequote}}type{\isaliteral{22}{\isachardoublequote}}}, \isa{{\isaliteral{22}{\isachardoublequote}}const{\isaliteral{22}{\isachardoublequote}}}, \isa{{\isaliteral{22}{\isachardoublequote}}fixed{\isaliteral{22}{\isachardoublequote}}}) represents the information faithfully
+  within the untyped AST format.  Accidental overlap with free or
+  bound variables is excluded as well.  Authentic syntax names work
+  implicitly in the following situations:
+
+  \begin{itemize}
+
+  \item Input of term constants (or fixed variables) that are
+  introduced by concrete syntax via \hyperlink{command.notation}{\mbox{\isa{\isacommand{notation}}}}: the
+  correspondence of a particular grammar production to some known term
+  entity is preserved.
+
+  \item Input type constants (constructors) and type classes ---
+  thanks to explicit syntactic distinction independently on the
+  context.
+
+  \item Output of term constants, type constants, type classes ---
+  this information is already available from the internal term to be
+  printed.
+
+  \end{itemize}
+
+  In other words, syntax transformations that operate on input terms
+  written as prefix applications are difficult to achieve.  Luckily,
+  this case rarely occurs in practice, because syntax forms to be
+  translated usually correspond to some bits of concrete notation.%
 \end{isamarkuptext}%
 \isamarkuptrue%
 %