--- a/doc-src/IsarRef/Thy/Inner_Syntax.thy Mon Jun 18 21:17:34 2012 +0200
+++ b/doc-src/IsarRef/Thy/Inner_Syntax.thy Tue Jun 19 20:43:09 2012 +0200
@@ -1060,7 +1060,7 @@
Isabelle/Pure. *}
-subsection {* Abstract syntax trees *}
+subsection {* Abstract syntax trees \label{sec:ast} *}
text {* The ML datatype @{ML_type Ast.ast} explicitly represents the
intermediate AST format that is used for syntax rewriting
@@ -1099,7 +1099,7 @@
*}
-subsubsection {* Ast constants versus variables *}
+subsubsection {* AST constants versus variables *}
text {* Depending on the situation --- input syntax, output syntax,
translation patterns --- the distinction of atomic asts as @{ML
@@ -1192,12 +1192,12 @@
Unlike mixfix notation for existing formal entities
(\secref{sec:notation}), raw syntax declarations provide full access
- to the priority grammar of the inner syntax. This includes
- additional syntactic categories (via @{command nonterminal}) and
- free-form grammar productions (via @{command syntax}). Additional
- syntax translations (or macros, via @{command translations}) are
- required to turn resulting parse trees into proper representations
- of formal entities again.
+ to the priority grammar of the inner syntax, without any sanity
+ checks. This includes additional syntactic categories (via
+ @{command nonterminal}) and free-form grammar productions (via
+ @{command syntax}). Additional syntax translations (or macros, via
+ @{command translations}) are required to turn resulting parse trees
+ into proper representations of formal entities again.
@{rail "
@@{command nonterminal} (@{syntax name} + @'and')
@@ -1274,10 +1274,51 @@
are interpreted in the same manner as for @{command "syntax"} above.
\item @{command "translations"}~@{text rules} specifies syntactic
- translation rules (i.e.\ macros): parse~/ print rules (@{text "\<rightleftharpoons>"}),
- parse rules (@{text "\<rightharpoonup>"}), or print rules (@{text "\<leftharpoondown>"}).
+ translation rules (i.e.\ macros) as first-order rewrite rules on
+ ASTs (see also \secref{sec:ast}). The theory context maintains two
+ independent lists translation rules: parse rules (@{verbatim "=>"}
+ or @{text "\<rightharpoonup>"}) and print rules (@{verbatim "<="} or @{text "\<leftharpoondown>"}).
+ For convenience, both can be specified simultaneously as parse~/
+ print rules (@{verbatim "=="} or @{text "\<rightleftharpoons>"}).
+
Translation patterns may be prefixed by the syntactic category to be
- used for parsing; the default is @{text logic}.
+ used for parsing; the default is @{text logic} which means that
+ regular term syntax is used. Both sides of the syntax translation
+ rule undergo parsing and parse AST translations
+ \secref{sec:tr-funs}, in order to perform some fundamental
+ normalization like @{text "\<lambda>x y. b \<leadsto> \<lambda>x. \<lambda>y. b"}, but other AST
+ translation rules are \emph{not} applied recursively here.
+
+ When processing AST patterns, the inner syntax lexer runs in a
+ different mode that allows identifiers to start with underscore.
+ This accommodates the usual naming convention for auxiliary syntax
+ constants --- those that do not have a logical counter part --- by
+ allowing to specify arbitrary AST applications within the term
+ syntax, independently of the corresponding concrete syntax.
+
+ Atomic ASTs are distinguished as @{ML Ast.Constant} versus @{ML
+ Ast.Variable} as follows: a qualified name or syntax constant
+ declared via @{command syntax}, or parse tree head of concrete
+ notation becomes @{ML Ast.Constant}, anything else @{ML
+ Ast.Variable}. Note that @{text CONST} and @{text XCONST} within
+ the term language (\secref{sec:pure-grammar}) allow to enforce
+ treatment as constants.
+
+ AST rewrite rules @{text "(lhs, rhs)"} need to obey the following
+ side-conditions:
+
+ \begin{itemize}
+
+ \item Rules must be left linear: @{text "lhs"} must not contain
+ repeated variables.\footnote{The deeper reason for this is that AST
+ equality is not well-defined: different occurrences of the ``same''
+ AST could be decorated differently by accidental type-constraints or
+ source position information, for example.}
+
+ \item Every variable in @{text "rhs"} must also occur in @{text
+ "lhs"}.
+
+ \end{itemize}
\item @{command "no_translations"}~@{text rules} removes syntactic
translation rules, which are interpreted in the same manner as for
--- a/doc-src/IsarRef/Thy/document/Inner_Syntax.tex Mon Jun 18 21:17:34 2012 +0200
+++ b/doc-src/IsarRef/Thy/document/Inner_Syntax.tex Tue Jun 19 20:43:09 2012 +0200
@@ -1239,7 +1239,7 @@
\end{isamarkuptext}%
\isamarkuptrue%
%
-\isamarkupsubsection{Abstract syntax trees%
+\isamarkupsubsection{Abstract syntax trees \label{sec:ast}%
}
\isamarkuptrue%
%
@@ -1283,7 +1283,7 @@
\end{isamarkuptext}%
\isamarkuptrue%
%
-\isamarkupsubsubsection{Ast constants versus variables%
+\isamarkupsubsubsection{AST constants versus variables%
}
\isamarkuptrue%
%
@@ -1383,12 +1383,12 @@
Unlike mixfix notation for existing formal entities
(\secref{sec:notation}), raw syntax declarations provide full access
- to the priority grammar of the inner syntax. This includes
- additional syntactic categories (via \hyperlink{command.nonterminal}{\mbox{\isa{\isacommand{nonterminal}}}}) and
- free-form grammar productions (via \hyperlink{command.syntax}{\mbox{\isa{\isacommand{syntax}}}}). Additional
- syntax translations (or macros, via \hyperlink{command.translations}{\mbox{\isa{\isacommand{translations}}}}) are
- required to turn resulting parse trees into proper representations
- of formal entities again.
+ to the priority grammar of the inner syntax, without any sanity
+ checks. This includes additional syntactic categories (via
+ \hyperlink{command.nonterminal}{\mbox{\isa{\isacommand{nonterminal}}}}) and free-form grammar productions (via
+ \hyperlink{command.syntax}{\mbox{\isa{\isacommand{syntax}}}}). Additional syntax translations (or macros, via
+ \hyperlink{command.translations}{\mbox{\isa{\isacommand{translations}}}}) are required to turn resulting parse trees
+ into proper representations of formal entities again.
\begin{railoutput}
\rail@begin{2}{}
@@ -1525,10 +1525,48 @@
are interpreted in the same manner as for \hyperlink{command.syntax}{\mbox{\isa{\isacommand{syntax}}}} above.
\item \hyperlink{command.translations}{\mbox{\isa{\isacommand{translations}}}}~\isa{rules} specifies syntactic
- translation rules (i.e.\ macros): parse~/ print rules (\isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{5C3C72696768746C656674686172706F6F6E733E}{\isasymrightleftharpoons}}{\isaliteral{22}{\isachardoublequote}}}),
- parse rules (\isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{5C3C7269676874686172706F6F6E75703E}{\isasymrightharpoonup}}{\isaliteral{22}{\isachardoublequote}}}), or print rules (\isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{5C3C6C656674686172706F6F6E646F776E3E}{\isasymleftharpoondown}}{\isaliteral{22}{\isachardoublequote}}}).
+ translation rules (i.e.\ macros) as first-order rewrite rules on
+ ASTs (see also \secref{sec:ast}). The theory context maintains two
+ independent lists translation rules: parse rules (\verb|=>|
+ or \isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{5C3C7269676874686172706F6F6E75703E}{\isasymrightharpoonup}}{\isaliteral{22}{\isachardoublequote}}}) and print rules (\verb|<=| or \isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{5C3C6C656674686172706F6F6E646F776E3E}{\isasymleftharpoondown}}{\isaliteral{22}{\isachardoublequote}}}).
+ For convenience, both can be specified simultaneously as parse~/
+ print rules (\verb|==| or \isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{5C3C72696768746C656674686172706F6F6E733E}{\isasymrightleftharpoons}}{\isaliteral{22}{\isachardoublequote}}}).
+
Translation patterns may be prefixed by the syntactic category to be
- used for parsing; the default is \isa{logic}.
+ used for parsing; the default is \isa{logic} which means that
+ regular term syntax is used. Both sides of the syntax translation
+ rule undergo parsing and parse AST translations
+ \secref{sec:tr-funs}, in order to perform some fundamental
+ normalization like \isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{5C3C6C616D6264613E}{\isasymlambda}}x\ y{\isaliteral{2E}{\isachardot}}\ b\ {\isaliteral{5C3C6C65616473746F3E}{\isasymleadsto}}\ {\isaliteral{5C3C6C616D6264613E}{\isasymlambda}}x{\isaliteral{2E}{\isachardot}}\ {\isaliteral{5C3C6C616D6264613E}{\isasymlambda}}y{\isaliteral{2E}{\isachardot}}\ b{\isaliteral{22}{\isachardoublequote}}}, but other AST
+ translation rules are \emph{not} applied recursively here.
+
+ When processing AST patterns, the inner syntax lexer runs in a
+ different mode that allows identifiers to start with underscore.
+ This accommodates the usual naming convention for auxiliary syntax
+ constants --- those that do not have a logical counter part --- by
+ allowing to specify arbitrary AST applications within the term
+ syntax, independently of the corresponding concrete syntax.
+
+ Atomic ASTs are distinguished as \verb|Ast.Constant| versus \verb|Ast.Variable| as follows: a qualified name or syntax constant
+ declared via \hyperlink{command.syntax}{\mbox{\isa{\isacommand{syntax}}}}, or parse tree head of concrete
+ notation becomes \verb|Ast.Constant|, anything else \verb|Ast.Variable|. Note that \isa{CONST} and \isa{XCONST} within
+ the term language (\secref{sec:pure-grammar}) allow to enforce
+ treatment as constants.
+
+ AST rewrite rules \isa{{\isaliteral{22}{\isachardoublequote}}{\isaliteral{28}{\isacharparenleft}}lhs{\isaliteral{2C}{\isacharcomma}}\ rhs{\isaliteral{29}{\isacharparenright}}{\isaliteral{22}{\isachardoublequote}}} need to obey the following
+ side-conditions:
+
+ \begin{itemize}
+
+ \item Rules must be left linear: \isa{{\isaliteral{22}{\isachardoublequote}}lhs{\isaliteral{22}{\isachardoublequote}}} must not contain
+ repeated variables.\footnote{The deeper reason for this is that AST
+ equality is not well-defined: different occurrences of the ``same''
+ AST could be decorated differently by accidental type-constraints or
+ source position information, for example.}
+
+ \item Every variable in \isa{{\isaliteral{22}{\isachardoublequote}}rhs{\isaliteral{22}{\isachardoublequote}}} must also occur in \isa{{\isaliteral{22}{\isachardoublequote}}lhs{\isaliteral{22}{\isachardoublequote}}}.
+
+ \end{itemize}
\item \hyperlink{command.no-translations}{\mbox{\isa{\isacommand{no{\isaliteral{5F}{\isacharunderscore}}translations}}}}~\isa{rules} removes syntactic
translation rules, which are interpreted in the same manner as for
--- a/doc-src/Ref/syntax.tex Mon Jun 18 21:17:34 2012 +0200
+++ b/doc-src/Ref/syntax.tex Tue Jun 19 20:43:09 2012 +0200
@@ -239,23 +239,6 @@
\section{Macros: syntactic rewriting} \label{sec:macros}
\index{macros|(}\index{rewriting!syntactic|(}
-Mixfix declarations alone can handle situations where there is a direct
-connection between the concrete syntax and the underlying term. Sometimes
-we require a more elaborate concrete syntax, such as quantifiers and list
-notation. Isabelle's {\bf macros} and {\bf translation functions} can
-perform translations such as
-\begin{center}\tt
- \begin{tabular}{r@{$\quad\protect\rightleftharpoons\quad$}l}
- ALL x:A.P & Ball(A, \%x.P) \\ \relax
- [x, y, z] & Cons(x, Cons(y, Cons(z, Nil)))
- \end{tabular}
-\end{center}
-Translation functions (see \S\ref{sec:tr_funs}) must be coded in ML; they
-are the most powerful translation mechanism but are difficult to read or
-write. Macros are specified by first-order rewriting systems that operate
-on abstract syntax trees. They are usually easy to read and write, and can
-express all but the most obscure translations.
-
Figure~\ref{fig:set_trans} defines a fragment of first-order logic and
set theory.\footnote{This and the following theories are complete
working examples, though they specify only syntax, no axioms. The
@@ -317,50 +300,8 @@
\subsection{Specifying macros}
-Macros are basically rewrite rules on \AST{}s. But unlike other macro
-systems found in programming languages, Isabelle's macros work in both
-directions. Therefore a syntax contains two lists of rewrites: one for
-parsing and one for printing.
-\index{*translations section}
-The {\tt translations} section specifies macros. The syntax for a macro is
-\[ (root)\; string \quad
- \left\{\begin{array}[c]{c} \mtt{=>} \\ \mtt{<=} \\ \mtt{==} \end{array}
- \right\} \quad
- (root)\; string
-\]
-%
-This specifies a parse rule ({\tt =>}), a print rule ({\tt <=}), or both
-({\tt ==}). The two strings specify the left and right-hand sides of the
-macro rule. The $(root)$ specification is optional; it specifies the
-nonterminal for parsing the $string$ and if omitted defaults to {\tt
- logic}. \AST{} rewrite rules $(l, r)$ must obey certain conditions:
-\begin{itemize}
-\item Rules must be left linear: $l$ must not contain repeated variables.
-
-\item Every variable in~$r$ must also occur in~$l$.
-\end{itemize}
-
-Macro rules may refer to any syntax from the parent theories. They
-may also refer to anything defined before the current {\tt
- translations} section --- including any mixfix declarations.
-
-Upon declaration, both sides of the macro rule undergo parsing and parse
-\AST{} translations (see \S\ref{sec:asts}), but do not themselves undergo
-macro expansion. The lexer runs in a different mode that additionally
-accepts identifiers of the form $\_~letter~quasiletter^*$ (like {\tt _idt},
-{\tt _K}). Thus, a constant whose name starts with an underscore can
-appear in macro rules but not in ordinary terms.
-
-Some atoms of the macro rule's \AST{} are designated as constants for
-matching. These are all names that have been declared as classes, types or
-constants (logical and syntactic).
-
-The result of this preprocessing is two lists of macro rules, each
-stored as a pair of \AST{}s. They can be viewed using {\tt
- print_syntax} (sections \ttindex{parse_rules} and
-\ttindex{print_rules}). For theory~{\tt SetSyntax} of
-Fig.~\ref{fig:set_trans} these are
+For theory~{\tt SetSyntax} of Fig.~\ref{fig:set_trans} these are
\begin{ttbox}
parse_rules:
("{\at}Collect" x A P) -> ("Collect" A ("_abs" x P))