src/Doc/Isar_Ref/Inner_Syntax.thy
changeset 61458 987533262fc2
parent 61439 2bf52eec4e8a
child 61459 5f2ddeb15c06
equal deleted inserted replaced
61457:3e21699bb83b 61458:987533262fc2
    60     @@{command print_state} @{syntax modes}?
    60     @@{command print_state} @{syntax modes}?
    61     ;
    61     ;
    62     @{syntax_def modes}: '(' (@{syntax name} + ) ')'
    62     @{syntax_def modes}: '(' (@{syntax name} + ) ')'
    63   \<close>}
    63   \<close>}
    64 
    64 
    65   \begin{description}
       
    66 
       
    67   \<^descr> @{command "typ"}~@{text \<tau>} reads and prints a type expression
    65   \<^descr> @{command "typ"}~@{text \<tau>} reads and prints a type expression
    68   according to the current context.
    66   according to the current context.
    69 
    67 
    70   \<^descr> @{command "typ"}~@{text "\<tau> :: s"} uses type-inference to
    68   \<^descr> @{command "typ"}~@{text "\<tau> :: s"} uses type-inference to
    71   determine the most general way to make @{text "\<tau>"} conform to sort
    69   determine the most general way to make @{text "\<tau>"} conform to sort
    99   there.
    97   there.
   100 
    98 
   101   \<^descr> @{command "print_state"} prints the current proof state (if
    99   \<^descr> @{command "print_state"} prints the current proof state (if
   102   present), including current facts and goals.
   100   present), including current facts and goals.
   103 
   101 
   104   \end{description}
       
   105 
   102 
   106   All of the diagnostic commands above admit a list of @{text modes}
   103   All of the diagnostic commands above admit a list of @{text modes}
   107   to be specified, which is appended to the current print mode; see
   104   to be specified, which is appended to the current print mode; see
   108   also \secref{sec:print-modes}.  Thus the output behavior may be
   105   also \secref{sec:print-modes}.  Thus the output behavior may be
   109   modified according particular print mode features.  For example,
   106   modified according particular print mode features.  For example,
   142 
   139 
   143   These configuration options control the detail of information that
   140   These configuration options control the detail of information that
   144   is displayed for types, terms, theorems, goals etc.  See also
   141   is displayed for types, terms, theorems, goals etc.  See also
   145   \secref{sec:config}.
   142   \secref{sec:config}.
   146 
   143 
   147   \begin{description}
       
   148 
       
   149   \<^descr> @{attribute show_markup} controls direct inlining of markup
   144   \<^descr> @{attribute show_markup} controls direct inlining of markup
   150   into the printed representation of formal entities --- notably type
   145   into the printed representation of formal entities --- notably type
   151   and sort constraints.  This enables Prover IDE users to retrieve
   146   and sort constraints.  This enables Prover IDE users to retrieve
   152   that information via tooltips or popups while hovering with the
   147   that information via tooltips or popups while hovering with the
   153   mouse over the output window, for example.  Consequently, this
   148   mouse over the output window, for example.  Consequently, this
   236   \<^descr> @{attribute show_question_marks} controls printing of question
   231   \<^descr> @{attribute show_question_marks} controls printing of question
   237   marks for schematic variables, such as @{text ?x}.  Only the leading
   232   marks for schematic variables, such as @{text ?x}.  Only the leading
   238   question mark is affected, the remaining text is unchanged
   233   question mark is affected, the remaining text is unchanged
   239   (including proper markup for schematic variables that might be
   234   (including proper markup for schematic variables that might be
   240   relevant for user interfaces).
   235   relevant for user interfaces).
   241 
       
   242   \end{description}
       
   243 \<close>
   236 \<close>
   244 
   237 
   245 
   238 
   246 subsection \<open>Alternative print modes \label{sec:print-modes}\<close>
   239 subsection \<open>Alternative print modes \label{sec:print-modes}\<close>
   247 
   240 
   254   The \emph{print mode} facility allows to modify various operations
   247   The \emph{print mode} facility allows to modify various operations
   255   for printing.  Commands like @{command typ}, @{command term},
   248   for printing.  Commands like @{command typ}, @{command term},
   256   @{command thm} (see \secref{sec:print-diag}) take additional print
   249   @{command thm} (see \secref{sec:print-diag}) take additional print
   257   modes as optional argument.  The underlying ML operations are as
   250   modes as optional argument.  The underlying ML operations are as
   258   follows.
   251   follows.
   259 
       
   260   \begin{description}
       
   261 
   252 
   262   \<^descr> @{ML "print_mode_value ()"} yields the list of currently
   253   \<^descr> @{ML "print_mode_value ()"} yields the list of currently
   263   active print mode names.  This should be understood as symbolic
   254   active print mode names.  This should be understood as symbolic
   264   representation of certain individual features for printing (with
   255   representation of certain individual features for printing (with
   265   precedence from left to right).
   256   precedence from left to right).
   269   prepended by the given @{text "modes"}.  This provides a thread-safe
   260   prepended by the given @{text "modes"}.  This provides a thread-safe
   270   way to augment print modes.  It is also monotonic in the set of mode
   261   way to augment print modes.  It is also monotonic in the set of mode
   271   names: it retains the default print mode that certain
   262   names: it retains the default print mode that certain
   272   user-interfaces might have installed for their proper functioning!
   263   user-interfaces might have installed for their proper functioning!
   273 
   264 
   274   \end{description}
       
   275 
   265 
   276   \<^medskip>
   266   \<^medskip>
   277   The pretty printer for inner syntax maintains alternative
   267   The pretty printer for inner syntax maintains alternative
   278   mixfix productions for any print mode name invented by the user, say
   268   mixfix productions for any print mode name invented by the user, say
   279   in commands like @{command notation} or @{command abbreviation}.
   269   in commands like @{command notation} or @{command abbreviation}.
   280   Mode names can be arbitrary, but the following ones have a specific
   270   Mode names can be arbitrary, but the following ones have a specific
   281   meaning by convention:
   271   meaning by convention:
   282 
   272 
   283   \begin{itemize}
       
   284 
       
   285   \<^item> @{verbatim \<open>""\<close>} (the empty string): default mode;
   273   \<^item> @{verbatim \<open>""\<close>} (the empty string): default mode;
   286   implicitly active as last element in the list of modes.
   274   implicitly active as last element in the list of modes.
   287 
   275 
   288   \<^item> @{verbatim input}: dummy print mode that is never active; may
   276   \<^item> @{verbatim input}: dummy print mode that is never active; may
   289   be used to specify notation that is only available for input.
   277   be used to specify notation that is only available for input.
   300   alternative output notation.
   288   alternative output notation.
   301 
   289 
   302   \<^item> @{verbatim latex}: additional mode that is active in {\LaTeX}
   290   \<^item> @{verbatim latex}: additional mode that is active in {\LaTeX}
   303   document preparation of Isabelle theory sources; allows to provide
   291   document preparation of Isabelle theory sources; allows to provide
   304   alternative output notation.
   292   alternative output notation.
   305 
       
   306   \end{itemize}
       
   307 \<close>
   293 \<close>
   308 
   294 
   309 
   295 
   310 section \<open>Mixfix annotations \label{sec:mixfix}\<close>
   296 section \<open>Mixfix annotations \label{sec:mixfix}\<close>
   311 
   297 
   375   A mixfix template may also contain additional directives
   361   A mixfix template may also contain additional directives
   376   for pretty printing, notably spaces, blocks, and breaks.  The
   362   for pretty printing, notably spaces, blocks, and breaks.  The
   377   general template format is a sequence over any of the following
   363   general template format is a sequence over any of the following
   378   entities.
   364   entities.
   379 
   365 
   380   \begin{description}
       
   381 
       
   382   \<^descr> @{text "d"} is a delimiter, namely a non-empty sequence of
   366   \<^descr> @{text "d"} is a delimiter, namely a non-empty sequence of
   383   characters other than the following special characters:
   367   characters other than the following special characters:
   384 
   368 
   385   \<^medskip>
   369   \<^medskip>
   386   \begin{tabular}{ll}
   370   \begin{tabular}{ll}
   422 
   406 
   423   \<^descr> @{verbatim "/"}@{text s} allows a line break.  Here @{text s}
   407   \<^descr> @{verbatim "/"}@{text s} allows a line break.  Here @{text s}
   424   stands for the string of spaces (zero or more) right after the
   408   stands for the string of spaces (zero or more) right after the
   425   slash.  These spaces are printed if the break is \emph{not} taken.
   409   slash.  These spaces are printed if the break is \emph{not} taken.
   426 
   410 
   427   \end{description}
       
   428 
   411 
   429   The general idea of pretty printing with blocks and breaks is also
   412   The general idea of pretty printing with blocks and breaks is also
   430   described in @{cite "paulson-ml2"}; it goes back to @{cite "Oppen:1980"}.
   413   described in @{cite "paulson-ml2"}; it goes back to @{cite "Oppen:1980"}.
   431 \<close>
   414 \<close>
   432 
   415 
   530       (@{syntax nameref} @{syntax mixfix} + @'and')
   513       (@{syntax nameref} @{syntax mixfix} + @'and')
   531     ;
   514     ;
   532     @@{command write} @{syntax mode}? (@{syntax nameref} @{syntax mixfix} + @'and')
   515     @@{command write} @{syntax mode}? (@{syntax nameref} @{syntax mixfix} + @'and')
   533   \<close>}
   516   \<close>}
   534 
   517 
   535   \begin{description}
       
   536 
       
   537   \<^descr> @{command "type_notation"}~@{text "c (mx)"} associates mixfix
   518   \<^descr> @{command "type_notation"}~@{text "c (mx)"} associates mixfix
   538   syntax with an existing type constructor.  The arity of the
   519   syntax with an existing type constructor.  The arity of the
   539   constructor is retrieved from the context.
   520   constructor is retrieved from the context.
   540 
   521 
   541   \<^descr> @{command "no_type_notation"} is similar to @{command
   522   \<^descr> @{command "no_type_notation"} is similar to @{command
   550   but removes the specified syntax annotation from the present
   531   but removes the specified syntax annotation from the present
   551   context.
   532   context.
   552 
   533 
   553   \<^descr> @{command "write"} is similar to @{command "notation"}, but
   534   \<^descr> @{command "write"} is similar to @{command "notation"}, but
   554   works within an Isar proof body.
   535   works within an Isar proof body.
   555 
       
   556   \end{description}
       
   557 \<close>
   536 \<close>
   558 
   537 
   559 
   538 
   560 section \<open>The Pure syntax \label{sec:pure-syntax}\<close>
   539 section \<open>The Pure syntax \label{sec:pure-syntax}\<close>
   561 
   540 
   563 
   542 
   564 text \<open>The inner lexical syntax vaguely resembles the outer one
   543 text \<open>The inner lexical syntax vaguely resembles the outer one
   565   (\secref{sec:outer-lex}), but some details are different.  There are
   544   (\secref{sec:outer-lex}), but some details are different.  There are
   566   two main categories of inner syntax tokens:
   545   two main categories of inner syntax tokens:
   567 
   546 
   568   \begin{enumerate}
       
   569 
       
   570   \<^enum> \emph{delimiters} --- the literal tokens occurring in
   547   \<^enum> \emph{delimiters} --- the literal tokens occurring in
   571   productions of the given priority grammar (cf.\
   548   productions of the given priority grammar (cf.\
   572   \secref{sec:priority-grammar});
   549   \secref{sec:priority-grammar});
   573 
   550 
   574   \<^enum> \emph{named tokens} --- various categories of identifiers etc.
   551   \<^enum> \emph{named tokens} --- various categories of identifiers etc.
   575 
   552 
   576   \end{enumerate}
       
   577 
   553 
   578   Delimiters override named tokens and may thus render certain
   554   Delimiters override named tokens and may thus render certain
   579   identifiers inaccessible.  Sometimes the logical context admits
   555   identifiers inaccessible.  Sometimes the logical context admits
   580   alternative ways to refer to the same entity, potentially via
   556   alternative ways to refer to the same entity, potentially via
   581   qualified names.
   557   qualified names.
   657   "+"}.  Furthermore @{verbatim "+"} associates to the left and
   633   "+"}.  Furthermore @{verbatim "+"} associates to the left and
   658   @{verbatim "*"} to the right.
   634   @{verbatim "*"} to the right.
   659 
   635 
   660   \<^medskip>
   636   \<^medskip>
   661   For clarity, grammars obey these conventions:
   637   For clarity, grammars obey these conventions:
   662   \begin{itemize}
       
   663 
   638 
   664   \<^item> All priorities must lie between 0 and 1000.
   639   \<^item> All priorities must lie between 0 and 1000.
   665 
   640 
   666   \<^item> Priority 0 on the right-hand side and priority 1000 on the
   641   \<^item> Priority 0 on the right-hand side and priority 1000 on the
   667   left-hand side may be omitted.
   642   left-hand side may be omitted.
   673   \<^item> Alternatives are separated by @{text "|"}.
   648   \<^item> Alternatives are separated by @{text "|"}.
   674 
   649 
   675   \<^item> Repetition is indicated by dots @{text "(\<dots>)"} in an informal
   650   \<^item> Repetition is indicated by dots @{text "(\<dots>)"} in an informal
   676   but obvious way.
   651   but obvious way.
   677 
   652 
   678   \end{itemize}
       
   679 
   653 
   680   Using these conventions, the example grammar specification above
   654   Using these conventions, the example grammar specification above
   681   takes the form:
   655   takes the form:
   682   \begin{center}
   656   \begin{center}
   683   \begin{tabular}{rclc}
   657   \begin{tabular}{rclc}
   769   Here literal terminals are printed @{verbatim "verbatim"};
   743   Here literal terminals are printed @{verbatim "verbatim"};
   770   see also \secref{sec:inner-lex} for further token categories of the
   744   see also \secref{sec:inner-lex} for further token categories of the
   771   inner syntax.  The meaning of the nonterminals defined by the above
   745   inner syntax.  The meaning of the nonterminals defined by the above
   772   grammar is as follows:
   746   grammar is as follows:
   773 
   747 
   774   \begin{description}
       
   775 
       
   776   \<^descr> @{syntax_ref (inner) any} denotes any term.
   748   \<^descr> @{syntax_ref (inner) any} denotes any term.
   777 
   749 
   778   \<^descr> @{syntax_ref (inner) prop} denotes meta-level propositions,
   750   \<^descr> @{syntax_ref (inner) prop} denotes meta-level propositions,
   779   which are terms of type @{typ prop}.  The syntax of such formulae of
   751   which are terms of type @{typ prop}.  The syntax of such formulae of
   780   the meta-logic is carefully distinguished from usual conventions for
   752   the meta-logic is carefully distinguished from usual conventions for
   822 
   794 
   823   \<^descr> @{syntax_ref (inner) type} denotes types of the meta-logic.
   795   \<^descr> @{syntax_ref (inner) type} denotes types of the meta-logic.
   824 
   796 
   825   \<^descr> @{syntax_ref (inner) sort} denotes meta-level sorts.
   797   \<^descr> @{syntax_ref (inner) sort} denotes meta-level sorts.
   826 
   798 
   827   \end{description}
       
   828 
   799 
   829   Here are some further explanations of certain syntax features.
   800   Here are some further explanations of certain syntax features.
   830 
       
   831   \begin{itemize}
       
   832 
   801 
   833   \<^item> In @{syntax (inner) idts}, note that @{text "x :: nat y"} is
   802   \<^item> In @{syntax (inner) idts}, note that @{text "x :: nat y"} is
   834   parsed as @{text "x :: (nat y)"}, treating @{text y} like a type
   803   parsed as @{text "x :: (nat y)"}, treating @{text y} like a type
   835   constructor applied to @{text nat}.  To avoid this interpretation,
   804   constructor applied to @{text nat}.  To avoid this interpretation,
   836   write @{text "(x :: nat) y"} with explicit parentheses.
   805   write @{text "(x :: nat) y"} with explicit parentheses.
   849   \<^item> Dummy variables (written as underscore) may occur in different
   818   \<^item> Dummy variables (written as underscore) may occur in different
   850   roles.
   819   roles.
   851 
   820 
   852   \begin{description}
   821   \begin{description}
   853 
   822 
   854   \<^descr> A type ``@{text "_"}'' or ``@{text "_ :: sort"}'' acts like an
   823     \item A type ``@{text "_"}'' or ``@{text "_ :: sort"}'' acts like an
   855   anonymous inference parameter, which is filled-in according to the
   824     anonymous inference parameter, which is filled-in according to the
   856   most general type produced by the type-checking phase.
   825     most general type produced by the type-checking phase.
   857 
   826 
   858   \<^descr> A bound ``@{text "_"}'' refers to a vacuous abstraction, where
   827     \item A bound ``@{text "_"}'' refers to a vacuous abstraction, where
   859   the body does not refer to the binding introduced here.  As in the
   828     the body does not refer to the binding introduced here.  As in the
   860   term @{term "\<lambda>x _. x"}, which is @{text "\<alpha>"}-equivalent to @{text
   829     term @{term "\<lambda>x _. x"}, which is @{text "\<alpha>"}-equivalent to @{text
   861   "\<lambda>x y. x"}.
   830     "\<lambda>x y. x"}.
   862 
   831 
   863   \<^descr> A free ``@{text "_"}'' refers to an implicit outer binding.
   832     \item A free ``@{text "_"}'' refers to an implicit outer binding.
   864   Higher definitional packages usually allow forms like @{text "f x _
   833     Higher definitional packages usually allow forms like @{text "f x _
   865   = x"}.
   834     = x"}.
   866 
   835 
   867   \<^descr> A schematic ``@{text "_"}'' (within a term pattern, see
   836     \item A schematic ``@{text "_"}'' (within a term pattern, see
   868   \secref{sec:term-decls}) refers to an anonymous variable that is
   837     \secref{sec:term-decls}) refers to an anonymous variable that is
   869   implicitly abstracted over its context of locally bound variables.
   838     implicitly abstracted over its context of locally bound variables.
   870   For example, this allows pattern matching of @{text "{x. f x = g
   839     For example, this allows pattern matching of @{text "{x. f x = g
   871   x}"} against @{text "{x. _ = _}"}, or even @{text "{_. _ = _}"} by
   840     x}"} against @{text "{x. _ = _}"}, or even @{text "{_. _ = _}"} by
   872   using both bound and schematic dummies.
   841     using both bound and schematic dummies.
   873 
   842 
   874   \end{description}
   843   \end{description}
   875 
   844 
   876   \<^descr> The three literal dots ``@{verbatim "..."}'' may be also
   845   \<^descr> The three literal dots ``@{verbatim "..."}'' may be also
   877   written as ellipsis symbol @{verbatim "\<dots>"}.  In both cases this
   846   written as ellipsis symbol @{verbatim "\<dots>"}.  In both cases this
   885   rules (\secref{sec:syn-trans}), notably on the RHS.
   854   rules (\secref{sec:syn-trans}), notably on the RHS.
   886 
   855 
   887   \<^descr> @{verbatim XCONST} is similar to @{verbatim CONST}, but
   856   \<^descr> @{verbatim XCONST} is similar to @{verbatim CONST}, but
   888   retains the constant name as given.  This is only relevant to
   857   retains the constant name as given.  This is only relevant to
   889   translation rules (\secref{sec:syn-trans}), notably on the LHS.
   858   translation rules (\secref{sec:syn-trans}), notably on the LHS.
   890 
       
   891   \end{itemize}
       
   892 \<close>
   859 \<close>
   893 
   860 
   894 
   861 
   895 subsection \<open>Inspecting the syntax\<close>
   862 subsection \<open>Inspecting the syntax\<close>
   896 
   863 
   897 text \<open>
   864 text \<open>
   898   \begin{matharray}{rcl}
   865   \begin{matharray}{rcl}
   899     @{command_def "print_syntax"}@{text "\<^sup>*"} & : & @{text "context \<rightarrow>"} \\
   866     @{command_def "print_syntax"}@{text "\<^sup>*"} & : & @{text "context \<rightarrow>"} \\
   900   \end{matharray}
   867   \end{matharray}
   901 
   868 
   902   \begin{description}
       
   903 
       
   904   \<^descr> @{command "print_syntax"} prints the inner syntax of the
   869   \<^descr> @{command "print_syntax"} prints the inner syntax of the
   905   current context.  The output can be quite large; the most important
   870   current context.  The output can be quite large; the most important
   906   sections are explained below.
   871   sections are explained below.
   907 
   872 
   908   \begin{description}
   873   \begin{description}
   909 
   874 
   910   \<^descr> @{text "lexicon"} lists the delimiters of the inner token
   875     \item @{text "lexicon"} lists the delimiters of the inner token
   911   language; see \secref{sec:inner-lex}.
   876     language; see \secref{sec:inner-lex}.
   912 
   877 
   913   \<^descr> @{text "prods"} lists the productions of the underlying
   878     \item @{text "prods"} lists the productions of the underlying
   914   priority grammar; see \secref{sec:priority-grammar}.
   879     priority grammar; see \secref{sec:priority-grammar}.
   915 
   880 
   916   The nonterminal @{text "A\<^sup>(\<^sup>p\<^sup>)"} is rendered in plain text as @{text
   881     The nonterminal @{text "A\<^sup>(\<^sup>p\<^sup>)"} is rendered in plain text as @{text
   917   "A[p]"}; delimiters are quoted.  Many productions have an extra
   882     "A[p]"}; delimiters are quoted.  Many productions have an extra
   918   @{text "\<dots> => name"}.  These names later become the heads of parse
   883     @{text "\<dots> => name"}.  These names later become the heads of parse
   919   trees; they also guide the pretty printer.
   884     trees; they also guide the pretty printer.
   920 
   885 
   921   Productions without such parse tree names are called \emph{copy
   886     Productions without such parse tree names are called \emph{copy
   922   productions}.  Their right-hand side must have exactly one
   887     productions}.  Their right-hand side must have exactly one
   923   nonterminal symbol (or named token).  The parser does not create a
   888     nonterminal symbol (or named token).  The parser does not create a
   924   new parse tree node for copy productions, but simply returns the
   889     new parse tree node for copy productions, but simply returns the
   925   parse tree of the right-hand symbol.
   890     parse tree of the right-hand symbol.
   926 
   891 
   927   If the right-hand side of a copy production consists of a single
   892     If the right-hand side of a copy production consists of a single
   928   nonterminal without any delimiters, then it is called a \emph{chain
   893     nonterminal without any delimiters, then it is called a \emph{chain
   929   production}.  Chain productions act as abbreviations: conceptually,
   894     production}.  Chain productions act as abbreviations: conceptually,
   930   they are removed from the grammar by adding new productions.
   895     they are removed from the grammar by adding new productions.
   931   Priority information attached to chain productions is ignored; only
   896     Priority information attached to chain productions is ignored; only
   932   the dummy value @{text "-1"} is displayed.
   897     the dummy value @{text "-1"} is displayed.
   933 
   898 
   934   \<^descr> @{text "print modes"} lists the alternative print modes
   899     \item @{text "print modes"} lists the alternative print modes
   935   provided by this grammar; see \secref{sec:print-modes}.
   900     provided by this grammar; see \secref{sec:print-modes}.
   936 
   901 
   937   \<^descr> @{text "parse_rules"} and @{text "print_rules"} relate to
   902     \item @{text "parse_rules"} and @{text "print_rules"} relate to
   938   syntax translations (macros); see \secref{sec:syn-trans}.
   903     syntax translations (macros); see \secref{sec:syn-trans}.
   939 
   904 
   940   \<^descr> @{text "parse_ast_translation"} and @{text
   905     \item @{text "parse_ast_translation"} and @{text
   941   "print_ast_translation"} list sets of constants that invoke
   906     "print_ast_translation"} list sets of constants that invoke
   942   translation functions for abstract syntax trees, which are only
   907     translation functions for abstract syntax trees, which are only
   943   required in very special situations; see \secref{sec:tr-funs}.
   908     required in very special situations; see \secref{sec:tr-funs}.
   944 
   909 
   945   \<^descr> @{text "parse_translation"} and @{text "print_translation"}
   910     \item @{text "parse_translation"} and @{text "print_translation"}
   946   list the sets of constants that invoke regular translation
   911     list the sets of constants that invoke regular translation
   947   functions; see \secref{sec:tr-funs}.
   912     functions; see \secref{sec:tr-funs}.
   948 
       
   949   \end{description}
       
   950 
   913 
   951   \end{description}
   914   \end{description}
   952 \<close>
   915 \<close>
   953 
   916 
   954 
   917 
   972 
   935 
   973   Certain warning or error messages are printed, depending on the
   936   Certain warning or error messages are printed, depending on the
   974   situation and the given configuration options.  Parsing ultimately
   937   situation and the given configuration options.  Parsing ultimately
   975   fails, if multiple results remain after the filtering phase.
   938   fails, if multiple results remain after the filtering phase.
   976 
   939 
   977   \begin{description}
       
   978 
       
   979   \<^descr> @{attribute syntax_ambiguity_warning} controls output of
   940   \<^descr> @{attribute syntax_ambiguity_warning} controls output of
   980   explicit warning messages about syntax ambiguity.
   941   explicit warning messages about syntax ambiguity.
   981 
   942 
   982   \<^descr> @{attribute syntax_ambiguity_limit} determines the number of
   943   \<^descr> @{attribute syntax_ambiguity_limit} determines the number of
   983   resulting parse trees that are shown as part of the printed message
   944   resulting parse trees that are shown as part of the printed message
   984   in case of an ambiguity.
   945   in case of an ambiguity.
   985 
       
   986   \end{description}
       
   987 \<close>
   946 \<close>
   988 
   947 
   989 
   948 
   990 section \<open>Syntax transformations \label{sec:syntax-transformations}\<close>
   949 section \<open>Syntax transformations \label{sec:syntax-transformations}\<close>
   991 
   950 
  1131   "const"}, @{text "fixed"}) represents the information faithfully
  1090   "const"}, @{text "fixed"}) represents the information faithfully
  1132   within the untyped AST format.  Accidental overlap with free or
  1091   within the untyped AST format.  Accidental overlap with free or
  1133   bound variables is excluded as well.  Authentic syntax names work
  1092   bound variables is excluded as well.  Authentic syntax names work
  1134   implicitly in the following situations:
  1093   implicitly in the following situations:
  1135 
  1094 
  1136   \begin{itemize}
       
  1137 
       
  1138   \<^item> Input of term constants (or fixed variables) that are
  1095   \<^item> Input of term constants (or fixed variables) that are
  1139   introduced by concrete syntax via @{command notation}: the
  1096   introduced by concrete syntax via @{command notation}: the
  1140   correspondence of a particular grammar production to some known term
  1097   correspondence of a particular grammar production to some known term
  1141   entity is preserved.
  1098   entity is preserved.
  1142 
  1099 
  1146 
  1103 
  1147   \<^item> Output of term constants, type constants, type classes ---
  1104   \<^item> Output of term constants, type constants, type classes ---
  1148   this information is already available from the internal term to be
  1105   this information is already available from the internal term to be
  1149   printed.
  1106   printed.
  1150 
  1107 
  1151   \end{itemize}
       
  1152 
  1108 
  1153   In other words, syntax transformations that operate on input terms
  1109   In other words, syntax transformations that operate on input terms
  1154   written as prefix applications are difficult to make robust.
  1110   written as prefix applications are difficult to make robust.
  1155   Luckily, this case rarely occurs in practice, because syntax forms
  1111   Luckily, this case rarely occurs in practice, because syntax forms
  1156   to be translated usually correspond to some concrete notation.\<close>
  1112   to be translated usually correspond to some concrete notation.\<close>
  1193     mode: ('(' ( @{syntax name} | @'output' | @{syntax name} @'output' ) ')')
  1149     mode: ('(' ( @{syntax name} | @'output' | @{syntax name} @'output' ) ')')
  1194     ;
  1150     ;
  1195     transpat: ('(' @{syntax nameref} ')')? @{syntax string}
  1151     transpat: ('(' @{syntax nameref} ')')? @{syntax string}
  1196   \<close>}
  1152   \<close>}
  1197 
  1153 
  1198   \begin{description}
       
  1199 
       
  1200   \<^descr> @{command "nonterminal"}~@{text c} declares a type
  1154   \<^descr> @{command "nonterminal"}~@{text c} declares a type
  1201   constructor @{text c} (without arguments) to act as purely syntactic
  1155   constructor @{text c} (without arguments) to act as purely syntactic
  1202   type: a nonterminal symbol of the inner syntax.
  1156   type: a nonterminal symbol of the inner syntax.
  1203 
  1157 
  1204   \<^descr> @{command "syntax"}~@{text "(mode) c :: \<sigma> (mx)"} augments the
  1158   \<^descr> @{command "syntax"}~@{text "(mode) c :: \<sigma> (mx)"} augments the
  1213   (@{verbatim "_"}).  The latter correspond to nonterminal symbols
  1167   (@{verbatim "_"}).  The latter correspond to nonterminal symbols
  1214   @{text "A\<^sub>i"} derived from the argument types @{text "\<tau>\<^sub>i"} as
  1168   @{text "A\<^sub>i"} derived from the argument types @{text "\<tau>\<^sub>i"} as
  1215   follows:
  1169   follows:
  1216   \begin{itemize}
  1170   \begin{itemize}
  1217 
  1171 
  1218   \<^item> @{text "prop"} if @{text "\<tau>\<^sub>i = prop"}
  1172     \item @{text "prop"} if @{text "\<tau>\<^sub>i = prop"}
  1219 
  1173 
  1220   \<^item> @{text "logic"} if @{text "\<tau>\<^sub>i = (\<dots>)\<kappa>"} for logical type
  1174     \item @{text "logic"} if @{text "\<tau>\<^sub>i = (\<dots>)\<kappa>"} for logical type
  1221   constructor @{text "\<kappa> \<noteq> prop"}
  1175     constructor @{text "\<kappa> \<noteq> prop"}
  1222 
  1176 
  1223   \<^item> @{text any} if @{text "\<tau>\<^sub>i = \<alpha>"} for type variables
  1177     \item @{text any} if @{text "\<tau>\<^sub>i = \<alpha>"} for type variables
  1224 
  1178 
  1225   \<^item> @{text "\<kappa>"} if @{text "\<tau>\<^sub>i = \<kappa>"} for nonterminal @{text "\<kappa>"}
  1179     \item @{text "\<kappa>"} if @{text "\<tau>\<^sub>i = \<kappa>"} for nonterminal @{text "\<kappa>"}
  1226   (syntactic type constructor)
  1180     (syntactic type constructor)
  1227 
  1181 
  1228   \end{itemize}
  1182   \end{itemize}
  1229 
  1183 
  1230   Each @{text "A\<^sub>i"} is decorated by priority @{text "p\<^sub>i"} from the
  1184   Each @{text "A\<^sub>i"} is decorated by priority @{text "p\<^sub>i"} from the
  1231   given list @{text "ps"}; missing priorities default to 0.
  1185   given list @{text "ps"}; missing priorities default to 0.
  1289   AST rewrite rules @{text "(lhs, rhs)"} need to obey the following
  1243   AST rewrite rules @{text "(lhs, rhs)"} need to obey the following
  1290   side-conditions:
  1244   side-conditions:
  1291 
  1245 
  1292   \begin{itemize}
  1246   \begin{itemize}
  1293 
  1247 
  1294   \<^item> Rules must be left linear: @{text "lhs"} must not contain
  1248     \item Rules must be left linear: @{text "lhs"} must not contain
  1295   repeated variables.\footnote{The deeper reason for this is that AST
  1249     repeated variables.\footnote{The deeper reason for this is that AST
  1296   equality is not well-defined: different occurrences of the ``same''
  1250     equality is not well-defined: different occurrences of the ``same''
  1297   AST could be decorated differently by accidental type-constraints or
  1251     AST could be decorated differently by accidental type-constraints or
  1298   source position information, for example.}
  1252     source position information, for example.}
  1299 
  1253 
  1300   \<^item> Every variable in @{text "rhs"} must also occur in @{text
  1254     \item Every variable in @{text "rhs"} must also occur in @{text
  1301   "lhs"}.
  1255     "lhs"}.
  1302 
  1256 
  1303   \end{itemize}
  1257   \end{itemize}
  1304 
  1258 
  1305   \<^descr> @{command "no_translations"}~@{text rules} removes syntactic
  1259   \<^descr> @{command "no_translations"}~@{text rules} removes syntactic
  1306   translation rules, which are interpreted in the same manner as for
  1260   translation rules, which are interpreted in the same manner as for
  1309   \<^descr> @{attribute syntax_ast_trace} and @{attribute
  1263   \<^descr> @{attribute syntax_ast_trace} and @{attribute
  1310   syntax_ast_stats} control diagnostic output in the AST normalization
  1264   syntax_ast_stats} control diagnostic output in the AST normalization
  1311   process, when translation rules are applied to concrete input or
  1265   process, when translation rules are applied to concrete input or
  1312   output.
  1266   output.
  1313 
  1267 
  1314   \end{description}
       
  1315 
  1268 
  1316   Raw syntax and translations provides a slightly more low-level
  1269   Raw syntax and translations provides a slightly more low-level
  1317   access to the grammar and the form of resulting parse trees.  It is
  1270   access to the grammar and the form of resulting parse trees.  It is
  1318   often possible to avoid this untyped macro mechanism, and use
  1271   often possible to avoid this untyped macro mechanism, and use
  1319   type-safe @{command abbreviation} or @{command notation} instead.
  1272   type-safe @{command abbreviation} or @{command notation} instead.
  1320   Some important situations where @{command syntax} and @{command
  1273   Some important situations where @{command syntax} and @{command
  1321   translations} are really need are as follows:
  1274   translations} are really need are as follows:
  1322 
  1275 
  1323   \begin{itemize}
       
  1324 
       
  1325   \<^item> Iterated replacement via recursive @{command translations}.
  1276   \<^item> Iterated replacement via recursive @{command translations}.
  1326   For example, consider list enumeration @{term "[a, b, c, d]"} as
  1277   For example, consider list enumeration @{term "[a, b, c, d]"} as
  1327   defined in theory @{theory List} in Isabelle/HOL.
  1278   defined in theory @{theory List} in Isabelle/HOL.
  1328 
  1279 
  1329   \<^item> Change of binding status of variables: anything beyond the
  1280   \<^item> Change of binding status of variables: anything beyond the
  1330   built-in @{keyword "binder"} mixfix annotation requires explicit
  1281   built-in @{keyword "binder"} mixfix annotation requires explicit
  1331   syntax translations.  For example, consider list filter
  1282   syntax translations.  For example, consider list filter
  1332   comprehension @{term "[x \<leftarrow> xs . P]"} as defined in theory @{theory
  1283   comprehension @{term "[x \<leftarrow> xs . P]"} as defined in theory @{theory
  1333   List} in Isabelle/HOL.
  1284   List} in Isabelle/HOL.
  1334 
  1285 \<close>
  1335   \end{itemize}
  1286 
  1336 \<close>
       
  1337 
  1287 
  1338 subsubsection \<open>Applying translation rules\<close>
  1288 subsubsection \<open>Applying translation rules\<close>
  1339 
  1289 
  1340 text \<open>As a term is being parsed or printed, an AST is generated as
  1290 text \<open>As a term is being parsed or printed, an AST is generated as
  1341   an intermediate form according to \figref{fig:parse-print}.  The AST
  1291   an intermediate form according to \figref{fig:parse-print}.  The AST
  1354   this purpose.
  1304   this purpose.
  1355 
  1305 
  1356   More precisely, the matching of the object @{text "u"} against the
  1306   More precisely, the matching of the object @{text "u"} against the
  1357   pattern @{text "lhs"} is performed as follows:
  1307   pattern @{text "lhs"} is performed as follows:
  1358 
  1308 
  1359   \begin{itemize}
       
  1360 
       
  1361   \<^item> Objects of the form @{ML Ast.Variable}~@{text "x"} or @{ML
  1309   \<^item> Objects of the form @{ML Ast.Variable}~@{text "x"} or @{ML
  1362   Ast.Constant}~@{text "x"} are matched by pattern @{ML
  1310   Ast.Constant}~@{text "x"} are matched by pattern @{ML
  1363   Ast.Constant}~@{text "x"}.  Thus all atomic ASTs in the object are
  1311   Ast.Constant}~@{text "x"}.  Thus all atomic ASTs in the object are
  1364   treated as (potential) constants, and a successful match makes them
  1312   treated as (potential) constants, and a successful match makes them
  1365   actual constants even before name space resolution (see also
  1313   actual constants even before name space resolution (see also
  1372   Ast.Appl}~@{text "ts"} if @{text "us"} and @{text "ts"} have the
  1320   Ast.Appl}~@{text "ts"} if @{text "us"} and @{text "ts"} have the
  1373   same length and each corresponding subtree matches.
  1321   same length and each corresponding subtree matches.
  1374 
  1322 
  1375   \<^item> In every other case, matching fails.
  1323   \<^item> In every other case, matching fails.
  1376 
  1324 
  1377   \end{itemize}
       
  1378 
  1325 
  1379   A successful match yields a substitution that is applied to @{text
  1326   A successful match yields a substitution that is applied to @{text
  1380   "rhs"}, generating the instance that replaces @{text "u"}.
  1327   "rhs"}, generating the instance that replaces @{text "u"}.
  1381 
  1328 
  1382   Normalizing an AST involves repeatedly applying translation rules
  1329   Normalizing an AST involves repeatedly applying translation rules
  1436   (@@{ML_antiquotation class_syntax} |
  1383   (@@{ML_antiquotation class_syntax} |
  1437    @@{ML_antiquotation type_syntax} |
  1384    @@{ML_antiquotation type_syntax} |
  1438    @@{ML_antiquotation const_syntax} |
  1385    @@{ML_antiquotation const_syntax} |
  1439    @@{ML_antiquotation syntax_const}) name
  1386    @@{ML_antiquotation syntax_const}) name
  1440   \<close>}
  1387   \<close>}
  1441 
       
  1442   \begin{description}
       
  1443 
  1388 
  1444   \<^descr> @{command parse_translation} etc. declare syntax translation
  1389   \<^descr> @{command parse_translation} etc. declare syntax translation
  1445   functions to the theory.  Any of these commands have a single
  1390   functions to the theory.  Any of these commands have a single
  1446   @{syntax text} argument that refers to an ML expression of
  1391   @{syntax text} argument that refers to an ML expression of
  1447   appropriate type as follows:
  1392   appropriate type as follows:
  1484   the given syntax constant, having checked that it has been declared
  1429   the given syntax constant, having checked that it has been declared
  1485   via some @{command syntax} commands within the theory context.  Note
  1430   via some @{command syntax} commands within the theory context.  Note
  1486   that the usual naming convention makes syntax constants start with
  1431   that the usual naming convention makes syntax constants start with
  1487   underscore, to reduce the chance of accidental clashes with other
  1432   underscore, to reduce the chance of accidental clashes with other
  1488   names occurring in parse trees (unqualified constants etc.).
  1433   names occurring in parse trees (unqualified constants etc.).
  1489 
       
  1490   \end{description}
       
  1491 \<close>
  1434 \<close>
  1492 
  1435 
  1493 
  1436 
  1494 subsubsection \<open>The translation strategy\<close>
  1437 subsubsection \<open>The translation strategy\<close>
  1495 
  1438 
  1513 
  1456 
  1514   Regardless of whether they act on ASTs or terms, translation
  1457   Regardless of whether they act on ASTs or terms, translation
  1515   functions called during the parsing process differ from those for
  1458   functions called during the parsing process differ from those for
  1516   printing in their overall behaviour:
  1459   printing in their overall behaviour:
  1517 
  1460 
  1518   \begin{description}
       
  1519 
       
  1520   \<^descr>[Parse translations] are applied bottom-up.  The arguments are
  1461   \<^descr>[Parse translations] are applied bottom-up.  The arguments are
  1521   already in translated form.  The translations must not fail;
  1462   already in translated form.  The translations must not fail;
  1522   exceptions trigger an error message.  There may be at most one
  1463   exceptions trigger an error message.  There may be at most one
  1523   function associated with any syntactic name.
  1464   function associated with any syntactic name.
  1524 
  1465 
  1529   function may raise exception @{ML Match} to indicate failure; in
  1470   function may raise exception @{ML Match} to indicate failure; in
  1530   this event it has no effect.  Multiple functions associated with
  1471   this event it has no effect.  Multiple functions associated with
  1531   some syntactic name are tried in the order of declaration in the
  1472   some syntactic name are tried in the order of declaration in the
  1532   theory.
  1473   theory.
  1533 
  1474 
  1534   \end{description}
       
  1535 
  1475 
  1536   Only constant atoms --- constructor @{ML Ast.Constant} for ASTs and
  1476   Only constant atoms --- constructor @{ML Ast.Constant} for ASTs and
  1537   @{ML Const} for terms --- can invoke translation functions.  This
  1477   @{ML Const} for terms --- can invoke translation functions.  This
  1538   means that parse translations can only be associated with parse tree
  1478   means that parse translations can only be associated with parse tree
  1539   heads of concrete syntax, or syntactic constants introduced via
  1479   heads of concrete syntax, or syntactic constants introduced via