doc-src/Ref/syntax.tex
changeset 3485 f27a30a18a17
parent 3135 233aba197bf2
child 4375 ef2a7b589004
equal deleted inserted replaced
3484:1e93eb09ebb9 3485:f27a30a18a17
   112 by stripping out delimiters and copy productions.  More precisely, the
   112 by stripping out delimiters and copy productions.  More precisely, the
   113 mapping $\astofpt{-}$ is derived from the productions as follows:
   113 mapping $\astofpt{-}$ is derived from the productions as follows:
   114 \begin{itemize}
   114 \begin{itemize}
   115 \item Name tokens: $\astofpt{t} = \Variable s$, where $t$ is an \ndx{id},
   115 \item Name tokens: $\astofpt{t} = \Variable s$, where $t$ is an \ndx{id},
   116   \ndx{var}, \ndx{tid}, \ndx{tvar}, \ndx{xnum} or \ndx{xstr} token, and $s$
   116   \ndx{var}, \ndx{tid}, \ndx{tvar}, \ndx{xnum} or \ndx{xstr} token, and $s$
   117   its associated string. Note that for {\tt xstr} this does not include the
   117   its associated string.  Note that for {\tt xstr} this does not include the
   118   quotes.
   118   quotes.
   119 
   119 
   120 \item Copy productions:\index{productions!copy}
   120 \item Copy productions:\index{productions!copy}
   121   $\astofpt{\ldots P \ldots} = \astofpt{P}$.  Here $\ldots$ stands for
   121   $\astofpt{\ldots P \ldots} = \astofpt{P}$.  Here $\ldots$ stands for
   122   strings of delimiters, which are discarded.  $P$ stands for the single
   122   strings of delimiters, which are discarded.  $P$ stands for the single
   604 line break is required then a space is printed instead.
   604 line break is required then a space is printed instead.
   605 
   605 
   606 The nonterminal is declared as the type~{\tt is}, but with no {\tt arities}
   606 The nonterminal is declared as the type~{\tt is}, but with no {\tt arities}
   607 declaration.  Hence {\tt is} is not a logical type and may be used safely as
   607 declaration.  Hence {\tt is} is not a logical type and may be used safely as
   608 a new nonterminal for custom syntax.  The nonterminal~{\tt is} can later be
   608 a new nonterminal for custom syntax.  The nonterminal~{\tt is} can later be
   609 re-used for other enumerations of type~{\tt i} like lists or tuples. If we
   609 re-used for other enumerations of type~{\tt i} like lists or tuples.  If we
   610 had needed polymorphic enumerations, we could have used the predefined
   610 had needed polymorphic enumerations, we could have used the predefined
   611 nonterminal symbol \ndx{args} and skipped this part altogether.
   611 nonterminal symbol \ndx{args} and skipped this part altogether.
   612 
   612 
   613 \index{"@Finset@{\tt\at Finset} constant}
   613 \index{"@Finset@{\tt\at Finset} constant}
   614 Next follows {\tt empty}, which is already equipped with its syntax
   614 Next follows {\tt empty}, which is already equipped with its syntax
   848 type {\tt dummyT}.  Only the binding occurrence of~$x'$ is given the
   848 type {\tt dummyT}.  Only the binding occurrence of~$x'$ is given the
   849 correct type~{\tt T}, so this is the only place where a type
   849 correct type~{\tt T}, so this is the only place where a type
   850 constraint might appear.
   850 constraint might appear.
   851 
   851 
   852 Also note that we are responsible to mark free identifiers that
   852 Also note that we are responsible to mark free identifiers that
   853 actually represent bound variables. This is achieved by
   853 actually represent bound variables.  This is achieved by
   854 \ttindex{Syntax.variant_abs'} and \ttindex{Syntax.mark_boundT} above.
   854 \ttindex{Syntax.variant_abs'} and \ttindex{Syntax.mark_boundT} above.
   855 Failing to do so may cause these names to be printed in the wrong
   855 Failing to do so may cause these names to be printed in the wrong
   856 style.  \index{translations|)} \index{syntax!transformations|)}
   856 style.  \index{translations|)} \index{syntax!transformations|)}
   857 
   857 
   858 
   858 
   859 \section{Token translations} \label{sec:tok_tr}
   859 \section{Token translations} \label{sec:tok_tr}
   860 \index{token translations|(}
   860 \index{token translations|(}
   861 %
   861 %
   862 Isabelle's meta-logic features quite a lot of different kinds of
   862 Isabelle's meta-logic features quite a lot of different kinds of
   863 identifiers, namely {\em class}, {\em tfree}, {\em tvar}, {\em free},
   863 identifiers, namely {\em class}, {\em tfree}, {\em tvar}, {\em free},
   864 {\em bound}, {\em var}. One might want to have these printed in
   864 {\em bound}, {\em var}.  One might want to have these printed in
   865 different styles, e.g.\ in bold or italic, or even transcribed into
   865 different styles, e.g.\ in bold or italic, or even transcribed into
   866 something more readable like $\alpha, \alpha', \beta$ instead of {\tt
   866 something more readable like $\alpha, \alpha', \beta$ instead of {\tt
   867   'a}, {\tt 'aa}, {\tt 'b} for type variables. Token translations
   867   'a}, {\tt 'aa}, {\tt 'b} for type variables.  Token translations
   868 provide a means to such ends, enabling the user to install certain
   868 provide a means to such ends, enabling the user to install certain
   869 \ML{} functions associated with any logical \rmindex{token class} and
   869 \ML{} functions associated with any logical \rmindex{token class} and
   870 depending on some \rmindex{print mode}.
   870 depending on some \rmindex{print mode}.
   871 
   871 
   872 The logical class of identifiers can not necessarily be determined by
   872 The logical class of identifiers can not necessarily be determined by
   873 its syntactic category, though. For example, consider free vs.\ bound
   873 its syntactic category, though.  For example, consider free vs.\ bound
   874 variables. So Isabelle's pretty printing mechanism, starting from
   874 variables.  So Isabelle's pretty printing mechanism, starting from
   875 fully typed terms, has to be careful to preserve this additional
   875 fully typed terms, has to be careful to preserve this additional
   876 information\footnote{This is done by marking atoms in abstract syntax
   876 information\footnote{This is done by marking atoms in abstract syntax
   877   trees appropriately. The marks are actually visible by print
   877   trees appropriately.  The marks are actually visible by print
   878   translation functions -- they are just special constants applied to
   878   translation functions -- they are just special constants applied to
   879   atomic asts, for example \texttt{("_bound" x)}.}.  In particular,
   879   atomic asts, for example \texttt{("_bound" x)}.}.  In particular,
   880 user-supplied print translation functions operating on terms have to
   880 user-supplied print translation functions operating on terms have to
   881 be well-behaved in this respect. Free identifiers introduced to
   881 be well-behaved in this respect.  Free identifiers introduced to
   882 represent bound variables have to be marked appropriately (cf.\ the
   882 represent bound variables have to be marked appropriately (cf.\ the
   883 example at the end of \S\ref{sec:tr_funs}).
   883 example at the end of \S\ref{sec:tr_funs}).
   884 
   884 
   885 \medskip Token translations may be installed by declaring the
   885 \medskip Token translations may be installed by declaring the
   886 \ttindex{token_translation} value within the \texttt{ML} section of a
   886 \ttindex{token_translation} value within the \texttt{ML} section of a
   888 \begin{ttbox}
   888 \begin{ttbox}
   889 val token_translation: (string * string * (string -> string * int)) list
   889 val token_translation: (string * string * (string -> string * int)) list
   890 \end{ttbox}\index{token_translation}
   890 \end{ttbox}\index{token_translation}
   891 The elements of this list are of the form $(m, c, f)$, where $m$ is a
   891 The elements of this list are of the form $(m, c, f)$, where $m$ is a
   892 print mode identifier, $c$ a token class, and $f\colon string \to
   892 print mode identifier, $c$ a token class, and $f\colon string \to
   893 string \times int$ the actual translation function. Assuming that $x$
   893 string \times int$ the actual translation function.  Assuming that $x$
   894 is of identifier class $c$, and print mode $m$ is the first one of all
   894 is of identifier class $c$, and print mode $m$ is the first one of all
   895 currently active modes that provide some translation for $c$, then $x$
   895 currently active modes that provide some translation for $c$, then $x$
   896 is output according to $f(x) = (x', len)$. Thereby $x'$ is the
   896 is output according to $f(x) = (x', len)$.  Thereby $x'$ is the
   897 modified identifier name and $len$ its visual length approximated in
   897 modified identifier name and $len$ its visual length approximated in
   898 terms of whole characters. Thus $x'$ may include non-printing parts
   898 terms of whole characters.  Thus $x'$ may include non-printing parts
   899 like control sequences or markup information for typesetting systems.
   899 like control sequences or markup information for typesetting systems.
   900 
   900 
   901 %FIXME example (?)
   901 %FIXME example (?)
   902 
   902 
   903 \index{token translations|)}
   903 \index{token translations|)}