doc-src/Ref/defining.tex
author wenzelm
Sat, 04 Feb 2012 18:33:04 +0100
changeset 46290 465851ba524e
parent 46284 6233d0b74d71
child 46291 a1827b8b45ae
permissions -rw-r--r--
updated/unified section on mixfix annotations;


\chapter{Defining Logics} \label{Defining-Logics}

\section{Mixfix declarations} \label{sec:mixfix}

\subsection{The general mixfix form}
Here is a detailed account of mixfix declarations.  Suppose the following
line occurs within a {\tt consts} or {\tt syntax} section of a {\tt .thy}
file:
\begin{center}
  {\tt $c$ ::\ "$\sigma$" ("$template$" $ps$ $p$)}
\end{center}
This constant declaration and mixfix annotation are interpreted as follows:
\begin{itemize}\index{productions}
\item The string {\tt $c$} is the name of the constant associated with the
  production; unless it is a valid identifier, it must be enclosed in
  quotes.  If $c$ is empty (given as~{\tt ""}) then this is a copy
  production.\index{productions!copy} Otherwise, parsing an instance of the
  phrase $template$ generates the \AST{} {\tt ("$c$" $a@1$ $\ldots$
    $a@n$)}, where $a@i$ is the \AST{} generated by parsing the $i$-th
  argument.

  \item The type $\sigma$ specifies the production's nonterminal symbols
    (or name tokens).  If $template$ is of the form above then $\sigma$
    must be a function type with at least~$n$ argument positions, say
    $\sigma = [\tau@1, \dots, \tau@n] \To \tau$.  Nonterminal symbols are
    derived from the types $\tau@1$, \ldots,~$\tau@n$, $\tau$ as described
    below.  Any of these may be function types.

\end{itemize}
%
The resulting production is \[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\,
A@2^{(p@2)}\, \dots\, A@n^{(p@n)}\, w@n \] where $A$ and the $A@i$ are the
nonterminals corresponding to the types $\tau$ and $\tau@i$ respectively.
The nonterminal symbol associated with a type $(\ldots)ty$ is {\tt logic}, if
this is a logical type (namely one of class {\tt logic} excluding {\tt
prop}).  Otherwise it is $ty$ (note that only the outermost type constructor
is taken into account).  Finally, the nonterminal of a type variable is {\tt
any}.

Associating nonterminals with types allows a constant's type to specify
syntax as well.  We can declare the function~$f$ to have type $[\tau@1,
\ldots, \tau@n]\To \tau$ and, through a mixfix annotation, specify the layout
of the function's $n$ arguments.  The constant's name, in this case~$f$, will
also serve as the label in the abstract syntax tree.

You may also declare mixfix syntax without adding constants to the theory's
signature, by using a {\tt syntax} section instead of {\tt consts}.  Thus a
production need not map directly to a logical function (this typically
requires additional syntactic translations, see also
Chapter~\ref{chap:syntax}).


\medskip
Omitting priorities is prone to syntactic ambiguities unless
the production's right-hand side is fully bracketed, as in
\verb|"if _ then _ else _ fi"|.


\subsection{Example: arithmetic expressions}
\index{examples!of mixfix declarations}
This theory specification contains a {\tt syntax} section with mixfix
declarations encoding the priority grammar from
\S\ref{sec:priority_grammars}:
\begin{ttbox}
ExpSyntax = Pure +
types
  exp
syntax
  "0" :: exp                 ("0"      9)
  "+" :: [exp, exp] => exp   ("_ + _"  [0, 1] 0)
  "*" :: [exp, exp] => exp   ("_ * _"  [3, 2] 2)
  "-" :: exp => exp          ("- _"    [3] 3)
end
\end{ttbox}
Executing {\tt Syntax.print_gram} reveals the productions derived from the
above mixfix declarations (lots of additional information deleted):
\begin{ttbox}
Syntax.print_gram (syn_of ExpSyntax.thy);
{\out exp = "0"  => "0" (9)}
{\out exp = exp[0] "+" exp[1]  => "+" (0)}
{\out exp = exp[3] "*" exp[2]  => "*" (2)}
{\out exp = "-" exp[3]  => "-" (3)}
\end{ttbox}


\section{Ambiguity of parsed expressions} \label{sec:ambiguity}
\index{ambiguity!of parsed expressions}

To keep the grammar small and allow common productions to be shared
all logical types (except {\tt prop}) are internally represented
by one nonterminal, namely {\tt logic}.  This and omitted or too freely
chosen priorities may lead to ways of parsing an expression that were
not intended by the theory's maker.  In most cases Isabelle is able to
select one of multiple parse trees that an expression has lead
to by checking which of them can be typed correctly.  But this may not
work in every case and always slows down parsing.
The warning and error messages that can be produced during this process are
as follows:

If an ambiguity can be resolved by type inference the following
warning is shown to remind the user that parsing is (unnecessarily)
slowed down.  In cases where it's not easily possible to eliminate the
ambiguity the frequency of the warning can be controlled by changing
the value of {\tt Syntax.ambiguity_level} which has type {\tt int
ref}.  Its default value is 1 and by increasing it one can control how
many parse trees are necessary to generate the warning.

\begin{ttbox}
{\out Ambiguous input "\dots"}
{\out produces the following parse trees:}
{\out \dots}
{\out Fortunately, only one parse tree is type correct.}
{\out You may still want to disambiguate your grammar or your input.}
\end{ttbox}

The following message is normally caused by using the same
syntax in two different productions:

\begin{ttbox}
{\out Ambiguous input "..."}
{\out produces the following parse trees:}
{\out \dots}
{\out More than one term is type correct:}
{\out \dots}
\end{ttbox}

Ambiguities occuring in syntax translation rules cannot be resolved by
type inference because it is not necessary for these rules to be type
correct.  Therefore Isabelle always generates an error message and the
ambiguity should be eliminated by changing the grammar or the rule.


%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "ref"
%%% End: