doc-src/TutorialI/Documents/Documents.thy
author wenzelm
Sat, 05 Jan 2002 01:15:12 +0100
changeset 12635 e2d44df29c94
parent 12629 281aa36829d8
child 12642 40fbd988b59b
permissions -rw-r--r--
more on concrete syntax;

(*<*)
theory Documents = Main:
(*>*)

section {* Concrete syntax \label{sec:concrete-syntax} *}

text {*
  Concerning Isabelle's ``inner'' language of simply-typed @{text
  \<lambda>}-calculus, the core concept of Isabelle's elaborate infrastructure
  for concrete syntax is that of general \emph{mixfix
  annotations}\index{mixfix annotations|bold}.  Associated with any
  kind of name and type declaration, mixfixes give rise both to
  grammar productions for the parser and output templates for the
  pretty printer.

  In full generality, the whole affair of parser and pretty printer
  configuration is rather subtle.  Any syntax specifications given by
  end-users need to interact properly with the existing setup of
  Isabelle/Pure and Isabelle/HOL; see \cite{isabelle-ref} for further
  details.  It is particularly important to get the precedence of new
  syntactic constructs right, avoiding ambiguities with existing
  elements.

  \medskip Subsequently we introduce a few simple declaration forms
  that already cover the most common situations fairly well.
*}


subsection {* Infixes *}

text {*
  Syntax annotations may be included wherever constants are declared
  directly or indirectly, including \isacommand{consts},
  \isacommand{constdefs}, or \isacommand{datatype} (for the
  constructor operations).  Type-constructors may be annotated as
  well, although this is less frequently encountered in practice
  (@{text "*"} and @{text "+"} types may come to mind).

  Infix declarations\index{infix annotations|bold} provide a useful
  special case of mixfixes, where users need not care about the full
  details of priorities, nesting, spacing, etc.  The subsequent
  example of the exclusive-or operation on boolean values illustrates
  typical infix declarations.
*}

constdefs
  xor :: "bool \<Rightarrow> bool \<Rightarrow> bool"    (infixl "[+]" 60)
  "A [+] B \<equiv> (A \<and> \<not> B) \<or> (\<not> A \<and> B)"

text {*
  Any curried function with at least two arguments may be associated
  with infix syntax: @{text "xor A B"} and @{text "A [+] B"} refer to
  the same expression internally.  In partial applications with less
  than two operands there is a special notation with \isa{op} prefix:
  @{text xor} without arguments is represented as @{text "op [+]"};
  combined with plain prefix application this turns @{text "xor A"}
  into @{text "op [+] A"}.

  \medskip The string @{text [source] "[+]"} in the above declaration
  refers to the bit of concrete syntax to represent the operator,
  while the number @{text 60} determines the precedence of the whole
  construct.

  As it happens, Isabelle/HOL already spends many popular combinations
  of ASCII symbols for its own use, including both @{text "+"} and
  @{text "++"}.  Slightly more awkward combinations like the present
  @{text "[+]"} tend to be available for user extensions.  The current
  arrangement of inner syntax may be inspected via
  \commdx{print\protect\_syntax}, albeit its output is enormous.

  Operator precedence also needs some special considerations.  The
  admissible range is 0--1000.  Very low or high priorities are
  basically reserved for the meta-logic.  Syntax of Isabelle/HOL
  mainly uses the range of 10--100: the equality infix @{text "="} is
  centered at 50, logical connectives (like @{text "\<or>"} and @{text
  "\<and>"}) are below 50, and algebraic ones (like @{text "+"} and @{text
  "*"}) above 50.  User syntax should strive to coexist with common
  HOL forms, or use the mostly unused range 100--900.

  \medskip The keyword \isakeyword{infixl} specifies an operator that
  is nested to the \emph{left}: in iterated applications the more
  complex expression appears on the left-hand side: @{term "A [+] B
  [+] C"} stands for @{text "(A [+] B) [+] C"}.  Similarly,
  \isakeyword{infixr} refers to nesting to the \emph{right}, reading
  @{term "A [+] B [+] C"} as @{text "A [+] (B [+] C)"}.  In contrast,
  a \emph{non-oriented} declaration via \isakeyword{infix} would
  always demand explicit parentheses.
  
  Many binary operations observe the associative law, so the exact
  grouping does not matter.  Nevertheless, formal statements need be
  given in a particular format, associativity needs to be treated
  explicitly within the logic.  Exclusive-or is happens to be
  associative, as shown below.
*}

lemma xor_assoc: "(A [+] B) [+] C = A [+] (B [+] C)"
  by (auto simp add: xor_def)

text {*
  Such rules may be used in simplification to regroup nested
  expressions as required.  Note that the system would actually print
  the above statement as @{term "A [+] B [+] C = A [+] (B [+] C)"}
  (due to nesting to the left).  We have preferred to give the fully
  parenthesized form in the text for clarity.  Only in rare situations
  one may consider to force parentheses by use of non-oriented infix
  syntax; equality would probably be a typical candidate.
*}


subsection {* Mathematical symbols \label{sec:thy-present-symbols} *}

text {*
  Concrete syntax based on plain ASCII characters has its inherent
  limitations.  Rich mathematical notation demands a larger repertoire
  of symbols.  Several standards of extended character sets have been
  proposed over decades, but none has become universally available so
  far, not even Unicode\index{Unicode}.

  Isabelle supports a generic notion of
  \emph{symbols}\index{symbols|bold} as the smallest entities of
  source text, without referring to internal encodings.  Such
  ``generalized characters'' may be of one of the following three
  kinds:

  \begin{enumerate}

  \item Traditional 7-bit ASCII characters.

  \item Named symbols: \verb,\,\verb,<,$ident$\verb,>, (or
  \verb,\\,\verb,<,$ident$\verb,>,).

  \item Named control symbols: \verb,\,\verb,<^,$ident$\verb,>, (or
  \verb,\\,\verb,<^,$ident$\verb,>,).

  \end{enumerate}

  Here $ident$ may be any identifier according to the usual Isabelle
  conventions.  This results in an infinite store of symbols, whose
  interpretation is left to further front-end tools.  For example, the
  \verb,\,\verb,<forall>, symbol of Isabelle is really displayed as
  $\forall$ --- both by the user-interface of Proof~General + X-Symbol
  and the Isabelle document processor (see \S\ref{FIXME}).

  A list of standard Isabelle symbols is given in
  \cite[appendix~A]{isabelle-sys}.  Users may introduce their own
  interpretation of further symbols by configuring the appropriate
  front-end tool accordingly, e.g.\ defining appropriate {\LaTeX}
  macros for document preparation.  There are also a few predefined
  control symbols, such as \verb,\,\verb,<^sub>, and
  \verb,\,\verb,<^sup>, for sub- and superscript of the subsequent
  (printable) symbol, respectively.

  \medskip The following version of our @{text xor} definition uses a
  standard Isabelle symbol to achieve typographically pleasing output.
*}

(*<*)
hide const xor
ML_setup {* Context.>> (Theory.add_path "1") *}
(*>*)
constdefs
  xor :: "bool \<Rightarrow> bool \<Rightarrow> bool"    (infixl "\<oplus>" 60)
  "A \<oplus> B \<equiv> (A \<and> \<not> B) \<or> (\<not> A \<and> B)"
(*<*)
local
(*>*)

text {*
  The X-Symbol package within Proof~General provides several input
  methods to enter @{text \<oplus>} in the text.  If all fails one may just
  type \verb,\,\verb,<oplus>, by hand; the display is adapted
  immediately after continuing further input.

  \medskip A slightly more refined scheme is to provide alternative
  syntax via the \emph{print mode}\index{print mode} concept of
  Isabelle (see also \cite{isabelle-ref}).  By convention, the mode
  ``$xsymbols$'' is enabled whenever X-Symbol is active.  Consider the
  following hybrid declaration of @{text xor}.
*}

(*<*)
hide const xor
ML_setup {* Context.>> (Theory.add_path "2") *}
(*>*)
constdefs
  xor :: "bool \<Rightarrow> bool \<Rightarrow> bool"    (infixl "[+]\<ignore>" 60)
  "A [+]\<ignore> B \<equiv> (A \<and> \<not> B) \<or> (\<not> A \<and> B)"

syntax (xsymbols)
  xor :: "bool \<Rightarrow> bool \<Rightarrow> bool"    (infixl "\<oplus>\<ignore>" 60)
(*<*)
local
(*>*)

text {*
  Here the \commdx{syntax} command acts like \isakeyword{consts}, but
  without declaring a logical constant, and with an optional print
  mode specification.  Note that the type declaration given here
  merely serves for syntactic purposes, and is not checked for
  consistency with the real constant.

  \medskip Now we may write either @{text "[+]"} or @{text "\<oplus>"} in
  input, while output uses the nicer syntax of $xsymbols$, provided
  that print mode is presently active.  This scheme is particularly
  useful for interactive development, with the user typing plain ASCII
  text, but gaining improved visual feedback from the system (say in
  current goal output).

  \begin{warn}
  Using alternative syntax declarations easily results in varying
  versions of input sources.  Isabelle provides no systematic way to
  convert alternative expressions back and forth.  Print modes only
  affect situations where formal entities are pretty printed by the
  Isabelle process (e.g.\ output of terms and types), but not the
  original theory text.
  \end{warn}

  \medskip The following variant makes the alternative @{text \<oplus>}
  notation only available for output.  Thus we may enforce input
  sources to refer to plain ASCII only.
*}

syntax (xsymbols output)
  xor :: "bool \<Rightarrow> bool \<Rightarrow> bool"    (infixl "\<oplus>\<ignore>" 60)


subsection {* Prefixes *}

text {*
  Prefix syntax annotations\index{prefix annotation|bold} are just a
  very degenerate of the general mixfix form \cite{isabelle-ref},
  without any template arguments or priorities --- just some piece of
  literal syntax.

  The following example illustrates this idea idea by associating
  common symbols with the constructors of a currency datatype.
*}

datatype currency =
    Euro nat    ("\<euro>")
  | Pounds nat  ("\<pounds>")
  | Yen nat     ("\<yen>")
  | Dollar nat  ("$")

text {*
  Here the degenerate mixfix annotations on the rightmost column
  happen to consist of a single Isabelle symbol each:
  \verb,\,\verb,<euro>,, \verb,\,\verb,<pounds>,,
  \verb,\,\verb,<yen>,, and \verb,\,$,.

  Recall that a constructor like @{text Euro} actually is a function
  @{typ "nat \<Rightarrow> currency"}.  An expression like @{text "Euro 10"} will
  be printed as @{term "\<euro> 10"}.  Merely the head of the application is
  subject to our trivial concrete syntax; this form is sufficient to
  achieve fair conformance to EU~Commission standards of currency
  notation.

  \medskip Certainly, the same idea of prefix syntax also works for
  \isakeyword{consts}, \isakeyword{constdefs} etc.  For example, we
  might introduce a (slightly unrealistic) function to calculate an
  abstract currency value, by cases on the datatype constructors and
  fixed exchange rates.
*}

consts
  currency :: "currency \<Rightarrow> nat"    ("\<currency>")

text {*
  \noindent The funny symbol encountered here is that of
  \verb,\<currency>,.
*}


subsection {* Syntax translations \label{sec:def-translations} *}

text{*
  FIXME

\index{syntax translations|(}%
\index{translations@\isacommand {translations} (command)|(}
Isabelle offers an additional definitional facility,
\textbf{syntax translations}.
They resemble macros: upon parsing, the defined concept is immediately
replaced by its definition.  This effect is reversed upon printing.  For example,
the symbol @{text"\<noteq>"} is defined via a syntax translation:
*}

translations "x \<noteq> y" \<rightleftharpoons> "\<not>(x = y)"

text{*\index{$IsaEqTrans@\isasymrightleftharpoons}
\noindent
Internally, @{text"\<noteq>"} never appears.

In addition to @{text"\<rightleftharpoons>"} there are
@{text"\<rightharpoonup>"}\index{$IsaEqTrans1@\isasymrightharpoonup}
and @{text"\<leftharpoondown>"}\index{$IsaEqTrans2@\isasymleftharpoondown}
for uni-directional translations, which only affect
parsing or printing.  This tutorial will not cover the details of
translations.  We have mentioned the concept merely because it
crops up occasionally; a number of HOL's built-in constructs are defined
via translations.  Translations are preferable to definitions when the new 
concept is a trivial variation on an existing one.  For example, we
don't need to derive new theorems about @{text"\<noteq>"}, since existing theorems
about @{text"="} still apply.%
\index{syntax translations|)}%
\index{translations@\isacommand {translations} (command)|)}
*}


section {* Document preparation *}

subsection {* Batch-mode sessions *}

subsection {* {\LaTeX} macros *}

subsubsection {* Structure markup *}

subsubsection {* Symbols and characters *}

text {*
  FIXME

  
*}

(*<*)
end
(*>*)