doc-src/IsarImplementation/Thy/Syntax.thy
changeset 45258 97f8806c3ed6
parent 42510 b9c106763325
child 45260 48295059cef3
equal deleted inserted replaced
45257:12063e071d92 45258:97f8806c3ed6
     2 imports Base
     2 imports Base
     3 begin
     3 begin
     4 
     4 
     5 chapter {* Concrete syntax and type-checking *}
     5 chapter {* Concrete syntax and type-checking *}
     6 
     6 
     7 text FIXME
     7 text {* Pure @{text "\<lambda>"}-calculus as introduced in \chref{ch:logic} is
       
     8   an adequate foundation for logical languages --- in the tradition of
       
     9   \emph{higher-order abstract syntax} --- but end-users require
       
    10   additional means for reading and printing of terms and types.  This
       
    11   important add-on outside the logical core is called \emph{inner
       
    12   syntax} in Isabelle jargon, as opposed to the \emph{outer syntax} of
       
    13   the theory and proof language (cf.\ \chref{FIXME}).
       
    14 
       
    15   For example, according to \cite{church40} quantifiers are
       
    16   represented as higher-order constants @{text "All :: ('a \<Rightarrow> bool) \<Rightarrow>
       
    17   bool"} such that @{text "All (\<lambda>x::'a. B x)"} faithfully represents
       
    18   the idea that is displayed as @{text "\<forall>x::'a. B x"} via @{keyword
       
    19   "binder"} notation.  Moreover, type-inference in the style of
       
    20   Hindley-Milner \cite{hindleymilner} (and extensions) enables users
       
    21   to write @{text "\<forall>x. B x"} concisely, when the type @{text "'a"} is
       
    22   already clear from the context.\footnote{Type-inference taken to the
       
    23   extreme can easily confuse users, though.  Beginners often stumble
       
    24   over unexpectedly general types inferred by the system.}
       
    25 
       
    26   \medskip The main inner syntax operations are \emph{read} for
       
    27   parsing together with type-checking, and \emph{pretty} for formatted
       
    28   output.  See also \secref{sec:read-print}.
       
    29 
       
    30   Furthermore, the input and output syntax layers are sub-divided into
       
    31   separate phases for \emph{concrete syntax} versus \emph{abstract
       
    32   syntax}, see also \secref{sec:parse-unparse} and
       
    33   \secref{sec:term-check}, respectively.  This results in the
       
    34   following decomposition of the main operations:
       
    35 
       
    36   \begin{itemize}
       
    37 
       
    38   \item @{text "read = parse; check"}
       
    39 
       
    40   \item @{text "pretty = uncheck; unparse"}
       
    41 
       
    42   \end{itemize}
       
    43 
       
    44   Some specification package might thus intercept syntax processing at
       
    45   a well-defined stage after @{text "parse"}, to a augment the
       
    46   resulting pre-term before full type-reconstruction is performed by
       
    47   @{text "check"}, for example.  Note that the formal status of bound
       
    48   variables, versus free variables, versus constants must not be
       
    49   changed here! *}
       
    50 
     8 
    51 
     9 section {* Reading and pretty printing \label{sec:read-print} *}
    52 section {* Reading and pretty printing \label{sec:read-print} *}
    10 
    53 
    11 text FIXME
    54 text {* Read and print operations are roughly dual to each other, such
       
    55   that for the user @{text "s' = pretty (read s)"} looks similar to
       
    56   the original source text @{text "s"}, but the details depend on many
       
    57   side-conditions.  There are also explicit options to control
       
    58   suppressing of type information in the output.  The default
       
    59   configuration routinely looses information, so @{text "t' = read
       
    60   (pretty t)"} might fail, produce a differently typed term, or a
       
    61   completely different term in the face of syntactic overloading!  *}
    12 
    62 
    13 text %mlref {*
    63 text %mlref {*
    14   \begin{mldecls}
    64   \begin{mldecls}
    15   @{index_ML Syntax.read_typ: "Proof.context -> string -> typ"} \\
    65   @{index_ML Syntax.read_typ: "Proof.context -> string -> typ"} \\
    16   @{index_ML Syntax.read_term: "Proof.context -> string -> term"} \\
    66   @{index_ML Syntax.read_term: "Proof.context -> string -> term"} \\
    27 *}
    77 *}
    28 
    78 
    29 
    79 
    30 section {* Parsing and unparsing \label{sec:parse-unparse} *}
    80 section {* Parsing and unparsing \label{sec:parse-unparse} *}
    31 
    81 
    32 text FIXME
    82 text {* Parsing and unparsing converts between actual source text and
       
    83   a certain \emph{pre-term} format, where all bindings and scopes are
       
    84   resolved faithfully.  Thus the names of free variables or constants
       
    85   are already determined in the sense of the logical context, but type
       
    86   information might is still missing.  Pre-terms support an explicit
       
    87   language of \emph{type constraints} that may be augmented by user
       
    88   code to guide the later \emph{check} phase, for example.
       
    89 
       
    90   Actual parsing is based on traditional lexical analysis and Earley
       
    91   parsing for arbitrary context-free grammars.  The user can specify
       
    92   this via mixfix annotations.  Moreover, there are \emph{syntax
       
    93   translations} that can be augmented by the user, either
       
    94   declaratively via @{command translations} or programmatically via
       
    95   @{command parse_translation}, @{command print_translation} etc.  The
       
    96   final scope resolution is performed by the system, according to name
       
    97   spaces for types, constants etc.\ determined by the context.
       
    98 *}
    33 
    99 
    34 text %mlref {*
   100 text %mlref {*
    35   \begin{mldecls}
   101   \begin{mldecls}
    36   @{index_ML Syntax.parse_typ: "Proof.context -> string -> typ"} \\
   102   @{index_ML Syntax.parse_typ: "Proof.context -> string -> typ"} \\
    37   @{index_ML Syntax.parse_term: "Proof.context -> string -> term"} \\
   103   @{index_ML Syntax.parse_term: "Proof.context -> string -> term"} \\
    48 *}
   114 *}
    49 
   115 
    50 
   116 
    51 section {* Checking and unchecking \label{sec:term-check} *}
   117 section {* Checking and unchecking \label{sec:term-check} *}
    52 
   118 
    53 text FIXME
   119 text {* These operations define the transition from pre-terms and
       
   120   fully-annotated terms in the sense of the logical core
       
   121   (\chref{ch:logic}).
       
   122 
       
   123   The \emph{check} phase is meant to subsume a variety of mechanisms
       
   124   in the manner of ``type-inference'' or ``type-reconstruction'' or
       
   125   ``type-improvement'', not just type-checking in the narrow sense.
       
   126   The \emph{uncheck} phase is roughly dual, it prunes type-information
       
   127   before pretty printing.
       
   128 
       
   129   A typical add-on for the check/uncheck syntax layer is the @{command
       
   130   abbreviation} mechanism.  Here the user specifies syntactic
       
   131   definitions that are managed by the system as polymorphic @{text
       
   132   "let"} bindings.  These are expanded during the @{text "check"}
       
   133   phase, and contracted during the @{text "uncheck"} phase, without
       
   134   affecting the type-assignment of the given terms.
       
   135 
       
   136   \medskip The precise meaning of type checking depends on the context
       
   137   --- additional check/unckeck plugins might be defined in user space!
       
   138 
       
   139   For example, the @{command class} command defines a context where
       
   140   @{text "check"} treats certain type instances of overloaded
       
   141   constants according to the ``dictionary construction'' of its
       
   142   logical foundation.  This involves ``type improvement''
       
   143   (specialization of slightly too general types) and replacement by
       
   144   certain locale parameters.  See also \cite{Haftmann-Wenzel:2009}.
       
   145 *}
    54 
   146 
    55 text %mlref {*
   147 text %mlref {*
    56   \begin{mldecls}
   148   \begin{mldecls}
    57   @{index_ML Syntax.check_typs: "Proof.context -> typ list -> typ list"} \\
   149   @{index_ML Syntax.check_typs: "Proof.context -> typ list -> typ list"} \\
    58   @{index_ML Syntax.check_terms: "Proof.context -> term list -> term list"} \\
   150   @{index_ML Syntax.check_terms: "Proof.context -> term list -> term list"} \\