src/Doc/Implementation/Syntax.thy
changeset 57496 94596c573b38
parent 57346 1d6d44a0583f
child 57846 7cbb28332896
equal deleted inserted replaced
57495:b627e76cc5cc 57496:94596c573b38
    10   an adequate foundation for logical languages --- in the tradition of
    10   an adequate foundation for logical languages --- in the tradition of
    11   \emph{higher-order abstract syntax} --- but end-users require
    11   \emph{higher-order abstract syntax} --- but end-users require
    12   additional means for reading and printing of terms and types.  This
    12   additional means for reading and printing of terms and types.  This
    13   important add-on outside the logical core is called \emph{inner
    13   important add-on outside the logical core is called \emph{inner
    14   syntax} in Isabelle jargon, as opposed to the \emph{outer syntax} of
    14   syntax} in Isabelle jargon, as opposed to the \emph{outer syntax} of
    15   the theory and proof language (cf.\ \cite{isabelle-isar-ref}).
    15   the theory and proof language \cite{isabelle-isar-ref}.
    16 
    16 
    17   For example, according to \cite{church40} quantifiers are
    17   For example, according to \cite{church40} quantifiers are represented as
    18   represented as higher-order constants @{text "All :: ('a \<Rightarrow> bool) \<Rightarrow>
    18   higher-order constants @{text "All :: ('a \<Rightarrow> bool) \<Rightarrow> bool"} such that @{text
    19   bool"} such that @{text "All (\<lambda>x::'a. B x)"} faithfully represents
    19   "All (\<lambda>x::'a. B x)"} faithfully represents the idea that is displayed in
    20   the idea that is displayed as @{text "\<forall>x::'a. B x"} via @{keyword
    20   Isabelle as @{text "\<forall>x::'a. B x"} via @{keyword "binder"} notation.
    21   "binder"} notation.  Moreover, type-inference in the style of
    21   Moreover, type-inference in the style of Hindley-Milner \cite{hindleymilner}
    22   Hindley-Milner \cite{hindleymilner} (and extensions) enables users
    22   (and extensions) enables users to write @{text "\<forall>x. B x"} concisely, when
    23   to write @{text "\<forall>x. B x"} concisely, when the type @{text "'a"} is
    23   the type @{text "'a"} is already clear from the
    24   already clear from the context.\footnote{Type-inference taken to the
    24   context.\footnote{Type-inference taken to the extreme can easily confuse
    25   extreme can easily confuse users, though.  Beginners often stumble
    25   users. Beginners often stumble over unexpectedly general types inferred by
    26   over unexpectedly general types inferred by the system.}
    26   the system.}
    27 
    27 
    28   \medskip The main inner syntax operations are \emph{read} for
    28   \medskip The main inner syntax operations are \emph{read} for
    29   parsing together with type-checking, and \emph{pretty} for formatted
    29   parsing together with type-checking, and \emph{pretty} for formatted
    30   output.  See also \secref{sec:read-print}.
    30   output.  See also \secref{sec:read-print}.
    31 
    31 
    41 
    41 
    42   \item @{text "pretty = uncheck; unparse"}
    42   \item @{text "pretty = uncheck; unparse"}
    43 
    43 
    44   \end{itemize}
    44   \end{itemize}
    45 
    45 
    46   Some specification package might thus intercept syntax processing at
    46   For example, some specification package might thus intercept syntax
    47   a well-defined stage after @{text "parse"}, to a augment the
    47   processing at a well-defined stage after @{text "parse"}, to a augment the
    48   resulting pre-term before full type-reconstruction is performed by
    48   resulting pre-term before full type-reconstruction is performed by @{text
    49   @{text "check"}, for example.  Note that the formal status of bound
    49   "check"}. Note that the formal status of bound variables, versus free
    50   variables, versus free variables, versus constants must not be
    50   variables, versus constants must not be changed between these phases.
    51   changed between these phases!
       
    52 
    51 
    53   \medskip In general, @{text check} and @{text uncheck} operate
    52   \medskip In general, @{text check} and @{text uncheck} operate
    54   simultaneously on a list of terms. This is particular important for
    53   simultaneously on a list of terms. This is particular important for
    55   type-checking, to reconstruct types for several terms of the same context
    54   type-checking, to reconstruct types for several terms of the same context
    56   and scope. In contrast, @{text parse} and @{text unparse} operate separately
    55   and scope. In contrast, @{text parse} and @{text unparse} operate separately
    57   in single terms.
    56   on single terms.
    58 
    57 
    59   There are analogous operations to read and print types, with the same
    58   There are analogous operations to read and print types, with the same
    60   sub-division into phases.
    59   sub-division into phases.
    61 *}
    60 *}
    62 
    61 
    63 
    62 
    64 section {* Reading and pretty printing \label{sec:read-print} *}
    63 section {* Reading and pretty printing \label{sec:read-print} *}
    65 
    64 
    66 text {* Read and print operations are roughly dual to each other, such
    65 text {*
    67   that for the user @{text "s' = pretty (read s)"} looks similar to
    66   Read and print operations are roughly dual to each other, such that for the
    68   the original source text @{text "s"}, but the details depend on many
    67   user @{text "s' = pretty (read s)"} looks similar to the original source
    69   side-conditions.  There are also explicit options to control
    68   text @{text "s"}, but the details depend on many side-conditions. There are
    70   suppressing of type information in the output.  The default
    69   also explicit options to control the removal of type information in the
    71   configuration routinely looses information, so @{text "t' = read
    70   output. The default configuration routinely looses information, so @{text
    72   (pretty t)"} might fail, or produce a differently typed term, or a
    71   "t' = read (pretty t)"} might fail, or produce a differently typed term, or
    73   completely different term in the face of syntactic overloading!  *}
    72   a completely different term in the face of syntactic overloading.
       
    73 *}
    74 
    74 
    75 text %mlref {*
    75 text %mlref {*
    76   \begin{mldecls}
    76   \begin{mldecls}
    77   @{index_ML Syntax.read_typs: "Proof.context -> string list -> typ list"} \\
    77   @{index_ML Syntax.read_typs: "Proof.context -> string list -> typ list"} \\
    78   @{index_ML Syntax.read_terms: "Proof.context -> string list -> term list"} \\
    78   @{index_ML Syntax.read_terms: "Proof.context -> string list -> term list"} \\
    96   Type-reconstruction puts all parsed terms into the same scope: types of
    96   Type-reconstruction puts all parsed terms into the same scope: types of
    97   free variables ultimately need to coincide.
    97   free variables ultimately need to coincide.
    98 
    98 
    99   If particular type-constraints are required for some of the arguments, the
    99   If particular type-constraints are required for some of the arguments, the
   100   read operations needs to be split into its parse and check phases. Then it
   100   read operations needs to be split into its parse and check phases. Then it
   101   is possible to use @{ML Type.constraint} on the intermediate pre-terms.
   101   is possible to use @{ML Type.constraint} on the intermediate pre-terms
       
   102   \secref{sec:term-check}.
   102 
   103 
   103   \item @{ML Syntax.read_props}~@{text "ctxt strs"} parses and checks a
   104   \item @{ML Syntax.read_props}~@{text "ctxt strs"} parses and checks a
   104   simultaneous list of source strings as terms of the logic, with an implicit
   105   simultaneous list of source strings as terms of the logic, with an implicit
   105   type-constraint for each argument to enforce type @{typ prop}; this also
   106   type-constraint for each argument to enforce type @{typ prop}; this also
   106   affects the inner syntax for parsing. The remaining type-reconstructions
   107   affects the inner syntax for parsing. The remaining type-reconstruction
   107   works as for @{ML Syntax.read_terms} above.
   108   works as for @{ML Syntax.read_terms}.
   108 
   109 
   109   \item @{ML Syntax.read_typ}, @{ML Syntax.read_term}, @{ML Syntax.read_prop}
   110   \item @{ML Syntax.read_typ}, @{ML Syntax.read_term}, @{ML Syntax.read_prop}
   110   are like the simultaneous versions above, but operate on a single argument
   111   are like the simultaneous versions, but operate on a single argument only.
   111   only. This convenient shorthand is adequate in situations where a single
   112   This convenient shorthand is adequate in situations where a single item in
   112   item in its own scope is processed. Do not use @{ML "map o
   113   its own scope is processed. Do not use @{ML "map o Syntax.read_term"} where
   113   Syntax.read_term"} where @{ML Syntax.read_terms} is actually intended!
   114   @{ML Syntax.read_terms} is actually intended!
   114 
   115 
   115   \item @{ML Syntax.pretty_typ}~@{text "ctxt T"} and @{ML
   116   \item @{ML Syntax.pretty_typ}~@{text "ctxt T"} and @{ML
   116   Syntax.pretty_term}~@{text "ctxt t"} uncheck and pretty-print the given type
   117   Syntax.pretty_term}~@{text "ctxt t"} uncheck and pretty-print the given type
   117   or term, respectively. Although the uncheck phase acts on a simultaneous
   118   or term, respectively. Although the uncheck phase acts on a simultaneous
   118   list as well, this is rarely relevant in practice, so only the singleton
   119   list as well, this is rarely used in practice, so only the singleton case is
   119   case is provided as combined pretty operation. There is no distinction of
   120   provided as combined pretty operation. There is no distinction of term vs.\
   120   term vs.\ proposition.
   121   proposition.
   121 
   122 
   122   \item @{ML Syntax.string_of_typ} and @{ML Syntax.string_of_term} are
   123   \item @{ML Syntax.string_of_typ} and @{ML Syntax.string_of_term} are
   123   convenient compositions of @{ML Syntax.pretty_typ} and @{ML
   124   convenient compositions of @{ML Syntax.pretty_typ} and @{ML
   124   Syntax.pretty_term} with @{ML Pretty.string_of} for output. The result may
   125   Syntax.pretty_term} with @{ML Pretty.string_of} for output. The result may
   125   be concatenated with other strings, as long as there is no further
   126   be concatenated with other strings, as long as there is no further
   128   \end{description}
   129   \end{description}
   129 
   130 
   130   @{ML Syntax.read_term}, @{ML Syntax.read_prop}, and @{ML
   131   @{ML Syntax.read_term}, @{ML Syntax.read_prop}, and @{ML
   131   Syntax.string_of_term} are the most important operations in practice.
   132   Syntax.string_of_term} are the most important operations in practice.
   132 
   133 
   133   \medskip Note that the string values that are passed in and out here are
   134   \medskip Note that the string values that are passed in and out are
   134   annotated by the system, to carry further markup that is relevant for the
   135   annotated by the system, to carry further markup that is relevant for the
   135   Prover IDE \cite{isabelle-jedit}. User code should neither compose its own
   136   Prover IDE \cite{isabelle-jedit}. User code should neither compose its own
   136   input strings, nor try to analyze the output strings. Conceptually this is
   137   input strings, nor try to analyze the output strings. Conceptually this is
   137   an abstract datatype, encoded into a concrete string for historical reasons.
   138   an abstract datatype, encoded as concrete string for historical reasons.
   138 
   139 
   139   The standard way to provide the required position markup for input works via
   140   The standard way to provide the required position markup for input works via
   140   the outer syntax parser wrapper @{ML Parse.inner_syntax}, which is already
   141   the outer syntax parser wrapper @{ML Parse.inner_syntax}, which is already
   141   part of @{ML Parse.typ}, @{ML Parse.term}, @{ML Parse.prop}. So a string
   142   part of @{ML Parse.typ}, @{ML Parse.term}, @{ML Parse.prop}. So a string
   142   obtained from one of the latter may be directly passed to the corresponding
   143   obtained from one of the latter may be directly passed to the corresponding
   145 *}
   146 *}
   146 
   147 
   147 
   148 
   148 section {* Parsing and unparsing \label{sec:parse-unparse} *}
   149 section {* Parsing and unparsing \label{sec:parse-unparse} *}
   149 
   150 
   150 text {* Parsing and unparsing converts between actual source text and
   151 text {*
   151   a certain \emph{pre-term} format, where all bindings and scopes are
   152   Parsing and unparsing converts between actual source text and a certain
   152   resolved faithfully.  Thus the names of free variables or constants
   153   \emph{pre-term} format, where all bindings and scopes are already resolved
   153   are already determined in the sense of the logical context, but type
   154   faithfully. Thus the names of free variables or constants are determined in
   154   information might be still missing.  Pre-terms support an explicit
   155   the sense of the logical context, but type information might be still
   155   language of \emph{type constraints} that may be augmented by user
   156   missing. Pre-terms support an explicit language of \emph{type constraints}
   156   code to guide the later \emph{check} phase.
   157   that may be augmented by user code to guide the later \emph{check} phase.
   157 
   158 
   158   Actual parsing is based on traditional lexical analysis and Earley
   159   Actual parsing is based on traditional lexical analysis and Earley parsing
   159   parsing for arbitrary context-free grammars.  The user can specify
   160   for arbitrary context-free grammars. The user can specify the grammar
   160   the grammar via mixfix annotations.  Moreover, there are \emph{syntax
   161   declaratively via mixfix annotations. Moreover, there are \emph{syntax
   161   translations} that can be augmented by the user, either
   162   translations} that can be augmented by the user, either declaratively via
   162   declaratively via @{command translations} or programmatically via
   163   @{command translations} or programmatically via @{command
   163   @{command parse_translation}, @{command print_translation} etc.  The
   164   parse_translation}, @{command print_translation} \cite{isabelle-isar-ref}.
   164   final scope-resolution is performed by the system, according to name
   165   The final scope-resolution is performed by the system, according to name
   165   spaces for types, term variables and constants etc.\ determined by
   166   spaces for types, term variables and constants determined by the context.
   166   the context.
       
   167 *}
   167 *}
   168 
   168 
   169 text %mlref {*
   169 text %mlref {*
   170   \begin{mldecls}
   170   \begin{mldecls}
   171   @{index_ML Syntax.parse_typ: "Proof.context -> string -> typ"} \\
   171   @{index_ML Syntax.parse_typ: "Proof.context -> string -> typ"} \\
   172   @{index_ML Syntax.parse_term: "Proof.context -> string -> term"} \\
   172   @{index_ML Syntax.parse_term: "Proof.context -> string -> term"} \\
   173   @{index_ML Syntax.parse_prop: "Proof.context -> string -> term"} \\
   173   @{index_ML Syntax.parse_prop: "Proof.context -> string -> term"} \\[0.5ex]
   174   @{index_ML Syntax.unparse_typ: "Proof.context -> typ -> Pretty.T"} \\
   174   @{index_ML Syntax.unparse_typ: "Proof.context -> typ -> Pretty.T"} \\
   175   @{index_ML Syntax.unparse_term: "Proof.context -> term -> Pretty.T"} \\
   175   @{index_ML Syntax.unparse_term: "Proof.context -> term -> Pretty.T"} \\
   176   \end{mldecls}
   176   \end{mldecls}
   177 
   177 
   178   \begin{description}
   178   \begin{description}
   179 
   179 
   180   \item @{ML Syntax.parse_typ}~@{text "ctxt str"} parses a source strings as
   180   \item @{ML Syntax.parse_typ}~@{text "ctxt str"} parses a source string as
   181   pre-type that is ready to be used with subsequent check operations.
   181   pre-type that is ready to be used with subsequent check operations.
   182 
   182 
   183   \item @{ML Syntax.parse_term}~@{text "ctxt str"} parses a source strings as
   183   \item @{ML Syntax.parse_term}~@{text "ctxt str"} parses a source string as
   184   pre-term that is ready to be used with subsequent check operations.
   184   pre-term that is ready to be used with subsequent check operations.
   185 
   185 
   186   \item @{ML Syntax.parse_prop}~@{text "ctxt str"} parses a source strings as
   186   \item @{ML Syntax.parse_prop}~@{text "ctxt str"} parses a source string as
   187   pre-term that is ready to be used with subsequent check operations. The
   187   pre-term that is ready to be used with subsequent check operations. The
   188   inner syntax category is @{typ prop} and a suitable type-constraint is
   188   inner syntax category is @{typ prop} and a suitable type-constraint is
   189   included to ensure that this information is preserved during the check
   189   included to ensure that this information is observed in subsequent type
   190   phase.
   190   reconstruction.
   191 
   191 
   192   \item @{ML Syntax.unparse_typ}~@{text "ctxt T"} unparses a type after
   192   \item @{ML Syntax.unparse_typ}~@{text "ctxt T"} unparses a type after
   193   uncheck operations, to turn it into a pretty tree.
   193   uncheck operations, to turn it into a pretty tree.
   194 
   194 
   195   \item @{ML Syntax.unparse_term}~@{text "ctxt T"} unparses a term after
   195   \item @{ML Syntax.unparse_term}~@{text "ctxt T"} unparses a term after
   196   uncheck operations, to turn it into a pretty tree. There is no distinction
   196   uncheck operations, to turn it into a pretty tree. There is no distinction
   197   for propositions here.
   197   for propositions here.
   198 
   198 
   199   \end{description}
   199   \end{description}
   200 
   200 
   201   These operations always operate on single items; use the combinator @{ML
   201   These operations always operate on a single item; use the combinator @{ML
   202   map} to apply them to a list of items.
   202   map} to apply them to a list.
   203 *}
   203 *}
   204 
   204 
   205 
   205 
   206 section {* Checking and unchecking \label{sec:term-check} *}
   206 section {* Checking and unchecking \label{sec:term-check} *}
   207 
   207 
   214   ``type-improvement'', not just type-checking in the narrow sense.
   214   ``type-improvement'', not just type-checking in the narrow sense.
   215   The \emph{uncheck} phase is roughly dual, it prunes type-information
   215   The \emph{uncheck} phase is roughly dual, it prunes type-information
   216   before pretty printing.
   216   before pretty printing.
   217 
   217 
   218   A typical add-on for the check/uncheck syntax layer is the @{command
   218   A typical add-on for the check/uncheck syntax layer is the @{command
   219   abbreviation} mechanism.  Here the user specifies syntactic
   219   abbreviation} mechanism \cite{isabelle-isar-ref}. Here the user specifies
   220   definitions that are managed by the system as polymorphic @{text
   220   syntactic definitions that are managed by the system as polymorphic @{text
   221   "let"} bindings.  These are expanded during the @{text "check"}
   221   "let"} bindings. These are expanded during the @{text "check"} phase, and
   222   phase, and contracted during the @{text "uncheck"} phase, without
   222   contracted during the @{text "uncheck"} phase, without affecting the
   223   affecting the type-assignment of the given terms.
   223   type-assignment of the given terms.
   224 
   224 
   225   \medskip The precise meaning of type checking depends on the context
   225   \medskip The precise meaning of type checking depends on the context ---
   226   --- additional check/uncheck plugins might be defined in user space.
   226   additional check/uncheck modules might be defined in user space.
   227 
   227 
   228   For example, the @{command class} command defines a context where
   228   For example, the @{command class} command defines a context where
   229   @{text "check"} treats certain type instances of overloaded
   229   @{text "check"} treats certain type instances of overloaded
   230   constants according to the ``dictionary construction'' of its
   230   constants according to the ``dictionary construction'' of its
   231   logical foundation.  This involves ``type improvement''
   231   logical foundation.  This involves ``type improvement''
   235 
   235 
   236 text %mlref {*
   236 text %mlref {*
   237   \begin{mldecls}
   237   \begin{mldecls}
   238   @{index_ML Syntax.check_typs: "Proof.context -> typ list -> typ list"} \\
   238   @{index_ML Syntax.check_typs: "Proof.context -> typ list -> typ list"} \\
   239   @{index_ML Syntax.check_terms: "Proof.context -> term list -> term list"} \\
   239   @{index_ML Syntax.check_terms: "Proof.context -> term list -> term list"} \\
   240   @{index_ML Syntax.check_props: "Proof.context -> term list -> term list"} \\
   240   @{index_ML Syntax.check_props: "Proof.context -> term list -> term list"} \\[0.5ex]
   241   @{index_ML Syntax.uncheck_typs: "Proof.context -> typ list -> typ list"} \\
   241   @{index_ML Syntax.uncheck_typs: "Proof.context -> typ list -> typ list"} \\
   242   @{index_ML Syntax.uncheck_terms: "Proof.context -> term list -> term list"} \\
   242   @{index_ML Syntax.uncheck_terms: "Proof.context -> term list -> term list"} \\
   243   \end{mldecls}
   243   \end{mldecls}
   244 
   244 
   245   \begin{description}
   245   \begin{description}
   254   treated in the same way as for @{ML Syntax.check_typs}.
   254   treated in the same way as for @{ML Syntax.check_typs}.
   255 
   255 
   256   Applications sometimes need to check several types and terms together. The
   256   Applications sometimes need to check several types and terms together. The
   257   standard approach uses @{ML Logic.mk_type} to embed the language of types
   257   standard approach uses @{ML Logic.mk_type} to embed the language of types
   258   into that of terms; all arguments are appended into one list of terms that
   258   into that of terms; all arguments are appended into one list of terms that
   259   is checked; afterwards the original type arguments are recovered with @{ML
   259   is checked; afterwards the type arguments are recovered with @{ML
   260   Logic.dest_type}.
   260   Logic.dest_type}.
   261 
   261 
   262   \item @{ML Syntax.check_props}~@{text "ctxt ts"} checks a simultaneous list
   262   \item @{ML Syntax.check_props}~@{text "ctxt ts"} checks a simultaneous list
   263   of pre-terms as terms of the logic, such that all terms are constrained by
   263   of pre-terms as terms of the logic, such that all terms are constrained by
   264   type @{typ prop}. The remaining check operation works as @{ML
   264   type @{typ prop}. The remaining check operation works as @{ML
   271   list of terms of the logic, in preparation of pretty printing. There is no
   271   list of terms of the logic, in preparation of pretty printing. There is no
   272   distinction for propositions here.
   272   distinction for propositions here.
   273 
   273 
   274   \end{description}
   274   \end{description}
   275 
   275 
   276   These operations always operate simultaneously on multiple items; use the
   276   These operations always operate simultaneously on a list; use the combinator
   277   combinator @{ML singleton} to apply them to a single item.
   277   @{ML singleton} to apply them to a single item.
   278 *}
   278 *}
   279 
   279 
   280 end
   280 end