| author | wenzelm | 
| Wed, 04 Nov 2015 18:32:47 +0100 | |
| changeset 61572 | ddb3ac3fef45 | 
| parent 61493 | 0debd22f0c0e | 
| child 61656 | cfabbc083977 | 
| permissions | -rw-r--r-- | 
| 57344 | 1 | (*:wrap=hard:maxLineLen=78:*) | 
| 2 | ||
| 30124 | 3 | theory Syntax | 
| 4 | imports Base | |
| 5 | begin | |
| 6 | ||
| 58618 | 7 | chapter \<open>Concrete syntax and type-checking\<close> | 
| 30124 | 8 | |
| 61493 | 9 | text \<open>Pure \<open>\<lambda>\<close>-calculus as introduced in \chref{ch:logic} is
 | 
| 45258 | 10 | an adequate foundation for logical languages --- in the tradition of | 
| 61477 | 11 | \<^emph>\<open>higher-order abstract syntax\<close> --- but end-users require | 
| 45258 | 12 | additional means for reading and printing of terms and types. This | 
| 61477 | 13 | important add-on outside the logical core is called \<^emph>\<open>inner | 
| 14 | syntax\<close> in Isabelle jargon, as opposed to the \<^emph>\<open>outer syntax\<close> of | |
| 58555 | 15 |   the theory and proof language @{cite "isabelle-isar-ref"}.
 | 
| 45258 | 16 | |
| 58555 | 17 |   For example, according to @{cite church40} quantifiers are represented as
 | 
| 61493 | 18 |   higher-order constants \<open>All :: ('a \<Rightarrow> bool) \<Rightarrow> bool\<close> such that \<open>All (\<lambda>x::'a. B x)\<close> faithfully represents the idea that is displayed in
 | 
| 19 |   Isabelle as \<open>\<forall>x::'a. B x\<close> via @{keyword "binder"} notation.
 | |
| 58555 | 20 |   Moreover, type-inference in the style of Hindley-Milner @{cite hindleymilner}
 | 
| 61493 | 21 | (and extensions) enables users to write \<open>\<forall>x. B x\<close> concisely, when | 
| 22 | the type \<open>'a\<close> is already clear from the | |
| 61572 | 23 | context.\<^footnote>\<open>Type-inference taken to the extreme can easily confuse | 
| 57496 | 24 | users. Beginners often stumble over unexpectedly general types inferred by | 
| 61572 | 25 | the system.\<close> | 
| 45258 | 26 | |
| 61416 | 27 | \<^medskip> | 
| 61477 | 28 | The main inner syntax operations are \<^emph>\<open>read\<close> for | 
| 29 | parsing together with type-checking, and \<^emph>\<open>pretty\<close> for formatted | |
| 45258 | 30 |   output.  See also \secref{sec:read-print}.
 | 
| 31 | ||
| 32 | Furthermore, the input and output syntax layers are sub-divided into | |
| 61477 | 33 | separate phases for \<^emph>\<open>concrete syntax\<close> versus \<^emph>\<open>abstract | 
| 34 |   syntax\<close>, see also \secref{sec:parse-unparse} and
 | |
| 45258 | 35 |   \secref{sec:term-check}, respectively.  This results in the
 | 
| 36 | following decomposition of the main operations: | |
| 37 | ||
| 61493 | 38 | \<^item> \<open>read = parse; check\<close> | 
| 45258 | 39 | |
| 61493 | 40 | \<^item> \<open>pretty = uncheck; unparse\<close> | 
| 45258 | 41 | |
| 42 | ||
| 57496 | 43 | For example, some specification package might thus intercept syntax | 
| 61493 | 44 | processing at a well-defined stage after \<open>parse\<close>, to a augment the | 
| 45 | resulting pre-term before full type-reconstruction is performed by \<open>check\<close>. Note that the formal status of bound variables, versus free | |
| 57496 | 46 | variables, versus constants must not be changed between these phases. | 
| 57345 | 47 | |
| 61416 | 48 | \<^medskip> | 
| 61493 | 49 | In general, \<open>check\<close> and \<open>uncheck\<close> operate | 
| 57345 | 50 | simultaneously on a list of terms. This is particular important for | 
| 51 | type-checking, to reconstruct types for several terms of the same context | |
| 61493 | 52 | and scope. In contrast, \<open>parse\<close> and \<open>unparse\<close> operate separately | 
| 57496 | 53 | on single terms. | 
| 57345 | 54 | |
| 55 | There are analogous operations to read and print types, with the same | |
| 56 | sub-division into phases. | |
| 58618 | 57 | \<close> | 
| 45258 | 58 | |
| 30272 | 59 | |
| 58618 | 60 | section \<open>Reading and pretty printing \label{sec:read-print}\<close>
 | 
| 34924 | 61 | |
| 58618 | 62 | text \<open> | 
| 57496 | 63 | Read and print operations are roughly dual to each other, such that for the | 
| 61493 | 64 | user \<open>s' = pretty (read s)\<close> looks similar to the original source | 
| 65 | text \<open>s\<close>, but the details depend on many side-conditions. There are | |
| 57496 | 66 | also explicit options to control the removal of type information in the | 
| 61493 | 67 | output. The default configuration routinely looses information, so \<open>t' = read (pretty t)\<close> might fail, or produce a differently typed term, or | 
| 57496 | 68 | a completely different term in the face of syntactic overloading. | 
| 58618 | 69 | \<close> | 
| 34924 | 70 | |
| 58618 | 71 | text %mlref \<open> | 
| 39876 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 72 |   \begin{mldecls}
 | 
| 57345 | 73 |   @{index_ML Syntax.read_typs: "Proof.context -> string list -> typ list"} \\
 | 
| 74 |   @{index_ML Syntax.read_terms: "Proof.context -> string list -> term list"} \\
 | |
| 75 |   @{index_ML Syntax.read_props: "Proof.context -> string list -> term list"} \\[0.5ex]
 | |
| 39876 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 76 |   @{index_ML Syntax.read_typ: "Proof.context -> string -> typ"} \\
 | 
| 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 77 |   @{index_ML Syntax.read_term: "Proof.context -> string -> term"} \\
 | 
| 57345 | 78 |   @{index_ML Syntax.read_prop: "Proof.context -> string -> term"} \\[0.5ex]
 | 
| 39876 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 79 |   @{index_ML Syntax.pretty_typ: "Proof.context -> typ -> Pretty.T"} \\
 | 
| 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 80 |   @{index_ML Syntax.pretty_term: "Proof.context -> term -> Pretty.T"} \\
 | 
| 57345 | 81 |   @{index_ML Syntax.string_of_typ: "Proof.context -> typ -> string"} \\
 | 
| 82 |   @{index_ML Syntax.string_of_term: "Proof.context -> term -> string"} \\
 | |
| 39876 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 83 |   \end{mldecls}
 | 
| 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 84 | |
| 61493 | 85 |   \<^descr> @{ML Syntax.read_typs}~\<open>ctxt strs\<close> parses and checks a
 | 
| 57345 | 86 | simultaneous list of source strings as types of the logic. | 
| 87 | ||
| 61493 | 88 |   \<^descr> @{ML Syntax.read_terms}~\<open>ctxt strs\<close> parses and checks a
 | 
| 57345 | 89 | simultaneous list of source strings as terms of the logic. | 
| 57346 | 90 | Type-reconstruction puts all parsed terms into the same scope: types of | 
| 91 | free variables ultimately need to coincide. | |
| 57345 | 92 | |
| 93 | If particular type-constraints are required for some of the arguments, the | |
| 57346 | 94 | read operations needs to be split into its parse and check phases. Then it | 
| 57496 | 95 |   is possible to use @{ML Type.constraint} on the intermediate pre-terms
 | 
| 57846 | 96 |   (\secref{sec:term-check}).
 | 
| 57345 | 97 | |
| 61493 | 98 |   \<^descr> @{ML Syntax.read_props}~\<open>ctxt strs\<close> parses and checks a
 | 
| 57346 | 99 | simultaneous list of source strings as terms of the logic, with an implicit | 
| 100 |   type-constraint for each argument to enforce type @{typ prop}; this also
 | |
| 57496 | 101 | affects the inner syntax for parsing. The remaining type-reconstruction | 
| 102 |   works as for @{ML Syntax.read_terms}.
 | |
| 57345 | 103 | |
| 61439 | 104 |   \<^descr> @{ML Syntax.read_typ}, @{ML Syntax.read_term}, @{ML Syntax.read_prop}
 | 
| 57496 | 105 | are like the simultaneous versions, but operate on a single argument only. | 
| 106 | This convenient shorthand is adequate in situations where a single item in | |
| 107 |   its own scope is processed. Do not use @{ML "map o Syntax.read_term"} where
 | |
| 108 |   @{ML Syntax.read_terms} is actually intended!
 | |
| 57345 | 109 | |
| 61493 | 110 |   \<^descr> @{ML Syntax.pretty_typ}~\<open>ctxt T\<close> and @{ML
 | 
| 111 | Syntax.pretty_term}~\<open>ctxt t\<close> uncheck and pretty-print the given type | |
| 57345 | 112 | or term, respectively. Although the uncheck phase acts on a simultaneous | 
| 57496 | 113 | list as well, this is rarely used in practice, so only the singleton case is | 
| 114 | provided as combined pretty operation. There is no distinction of term vs.\ | |
| 115 | proposition. | |
| 57345 | 116 | |
| 61439 | 117 |   \<^descr> @{ML Syntax.string_of_typ} and @{ML Syntax.string_of_term} are
 | 
| 57345 | 118 |   convenient compositions of @{ML Syntax.pretty_typ} and @{ML
 | 
| 119 |   Syntax.pretty_term} with @{ML Pretty.string_of} for output. The result may
 | |
| 120 | be concatenated with other strings, as long as there is no further | |
| 121 | formatting and line-breaking involved. | |
| 122 | ||
| 123 | ||
| 57346 | 124 |   @{ML Syntax.read_term}, @{ML Syntax.read_prop}, and @{ML
 | 
| 125 | Syntax.string_of_term} are the most important operations in practice. | |
| 57345 | 126 | |
| 61416 | 127 | \<^medskip> | 
| 128 | Note that the string values that are passed in and out are | |
| 57346 | 129 | annotated by the system, to carry further markup that is relevant for the | 
| 58555 | 130 |   Prover IDE @{cite "isabelle-jedit"}. User code should neither compose its
 | 
| 131 | own input strings, nor try to analyze the output strings. Conceptually this | |
| 132 | is an abstract datatype, encoded as concrete string for historical reasons. | |
| 57345 | 133 | |
| 134 | The standard way to provide the required position markup for input works via | |
| 135 |   the outer syntax parser wrapper @{ML Parse.inner_syntax}, which is already
 | |
| 136 |   part of @{ML Parse.typ}, @{ML Parse.term}, @{ML Parse.prop}. So a string
 | |
| 137 | obtained from one of the latter may be directly passed to the corresponding | |
| 57346 | 138 | read operation: this yields PIDE markup of the input and precise positions | 
| 139 | for warning and error messages. | |
| 58618 | 140 | \<close> | 
| 39876 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 141 | |
| 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 142 | |
| 58618 | 143 | section \<open>Parsing and unparsing \label{sec:parse-unparse}\<close>
 | 
| 39876 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 144 | |
| 58618 | 145 | text \<open> | 
| 57496 | 146 | Parsing and unparsing converts between actual source text and a certain | 
| 61477 | 147 | \<^emph>\<open>pre-term\<close> format, where all bindings and scopes are already resolved | 
| 57496 | 148 | faithfully. Thus the names of free variables or constants are determined in | 
| 149 | the sense of the logical context, but type information might be still | |
| 61477 | 150 | missing. Pre-terms support an explicit language of \<^emph>\<open>type constraints\<close> | 
| 151 | that may be augmented by user code to guide the later \<^emph>\<open>check\<close> phase. | |
| 45258 | 152 | |
| 57496 | 153 | Actual parsing is based on traditional lexical analysis and Earley parsing | 
| 154 | for arbitrary context-free grammars. The user can specify the grammar | |
| 61477 | 155 | declaratively via mixfix annotations. Moreover, there are \<^emph>\<open>syntax | 
| 156 | translations\<close> that can be augmented by the user, either declaratively via | |
| 57496 | 157 |   @{command translations} or programmatically via @{command
 | 
| 58555 | 158 |   parse_translation}, @{command print_translation} @{cite
 | 
| 159 | "isabelle-isar-ref"}. The final scope-resolution is performed by the system, | |
| 160 | according to name spaces for types, term variables and constants determined | |
| 161 | by the context. | |
| 58618 | 162 | \<close> | 
| 39876 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 163 | |
| 58618 | 164 | text %mlref \<open> | 
| 39876 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 165 |   \begin{mldecls}
 | 
| 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 166 |   @{index_ML Syntax.parse_typ: "Proof.context -> string -> typ"} \\
 | 
| 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 167 |   @{index_ML Syntax.parse_term: "Proof.context -> string -> term"} \\
 | 
| 57496 | 168 |   @{index_ML Syntax.parse_prop: "Proof.context -> string -> term"} \\[0.5ex]
 | 
| 39876 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 169 |   @{index_ML Syntax.unparse_typ: "Proof.context -> typ -> Pretty.T"} \\
 | 
| 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 170 |   @{index_ML Syntax.unparse_term: "Proof.context -> term -> Pretty.T"} \\
 | 
| 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 171 |   \end{mldecls}
 | 
| 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 172 | |
| 61493 | 173 |   \<^descr> @{ML Syntax.parse_typ}~\<open>ctxt str\<close> parses a source string as
 | 
| 57346 | 174 | pre-type that is ready to be used with subsequent check operations. | 
| 175 | ||
| 61493 | 176 |   \<^descr> @{ML Syntax.parse_term}~\<open>ctxt str\<close> parses a source string as
 | 
| 57346 | 177 | pre-term that is ready to be used with subsequent check operations. | 
| 178 | ||
| 61493 | 179 |   \<^descr> @{ML Syntax.parse_prop}~\<open>ctxt str\<close> parses a source string as
 | 
| 57346 | 180 | pre-term that is ready to be used with subsequent check operations. The | 
| 181 |   inner syntax category is @{typ prop} and a suitable type-constraint is
 | |
| 57496 | 182 | included to ensure that this information is observed in subsequent type | 
| 183 | reconstruction. | |
| 57346 | 184 | |
| 61493 | 185 |   \<^descr> @{ML Syntax.unparse_typ}~\<open>ctxt T\<close> unparses a type after
 | 
| 57346 | 186 | uncheck operations, to turn it into a pretty tree. | 
| 187 | ||
| 61493 | 188 |   \<^descr> @{ML Syntax.unparse_term}~\<open>ctxt T\<close> unparses a term after
 | 
| 57346 | 189 | uncheck operations, to turn it into a pretty tree. There is no distinction | 
| 190 | for propositions here. | |
| 191 | ||
| 192 | ||
| 57496 | 193 |   These operations always operate on a single item; use the combinator @{ML
 | 
| 194 | map} to apply them to a list. | |
| 58618 | 195 | \<close> | 
| 39876 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 196 | |
| 39852 | 197 | |
| 58618 | 198 | section \<open>Checking and unchecking \label{sec:term-check}\<close>
 | 
| 34924 | 199 | |
| 58618 | 200 | text \<open>These operations define the transition from pre-terms and | 
| 45258 | 201 | fully-annotated terms in the sense of the logical core | 
| 202 |   (\chref{ch:logic}).
 | |
| 203 | ||
| 61477 | 204 | The \<^emph>\<open>check\<close> phase is meant to subsume a variety of mechanisms | 
| 45258 | 205 | in the manner of ``type-inference'' or ``type-reconstruction'' or | 
| 206 | ``type-improvement'', not just type-checking in the narrow sense. | |
| 61477 | 207 | The \<^emph>\<open>uncheck\<close> phase is roughly dual, it prunes type-information | 
| 45258 | 208 | before pretty printing. | 
| 209 | ||
| 210 |   A typical add-on for the check/uncheck syntax layer is the @{command
 | |
| 58555 | 211 |   abbreviation} mechanism @{cite "isabelle-isar-ref"}. Here the user specifies
 | 
| 61493 | 212 | syntactic definitions that are managed by the system as polymorphic \<open>let\<close> bindings. These are expanded during the \<open>check\<close> phase, and | 
| 213 | contracted during the \<open>uncheck\<close> phase, without affecting the | |
| 57496 | 214 | type-assignment of the given terms. | 
| 45258 | 215 | |
| 61416 | 216 | \<^medskip> | 
| 217 | The precise meaning of type checking depends on the context --- | |
| 57496 | 218 | additional check/uncheck modules might be defined in user space. | 
| 45258 | 219 | |
| 220 |   For example, the @{command class} command defines a context where
 | |
| 61493 | 221 | \<open>check\<close> treats certain type instances of overloaded | 
| 45258 | 222 | constants according to the ``dictionary construction'' of its | 
| 223 | logical foundation. This involves ``type improvement'' | |
| 224 | (specialization of slightly too general types) and replacement by | |
| 58555 | 225 |   certain locale parameters.  See also @{cite "Haftmann-Wenzel:2009"}.
 | 
| 58618 | 226 | \<close> | 
| 39852 | 227 | |
| 58618 | 228 | text %mlref \<open> | 
| 39876 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 229 |   \begin{mldecls}
 | 
| 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 230 |   @{index_ML Syntax.check_typs: "Proof.context -> typ list -> typ list"} \\
 | 
| 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 231 |   @{index_ML Syntax.check_terms: "Proof.context -> term list -> term list"} \\
 | 
| 57496 | 232 |   @{index_ML Syntax.check_props: "Proof.context -> term list -> term list"} \\[0.5ex]
 | 
| 39876 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 233 |   @{index_ML Syntax.uncheck_typs: "Proof.context -> typ list -> typ list"} \\
 | 
| 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 234 |   @{index_ML Syntax.uncheck_terms: "Proof.context -> term list -> term list"} \\
 | 
| 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 235 |   \end{mldecls}
 | 
| 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 236 | |
| 61493 | 237 |   \<^descr> @{ML Syntax.check_typs}~\<open>ctxt Ts\<close> checks a simultaneous list
 | 
| 57346 | 238 | of pre-types as types of the logic. Typically, this involves normalization | 
| 239 | of type synonyms. | |
| 240 | ||
| 61493 | 241 |   \<^descr> @{ML Syntax.check_terms}~\<open>ctxt ts\<close> checks a simultaneous list
 | 
| 57346 | 242 | of pre-terms as terms of the logic. Typically, this involves type-inference | 
| 243 | and normalization term abbreviations. The types within the given terms are | |
| 244 |   treated in the same way as for @{ML Syntax.check_typs}.
 | |
| 245 | ||
| 246 | Applications sometimes need to check several types and terms together. The | |
| 247 |   standard approach uses @{ML Logic.mk_type} to embed the language of types
 | |
| 248 | into that of terms; all arguments are appended into one list of terms that | |
| 57496 | 249 |   is checked; afterwards the type arguments are recovered with @{ML
 | 
| 57346 | 250 | Logic.dest_type}. | 
| 251 | ||
| 61493 | 252 |   \<^descr> @{ML Syntax.check_props}~\<open>ctxt ts\<close> checks a simultaneous list
 | 
| 57346 | 253 | of pre-terms as terms of the logic, such that all terms are constrained by | 
| 254 |   type @{typ prop}. The remaining check operation works as @{ML
 | |
| 255 | Syntax.check_terms} above. | |
| 256 | ||
| 61493 | 257 |   \<^descr> @{ML Syntax.uncheck_typs}~\<open>ctxt Ts\<close> unchecks a simultaneous
 | 
| 57346 | 258 | list of types of the logic, in preparation of pretty printing. | 
| 259 | ||
| 61493 | 260 |   \<^descr> @{ML Syntax.uncheck_terms}~\<open>ctxt ts\<close> unchecks a simultaneous
 | 
| 57346 | 261 | list of terms of the logic, in preparation of pretty printing. There is no | 
| 262 | distinction for propositions here. | |
| 263 | ||
| 264 | ||
| 57496 | 265 | These operations always operate simultaneously on a list; use the combinator | 
| 266 |   @{ML singleton} to apply them to a single item.
 | |
| 58618 | 267 | \<close> | 
| 39876 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 wenzelm parents: 
39865diff
changeset | 268 | |
| 30124 | 269 | end |