| author | wenzelm | 
| Mon, 07 Jun 2021 11:42:05 +0200 | |
| changeset 73826 | 72900f34dbb3 | 
| parent 73764 | 35d8132633c6 | 
| child 76987 | 4c275405faae | 
| permissions | -rw-r--r-- | 
| 61656 | 1  | 
(*:maxLineLen=78:*)  | 
| 57344 | 2  | 
|
| 30124 | 3  | 
theory Syntax  | 
4  | 
imports Base  | 
|
5  | 
begin  | 
|
6  | 
||
| 58618 | 7  | 
chapter \<open>Concrete syntax and type-checking\<close>  | 
| 30124 | 8  | 
|
| 61854 | 9  | 
text \<open>  | 
10  | 
  Pure \<open>\<lambda>\<close>-calculus as introduced in \chref{ch:logic} is an adequate
 | 
|
11  | 
foundation for logical languages --- in the tradition of \<^emph>\<open>higher-order  | 
|
12  | 
abstract syntax\<close> --- but end-users require additional means for reading and  | 
|
13  | 
printing of terms and types. This important add-on outside the logical core  | 
|
14  | 
is called \<^emph>\<open>inner syntax\<close> in Isabelle jargon, as opposed to the \<^emph>\<open>outer  | 
|
15  | 
  syntax\<close> of the theory and proof language @{cite "isabelle-isar-ref"}.
 | 
|
| 45258 | 16  | 
|
| 58555 | 17  | 
  For example, according to @{cite church40} quantifiers are represented as
 | 
| 61854 | 18  | 
  higher-order constants \<open>All :: ('a \<Rightarrow> bool) \<Rightarrow> bool\<close> such that \<open>All (\<lambda>x::'a. B
 | 
19  | 
x)\<close> faithfully represents the idea that is displayed in Isabelle as \<open>\<forall>x::'a.  | 
|
20  | 
  B x\<close> via @{keyword "binder"} notation. Moreover, type-inference in the style
 | 
|
21  | 
  of Hindley-Milner @{cite hindleymilner} (and extensions) enables users to
 | 
|
22  | 
write \<open>\<forall>x. B x\<close> concisely, when the type \<open>'a\<close> is already clear from the  | 
|
23  | 
context.\<^footnote>\<open>Type-inference taken to the extreme can easily confuse users.  | 
|
24  | 
Beginners often stumble over unexpectedly general types inferred by the  | 
|
25  | 
system.\<close>  | 
|
| 45258 | 26  | 
|
| 61416 | 27  | 
\<^medskip>  | 
| 61854 | 28  | 
The main inner syntax operations are \<^emph>\<open>read\<close> for parsing together with  | 
29  | 
type-checking, and \<^emph>\<open>pretty\<close> for formatted output. See also  | 
|
30  | 
  \secref{sec:read-print}.
 | 
|
| 45258 | 31  | 
|
32  | 
Furthermore, the input and output syntax layers are sub-divided into  | 
|
| 61854 | 33  | 
separate phases for \<^emph>\<open>concrete syntax\<close> versus \<^emph>\<open>abstract syntax\<close>, see also  | 
34  | 
  \secref{sec:parse-unparse} and \secref{sec:term-check}, respectively. This
 | 
|
35  | 
results in the following decomposition of the main operations:  | 
|
| 45258 | 36  | 
|
| 61854 | 37  | 
\<^item> \<open>read = parse; check\<close>  | 
| 45258 | 38  | 
|
| 61854 | 39  | 
\<^item> \<open>pretty = uncheck; unparse\<close>  | 
| 45258 | 40  | 
|
| 57496 | 41  | 
For example, some specification package might thus intercept syntax  | 
| 61854 | 42  | 
processing at a well-defined stage after \<open>parse\<close>, to a augment the resulting  | 
43  | 
pre-term before full type-reconstruction is performed by \<open>check\<close>. Note that  | 
|
44  | 
the formal status of bound variables, versus free variables, versus  | 
|
45  | 
constants must not be changed between these phases.  | 
|
| 57345 | 46  | 
|
| 61416 | 47  | 
\<^medskip>  | 
| 61854 | 48  | 
In general, \<open>check\<close> and \<open>uncheck\<close> operate simultaneously on a list of terms.  | 
49  | 
This is particular important for type-checking, to reconstruct types for  | 
|
50  | 
several terms of the same context and scope. In contrast, \<open>parse\<close> and  | 
|
51  | 
\<open>unparse\<close> operate separately on single terms.  | 
|
| 57345 | 52  | 
|
53  | 
There are analogous operations to read and print types, with the same  | 
|
54  | 
sub-division into phases.  | 
|
| 58618 | 55  | 
\<close>  | 
| 45258 | 56  | 
|
| 30272 | 57  | 
|
| 58618 | 58  | 
section \<open>Reading and pretty printing \label{sec:read-print}\<close>
 | 
| 34924 | 59  | 
|
| 58618 | 60  | 
text \<open>  | 
| 57496 | 61  | 
Read and print operations are roughly dual to each other, such that for the  | 
| 61854 | 62  | 
user \<open>s' = pretty (read s)\<close> looks similar to the original source text \<open>s\<close>,  | 
63  | 
but the details depend on many side-conditions. There are also explicit  | 
|
64  | 
options to control the removal of type information in the output. The  | 
|
65  | 
default configuration routinely looses information, so \<open>t' = read (pretty  | 
|
66  | 
t)\<close> might fail, or produce a differently typed term, or a completely  | 
|
67  | 
different term in the face of syntactic overloading.  | 
|
| 58618 | 68  | 
\<close>  | 
| 34924 | 69  | 
|
| 58618 | 70  | 
text %mlref \<open>  | 
| 
39876
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
71  | 
  \begin{mldecls}
 | 
| 73764 | 72  | 
  @{define_ML Syntax.read_typs: "Proof.context -> string list -> typ list"} \\
 | 
73  | 
  @{define_ML Syntax.read_terms: "Proof.context -> string list -> term list"} \\
 | 
|
74  | 
  @{define_ML Syntax.read_props: "Proof.context -> string list -> term list"} \\[0.5ex]
 | 
|
75  | 
  @{define_ML Syntax.read_typ: "Proof.context -> string -> typ"} \\
 | 
|
76  | 
  @{define_ML Syntax.read_term: "Proof.context -> string -> term"} \\
 | 
|
77  | 
  @{define_ML Syntax.read_prop: "Proof.context -> string -> term"} \\[0.5ex]
 | 
|
78  | 
  @{define_ML Syntax.pretty_typ: "Proof.context -> typ -> Pretty.T"} \\
 | 
|
79  | 
  @{define_ML Syntax.pretty_term: "Proof.context -> term -> Pretty.T"} \\
 | 
|
80  | 
  @{define_ML Syntax.string_of_typ: "Proof.context -> typ -> string"} \\
 | 
|
81  | 
  @{define_ML Syntax.string_of_term: "Proof.context -> term -> string"} \\
 | 
|
| 
39876
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
82  | 
  \end{mldecls}
 | 
| 
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
83  | 
|
| 69597 | 84  | 
\<^descr> \<^ML>\<open>Syntax.read_typs\<close>~\<open>ctxt strs\<close> parses and checks a simultaneous list  | 
| 61854 | 85  | 
of source strings as types of the logic.  | 
| 57345 | 86  | 
|
| 69597 | 87  | 
\<^descr> \<^ML>\<open>Syntax.read_terms\<close>~\<open>ctxt strs\<close> parses and checks a simultaneous list  | 
| 61854 | 88  | 
of source strings as terms of the logic. Type-reconstruction puts all parsed  | 
89  | 
terms into the same scope: types of free variables ultimately need to  | 
|
90  | 
coincide.  | 
|
| 57345 | 91  | 
|
92  | 
If particular type-constraints are required for some of the arguments, the  | 
|
| 57346 | 93  | 
read operations needs to be split into its parse and check phases. Then it  | 
| 69597 | 94  | 
is possible to use \<^ML>\<open>Type.constraint\<close> on the intermediate pre-terms  | 
| 57846 | 95  | 
  (\secref{sec:term-check}).
 | 
| 57345 | 96  | 
|
| 69597 | 97  | 
\<^descr> \<^ML>\<open>Syntax.read_props\<close>~\<open>ctxt strs\<close> parses and checks a simultaneous list  | 
| 61854 | 98  | 
of source strings as terms of the logic, with an implicit type-constraint  | 
| 69597 | 99  | 
for each argument to enforce type \<^typ>\<open>prop\<close>; this also affects the inner  | 
100  | 
syntax for parsing. The remaining type-reconstruction works as for \<^ML>\<open>Syntax.read_terms\<close>.  | 
|
| 57345 | 101  | 
|
| 69597 | 102  | 
\<^descr> \<^ML>\<open>Syntax.read_typ\<close>, \<^ML>\<open>Syntax.read_term\<close>, \<^ML>\<open>Syntax.read_prop\<close> are  | 
| 61854 | 103  | 
like the simultaneous versions, but operate on a single argument only. This  | 
104  | 
convenient shorthand is adequate in situations where a single item in its  | 
|
| 69597 | 105  | 
own scope is processed. Do not use \<^ML>\<open>map o Syntax.read_term\<close> where \<^ML>\<open>Syntax.read_terms\<close> is actually intended!  | 
| 57345 | 106  | 
|
| 69597 | 107  | 
\<^descr> \<^ML>\<open>Syntax.pretty_typ\<close>~\<open>ctxt T\<close> and \<^ML>\<open>Syntax.pretty_term\<close>~\<open>ctxt t\<close>  | 
| 61854 | 108  | 
uncheck and pretty-print the given type or term, respectively. Although the  | 
109  | 
uncheck phase acts on a simultaneous list as well, this is rarely used in  | 
|
110  | 
practice, so only the singleton case is provided as combined pretty  | 
|
111  | 
operation. There is no distinction of term vs.\ proposition.  | 
|
| 57345 | 112  | 
|
| 69597 | 113  | 
\<^descr> \<^ML>\<open>Syntax.string_of_typ\<close> and \<^ML>\<open>Syntax.string_of_term\<close> are convenient  | 
114  | 
compositions of \<^ML>\<open>Syntax.pretty_typ\<close> and \<^ML>\<open>Syntax.pretty_term\<close> with  | 
|
115  | 
\<^ML>\<open>Pretty.string_of\<close> for output. The result may be concatenated with other  | 
|
| 61854 | 116  | 
strings, as long as there is no further formatting and line-breaking  | 
117  | 
involved.  | 
|
| 57345 | 118  | 
|
119  | 
||
| 69597 | 120  | 
\<^ML>\<open>Syntax.read_term\<close>, \<^ML>\<open>Syntax.read_prop\<close>, and \<^ML>\<open>Syntax.string_of_term\<close> are the most important operations in practice.  | 
| 57345 | 121  | 
|
| 61416 | 122  | 
\<^medskip>  | 
| 61854 | 123  | 
Note that the string values that are passed in and out are annotated by the  | 
124  | 
  system, to carry further markup that is relevant for the Prover IDE @{cite
 | 
|
125  | 
"isabelle-jedit"}. User code should neither compose its own input strings,  | 
|
126  | 
nor try to analyze the output strings. Conceptually this is an abstract  | 
|
127  | 
datatype, encoded as concrete string for historical reasons.  | 
|
| 57345 | 128  | 
|
129  | 
The standard way to provide the required position markup for input works via  | 
|
| 69597 | 130  | 
the outer syntax parser wrapper \<^ML>\<open>Parse.inner_syntax\<close>, which is already  | 
131  | 
part of \<^ML>\<open>Parse.typ\<close>, \<^ML>\<open>Parse.term\<close>, \<^ML>\<open>Parse.prop\<close>. So a string  | 
|
| 57345 | 132  | 
obtained from one of the latter may be directly passed to the corresponding  | 
| 57346 | 133  | 
read operation: this yields PIDE markup of the input and precise positions  | 
134  | 
for warning and error messages.  | 
|
| 58618 | 135  | 
\<close>  | 
| 
39876
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
136  | 
|
| 
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
137  | 
|
| 58618 | 138  | 
section \<open>Parsing and unparsing \label{sec:parse-unparse}\<close>
 | 
| 
39876
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
139  | 
|
| 58618 | 140  | 
text \<open>  | 
| 57496 | 141  | 
Parsing and unparsing converts between actual source text and a certain  | 
| 61477 | 142  | 
\<^emph>\<open>pre-term\<close> format, where all bindings and scopes are already resolved  | 
| 57496 | 143  | 
faithfully. Thus the names of free variables or constants are determined in  | 
144  | 
the sense of the logical context, but type information might be still  | 
|
| 61854 | 145  | 
missing. Pre-terms support an explicit language of \<^emph>\<open>type constraints\<close> that  | 
146  | 
may be augmented by user code to guide the later \<^emph>\<open>check\<close> phase.  | 
|
| 45258 | 147  | 
|
| 57496 | 148  | 
Actual parsing is based on traditional lexical analysis and Earley parsing  | 
149  | 
for arbitrary context-free grammars. The user can specify the grammar  | 
|
| 61477 | 150  | 
declaratively via mixfix annotations. Moreover, there are \<^emph>\<open>syntax  | 
151  | 
translations\<close> that can be augmented by the user, either declaratively via  | 
|
| 57496 | 152  | 
  @{command translations} or programmatically via @{command
 | 
| 58555 | 153  | 
  parse_translation}, @{command print_translation} @{cite
 | 
154  | 
"isabelle-isar-ref"}. The final scope-resolution is performed by the system,  | 
|
155  | 
according to name spaces for types, term variables and constants determined  | 
|
156  | 
by the context.  | 
|
| 58618 | 157  | 
\<close>  | 
| 
39876
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
158  | 
|
| 58618 | 159  | 
text %mlref \<open>  | 
| 
39876
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
160  | 
  \begin{mldecls}
 | 
| 73764 | 161  | 
  @{define_ML Syntax.parse_typ: "Proof.context -> string -> typ"} \\
 | 
162  | 
  @{define_ML Syntax.parse_term: "Proof.context -> string -> term"} \\
 | 
|
163  | 
  @{define_ML Syntax.parse_prop: "Proof.context -> string -> term"} \\[0.5ex]
 | 
|
164  | 
  @{define_ML Syntax.unparse_typ: "Proof.context -> typ -> Pretty.T"} \\
 | 
|
165  | 
  @{define_ML Syntax.unparse_term: "Proof.context -> term -> Pretty.T"} \\
 | 
|
| 
39876
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
166  | 
  \end{mldecls}
 | 
| 
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
167  | 
|
| 69597 | 168  | 
\<^descr> \<^ML>\<open>Syntax.parse_typ\<close>~\<open>ctxt str\<close> parses a source string as pre-type that  | 
| 61854 | 169  | 
is ready to be used with subsequent check operations.  | 
| 57346 | 170  | 
|
| 69597 | 171  | 
\<^descr> \<^ML>\<open>Syntax.parse_term\<close>~\<open>ctxt str\<close> parses a source string as pre-term that  | 
| 61854 | 172  | 
is ready to be used with subsequent check operations.  | 
| 57346 | 173  | 
|
| 69597 | 174  | 
\<^descr> \<^ML>\<open>Syntax.parse_prop\<close>~\<open>ctxt str\<close> parses a source string as pre-term that  | 
| 61854 | 175  | 
is ready to be used with subsequent check operations. The inner syntax  | 
| 69597 | 176  | 
category is \<^typ>\<open>prop\<close> and a suitable type-constraint is included to ensure  | 
| 61854 | 177  | 
that this information is observed in subsequent type reconstruction.  | 
| 57346 | 178  | 
|
| 69597 | 179  | 
\<^descr> \<^ML>\<open>Syntax.unparse_typ\<close>~\<open>ctxt T\<close> unparses a type after uncheck  | 
| 61854 | 180  | 
operations, to turn it into a pretty tree.  | 
| 57346 | 181  | 
|
| 69597 | 182  | 
\<^descr> \<^ML>\<open>Syntax.unparse_term\<close>~\<open>ctxt T\<close> unparses a term after uncheck  | 
| 61854 | 183  | 
operations, to turn it into a pretty tree. There is no distinction for  | 
184  | 
propositions here.  | 
|
| 57346 | 185  | 
|
186  | 
||
| 69597 | 187  | 
These operations always operate on a single item; use the combinator \<^ML>\<open>map\<close> to apply them to a list.  | 
| 58618 | 188  | 
\<close>  | 
| 
39876
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
189  | 
|
| 39852 | 190  | 
|
| 58618 | 191  | 
section \<open>Checking and unchecking \label{sec:term-check}\<close>
 | 
| 34924 | 192  | 
|
| 61854 | 193  | 
text \<open>  | 
194  | 
These operations define the transition from pre-terms and fully-annotated  | 
|
195  | 
  terms in the sense of the logical core (\chref{ch:logic}).
 | 
|
| 45258 | 196  | 
|
| 61854 | 197  | 
The \<^emph>\<open>check\<close> phase is meant to subsume a variety of mechanisms in the manner  | 
198  | 
of ``type-inference'' or ``type-reconstruction'' or ``type-improvement'',  | 
|
199  | 
not just type-checking in the narrow sense. The \<^emph>\<open>uncheck\<close> phase is roughly  | 
|
200  | 
dual, it prunes type-information before pretty printing.  | 
|
| 45258 | 201  | 
|
202  | 
  A typical add-on for the check/uncheck syntax layer is the @{command
 | 
|
| 58555 | 203  | 
  abbreviation} mechanism @{cite "isabelle-isar-ref"}. Here the user specifies
 | 
| 61854 | 204  | 
syntactic definitions that are managed by the system as polymorphic \<open>let\<close>  | 
205  | 
bindings. These are expanded during the \<open>check\<close> phase, and contracted during  | 
|
206  | 
the \<open>uncheck\<close> phase, without affecting the type-assignment of the given  | 
|
207  | 
terms.  | 
|
| 45258 | 208  | 
|
| 61416 | 209  | 
\<^medskip>  | 
| 61854 | 210  | 
The precise meaning of type checking depends on the context --- additional  | 
211  | 
check/uncheck modules might be defined in user space.  | 
|
| 45258 | 212  | 
|
| 61854 | 213  | 
  For example, the @{command class} command defines a context where \<open>check\<close>
 | 
214  | 
treats certain type instances of overloaded constants according to the  | 
|
215  | 
``dictionary construction'' of its logical foundation. This involves ``type  | 
|
216  | 
improvement'' (specialization of slightly too general types) and replacement  | 
|
217  | 
  by certain locale parameters. See also @{cite "Haftmann-Wenzel:2009"}.
 | 
|
| 58618 | 218  | 
\<close>  | 
| 39852 | 219  | 
|
| 58618 | 220  | 
text %mlref \<open>  | 
| 
39876
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
221  | 
  \begin{mldecls}
 | 
| 73764 | 222  | 
  @{define_ML Syntax.check_typs: "Proof.context -> typ list -> typ list"} \\
 | 
223  | 
  @{define_ML Syntax.check_terms: "Proof.context -> term list -> term list"} \\
 | 
|
224  | 
  @{define_ML Syntax.check_props: "Proof.context -> term list -> term list"} \\[0.5ex]
 | 
|
225  | 
  @{define_ML Syntax.uncheck_typs: "Proof.context -> typ list -> typ list"} \\
 | 
|
226  | 
  @{define_ML Syntax.uncheck_terms: "Proof.context -> term list -> term list"} \\
 | 
|
| 
39876
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
227  | 
  \end{mldecls}
 | 
| 
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
228  | 
|
| 69597 | 229  | 
\<^descr> \<^ML>\<open>Syntax.check_typs\<close>~\<open>ctxt Ts\<close> checks a simultaneous list of pre-types  | 
| 61854 | 230  | 
as types of the logic. Typically, this involves normalization of type  | 
231  | 
synonyms.  | 
|
| 57346 | 232  | 
|
| 69597 | 233  | 
\<^descr> \<^ML>\<open>Syntax.check_terms\<close>~\<open>ctxt ts\<close> checks a simultaneous list of pre-terms  | 
| 61854 | 234  | 
as terms of the logic. Typically, this involves type-inference and  | 
235  | 
normalization term abbreviations. The types within the given terms are  | 
|
| 69597 | 236  | 
treated in the same way as for \<^ML>\<open>Syntax.check_typs\<close>.  | 
| 57346 | 237  | 
|
238  | 
Applications sometimes need to check several types and terms together. The  | 
|
| 69597 | 239  | 
standard approach uses \<^ML>\<open>Logic.mk_type\<close> to embed the language of types  | 
| 57346 | 240  | 
into that of terms; all arguments are appended into one list of terms that  | 
| 69597 | 241  | 
is checked; afterwards the type arguments are recovered with \<^ML>\<open>Logic.dest_type\<close>.  | 
| 57346 | 242  | 
|
| 69597 | 243  | 
\<^descr> \<^ML>\<open>Syntax.check_props\<close>~\<open>ctxt ts\<close> checks a simultaneous list of pre-terms  | 
244  | 
as terms of the logic, such that all terms are constrained by type \<^typ>\<open>prop\<close>. The remaining check operation works as \<^ML>\<open>Syntax.check_terms\<close>  | 
|
| 61854 | 245  | 
above.  | 
| 57346 | 246  | 
|
| 69597 | 247  | 
\<^descr> \<^ML>\<open>Syntax.uncheck_typs\<close>~\<open>ctxt Ts\<close> unchecks a simultaneous list of types  | 
| 61854 | 248  | 
of the logic, in preparation of pretty printing.  | 
| 57346 | 249  | 
|
| 69597 | 250  | 
\<^descr> \<^ML>\<open>Syntax.uncheck_terms\<close>~\<open>ctxt ts\<close> unchecks a simultaneous list of terms  | 
| 61854 | 251  | 
of the logic, in preparation of pretty printing. There is no distinction for  | 
252  | 
propositions here.  | 
|
| 57346 | 253  | 
|
254  | 
||
| 57496 | 255  | 
These operations always operate simultaneously on a list; use the combinator  | 
| 69597 | 256  | 
\<^ML>\<open>singleton\<close> to apply them to a single item.  | 
| 58618 | 257  | 
\<close>  | 
| 
39876
 
1ff9bce085bd
preliminary material on "Concrete syntax and type-checking";
 
wenzelm 
parents: 
39865 
diff
changeset
 | 
258  | 
|
| 30124 | 259  | 
end  |