1 |
1 |
2 \chapter{Isar Syntax Primitives} |
2 \chapter{Syntax primitives} |
3 |
3 |
4 We give a complete reference of all basic syntactic entities underlying the |
4 The rather generic framework of Isabelle/Isar syntax emerges from three main |
5 Isabelle/Isar document syntax. Actual theory and proof commands will be |
5 syntactic categories: \emph{commands} of the top-level Isar engine (covering |
6 introduced later on. |
6 theory and proof elements), \emph{methods} for general goal refinements |
|
7 (analogous to traditional ``tactics''), and \emph{attributes} for operations |
|
8 on facts (within a certain context). Here we give a reference of basic |
|
9 syntactic entities underlying Isabelle/Isar syntax in a bottom-up manner. |
|
10 Concrete theory and proof language elements will be introduced later on. |
7 |
11 |
8 \medskip |
12 \medskip |
9 |
13 |
10 In order to get started with writing well-formed Isabelle/Isar documents, the |
14 In order to get started with writing well-formed Isabelle/Isar documents, the |
11 most important aspect to be noted is the difference of \emph{inner} versus |
15 most important aspect to be noted is the difference of \emph{inner} versus |
12 \emph{outer} syntax. Inner syntax is that of Isabelle types and terms of the |
16 \emph{outer} syntax. Inner syntax is that of Isabelle types and terms of the |
13 logic, while outer syntax is that of Isabelle/Isar theories (including |
17 logic, while outer syntax is that of Isabelle/Isar theory sources (including |
14 proofs). As a general rule, inner syntax entities may occur only as |
18 proofs). As a general rule, inner syntax entities may occur only as |
15 \emph{atomic entities} within outer syntax. For example, the string |
19 \emph{atomic entities} within outer syntax. For example, the string |
16 \texttt{"x + y"} and identifier \texttt{z} are legal term specifications |
20 \texttt{"x + y"} and identifier \texttt{z} are legal term specifications |
17 within a theory, while \texttt{x + y} is not. |
21 within a theory, while \texttt{x + y} is not. |
18 |
22 |
19 \begin{warn} |
23 \begin{warn} |
20 Note that classic Isabelle theories used to fake parts of the inner syntax |
24 Old-style Isabelle theories used to fake parts of the inner syntax of types, |
21 of types, with rather complicated rules when quotes may be omitted. Despite |
25 with rather complicated rules when quotes may be omitted. Despite the minor |
22 the minor drawback of requiring quotes more often, the syntax of |
26 drawback of requiring quotes more often, the syntax of Isabelle/Isar is |
23 Isabelle/Isar is much simpler and more robust in that respect. |
27 somewhat simpler and more robust in that respect. |
24 \end{warn} |
28 \end{warn} |
25 |
29 |
|
30 Printed theory documents usually omit quotes to gain readability (this is a |
|
31 matter of {\LaTeX} macro setup, say via \verb,\isabellestyle,, see also |
|
32 \cite{isabelle-sys}). Experienced users of Isabelle/Isar may easily |
|
33 reconstruct the lost technical information, while mere readers need not care |
|
34 about quotes at all. |
|
35 |
26 \medskip |
36 \medskip |
27 |
37 |
28 Isabelle/Isar input may contain any number of input termination characters |
38 Isabelle/Isar input may contain any number of input termination characters |
29 ``\texttt{;}'' (semicolon) to separate commands explicitly. This may be |
39 ``\texttt{;}'' (semicolon) to separate commands explicitly. This is |
30 particularly useful in interactive shell sessions to make clear where the |
40 particularly useful in interactive shell sessions to make clear where the |
31 current command is intended to end. Otherwise, the interactive loop will |
41 current command is intended to end. Otherwise, the interpreter loop will |
32 continue until end-of-command is clearly indicated from the input syntax, |
42 continue to issue a secondary prompt ``\verb,#,'' until an end-of-command is |
33 e.g.\ encounter of the next command keyword. |
43 clearly indicated from the input syntax, e.g.\ encounter of the next command |
|
44 keyword. |
34 |
45 |
35 Advanced interfaces such as Proof~General \cite{proofgeneral} do not require |
46 Advanced interfaces such as Proof~General \cite{proofgeneral} do not require |
36 explicit semicolons, the amount of input text is determined automatically by |
47 explicit semicolons, the amount of input text is determined automatically by |
37 inspecting the Emacs text buffer. Also note that in the presentation of |
48 inspecting the present content of the Emacs text buffer. In the printed |
38 Isabelle/Isar documents, semicolons are omitted in any case to gain |
49 presentation of Isabelle/Isar documents semicolons are omitted altogether for |
39 readability. |
50 readability. |
40 |
51 |
|
52 \begin{warn} |
|
53 Proof~General requires certain syntax classification tables in order to |
|
54 achieve properly synchronized interaction with the Isabelle/Isar process. |
|
55 These tables need to be consistent with the Isabelle version and particular |
|
56 logic image to be used in a running session (common object-logics may well |
|
57 change the outer syntax). The standard setup should work correctly with any |
|
58 of the ``official'' logic images derived from Isabelle/HOL (including HOLCF |
|
59 etc.). Users of alternative logics may need to tell Proof~General |
|
60 explicitly, e.g.\ by giving an option \verb,-k ZF, (in conjunction with |
|
61 \verb,-l ZF, to specify the default logic image). |
|
62 \end{warn} |
41 |
63 |
42 \section{Lexical matters}\label{sec:lex-syntax} |
64 \section{Lexical matters}\label{sec:lex-syntax} |
43 |
65 |
44 The Isabelle/Isar outer syntax provides token classes as presented below. |
66 The Isabelle/Isar outer syntax provides token classes as presented below. |
45 Note that some of these coincide (by full intention) with the inner lexical |
67 Note that some of these coincide (by full intention) with the inner lexical |
46 syntax as presented in \cite{isabelle-ref}. These different levels of syntax |
68 syntax as presented in \cite{isabelle-ref}. |
47 should not be confused, though. |
69 |
48 |
|
49 %FIXME keyword, command |
|
50 \indexoutertoken{ident}\indexoutertoken{longident}\indexoutertoken{symident} |
70 \indexoutertoken{ident}\indexoutertoken{longident}\indexoutertoken{symident} |
51 \indexoutertoken{nat}\indexoutertoken{var}\indexoutertoken{typefree} |
71 \indexoutertoken{nat}\indexoutertoken{var}\indexoutertoken{typefree} |
52 \indexoutertoken{typevar}\indexoutertoken{string}\indexoutertoken{verbatim} |
72 \indexoutertoken{typevar}\indexoutertoken{string}\indexoutertoken{verbatim} |
53 \begin{matharray}{rcl} |
73 \begin{matharray}{rcl} |
54 ident & = & letter~quasiletter^* \\ |
74 ident & = & letter~quasiletter^* \\ |
67 quasiletter & = & letter ~|~ digit ~|~ \verb,_, ~|~ \verb,', \\ |
87 quasiletter & = & letter ~|~ digit ~|~ \verb,_, ~|~ \verb,', \\ |
68 sym & = & \verb,!, ~|~ \verb,#, ~|~ \verb,$, ~|~ \verb,%, ~|~ \verb,&, ~|~ %$ |
88 sym & = & \verb,!, ~|~ \verb,#, ~|~ \verb,$, ~|~ \verb,%, ~|~ \verb,&, ~|~ %$ |
69 \verb,*, ~|~ \verb,+, ~|~ \verb,-, ~|~ \verb,/, ~|~ \verb,:, ~|~ \\ |
89 \verb,*, ~|~ \verb,+, ~|~ \verb,-, ~|~ \verb,/, ~|~ \verb,:, ~|~ \\ |
70 & & \verb,<, ~|~ \verb,=, ~|~ \verb,>, ~|~ \verb,?, ~|~ \texttt{\at} ~|~ |
90 & & \verb,<, ~|~ \verb,=, ~|~ \verb,>, ~|~ \verb,?, ~|~ \texttt{\at} ~|~ |
71 \verb,^, ~|~ \verb,_, ~|~ \verb,`, ~|~ \verb,|, ~|~ \verb,~, \\ |
91 \verb,^, ~|~ \verb,_, ~|~ \verb,`, ~|~ \verb,|, ~|~ \verb,~, \\ |
72 symbol & = & {\forall} ~|~ {\exists} ~|~ \dots |
92 symbol & = & {\forall} ~|~ {\exists} ~|~ {\land} ~|~ {\lor} ~|~ \dots |
73 \end{matharray} |
93 \end{matharray} |
74 |
94 |
75 The syntax of \texttt{string} admits any characters, including newlines; |
95 The syntax of \railtoken{string} admits any characters, including newlines; |
76 ``\verb|"|'' (double-quote) and ``\verb|\|'' (backslash) have to be escaped by |
96 ``\verb|"|'' (double-quote) and ``\verb|\|'' (backslash) need to be escaped by |
77 a backslash; newlines need not be escaped. Note that ML-style control |
97 a backslash. Note that ML-style control characters are \emph{not} supported. |
78 characters are \emph{not} supported. The body of \texttt{verbatim} may |
98 The body of \railtoken{verbatim} may consist of any text not containing |
79 consist of any text not containing ``\verb|*}|''. |
99 ``\verb|*}|''; this allows handsome inclusion of quotes without further |
80 |
100 escapes. |
81 Comments take the form \texttt{(*~\dots~*)} and may be |
101 |
82 nested\footnote{Proof~General may occasionally get confused by nested |
102 Comments take the form \texttt{(*~\dots~*)} and may in principle be nested, |
83 comments.}, just as in ML. Note that these are \emph{source} comments only, |
103 just as in ML. Note that these are \emph{source} comments only, which are |
84 which are stripped after lexical analysis of the input. The Isar document |
104 stripped after lexical analysis of the input. The Isar document syntax also |
85 syntax also provides \emph{formal comments} that are actually part of the text |
105 provides \emph{formal comments} that are actually part of the text (see |
86 (see \S\ref{sec:comments}). |
106 \S\ref{sec:comments}). |
|
107 |
|
108 \begin{warn} |
|
109 Proof~General does not handle nested comments properly; it is also unable to |
|
110 keep \verb,(*,\,/\,\verb,{*, and \verb,*),\,/\,\verb,*}, apart, despite |
|
111 their rather different meaning. These are inherent problems of Emacs |
|
112 legacy. |
|
113 \end{warn} |
|
114 |
|
115 \medskip |
87 |
116 |
88 Mathematical symbols such as ``$\forall$'' are represented in plain ASCII as |
117 Mathematical symbols such as ``$\forall$'' are represented in plain ASCII as |
89 ``\verb,\<forall>,''. |
118 ``\verb,\<forall>,''. Concerning Isabelle itself, any sequence of the form |
|
119 \verb,\<,$ident$\verb,>, (or \verb,\\<,$ident$\verb,>,) is a legal symbol. |
|
120 Display of appropriate glyphs is a matter of front-end tools, say the |
|
121 user-interface of Proof~General plus the X-Symbol package, or the {\LaTeX} |
|
122 macro setup of document output. A list of predefined Isabelle symbols is |
|
123 given in \cite[Appendix~A]{isabelle-sys}. |
90 |
124 |
91 |
125 |
92 \section{Common syntax entities} |
126 \section{Common syntax entities} |
93 |
127 |
94 Subsequently, we introduce several basic syntactic entities, such as names, |
128 Subsequently, we introduce several basic syntactic entities, such as names, |
129 \subsection{Comments}\label{sec:comments} |
163 \subsection{Comments}\label{sec:comments} |
130 |
164 |
131 Large chunks of plain \railqtoken{text} are usually given |
165 Large chunks of plain \railqtoken{text} are usually given |
132 \railtoken{verbatim}, i.e.\ enclosed in \verb|{*|~\dots~\verb|*}|. For |
166 \railtoken{verbatim}, i.e.\ enclosed in \verb|{*|~\dots~\verb|*}|. For |
133 convenience, any of the smaller text units conforming to \railqtoken{nameref} |
167 convenience, any of the smaller text units conforming to \railqtoken{nameref} |
134 are admitted as well. Almost any of the Isar commands may be annotated by |
168 are admitted as well. Almost any of the Isar commands may be annotated by a |
135 marginal \railnonterm{comment} of the form \texttt{--} \railqtoken{text}. |
169 marginal \railnonterm{comment} of the form \texttt{--} \railqtoken{text}. |
136 Note that the latter kind of comment is actually part of the language, while |
170 Note that the latter kind of comment is actually part of the language, while |
137 source level comments \verb|(*|~\dots~\verb|*)| are stripped at the lexical |
171 source level comments \verb|(*|~\dots~\verb|*)| are stripped at the lexical |
138 level. A few commands such as $\PROOFNAME$ admit additional markup with a |
172 level. |
139 ``level of interest'': \texttt{\%} followed by an optional number $n$ (default |
173 |
140 $n = 1$) indicates that the respective part of the document becomes $n$ levels |
174 A few commands such as $\PROOFNAME$ admit additional markup with a ``level of |
141 more obscure; \texttt{\%\%} means that interest drops by $\infty$ --- abandon |
175 interest'': \texttt{\%} followed by an optional number $n$ (default $n = 1$) |
142 every hope, who enter here. |
176 indicates that the respective part of the document becomes $n$ levels more |
|
177 obscure; \texttt{\%\%} means that interest drops by $\infty$ --- abandon every |
|
178 hope, who enter here. So far the Isabelle tool-chain (for document output |
|
179 etc.) does not yet treat interest levels specifically. |
143 |
180 |
144 \indexoutertoken{text}\indexouternonterm{comment}\indexouternonterm{interest} |
181 \indexoutertoken{text}\indexouternonterm{comment}\indexouternonterm{interest} |
145 \begin{rail} |
182 \begin{rail} |
146 text: verbatim | nameref |
183 text: verbatim | nameref |
147 ; |
184 ; |
182 too sophisticated in order to be modelled explicitly at the outer theory |
219 too sophisticated in order to be modelled explicitly at the outer theory |
183 level. Basically, any such entity has to be quoted to turn it into a single |
220 level. Basically, any such entity has to be quoted to turn it into a single |
184 token (the parsing and type-checking is performed internally later). For |
221 token (the parsing and type-checking is performed internally later). For |
185 convenience, a slightly more liberal convention is adopted: quotes may be |
222 convenience, a slightly more liberal convention is adopted: quotes may be |
186 omitted for any type or term that is already \emph{atomic} at the outer level. |
223 omitted for any type or term that is already \emph{atomic} at the outer level. |
187 For example, one may write just \texttt{x} instead of \texttt{"x"}. Note that |
224 For example, one may just write \texttt{x} instead of \texttt{"x"}. Note that |
188 symbolic identifiers (e.g.\ \texttt{++} or $\forall$) are available as well, |
225 symbolic identifiers (e.g.\ \texttt{++} or $\forall$) are available as well, |
189 provided these are not superseded by commands or keywords (e.g.\ \texttt{+}). |
226 provided these have not been superseded by commands or other keywords already |
|
227 (e.g.\ \texttt{=} or \texttt{+}). |
190 |
228 |
191 \indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop} |
229 \indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop} |
192 \begin{rail} |
230 \begin{rail} |
193 type: nameref | typefree | typevar |
231 type: nameref | typefree | typevar |
194 ; |
232 ; |
219 typespec: (() | typefree | '(' ( typefree + ',' ) ')') name |
257 typespec: (() | typefree | '(' ( typefree + ',' ) ')') name |
220 ; |
258 ; |
221 \end{rail} |
259 \end{rail} |
222 |
260 |
223 |
261 |
224 \subsection{Term patterns}\label{sec:term-pats} |
|
225 |
|
226 Assumptions and goal statements usually admit casual binding of schematic term |
|
227 variables by giving (optional) patterns of the form $\ISS{p@1\;\dots}{p@n}$. |
|
228 There are separate versions available for \railqtoken{term}s and |
|
229 \railqtoken{prop}s. The latter provides a $\CONCLNAME$ part with patterns |
|
230 referring the (atomic) conclusion of a rule. |
|
231 |
|
232 \indexouternonterm{termpat}\indexouternonterm{proppat} |
|
233 \begin{rail} |
|
234 termpat: '(' ('is' term +) ')' |
|
235 ; |
|
236 proppat: '(' (('is' prop +) | 'concl' ('is' prop +) | ('is' prop +) 'concl' ('is' prop +)) ')' |
|
237 ; |
|
238 \end{rail} |
|
239 |
|
240 |
|
241 \subsection{Mixfix annotations} |
262 \subsection{Mixfix annotations} |
242 |
263 |
243 Mixfix annotations specify concrete \emph{inner} syntax of Isabelle types and |
264 Mixfix annotations specify concrete \emph{inner} syntax of Isabelle types and |
244 terms (see also \cite{isabelle-ref}). Some commands such as $\TYPES$ (see |
265 terms. Some commands such as $\TYPES$ (see \S\ref{sec:types-pure}) admit |
245 \S\ref{sec:types-pure}) admit infixes only, while $\CONSTS$ (see |
266 infixes only, while $\CONSTS$ (see \S\ref{sec:consts}) and |
246 \S\ref{sec:consts}) and $\isarkeyword{syntax}$ (see \S\ref{sec:syn-trans}) |
267 $\isarkeyword{syntax}$ (see \S\ref{sec:syn-trans}) support the full range of |
247 support the full range of general mixfixes and binders. |
268 general mixfixes and binders. |
248 |
269 |
249 \indexouternonterm{infix}\indexouternonterm{mixfix} |
270 \indexouternonterm{infix}\indexouternonterm{mixfix} |
250 \begin{rail} |
271 \begin{rail} |
251 infix: '(' ('infix' | 'infixl' | 'infixr') string? nat ')' |
272 infix: '(' ('infix' | 'infixl' | 'infixr') string? nat ')' |
252 ; |
273 ; |
253 mixfix: infix | '(' string prios? nat? ')' | '(' 'binder' string prios? nat ')' |
274 mixfix: infix | '(' string prios? nat? ')' | '(' 'binder' string prios? nat ')' |
254 ; |
275 ; |
255 |
276 |
256 prios: '[' (nat + ',') ']' |
277 prios: '[' (nat + ',') ']' |
|
278 ; |
|
279 \end{rail} |
|
280 |
|
281 Here the \railtoken{string} specifications refer to the actual mixfix template |
|
282 (see also \cite{isabelle-ref}), which may include literal text, spacing, |
|
283 blocks, and arguments (denoted by ``$_$''); the special symbol \verb,\<index>, |
|
284 (printed as ``\i'') represents an index argument that specifies an implicit |
|
285 structure reference (see also \S\ref{sec:locale}). Infix and binder |
|
286 declarations provide common abbreviations for particular mixfix declarations. |
|
287 So in practice, mixfix templates mostly degenerate to literal text for |
|
288 concrete syntax, such as ``\verb,++,'' for an infix symbol, or ``\verb,++,\i'' |
|
289 for an infix of an implicit structure. |
|
290 |
|
291 |
|
292 |
|
293 \subsection{Proof methods}\label{sec:syn-meth} |
|
294 |
|
295 Proof methods are either basic ones, or expressions composed of methods via |
|
296 ``\texttt{,}'' (sequential composition), ``\texttt{|}'' (alternative choices), |
|
297 ``\texttt{?}'' (try), ``\texttt{+}'' (repeat at least once). In practice, |
|
298 proof methods are usually just a comma separated list of |
|
299 \railqtoken{nameref}~\railnonterm{args} specifications. Note that parentheses |
|
300 may be dropped for single method specifications (with no arguments). |
|
301 |
|
302 \indexouternonterm{method} |
|
303 \begin{rail} |
|
304 method: (nameref | '(' methods ')') (() | '?' | '+') |
|
305 ; |
|
306 methods: (nameref args | method) + (',' | '|') |
|
307 ; |
|
308 \end{rail} |
|
309 |
|
310 Proper use of Isar proof methods does \emph{not} involve goal addressing. |
|
311 Nevertheless, specifying goal ranges may occasionally come in handy in |
|
312 emulating tactic scripts. Note that $[n-]$ refers to all goals, starting from |
|
313 $n$. All goals may be specified by $[!]$, which is the same as $[1-]$. |
|
314 |
|
315 \indexouternonterm{goalspec} |
|
316 \begin{rail} |
|
317 goalspec: '[' (nat '-' nat | nat '-' | nat | '!' ) ']' |
257 ; |
318 ; |
258 \end{rail} |
319 \end{rail} |
259 |
320 |
260 |
321 |
261 \subsection{Attributes and theorems}\label{sec:syn-att} |
322 \subsection{Attributes and theorems}\label{sec:syn-att} |
307 thmbind: name attributes | name | attributes |
368 thmbind: name attributes | name | attributes |
308 ; |
369 ; |
309 \end{rail} |
370 \end{rail} |
310 |
371 |
311 |
372 |
312 \subsection{Proof methods}\label{sec:syn-meth} |
373 \subsection{Term patterns and declarations}\label{sec:term-decls} |
313 |
374 |
314 Proof methods are either basic ones, or expressions composed of methods via |
375 Wherever explicit propositions (or term fragments) occur in a proof text, |
315 ``\texttt{,}'' (sequential composition), ``\texttt{|}'' (alternative choices), |
376 casual binding of schematic term variables may be given specified via patterns |
316 ``\texttt{?}'' (try), ``\texttt{+}'' (repeat at least once). In practice, |
377 of the form $\ISS{p@1\;\dots}{p@n}$. There are separate versions available |
317 proof methods are usually just a comma separated list of |
378 for \railqtoken{term}s and \railqtoken{prop}s. The latter provides a |
318 \railqtoken{nameref}~\railnonterm{args} specifications. Note that parentheses |
379 $\CONCLNAME$ part with patterns referring the (atomic) conclusion of a rule. |
319 may be dropped for single method specifications (with no arguments). |
380 |
320 |
381 \indexouternonterm{termpat}\indexouternonterm{proppat} |
321 \indexouternonterm{method} |
382 \begin{rail} |
322 \begin{rail} |
383 termpat: '(' ('is' term +) ')' |
323 method: (nameref | '(' methods ')') (() | '?' | '+') |
384 ; |
324 ; |
385 proppat: '(' (('is' prop +) | 'concl' ('is' prop +) | ('is' prop +) 'concl' ('is' prop +)) ')' |
325 methods: (nameref args | method) + (',' | '|') |
386 ; |
326 ; |
387 \end{rail} |
327 \end{rail} |
388 |
328 |
389 Declarations of local variables $x :: \tau$ and logical propositions $a : |
329 Proper use of Isar proof methods does \emph{not} involve goal addressing. |
390 \phi$ represent different views on the same principle of introducing a local |
330 Nevertheless, specifying goal ranges may occasionally come in handy in |
391 scope. In practice, one may usually omit the typing of $vars$ (due to |
331 emulating tactic scripts. Note that $[n-]$ refers to all goals, starting from |
392 type-inference), and the naming of propositions (due to implicit chaining of |
332 $n$. All goals may be specified by $[!]$, which is the same as $[1-]$. |
393 emerging facts). In any case, Isar proof elements usually admit to introduce |
333 |
394 multiple such items simultaneously. |
334 \indexouternonterm{goalspec} |
395 |
335 \begin{rail} |
396 \indexouternonterm{vars}\indexouternonterm{props} |
336 goalspec: '[' (nat '-' nat | nat '-' | nat | '!' ) ']' |
397 \begin{rail} |
337 ; |
398 vars: (name+) ('::' type)? |
338 \end{rail} |
399 ; |
|
400 props: thmdecl? (prop proppat? +) |
|
401 ; |
|
402 \end{rail} |
|
403 |
|
404 The treatment of multiple declarations corresponds to the complementary focus |
|
405 of $vars$ versus $props$: in ``$x@1~\dots~x@n :: \tau$'' the typing refers to |
|
406 all variables, while in $a\colon \phi@1~\dots~\phi@n$ the naming refers to all |
|
407 propositions collectively. Isar language elements that refer to $vars$ or |
|
408 $props$ typically admit separate typings or namings via another level of |
|
409 iteration, with explicit $\AND$ separators; e.g.\ see $\FIXNAME$ and |
|
410 $\ASSUMENAME$ in \S\ref{sec:proof-context}. |
339 |
411 |
340 |
412 |
341 \subsection{Antiquotations}\label{sec:antiq} |
413 \subsection{Antiquotations}\label{sec:antiq} |
342 |
414 |
343 \begin{matharray}{rcl} |
415 \begin{matharray}{rcl} |