| author | wenzelm | 
| Mon, 15 May 2023 20:55:17 +0200 | |
| changeset 78060 | b6c886b7184f | 
| parent 76987 | 4c275405faae | 
| child 80773 | 83d21da2bc59 | 
| permissions | -rw-r--r-- | 
| 61656 | 1 | (*:maxLineLen=78:*) | 
| 2 | ||
| 28762 | 3 | theory Inner_Syntax | 
| 63531 | 4 | imports Main Base | 
| 28762 | 5 | begin | 
| 6 | ||
| 58618 | 7 | chapter \<open>Inner syntax --- the term language \label{ch:inner-syntax}\<close>
 | 
| 28762 | 8 | |
| 62106 | 9 | text \<open> | 
| 10 | The inner syntax of Isabelle provides concrete notation for the main | |
| 11 | entities of the logical framework, notably \<open>\<lambda>\<close>-terms with types and type | |
| 12 | classes. Applications may either extend existing syntactic categories by | |
| 13 | additional notation, or define new sub-languages that are linked to the | |
| 14 | standard term language via some explicit markers. For example \<^verbatim>\<open>FOO\<close>~\<open>foo\<close> | |
| 15 | could embed the syntax corresponding for some user-defined nonterminal \<open>foo\<close> | |
| 16 | --- within the bounds of the given lexical syntax of Isabelle/Pure. | |
| 46282 | 17 | |
| 62106 | 18 | The most basic way to specify concrete syntax for logical entities works via | 
| 19 |   mixfix annotations (\secref{sec:mixfix}), which may be usually given as part
 | |
| 20 | of the original declaration or via explicit notation commands later on | |
| 21 |   (\secref{sec:notation}). This already covers many needs of concrete syntax
 | |
| 22 | without having to understand the full complexity of inner syntax layers. | |
| 46282 | 23 | |
| 62106 | 24 | Further details of the syntax engine involves the classical distinction of | 
| 25 |   lexical language versus context-free grammar (see \secref{sec:pure-syntax}),
 | |
| 26 | and various mechanisms for \<^emph>\<open>syntax transformations\<close> (see | |
| 27 |   \secref{sec:syntax-transformations}).
 | |
| 58618 | 28 | \<close> | 
| 46282 | 29 | |
| 30 | ||
| 58618 | 31 | section \<open>Printing logical entities\<close> | 
| 28762 | 32 | |
| 58618 | 33 | subsection \<open>Diagnostic commands \label{sec:print-diag}\<close>
 | 
| 28762 | 34 | |
| 58618 | 35 | text \<open> | 
| 28762 | 36 |   \begin{matharray}{rcl}
 | 
| 61493 | 37 |     @{command_def "typ"}\<open>\<^sup>*\<close> & : & \<open>context \<rightarrow>\<close> \\
 | 
| 38 |     @{command_def "term"}\<open>\<^sup>*\<close> & : & \<open>context \<rightarrow>\<close> \\
 | |
| 39 |     @{command_def "prop"}\<open>\<^sup>*\<close> & : & \<open>context \<rightarrow>\<close> \\
 | |
| 40 |     @{command_def "thm"}\<open>\<^sup>*\<close> & : & \<open>context \<rightarrow>\<close> \\
 | |
| 41 |     @{command_def "prf"}\<open>\<^sup>*\<close> & : & \<open>context \<rightarrow>\<close> \\
 | |
| 42 |     @{command_def "full_prf"}\<open>\<^sup>*\<close> & : & \<open>context \<rightarrow>\<close> \\
 | |
| 43 |     @{command_def "print_state"}\<open>\<^sup>*\<close> & : & \<open>any \<rightarrow>\<close> \\
 | |
| 28762 | 44 |   \end{matharray}
 | 
| 45 | ||
| 46 | These diagnostic commands assist interactive development by printing | |
| 47 | internal logical entities in a human-readable fashion. | |
| 48 | ||
| 69597 | 49 | \<^rail>\<open> | 
| 48792 | 50 |     @@{command typ} @{syntax modes}? @{syntax type} ('::' @{syntax sort})?
 | 
| 28762 | 51 | ; | 
| 42596 
6c621a9d612a
modernized rail diagrams using @{rail} antiquotation;
 wenzelm parents: 
42358diff
changeset | 52 |     @@{command term} @{syntax modes}? @{syntax term}
 | 
| 28762 | 53 | ; | 
| 42596 
6c621a9d612a
modernized rail diagrams using @{rail} antiquotation;
 wenzelm parents: 
42358diff
changeset | 54 |     @@{command prop} @{syntax modes}? @{syntax prop}
 | 
| 28762 | 55 | ; | 
| 62969 | 56 |     @@{command thm} @{syntax modes}? @{syntax thms}
 | 
| 28762 | 57 | ; | 
| 62969 | 58 |     ( @@{command prf} | @@{command full_prf} ) @{syntax modes}? @{syntax thms}?
 | 
| 28762 | 59 | ; | 
| 52430 | 60 |     @@{command print_state} @{syntax modes}?
 | 
| 28762 | 61 | ; | 
| 42596 
6c621a9d612a
modernized rail diagrams using @{rail} antiquotation;
 wenzelm parents: 
42358diff
changeset | 62 |     @{syntax_def modes}: '(' (@{syntax name} + ) ')'
 | 
| 69597 | 63 | \<close> | 
| 28762 | 64 | |
| 62106 | 65 |   \<^descr> @{command "typ"}~\<open>\<tau>\<close> reads and prints a type expression according to the
 | 
| 66 | current context. | |
| 48792 | 67 | |
| 62106 | 68 |   \<^descr> @{command "typ"}~\<open>\<tau> :: s\<close> uses type-inference to determine the most
 | 
| 69 | general way to make \<open>\<tau>\<close> conform to sort \<open>s\<close>. For concrete \<open>\<tau>\<close> this checks if | |
| 70 | the type belongs to that sort. Dummy type parameters ``\<open>_\<close>'' (underscore) | |
| 71 | are assigned to fresh type variables with most general sorts, according the | |
| 72 | the principles of type-inference. | |
| 28766 
accab7594b8e
misc tuning and rearrangement of section "Printing logical entities";
 wenzelm parents: 
28765diff
changeset | 73 | |
| 62106 | 74 |     \<^descr> @{command "term"}~\<open>t\<close> and @{command "prop"}~\<open>\<phi>\<close> read, type-check and
 | 
| 75 | print terms or propositions according to the current theory or proof | |
| 76 | context; the inferred type of \<open>t\<close> is output as well. Note that these | |
| 77 | commands are also useful in inspecting the current environment of term | |
| 78 | abbreviations. | |
| 28762 | 79 | |
| 62106 | 80 |     \<^descr> @{command "thm"}~\<open>a\<^sub>1 \<dots> a\<^sub>n\<close> retrieves theorems from the current theory
 | 
| 81 | or proof context. Note that any attributes included in the theorem | |
| 82 | specifications are applied to a temporary context derived from the current | |
| 83 | theory or proof; the result is discarded, i.e.\ attributes involved in | |
| 84 | \<open>a\<^sub>1, \<dots>, a\<^sub>n\<close> do not have any permanent effect. | |
| 28762 | 85 | |
| 62106 | 86 |     \<^descr> @{command "prf"} displays the (compact) proof term of the current proof
 | 
| 63624 
994d1a1105ef
more informative 'prf' and 'full_prf', based on HOL/Proofs/ex/XML_Data.thy;
 wenzelm parents: 
63531diff
changeset | 87 | state (if present), or of the given theorems. Note that this requires an | 
| 
994d1a1105ef
more informative 'prf' and 'full_prf', based on HOL/Proofs/ex/XML_Data.thy;
 wenzelm parents: 
63531diff
changeset | 88 | underlying logic image with proof terms enabled, e.g. \<open>HOL-Proofs\<close>. | 
| 28762 | 89 | |
| 62106 | 90 |     \<^descr> @{command "full_prf"} is like @{command "prf"}, but displays the full
 | 
| 91 | proof term, i.e.\ also displays information omitted in the compact proof | |
| 92 | term, which is denoted by ``\<open>_\<close>'' placeholders there. | |
| 28762 | 93 | |
| 62106 | 94 |     \<^descr> @{command "print_state"} prints the current proof state (if present),
 | 
| 95 | including current facts and goals. | |
| 28762 | 96 | |
| 61997 | 97 | All of the diagnostic commands above admit a list of \<open>modes\<close> to be | 
| 98 | specified, which is appended to the current print mode; see also | |
| 99 |   \secref{sec:print-modes}. Thus the output behavior may be modified according
 | |
| 100 |   particular print mode features. For example, @{command
 | |
| 101 | "print_state"}~\<open>(latex)\<close> prints the current proof state with mathematical | |
| 102 |   symbols and special characters represented in {\LaTeX} source, according to
 | |
| 76987 | 103 | the Isabelle style \<^cite>\<open>"isabelle-system"\<close>. | 
| 28762 | 104 | |
| 62106 | 105 |   Note that antiquotations (cf.\ \secref{sec:antiq}) provide a more systematic
 | 
| 106 | way to include formal items into the printed text document. | |
| 58618 | 107 | \<close> | 
| 28762 | 108 | |
| 109 | ||
| 58618 | 110 | subsection \<open>Details of printed content\<close> | 
| 28763 
b5e6122ff575
added pretty printing options (from old ref manual);
 wenzelm parents: 
28762diff
changeset | 111 | |
| 58618 | 112 | text \<open> | 
| 42655 | 113 |   \begin{tabular}{rcll}
 | 
| 61493 | 114 |     @{attribute_def show_markup} & : & \<open>attribute\<close> \\
 | 
| 115 |     @{attribute_def show_types} & : & \<open>attribute\<close> & default \<open>false\<close> \\
 | |
| 116 |     @{attribute_def show_sorts} & : & \<open>attribute\<close> & default \<open>false\<close> \\
 | |
| 117 |     @{attribute_def show_consts} & : & \<open>attribute\<close> & default \<open>false\<close> \\
 | |
| 118 |     @{attribute_def show_abbrevs} & : & \<open>attribute\<close> & default \<open>true\<close> \\
 | |
| 119 |     @{attribute_def show_brackets} & : & \<open>attribute\<close> & default \<open>false\<close> \\
 | |
| 120 |     @{attribute_def names_long} & : & \<open>attribute\<close> & default \<open>false\<close> \\
 | |
| 121 |     @{attribute_def names_short} & : & \<open>attribute\<close> & default \<open>false\<close> \\
 | |
| 122 |     @{attribute_def names_unique} & : & \<open>attribute\<close> & default \<open>true\<close> \\
 | |
| 123 |     @{attribute_def eta_contract} & : & \<open>attribute\<close> & default \<open>true\<close> \\
 | |
| 124 |     @{attribute_def goals_limit} & : & \<open>attribute\<close> & default \<open>10\<close> \\
 | |
| 125 |     @{attribute_def show_main_goal} & : & \<open>attribute\<close> & default \<open>false\<close> \\
 | |
| 126 |     @{attribute_def show_hyps} & : & \<open>attribute\<close> & default \<open>false\<close> \\
 | |
| 127 |     @{attribute_def show_tags} & : & \<open>attribute\<close> & default \<open>false\<close> \\
 | |
| 128 |     @{attribute_def show_question_marks} & : & \<open>attribute\<close> & default \<open>true\<close> \\
 | |
| 42655 | 129 |   \end{tabular}
 | 
| 61421 | 130 | \<^medskip> | 
| 28763 
b5e6122ff575
added pretty printing options (from old ref manual);
 wenzelm parents: 
28762diff
changeset | 131 | |
| 62106 | 132 | These configuration options control the detail of information that is | 
| 133 | displayed for types, terms, theorems, goals etc. See also | |
| 42655 | 134 |   \secref{sec:config}.
 | 
| 28765 
da8f6f4a74be
misc tuning and rearrangement of section "Printing logical entities";
 wenzelm parents: 
28763diff
changeset | 135 | |
| 62106 | 136 |   \<^descr> @{attribute show_markup} controls direct inlining of markup into the
 | 
| 137 | printed representation of formal entities --- notably type and sort | |
| 138 | constraints. This enables Prover IDE users to retrieve that information via | |
| 139 | tooltips or popups while hovering with the mouse over the output window, for | |
| 140 | example. Consequently, this option is enabled by default for Isabelle/jEdit. | |
| 49699 | 141 | |
| 62106 | 142 |   \<^descr> @{attribute show_types} and @{attribute show_sorts} control printing of
 | 
| 143 | type constraints for term variables, and sort constraints for type | |
| 144 |   variables. By default, neither of these are shown in output. If @{attribute
 | |
| 145 | show_sorts} is enabled, types are always shown as well. In Isabelle/jEdit, | |
| 146 | manual setting of these options is normally not required thanks to | |
| 147 |   @{attribute show_markup} above.
 | |
| 28763 
b5e6122ff575
added pretty printing options (from old ref manual);
 wenzelm parents: 
28762diff
changeset | 148 | |
| 62106 | 149 | Note that displaying types and sorts may explain why a polymorphic inference | 
| 150 | rule fails to resolve with some goal, or why a rewrite rule does not apply | |
| 151 | as expected. | |
| 28765 
da8f6f4a74be
misc tuning and rearrangement of section "Printing logical entities";
 wenzelm parents: 
28763diff
changeset | 152 | |
| 62106 | 153 |   \<^descr> @{attribute show_consts} controls printing of types of constants when
 | 
| 154 | displaying a goal state. | |
| 28763 
b5e6122ff575
added pretty printing options (from old ref manual);
 wenzelm parents: 
28762diff
changeset | 155 | |
| 62106 | 156 | Note that the output can be enormous, because polymorphic constants often | 
| 157 | occur at several different type instances. | |
| 158 | ||
| 159 |   \<^descr> @{attribute show_abbrevs} controls folding of constant abbreviations.
 | |
| 40879 
ca132ef44944
configuration option "show_abbrevs" supersedes print mode "no_abbrevs", with inverted meaning;
 wenzelm parents: 
40255diff
changeset | 160 | |
| 62106 | 161 |   \<^descr> @{attribute show_brackets} controls bracketing in pretty printed output.
 | 
| 162 | If enabled, all sub-expressions of the pretty printing tree will be | |
| 163 | parenthesized, even if this produces malformed term syntax! This crude way | |
| 164 | of showing the internal structure of pretty printed entities may | |
| 165 | occasionally help to diagnose problems with operator priorities, for | |
| 166 | example. | |
| 28763 
b5e6122ff575
added pretty printing options (from old ref manual);
 wenzelm parents: 
28762diff
changeset | 167 | |
| 62106 | 168 |   \<^descr> @{attribute names_long}, @{attribute names_short}, and @{attribute
 | 
| 169 | names_unique} control the way of printing fully qualified internal names in | |
| 170 |   external form. See also \secref{sec:antiq} for the document antiquotation
 | |
| 171 | options of the same names. | |
| 42358 
b47d41d9f4b5
Name_Space: proper configuration options long_names, short_names, unique_names instead of former unsynchronized references;
 wenzelm parents: 
42279diff
changeset | 172 | |
| 62106 | 173 |   \<^descr> @{attribute eta_contract} controls \<open>\<eta>\<close>-contracted printing of terms.
 | 
| 28763 
b5e6122ff575
added pretty printing options (from old ref manual);
 wenzelm parents: 
28762diff
changeset | 174 | |
| 69597 | 175 | The \<open>\<eta>\<close>-contraction law asserts \<^prop>\<open>(\<lambda>x. f x) \<equiv> f\<close>, provided \<open>x\<close> is not | 
| 176 | free in \<open>f\<close>. It asserts \<^emph>\<open>extensionality\<close> of functions: \<^prop>\<open>f \<equiv> g\<close> if | |
| 177 | \<^prop>\<open>f x \<equiv> g x\<close> for all \<open>x\<close>. Higher-order unification frequently puts | |
| 62106 | 178 | terms into a fully \<open>\<eta>\<close>-expanded form. For example, if \<open>F\<close> has type \<open>(\<tau> \<Rightarrow> \<tau>) | 
| 69597 | 179 | \<Rightarrow> \<tau>\<close> then its expanded form is \<^term>\<open>\<lambda>h. F (\<lambda>x. h x)\<close>. | 
| 28763 
b5e6122ff575
added pretty printing options (from old ref manual);
 wenzelm parents: 
28762diff
changeset | 180 | |
| 62106 | 181 |   Enabling @{attribute eta_contract} makes Isabelle perform \<open>\<eta>\<close>-contractions
 | 
| 69597 | 182 | before printing, so that \<^term>\<open>\<lambda>h. F (\<lambda>x. h x)\<close> appears simply as \<open>F\<close>. | 
| 28763 
b5e6122ff575
added pretty printing options (from old ref manual);
 wenzelm parents: 
28762diff
changeset | 183 | |
| 62106 | 184 | Note that the distinction between a term and its \<open>\<eta>\<close>-expanded form | 
| 185 | occasionally matters. While higher-order resolution and rewriting operate | |
| 186 | modulo \<open>\<alpha>\<beta>\<eta>\<close>-conversion, some other tools might look at terms more | |
| 187 | discretely. | |
| 28763 
b5e6122ff575
added pretty printing options (from old ref manual);
 wenzelm parents: 
28762diff
changeset | 188 | |
| 62106 | 189 |   \<^descr> @{attribute goals_limit} controls the maximum number of subgoals to be
 | 
| 190 | printed. | |
| 28763 
b5e6122ff575
added pretty printing options (from old ref manual);
 wenzelm parents: 
28762diff
changeset | 191 | |
| 62106 | 192 |   \<^descr> @{attribute show_main_goal} controls whether the main result to be proven
 | 
| 193 | should be displayed. This information might be relevant for schematic goals, | |
| 194 | to inspect the current claim that has been synthesized so far. | |
| 28763 
b5e6122ff575
added pretty printing options (from old ref manual);
 wenzelm parents: 
28762diff
changeset | 195 | |
| 62106 | 196 |   \<^descr> @{attribute show_hyps} controls printing of implicit hypotheses of local
 | 
| 197 | facts. Normally, only those hypotheses are displayed that are \<^emph>\<open>not\<close> covered | |
| 198 | by the assumptions of the current context: this situation indicates a fault | |
| 199 | in some tool being used. | |
| 28763 
b5e6122ff575
added pretty printing options (from old ref manual);
 wenzelm parents: 
28762diff
changeset | 200 | |
| 62106 | 201 |   By enabling @{attribute show_hyps}, output of \<^emph>\<open>all\<close> hypotheses can be
 | 
| 202 | enforced, which is occasionally useful for diagnostic purposes. | |
| 28763 
b5e6122ff575
added pretty printing options (from old ref manual);
 wenzelm parents: 
28762diff
changeset | 203 | |
| 62106 | 204 |   \<^descr> @{attribute show_tags} controls printing of extra annotations within
 | 
| 205 | theorems, such as internal position information, or the case names being | |
| 206 |   attached by the attribute @{attribute case_names}.
 | |
| 28765 
da8f6f4a74be
misc tuning and rearrangement of section "Printing logical entities";
 wenzelm parents: 
28763diff
changeset | 207 | |
| 62106 | 208 |   Note that the @{attribute tagged} and @{attribute untagged} attributes
 | 
| 209 | provide low-level access to the collection of tags associated with a | |
| 210 | theorem. | |
| 28765 
da8f6f4a74be
misc tuning and rearrangement of section "Printing logical entities";
 wenzelm parents: 
28763diff
changeset | 211 | |
| 62106 | 212 |   \<^descr> @{attribute show_question_marks} controls printing of question marks for
 | 
| 213 | schematic variables, such as \<open>?x\<close>. Only the leading question mark is | |
| 214 | affected, the remaining text is unchanged (including proper markup for | |
| 215 | schematic variables that might be relevant for user interfaces). | |
| 58618 | 216 | \<close> | 
| 28765 
da8f6f4a74be
misc tuning and rearrangement of section "Printing logical entities";
 wenzelm parents: 
28763diff
changeset | 217 | |
| 
da8f6f4a74be
misc tuning and rearrangement of section "Printing logical entities";
 wenzelm parents: 
28763diff
changeset | 218 | |
| 58618 | 219 | subsection \<open>Alternative print modes \label{sec:print-modes}\<close>
 | 
| 46284 | 220 | |
| 58618 | 221 | text \<open> | 
| 46284 | 222 |   \begin{mldecls}
 | 
| 73764 | 223 |     @{define_ML print_mode_value: "unit -> string list"} \\
 | 
| 224 |     @{define_ML Print_Mode.with_modes: "string list -> ('a -> 'b) -> 'a -> 'b"} \\
 | |
| 46284 | 225 |   \end{mldecls}
 | 
| 226 | ||
| 62106 | 227 | The \<^emph>\<open>print mode\<close> facility allows to modify various operations for printing. | 
| 228 |   Commands like @{command typ}, @{command term}, @{command thm} (see
 | |
| 229 |   \secref{sec:print-diag}) take additional print modes as optional argument.
 | |
| 230 | The underlying ML operations are as follows. | |
| 46284 | 231 | |
| 69597 | 232 | \<^descr> \<^ML>\<open>print_mode_value ()\<close> yields the list of currently active print | 
| 62106 | 233 | mode names. This should be understood as symbolic representation of | 
| 234 | certain individual features for printing (with precedence from left to | |
| 235 | right). | |
| 46284 | 236 | |
| 69597 | 237 | \<^descr> \<^ML>\<open>Print_Mode.with_modes\<close>~\<open>modes f x\<close> evaluates \<open>f x\<close> in an execution | 
| 62106 | 238 | context where the print mode is prepended by the given \<open>modes\<close>. This | 
| 239 | provides a thread-safe way to augment print modes. It is also monotonic in | |
| 240 | the set of mode names: it retains the default print mode that certain | |
| 241 | user-interfaces might have installed for their proper functioning! | |
| 46284 | 242 | |
| 61421 | 243 | \<^medskip> | 
| 62106 | 244 | The pretty printer for inner syntax maintains alternative mixfix productions | 
| 245 |   for any print mode name invented by the user, say in commands like @{command
 | |
| 246 |   notation} or @{command abbreviation}. Mode names can be arbitrary, but the
 | |
| 247 | following ones have a specific meaning by convention: | |
| 46284 | 248 | |
| 62106 | 249 | \<^item> \<^verbatim>\<open>""\<close> (the empty string): default mode; implicitly active as last | 
| 250 | element in the list of modes. | |
| 46284 | 251 | |
| 62106 | 252 | \<^item> \<^verbatim>\<open>input\<close>: dummy print mode that is never active; may be used to specify | 
| 253 | notation that is only available for input. | |
| 46284 | 254 | |
| 62106 | 255 | \<^item> \<^verbatim>\<open>internal\<close> dummy print mode that is never active; used internally in | 
| 256 | Isabelle/Pure. | |
| 46284 | 257 | |
| 62106 | 258 | \<^item> \<^verbatim>\<open>ASCII\<close>: prefer ASCII art over mathematical symbols. | 
| 46284 | 259 | |
| 62106 | 260 |     \<^item> \<^verbatim>\<open>latex\<close>: additional mode that is active in {\LaTeX} document
 | 
| 261 | preparation of Isabelle theory sources; allows to provide alternative | |
| 262 | output notation. | |
| 58618 | 263 | \<close> | 
| 46284 | 264 | |
| 265 | ||
| 58618 | 266 | section \<open>Mixfix annotations \label{sec:mixfix}\<close>
 | 
| 28762 | 267 | |
| 62106 | 268 | text \<open> | 
| 269 | Mixfix annotations specify concrete \<^emph>\<open>inner syntax\<close> of Isabelle types and | |
| 270 | terms. Locally fixed parameters in toplevel theorem statements, locale and | |
| 271 | class specifications also admit mixfix annotations in a fairly uniform | |
| 272 | manner. A mixfix annotation describes the concrete syntax, the translation | |
| 273 | to abstract syntax, and the pretty printing. Special case annotations | |
| 274 | provide a simple means of specifying infix operators and binders. | |
| 46290 | 275 | |
| 76987 | 276 |   Isabelle mixfix syntax is inspired by {\OBJ} \<^cite>\<open>OBJ\<close>. It allows to
 | 
| 62106 | 277 | specify any context-free priority grammar, which is more general than the | 
| 278 | fixity declarations of ML and Prolog. | |
| 28762 | 279 | |
| 69597 | 280 | \<^rail>\<open> | 
| 51654 
8450b944e58a
just one syntax category "mixfix" -- check structure annotation semantically;
 wenzelm parents: 
50636diff
changeset | 281 |     @{syntax_def mixfix}: '('
 | 
| 58761 | 282 |       (@{syntax template} prios? @{syntax nat}? |
 | 
| 283 |         (@'infix' | @'infixl' | @'infixr') @{syntax template} @{syntax nat} |
 | |
| 68272 | 284 |         @'binder' @{syntax template} prio? @{syntax nat} |
 | 
| 58761 | 285 | @'structure') ')' | 
| 46290 | 286 | ; | 
| 69580 | 287 |     @{syntax template}: (string | cartouche)
 | 
| 46289 | 288 | ; | 
| 42596 
6c621a9d612a
modernized rail diagrams using @{rail} antiquotation;
 wenzelm parents: 
42358diff
changeset | 289 |     prios: '[' (@{syntax nat} + ',') ']'
 | 
| 68272 | 290 | ; | 
| 291 |     prio: '[' @{syntax nat} ']'
 | |
| 69597 | 292 | \<close> | 
| 28762 | 293 | |
| 69580 | 294 | The mixfix \<open>template\<close> may include literal text, spacing, blocks, and | 
| 295 | arguments (denoted by ``\<open>_\<close>''); the special symbol ``\<^verbatim>\<open>\<index>\<close>'' (printed as | |
| 62106 | 296 |   ``\<open>\<index>\<close>'') represents an index argument that specifies an implicit @{keyword
 | 
| 297 |   "structure"} reference (see also \secref{sec:locale}). Only locally fixed
 | |
| 298 |   variables may be declared as @{keyword "structure"}.
 | |
| 51657 
3db1bbc82d8d
more accurate documentation of "(structure)" mixfix;
 wenzelm parents: 
51654diff
changeset | 299 | |
| 62106 | 300 | Infix and binder declarations provide common abbreviations for particular | 
| 301 | mixfix declarations. So in practice, mixfix templates mostly degenerate to | |
| 302 | literal text for concrete syntax, such as ``\<^verbatim>\<open>++\<close>'' for an infix symbol. | |
| 61503 | 303 | \<close> | 
| 28762 | 304 | |
| 46290 | 305 | |
| 58618 | 306 | subsection \<open>The general mixfix form\<close> | 
| 46290 | 307 | |
| 62106 | 308 | text \<open> | 
| 309 | In full generality, mixfix declarations work as follows. Suppose a constant | |
| 310 | \<open>c :: \<tau>\<^sub>1 \<Rightarrow> \<dots> \<tau>\<^sub>n \<Rightarrow> \<tau>\<close> is annotated by \<open>(mixfix [p\<^sub>1, \<dots>, p\<^sub>n] p)\<close>, where | |
| 311 | \<open>mixfix\<close> is a string \<open>d\<^sub>0 _ d\<^sub>1 _ \<dots> _ d\<^sub>n\<close> consisting of delimiters that | |
| 312 | surround argument positions as indicated by underscores. | |
| 28762 | 313 | |
| 62106 | 314 | Altogether this determines a production for a context-free priority grammar, | 
| 315 | where for each argument \<open>i\<close> the syntactic category is determined by \<open>\<tau>\<^sub>i\<close> | |
| 316 | (with priority \<open>p\<^sub>i\<close>), and the result category is determined from \<open>\<tau>\<close> (with | |
| 317 | priority \<open>p\<close>). Priority specifications are optional, with default 0 for | |
| 318 | arguments and 1000 for the result.\<^footnote>\<open>Omitting priorities is prone to | |
| 319 | syntactic ambiguities unless the delimiter tokens determine fully bracketed | |
| 320 | notation, as in \<open>if _ then _ else _ fi\<close>.\<close> | |
| 28762 | 321 | |
| 62106 | 322 | Since \<open>\<tau>\<close> may be again a function type, the constant type scheme may have | 
| 323 | more argument positions than the mixfix pattern. Printing a nested | |
| 324 | application \<open>c t\<^sub>1 \<dots> t\<^sub>m\<close> for \<open>m > n\<close> works by attaching concrete notation | |
| 325 | only to the innermost part, essentially by printing \<open>(c t\<^sub>1 \<dots> t\<^sub>n) \<dots> t\<^sub>m\<close> | |
| 326 | instead. If a term has fewer arguments than specified in the mixfix | |
| 28762 | 327 | template, the concrete syntax is ignored. | 
| 328 | ||
| 61421 | 329 | \<^medskip> | 
| 62106 | 330 | A mixfix template may also contain additional directives for pretty | 
| 331 | printing, notably spaces, blocks, and breaks. The general template format is | |
| 332 | a sequence over any of the following entities. | |
| 28762 | 333 | |
| 63933 | 334 | \<^descr> \<open>d\<close> is a delimiter, namely a non-empty sequence delimiter items of the | 
| 335 | following form: | |
| 336 | \<^enum> a control symbol followed by a cartouche | |
| 337 | \<^enum> a single symbol, excluding the following special characters: | |
| 338 | \<^medskip> | |
| 339 |       \begin{tabular}{ll}
 | |
| 340 | \<^verbatim>\<open>'\<close> & single quote \\ | |
| 341 | \<^verbatim>\<open>_\<close> & underscore \\ | |
| 342 | \<open>\<index>\<close> & index symbol \\ | |
| 343 | \<^verbatim>\<open>(\<close> & open parenthesis \\ | |
| 344 | \<^verbatim>\<open>)\<close> & close parenthesis \\ | |
| 345 | \<^verbatim>\<open>/\<close> & slash \\ | |
| 346 | \<open>\<open> \<close>\<close> & cartouche delimiters \\ | |
| 347 |       \end{tabular}
 | |
| 348 | \<^medskip> | |
| 28762 | 349 | |
| 62106 | 350 | \<^descr> \<^verbatim>\<open>'\<close> escapes the special meaning of these meta-characters, producing a | 
| 351 | literal version of the following character, unless that is a blank. | |
| 28771 
4510201c6aaf
mixfix annotations: verbatim for special symbols;
 wenzelm parents: 
28770diff
changeset | 352 | |
| 62106 | 353 | A single quote followed by a blank separates delimiters, without affecting | 
| 354 | printing, but input tokens may have additional white space here. | |
| 28771 
4510201c6aaf
mixfix annotations: verbatim for special symbols;
 wenzelm parents: 
28770diff
changeset | 355 | |
| 62106 | 356 | \<^descr> \<^verbatim>\<open>_\<close> is an argument position, which stands for a certain syntactic | 
| 357 | category in the underlying grammar. | |
| 28762 | 358 | |
| 62106 | 359 | \<^descr> \<open>\<index>\<close> is an indexed argument position; this is the place where implicit | 
| 360 | structure arguments can be attached. | |
| 28762 | 361 | |
| 62106 | 362 | \<^descr> \<open>s\<close> is a non-empty sequence of spaces for printing. This and the following | 
| 363 | specifications do not affect parsing at all. | |
| 364 | ||
| 62807 | 365 | \<^descr> \<^verbatim>\<open>(\<close>\<open>n\<close> opens a pretty printing block. The optional natural number | 
| 366 | specifies the block indentation, i.e. how much spaces to add when a line | |
| 367 | break occurs within the block. The default indentation is 0. | |
| 368 | ||
| 369 | \<^descr> \<^verbatim>\<open>(\<close>\<open>\<open>properties\<close>\<close> opens a pretty printing block, with properties | |
| 370 | specified within the given text cartouche. The syntax and semantics of | |
| 371 |   the category @{syntax_ref mixfix_properties} is described below.
 | |
| 28762 | 372 | |
| 61503 | 373 | \<^descr> \<^verbatim>\<open>)\<close> closes a pretty printing block. | 
| 28762 | 374 | |
| 61503 | 375 | \<^descr> \<^verbatim>\<open>//\<close> forces a line break. | 
| 28762 | 376 | |
| 62106 | 377 | \<^descr> \<^verbatim>\<open>/\<close>\<open>s\<close> allows a line break. Here \<open>s\<close> stands for the string of spaces | 
| 378 | (zero or more) right after the slash. These spaces are printed if the break | |
| 379 | is \<^emph>\<open>not\<close> taken. | |
| 28762 | 380 | |
| 381 | ||
| 62807 | 382 | \<^medskip> | 
| 383 | Block properties allow more control over the details of pretty-printed | |
| 384 | output. The concrete syntax is defined as follows. | |
| 385 | ||
| 69597 | 386 | \<^rail>\<open> | 
| 62807 | 387 |     @{syntax_def "mixfix_properties"}: (entry *)
 | 
| 388 | ; | |
| 389 |     entry: atom ('=' atom)?
 | |
| 390 | ; | |
| 63138 
70f4d67235a0
clarified syntax category names according to Isabelle/ML/Scala;
 wenzelm parents: 
62969diff
changeset | 391 |     atom: @{syntax short_ident} | @{syntax int} | @{syntax float} | @{syntax cartouche}
 | 
| 69597 | 392 | \<close> | 
| 62807 | 393 | |
| 68503 | 394 |   Each @{syntax entry} is a name-value pair: if the value is omitted, it
 | 
| 62807 | 395 | defaults to \<^verbatim>\<open>true\<close> (intended for Boolean properties). The following | 
| 396 | standard block properties are supported: | |
| 397 | ||
| 398 | \<^item> \<open>indent\<close> (natural number): the block indentation --- the same as for the | |
| 399 | simple syntax without block properties. | |
| 400 | ||
| 401 | \<^item> \<open>consistent\<close> (Boolean): this block has consistent breaks (if one break | |
| 402 | is taken, all breaks are taken). | |
| 403 | ||
| 404 | \<^item> \<open>unbreakable\<close> (Boolean): all possible breaks of the block are disabled | |
| 405 | (turned into spaces). | |
| 406 | ||
| 407 | \<^item> \<open>markup\<close> (string): the optional name of the markup node. If this is | |
| 408 | provided, all remaining properties are turned into its XML attributes. | |
| 409 | This allows to specify free-form PIDE markup, e.g.\ for specialized | |
| 410 | output. | |
| 411 | ||
| 412 | \<^medskip> | |
| 413 | Note that the general idea of pretty printing with blocks and breaks is | |
| 76987 | 414 | described in \<^cite>\<open>"paulson-ml2"\<close>; it goes back to \<^cite>\<open>"Oppen:1980"\<close>. | 
| 58618 | 415 | \<close> | 
| 28762 | 416 | |
| 417 | ||
| 58618 | 418 | subsection \<open>Infixes\<close> | 
| 46290 | 419 | |
| 62106 | 420 | text \<open> | 
| 421 | Infix operators are specified by convenient short forms that abbreviate | |
| 422 | general mixfix annotations as follows: | |
| 46290 | 423 | |
| 424 |   \begin{center}
 | |
| 425 |   \begin{tabular}{lll}
 | |
| 426 | ||
| 61503 | 427 |   \<^verbatim>\<open>(\<close>@{keyword_def "infix"}~\<^verbatim>\<open>"\<close>\<open>sy\<close>\<^verbatim>\<open>"\<close> \<open>p\<close>\<^verbatim>\<open>)\<close>
 | 
| 61493 | 428 | & \<open>\<mapsto>\<close> & | 
| 61503 | 429 |   \<^verbatim>\<open>("(_\<close>~\<open>sy\<close>\<^verbatim>\<open>/ _)" [\<close>\<open>p + 1\<close>\<^verbatim>\<open>,\<close>~\<open>p + 1\<close>\<^verbatim>\<open>]\<close>~\<open>p\<close>\<^verbatim>\<open>)\<close> \\
 | 
| 430 |   \<^verbatim>\<open>(\<close>@{keyword_def "infixl"}~\<^verbatim>\<open>"\<close>\<open>sy\<close>\<^verbatim>\<open>"\<close> \<open>p\<close>\<^verbatim>\<open>)\<close>
 | |
| 61493 | 431 | & \<open>\<mapsto>\<close> & | 
| 61503 | 432 |   \<^verbatim>\<open>("(_\<close>~\<open>sy\<close>\<^verbatim>\<open>/ _)" [\<close>\<open>p\<close>\<^verbatim>\<open>,\<close>~\<open>p + 1\<close>\<^verbatim>\<open>]\<close>~\<open>p\<close>\<^verbatim>\<open>)\<close> \\
 | 
| 433 |   \<^verbatim>\<open>(\<close>@{keyword_def "infixr"}~\<^verbatim>\<open>"\<close>\<open>sy\<close>\<^verbatim>\<open>"\<close>~\<open>p\<close>\<^verbatim>\<open>)\<close>
 | |
| 61493 | 434 | & \<open>\<mapsto>\<close> & | 
| 61503 | 435 |   \<^verbatim>\<open>("(_\<close>~\<open>sy\<close>\<^verbatim>\<open>/ _)" [\<close>\<open>p + 1\<close>\<^verbatim>\<open>,\<close>~\<open>p\<close>\<^verbatim>\<open>]\<close>~\<open>p\<close>\<^verbatim>\<open>)\<close> \\
 | 
| 46290 | 436 | |
| 437 |   \end{tabular}
 | |
| 438 |   \end{center}
 | |
| 439 | ||
| 62106 | 440 | The mixfix template \<^verbatim>\<open>"(_\<close>~\<open>sy\<close>\<^verbatim>\<open>/ _)"\<close> specifies two argument positions; | 
| 441 | the delimiter is preceded by a space and followed by a space or line break; | |
| 442 | the entire phrase is a pretty printing block. | |
| 46290 | 443 | |
| 67398 | 444 | The alternative notation \<^verbatim>\<open>(\<close>\<open>sy\<close>\<^verbatim>\<open>)\<close> is introduced in addition. Thus any | 
| 445 | infix operator may be written in prefix form (as in Haskell), independently of | |
| 69065 | 446 | the number of arguments. | 
| 58618 | 447 | \<close> | 
| 46290 | 448 | |
| 449 | ||
| 58618 | 450 | subsection \<open>Binders\<close> | 
| 46290 | 451 | |
| 62106 | 452 | text \<open> | 
| 453 | A \<^emph>\<open>binder\<close> is a variable-binding construct such as a quantifier. The idea | |
| 454 |   to formalize \<open>\<forall>x. b\<close> as \<open>All (\<lambda>x. b)\<close> for \<open>All :: ('a \<Rightarrow> bool) \<Rightarrow> bool\<close>
 | |
| 76987 | 455 | already goes back to \<^cite>\<open>church40\<close>. Isabelle declarations of certain | 
| 62106 | 456 |   higher-order operators may be annotated with @{keyword_def "binder"}
 | 
| 457 | annotations as follows: | |
| 46290 | 458 | |
| 459 |   \begin{center}
 | |
| 61503 | 460 |   \<open>c ::\<close>~\<^verbatim>\<open>"\<close>\<open>(\<tau>\<^sub>1 \<Rightarrow> \<tau>\<^sub>2) \<Rightarrow> \<tau>\<^sub>3\<close>\<^verbatim>\<open>"  (\<close>@{keyword "binder"}~\<^verbatim>\<open>"\<close>\<open>sy\<close>\<^verbatim>\<open>" [\<close>\<open>p\<close>\<^verbatim>\<open>]\<close>~\<open>q\<close>\<^verbatim>\<open>)\<close>
 | 
| 46290 | 461 |   \end{center}
 | 
| 462 | ||
| 62106 | 463 | This introduces concrete binder syntax \<open>sy x. b\<close>, where \<open>x\<close> is a bound | 
| 464 | variable of type \<open>\<tau>\<^sub>1\<close>, the body \<open>b\<close> has type \<open>\<tau>\<^sub>2\<close> and the whole term has | |
| 465 | type \<open>\<tau>\<^sub>3\<close>. The optional integer \<open>p\<close> specifies the syntactic priority of the | |
| 466 | body; the default is \<open>q\<close>, which is also the priority of the whole construct. | |
| 46290 | 467 | |
| 468 | Internally, the binder syntax is expanded to something like this: | |
| 469 |   \begin{center}
 | |
| 61503 | 470 |   \<open>c_binder ::\<close>~\<^verbatim>\<open>"\<close>\<open>idts \<Rightarrow> \<tau>\<^sub>2 \<Rightarrow> \<tau>\<^sub>3\<close>\<^verbatim>\<open>"  ("(3\<close>\<open>sy\<close>\<^verbatim>\<open>_./ _)" [0,\<close>~\<open>p\<close>\<^verbatim>\<open>]\<close>~\<open>q\<close>\<^verbatim>\<open>)\<close>
 | 
| 46290 | 471 |   \end{center}
 | 
| 472 | ||
| 473 |   Here @{syntax (inner) idts} is the nonterminal symbol for a list of
 | |
| 474 | identifiers with optional type constraints (see also | |
| 62106 | 475 |   \secref{sec:pure-grammar}). The mixfix template \<^verbatim>\<open>"(3\<close>\<open>sy\<close>\<^verbatim>\<open>_./ _)"\<close> defines
 | 
| 476 | argument positions for the bound identifiers and the body, separated by a | |
| 477 | dot with optional line break; the entire phrase is a pretty printing block | |
| 478 | of indentation level 3. Note that there is no extra space after \<open>sy\<close>, so it | |
| 479 | needs to be included user specification if the binder syntax ends with a | |
| 480 |   token that may be continued by an identifier token at the start of @{syntax
 | |
| 481 | (inner) idts}. | |
| 46290 | 482 | |
| 62106 | 483 | Furthermore, a syntax translation to transforms \<open>c_binder x\<^sub>1 \<dots> x\<^sub>n b\<close> into | 
| 484 | iterated application \<open>c (\<lambda>x\<^sub>1. \<dots> c (\<lambda>x\<^sub>n. b)\<dots>)\<close>. This works in both | |
| 485 | directions, for parsing and printing. | |
| 486 | \<close> | |
| 46290 | 487 | |
| 488 | ||
| 58618 | 489 | section \<open>Explicit notation \label{sec:notation}\<close>
 | 
| 28762 | 490 | |
| 58618 | 491 | text \<open> | 
| 28762 | 492 |   \begin{matharray}{rcll}
 | 
| 61493 | 493 |     @{command_def "type_notation"} & : & \<open>local_theory \<rightarrow> local_theory\<close> \\
 | 
| 494 |     @{command_def "no_type_notation"} & : & \<open>local_theory \<rightarrow> local_theory\<close> \\
 | |
| 495 |     @{command_def "notation"} & : & \<open>local_theory \<rightarrow> local_theory\<close> \\
 | |
| 496 |     @{command_def "no_notation"} & : & \<open>local_theory \<rightarrow> local_theory\<close> \\
 | |
| 497 |     @{command_def "write"} & : & \<open>proof(state) \<rightarrow> proof(state)\<close> \\
 | |
| 28762 | 498 |   \end{matharray}
 | 
| 499 | ||
| 62106 | 500 | Commands that introduce new logical entities (terms or types) usually allow | 
| 501 | to provide mixfix annotations on the spot, which is convenient for default | |
| 502 | notation. Nonetheless, the syntax may be modified later on by declarations | |
| 503 | for explicit notation. This allows to add or delete mixfix annotations for | |
| 504 | of existing logical entities within the current context. | |
| 46288 | 505 | |
| 69597 | 506 | \<^rail>\<open> | 
| 59783 
00b62aa9f430
tuned syntax diagrams -- no duplication of "target";
 wenzelm parents: 
58842diff
changeset | 507 |     (@@{command type_notation} | @@{command no_type_notation}) @{syntax mode}? \<newline>
 | 
| 62969 | 508 |       (@{syntax name} @{syntax mixfix} + @'and')
 | 
| 35413 | 509 | ; | 
| 59783 
00b62aa9f430
tuned syntax diagrams -- no duplication of "target";
 wenzelm parents: 
58842diff
changeset | 510 |     (@@{command notation} | @@{command no_notation}) @{syntax mode}? \<newline>
 | 
| 62969 | 511 |       (@{syntax name} @{syntax mixfix} + @'and')
 | 
| 28762 | 512 | ; | 
| 62969 | 513 |     @@{command write} @{syntax mode}? (@{syntax name} @{syntax mixfix} + @'and')
 | 
| 69597 | 514 | \<close> | 
| 28762 | 515 | |
| 62106 | 516 |   \<^descr> @{command "type_notation"}~\<open>c (mx)\<close> associates mixfix syntax with an
 | 
| 517 | existing type constructor. The arity of the constructor is retrieved from | |
| 518 | the context. | |
| 46282 | 519 | |
| 62106 | 520 |   \<^descr> @{command "no_type_notation"} is similar to @{command "type_notation"},
 | 
| 521 | but removes the specified syntax annotation from the present context. | |
| 35413 | 522 | |
| 62106 | 523 |   \<^descr> @{command "notation"}~\<open>c (mx)\<close> associates mixfix syntax with an existing
 | 
| 524 | constant or fixed variable. The type declaration of the given entity is | |
| 525 | retrieved from the context. | |
| 46282 | 526 | |
| 62106 | 527 |   \<^descr> @{command "no_notation"} is similar to @{command "notation"}, but removes
 | 
| 528 | the specified syntax annotation from the present context. | |
| 28762 | 529 | |
| 62106 | 530 |   \<^descr> @{command "write"} is similar to @{command "notation"}, but works within
 | 
| 531 | an Isar proof body. | |
| 58618 | 532 | \<close> | 
| 28762 | 533 | |
| 28778 | 534 | |
| 58618 | 535 | section \<open>The Pure syntax \label{sec:pure-syntax}\<close>
 | 
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 536 | |
| 58618 | 537 | subsection \<open>Lexical matters \label{sec:inner-lex}\<close>
 | 
| 46282 | 538 | |
| 62106 | 539 | text \<open> | 
| 540 | The inner lexical syntax vaguely resembles the outer one | |
| 541 |   (\secref{sec:outer-lex}), but some details are different. There are two main
 | |
| 542 | categories of inner syntax tokens: | |
| 46282 | 543 | |
| 62106 | 544 | \<^enum> \<^emph>\<open>delimiters\<close> --- the literal tokens occurring in productions of the given | 
| 545 |   priority grammar (cf.\ \secref{sec:priority-grammar});
 | |
| 46282 | 546 | |
| 61477 | 547 | \<^enum> \<^emph>\<open>named tokens\<close> --- various categories of identifiers etc. | 
| 46282 | 548 | |
| 549 | ||
| 62106 | 550 | Delimiters override named tokens and may thus render certain identifiers | 
| 551 | inaccessible. Sometimes the logical context admits alternative ways to refer | |
| 552 | to the same entity, potentially via qualified names. | |
| 46282 | 553 | |
| 61421 | 554 | \<^medskip> | 
| 62106 | 555 | The categories for named tokens are defined once and for all as follows, | 
| 556 |   reusing some categories of the outer token syntax (\secref{sec:outer-lex}).
 | |
| 46282 | 557 | |
| 558 |   \begin{center}
 | |
| 559 |   \begin{supertabular}{rcl}
 | |
| 63138 
70f4d67235a0
clarified syntax category names according to Isabelle/ML/Scala;
 wenzelm parents: 
62969diff
changeset | 560 |     @{syntax_def (inner) id} & = & @{syntax_ref short_ident} \\
 | 
| 
70f4d67235a0
clarified syntax category names according to Isabelle/ML/Scala;
 wenzelm parents: 
62969diff
changeset | 561 |     @{syntax_def (inner) longid} & = & @{syntax_ref long_ident} \\
 | 
| 46282 | 562 |     @{syntax_def (inner) var} & = & @{syntax_ref var} \\
 | 
| 63138 
70f4d67235a0
clarified syntax category names according to Isabelle/ML/Scala;
 wenzelm parents: 
62969diff
changeset | 563 |     @{syntax_def (inner) tid} & = & @{syntax_ref type_ident} \\
 | 
| 
70f4d67235a0
clarified syntax category names according to Isabelle/ML/Scala;
 wenzelm parents: 
62969diff
changeset | 564 |     @{syntax_def (inner) tvar} & = & @{syntax_ref type_var} \\
 | 
| 58410 
6d46ad54a2ab
explicit separation of signed and unsigned numerals using existing lexical categories num and xnum
 haftmann parents: 
58409diff
changeset | 565 |     @{syntax_def (inner) num_token} & = & @{syntax_ref nat} \\
 | 
| 61503 | 566 |     @{syntax_def (inner) float_token} & = & @{syntax_ref nat}\<^verbatim>\<open>.\<close>@{syntax_ref nat} \\
 | 
| 567 |     @{syntax_def (inner) str_token} & = & \<^verbatim>\<open>''\<close> \<open>\<dots>\<close> \<^verbatim>\<open>''\<close> \\
 | |
| 568 |     @{syntax_def (inner) string_token} & = & \<^verbatim>\<open>"\<close> \<open>\<dots>\<close> \<^verbatim>\<open>"\<close> \\
 | |
| 75211 | 569 |     @{syntax_def (inner) cartouche} & = & @{verbatim "\<open>"} \<open>\<dots>\<close> @{verbatim "\<close>"} \\
 | 
| 46282 | 570 |   \end{supertabular}
 | 
| 571 |   \end{center}
 | |
| 572 | ||
| 573 |   The token categories @{syntax (inner) num_token}, @{syntax (inner)
 | |
| 58421 | 574 |   float_token}, @{syntax (inner) str_token}, @{syntax (inner) string_token},
 | 
| 575 |   and @{syntax (inner) cartouche} are not used in Pure. Object-logics may
 | |
| 576 | implement numerals and string literals by adding appropriate syntax | |
| 63680 | 577 | declarations, together with some translation functions (e.g.\ see | 
| 578 | \<^file>\<open>~~/src/HOL/Tools/string_syntax.ML\<close>). | |
| 46282 | 579 | |
| 58421 | 580 |   The derived categories @{syntax_def (inner) num_const}, and @{syntax_def
 | 
| 581 | (inner) float_const}, provide robust access to the respective tokens: the | |
| 582 | syntax tree holds a syntactic constant instead of a free variable. | |
| 67352 
5f7f339f3d7e
inner syntax comments may be written as "\<comment> \<open>text\<close>";
 wenzelm parents: 
67146diff
changeset | 583 | |
| 67448 | 584 |   Formal document comments (\secref{sec:comments}) may be also used within the
 | 
| 585 | inner syntax. | |
| 58618 | 586 | \<close> | 
| 46282 | 587 | |
| 588 | ||
| 58618 | 589 | subsection \<open>Priority grammars \label{sec:priority-grammar}\<close>
 | 
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 590 | |
| 62106 | 591 | text \<open> | 
| 592 | A context-free grammar consists of a set of \<^emph>\<open>terminal symbols\<close>, a set of | |
| 593 | \<^emph>\<open>nonterminal symbols\<close> and a set of \<^emph>\<open>productions\<close>. Productions have the | |
| 594 | form \<open>A = \<gamma>\<close>, where \<open>A\<close> is a nonterminal and \<open>\<gamma>\<close> is a string of terminals | |
| 595 | and nonterminals. One designated nonterminal is called the \<^emph>\<open>root symbol\<close>. | |
| 596 | The language defined by the grammar consists of all strings of terminals | |
| 597 | that can be derived from the root symbol by applying productions as rewrite | |
| 598 | rules. | |
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 599 | |
| 62106 | 600 | The standard Isabelle parser for inner syntax uses a \<^emph>\<open>priority grammar\<close>. | 
| 601 | Each nonterminal is decorated by an integer priority: \<open>A\<^sup>(\<^sup>p\<^sup>)\<close>. In a | |
| 602 | derivation, \<open>A\<^sup>(\<^sup>p\<^sup>)\<close> may be rewritten using a production \<open>A\<^sup>(\<^sup>q\<^sup>) = \<gamma>\<close> only | |
| 603 | if \<open>p \<le> q\<close>. Any priority grammar can be translated into a normal | |
| 604 | context-free grammar by introducing new nonterminals and productions. | |
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 605 | |
| 61421 | 606 | \<^medskip> | 
| 62106 | 607 | Formally, a set of context free productions \<open>G\<close> induces a derivation | 
| 608 | relation \<open>\<longrightarrow>\<^sub>G\<close> as follows. Let \<open>\<alpha>\<close> and \<open>\<beta>\<close> denote strings of terminal or | |
| 609 | nonterminal symbols. Then \<open>\<alpha> A\<^sup>(\<^sup>p\<^sup>) \<beta> \<longrightarrow>\<^sub>G \<alpha> \<gamma> \<beta>\<close> holds if and only if \<open>G\<close> | |
| 61493 | 610 | contains some production \<open>A\<^sup>(\<^sup>q\<^sup>) = \<gamma>\<close> for \<open>p \<le> q\<close>. | 
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 611 | |
| 61421 | 612 | \<^medskip> | 
| 62106 | 613 | The following grammar for arithmetic expressions demonstrates how binding | 
| 614 | power and associativity of operators can be enforced by priorities. | |
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 615 | |
| 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 616 |   \begin{center}
 | 
| 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 617 |   \begin{tabular}{rclr}
 | 
| 61503 | 618 | \<open>A\<^sup>(\<^sup>1\<^sup>0\<^sup>0\<^sup>0\<^sup>)\<close> & \<open>=\<close> & \<^verbatim>\<open>(\<close> \<open>A\<^sup>(\<^sup>0\<^sup>)\<close> \<^verbatim>\<open>)\<close> \\ | 
| 619 | \<open>A\<^sup>(\<^sup>1\<^sup>0\<^sup>0\<^sup>0\<^sup>)\<close> & \<open>=\<close> & \<^verbatim>\<open>0\<close> \\ | |
| 620 | \<open>A\<^sup>(\<^sup>0\<^sup>)\<close> & \<open>=\<close> & \<open>A\<^sup>(\<^sup>0\<^sup>)\<close> \<^verbatim>\<open>+\<close> \<open>A\<^sup>(\<^sup>1\<^sup>)\<close> \\ | |
| 621 | \<open>A\<^sup>(\<^sup>2\<^sup>)\<close> & \<open>=\<close> & \<open>A\<^sup>(\<^sup>3\<^sup>)\<close> \<^verbatim>\<open>*\<close> \<open>A\<^sup>(\<^sup>2\<^sup>)\<close> \\ | |
| 622 | \<open>A\<^sup>(\<^sup>3\<^sup>)\<close> & \<open>=\<close> & \<^verbatim>\<open>-\<close> \<open>A\<^sup>(\<^sup>3\<^sup>)\<close> \\ | |
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 623 |   \end{tabular}
 | 
| 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 624 |   \end{center}
 | 
| 62106 | 625 | The choice of priorities determines that \<^verbatim>\<open>-\<close> binds tighter than \<^verbatim>\<open>*\<close>, which | 
| 626 | binds tighter than \<^verbatim>\<open>+\<close>. Furthermore \<^verbatim>\<open>+\<close> associates to the left and \<^verbatim>\<open>*\<close> to | |
| 627 | the right. | |
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 628 | |
| 61421 | 629 | \<^medskip> | 
| 630 | For clarity, grammars obey these conventions: | |
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 631 | |
| 62106 | 632 | \<^item> All priorities must lie between 0 and 1000. | 
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 633 | |
| 62106 | 634 | \<^item> Priority 0 on the right-hand side and priority 1000 on the left-hand | 
| 635 | side may be omitted. | |
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 636 | |
| 62106 | 637 | \<^item> The production \<open>A\<^sup>(\<^sup>p\<^sup>) = \<alpha>\<close> is written as \<open>A = \<alpha> (p)\<close>, i.e.\ the | 
| 638 | priority of the left-hand side actually appears in a column on the far | |
| 639 | right. | |
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 640 | |
| 62106 | 641 | \<^item> Alternatives are separated by \<open>|\<close>. | 
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 642 | |
| 62106 | 643 | \<^item> Repetition is indicated by dots \<open>(\<dots>)\<close> in an informal but obvious way. | 
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 644 | |
| 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 645 | Using these conventions, the example grammar specification above | 
| 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 646 | takes the form: | 
| 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 647 |   \begin{center}
 | 
| 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 648 |   \begin{tabular}{rclc}
 | 
| 61503 | 649 | \<open>A\<close> & \<open>=\<close> & \<^verbatim>\<open>(\<close> \<open>A\<close> \<^verbatim>\<open>)\<close> \\ | 
| 650 | & \<open>|\<close> & \<^verbatim>\<open>0\<close> & \qquad\qquad \\ | |
| 651 | & \<open>|\<close> & \<open>A\<close> \<^verbatim>\<open>+\<close> \<open>A\<^sup>(\<^sup>1\<^sup>)\<close> & \<open>(0)\<close> \\ | |
| 652 | & \<open>|\<close> & \<open>A\<^sup>(\<^sup>3\<^sup>)\<close> \<^verbatim>\<open>*\<close> \<open>A\<^sup>(\<^sup>2\<^sup>)\<close> & \<open>(2)\<close> \\ | |
| 653 | & \<open>|\<close> & \<^verbatim>\<open>-\<close> \<open>A\<^sup>(\<^sup>3\<^sup>)\<close> & \<open>(3)\<close> \\ | |
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 654 |   \end{tabular}
 | 
| 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 655 |   \end{center}
 | 
| 58618 | 656 | \<close> | 
| 28769 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 657 | |
| 
8fc228f21861
added section "Priority grammars" (variant from old ref manual);
 wenzelm parents: 
28767diff
changeset | 658 | |
| 58618 | 659 | subsection \<open>The Pure grammar \label{sec:pure-grammar}\<close>
 | 
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 660 | |
| 62106 | 661 | text \<open> | 
| 662 | The priority grammar of the \<open>Pure\<close> theory is defined approximately like | |
| 663 | this: | |
| 28774 | 664 | |
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 665 |   \begin{center}
 | 
| 28773 | 666 |   \begin{supertabular}{rclr}
 | 
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 667 | |
| 61493 | 668 |   @{syntax_def (inner) any} & = & \<open>prop  |  logic\<close> \\\\
 | 
| 28772 | 669 | |
| 61503 | 670 |   @{syntax_def (inner) prop} & = & \<^verbatim>\<open>(\<close> \<open>prop\<close> \<^verbatim>\<open>)\<close> \\
 | 
| 671 | & \<open>|\<close> & \<open>prop\<^sup>(\<^sup>4\<^sup>)\<close> \<^verbatim>\<open>::\<close> \<open>type\<close> & \<open>(3)\<close> \\ | |
| 672 | & \<open>|\<close> & \<open>any\<^sup>(\<^sup>3\<^sup>)\<close> \<^verbatim>\<open>==\<close> \<open>any\<^sup>(\<^sup>3\<^sup>)\<close> & \<open>(2)\<close> \\ | |
| 61493 | 673 | & \<open>|\<close> & \<open>any\<^sup>(\<^sup>3\<^sup>)\<close> \<open>\<equiv>\<close> \<open>any\<^sup>(\<^sup>3\<^sup>)\<close> & \<open>(2)\<close> \\ | 
| 61503 | 674 | & \<open>|\<close> & \<open>prop\<^sup>(\<^sup>3\<^sup>)\<close> \<^verbatim>\<open>&&&\<close> \<open>prop\<^sup>(\<^sup>2\<^sup>)\<close> & \<open>(2)\<close> \\ | 
| 675 | & \<open>|\<close> & \<open>prop\<^sup>(\<^sup>2\<^sup>)\<close> \<^verbatim>\<open>==>\<close> \<open>prop\<^sup>(\<^sup>1\<^sup>)\<close> & \<open>(1)\<close> \\ | |
| 61493 | 676 | & \<open>|\<close> & \<open>prop\<^sup>(\<^sup>2\<^sup>)\<close> \<open>\<Longrightarrow>\<close> \<open>prop\<^sup>(\<^sup>1\<^sup>)\<close> & \<open>(1)\<close> \\ | 
| 61503 | 677 | & \<open>|\<close> & \<^verbatim>\<open>[|\<close> \<open>prop\<close> \<^verbatim>\<open>;\<close> \<open>\<dots>\<close> \<^verbatim>\<open>;\<close> \<open>prop\<close> \<^verbatim>\<open>|]\<close> \<^verbatim>\<open>==>\<close> \<open>prop\<^sup>(\<^sup>1\<^sup>)\<close> & \<open>(1)\<close> \\ | 
| 678 | & \<open>|\<close> & \<open>\<lbrakk>\<close> \<open>prop\<close> \<^verbatim>\<open>;\<close> \<open>\<dots>\<close> \<^verbatim>\<open>;\<close> \<open>prop\<close> \<open>\<rbrakk>\<close> \<open>\<Longrightarrow>\<close> \<open>prop\<^sup>(\<^sup>1\<^sup>)\<close> & \<open>(1)\<close> \\ | |
| 679 | & \<open>|\<close> & \<^verbatim>\<open>!!\<close> \<open>idts\<close> \<^verbatim>\<open>.\<close> \<open>prop\<close> & \<open>(0)\<close> \\ | |
| 680 | & \<open>|\<close> & \<open>\<And>\<close> \<open>idts\<close> \<^verbatim>\<open>.\<close> \<open>prop\<close> & \<open>(0)\<close> \\ | |
| 681 | & \<open>|\<close> & \<^verbatim>\<open>OFCLASS\<close> \<^verbatim>\<open>(\<close> \<open>type\<close> \<^verbatim>\<open>,\<close> \<open>logic\<close> \<^verbatim>\<open>)\<close> \\ | |
| 682 | & \<open>|\<close> & \<^verbatim>\<open>SORT_CONSTRAINT\<close> \<^verbatim>\<open>(\<close> \<open>type\<close> \<^verbatim>\<open>)\<close> \\ | |
| 683 | & \<open>|\<close> & \<^verbatim>\<open>TERM\<close> \<open>logic\<close> \\ | |
| 684 | & \<open>|\<close> & \<^verbatim>\<open>PROP\<close> \<open>aprop\<close> \\\\ | |
| 28772 | 685 | |
| 61503 | 686 |   @{syntax_def (inner) aprop} & = & \<^verbatim>\<open>(\<close> \<open>aprop\<close> \<^verbatim>\<open>)\<close> \\
 | 
| 687 | & \<open>|\<close> & \<open>id | longid | var |\<close>~~\<^verbatim>\<open>_\<close>~~\<open>|\<close>~~\<^verbatim>\<open>...\<close> \\ | |
| 688 | & \<open>|\<close> & \<^verbatim>\<open>CONST\<close> \<open>id |\<close>~~\<^verbatim>\<open>CONST\<close> \<open>longid\<close> \\ | |
| 689 | & \<open>|\<close> & \<^verbatim>\<open>XCONST\<close> \<open>id |\<close>~~\<^verbatim>\<open>XCONST\<close> \<open>longid\<close> \\ | |
| 61493 | 690 | & \<open>|\<close> & \<open>logic\<^sup>(\<^sup>1\<^sup>0\<^sup>0\<^sup>0\<^sup>) any\<^sup>(\<^sup>1\<^sup>0\<^sup>0\<^sup>0\<^sup>) \<dots> any\<^sup>(\<^sup>1\<^sup>0\<^sup>0\<^sup>0\<^sup>)\<close> & \<open>(999)\<close> \\\\ | 
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 691 | |
| 61503 | 692 |   @{syntax_def (inner) logic} & = & \<^verbatim>\<open>(\<close> \<open>logic\<close> \<^verbatim>\<open>)\<close> \\
 | 
| 693 | & \<open>|\<close> & \<open>logic\<^sup>(\<^sup>4\<^sup>)\<close> \<^verbatim>\<open>::\<close> \<open>type\<close> & \<open>(3)\<close> \\ | |
| 694 | & \<open>|\<close> & \<open>id | longid | var |\<close>~~\<^verbatim>\<open>_\<close>~~\<open>|\<close>~~\<^verbatim>\<open>...\<close> \\ | |
| 695 | & \<open>|\<close> & \<^verbatim>\<open>CONST\<close> \<open>id |\<close>~~\<^verbatim>\<open>CONST\<close> \<open>longid\<close> \\ | |
| 696 | & \<open>|\<close> & \<^verbatim>\<open>XCONST\<close> \<open>id |\<close>~~\<^verbatim>\<open>XCONST\<close> \<open>longid\<close> \\ | |
| 61493 | 697 | & \<open>|\<close> & \<open>logic\<^sup>(\<^sup>1\<^sup>0\<^sup>0\<^sup>0\<^sup>) any\<^sup>(\<^sup>1\<^sup>0\<^sup>0\<^sup>0\<^sup>) \<dots> any\<^sup>(\<^sup>1\<^sup>0\<^sup>0\<^sup>0\<^sup>)\<close> & \<open>(999)\<close> \\ | 
| 61503 | 698 | & \<open>|\<close> & \<^verbatim>\<open>%\<close> \<open>pttrns\<close> \<^verbatim>\<open>.\<close> \<open>any\<^sup>(\<^sup>3\<^sup>)\<close> & \<open>(3)\<close> \\ | 
| 699 | & \<open>|\<close> & \<open>\<lambda>\<close> \<open>pttrns\<close> \<^verbatim>\<open>.\<close> \<open>any\<^sup>(\<^sup>3\<^sup>)\<close> & \<open>(3)\<close> \\ | |
| 67398 | 700 | & \<open>|\<close> & \<^verbatim>\<open>(==)\<close>~~\<open>|\<close>~~\<^verbatim>\<open>(\<close>\<open>\<equiv>\<close>\<^verbatim>\<open>)\<close>~~\<open>|\<close>~~\<^verbatim>\<open>(&&&)\<close> \\ | 
| 701 | & \<open>|\<close> & \<^verbatim>\<open>(==>)\<close>~~\<open>|\<close>~~\<^verbatim>\<open>(\<close>\<open>\<Longrightarrow>\<close>\<^verbatim>\<open>)\<close> \\ | |
| 61503 | 702 | & \<open>|\<close> & \<^verbatim>\<open>TYPE\<close> \<^verbatim>\<open>(\<close> \<open>type\<close> \<^verbatim>\<open>)\<close> \\\\ | 
| 28772 | 703 | |
| 61503 | 704 |   @{syntax_def (inner) idt} & = & \<^verbatim>\<open>(\<close> \<open>idt\<close> \<^verbatim>\<open>)\<close>~~\<open>|  id  |\<close>~~\<^verbatim>\<open>_\<close> \\
 | 
| 705 | & \<open>|\<close> & \<open>id\<close> \<^verbatim>\<open>::\<close> \<open>type\<close> & \<open>(0)\<close> \\ | |
| 706 | & \<open>|\<close> & \<^verbatim>\<open>_\<close> \<^verbatim>\<open>::\<close> \<open>type\<close> & \<open>(0)\<close> \\\\ | |
| 28772 | 707 | |
| 61503 | 708 |   @{syntax_def (inner) index} & = & \<^verbatim>\<open>\<^bsub>\<close> \<open>logic\<^sup>(\<^sup>0\<^sup>)\<close> \<^verbatim>\<open>\<^esub>\<close>~~\<open>|  |  \<index>\<close> \\\\
 | 
| 46287 | 709 | |
| 61493 | 710 |   @{syntax_def (inner) idts} & = & \<open>idt  |  idt\<^sup>(\<^sup>1\<^sup>) idts\<close> & \<open>(0)\<close> \\\\
 | 
| 28772 | 711 | |
| 61493 | 712 |   @{syntax_def (inner) pttrn} & = & \<open>idt\<close> \\\\
 | 
| 28772 | 713 | |
| 61493 | 714 |   @{syntax_def (inner) pttrns} & = & \<open>pttrn  |  pttrn\<^sup>(\<^sup>1\<^sup>) pttrns\<close> & \<open>(0)\<close> \\\\
 | 
| 28774 | 715 | |
| 61503 | 716 |   @{syntax_def (inner) type} & = & \<^verbatim>\<open>(\<close> \<open>type\<close> \<^verbatim>\<open>)\<close> \\
 | 
| 717 | & \<open>|\<close> & \<open>tid | tvar |\<close>~~\<^verbatim>\<open>_\<close> \\ | |
| 718 | & \<open>|\<close> & \<open>tid\<close> \<^verbatim>\<open>::\<close> \<open>sort | tvar\<close>~~\<^verbatim>\<open>::\<close> \<open>sort |\<close>~~\<^verbatim>\<open>_\<close> \<^verbatim>\<open>::\<close> \<open>sort\<close> \\ | |
| 61493 | 719 | & \<open>|\<close> & \<open>type_name | type\<^sup>(\<^sup>1\<^sup>0\<^sup>0\<^sup>0\<^sup>) type_name\<close> \\ | 
| 61503 | 720 | & \<open>|\<close> & \<^verbatim>\<open>(\<close> \<open>type\<close> \<^verbatim>\<open>,\<close> \<open>\<dots>\<close> \<^verbatim>\<open>,\<close> \<open>type\<close> \<^verbatim>\<open>)\<close> \<open>type_name\<close> \\ | 
| 721 | & \<open>|\<close> & \<open>type\<^sup>(\<^sup>1\<^sup>)\<close> \<^verbatim>\<open>=>\<close> \<open>type\<close> & \<open>(0)\<close> \\ | |
| 61493 | 722 | & \<open>|\<close> & \<open>type\<^sup>(\<^sup>1\<^sup>)\<close> \<open>\<Rightarrow>\<close> \<open>type\<close> & \<open>(0)\<close> \\ | 
| 61503 | 723 | & \<open>|\<close> & \<^verbatim>\<open>[\<close> \<open>type\<close> \<^verbatim>\<open>,\<close> \<open>\<dots>\<close> \<^verbatim>\<open>,\<close> \<open>type\<close> \<^verbatim>\<open>]\<close> \<^verbatim>\<open>=>\<close> \<open>type\<close> & \<open>(0)\<close> \\ | 
| 724 | & \<open>|\<close> & \<^verbatim>\<open>[\<close> \<open>type\<close> \<^verbatim>\<open>,\<close> \<open>\<dots>\<close> \<^verbatim>\<open>,\<close> \<open>type\<close> \<^verbatim>\<open>]\<close> \<open>\<Rightarrow>\<close> \<open>type\<close> & \<open>(0)\<close> \\ | |
| 61493 | 725 |   @{syntax_def (inner) type_name} & = & \<open>id  |  longid\<close> \\\\
 | 
| 28772 | 726 | |
| 67718 | 727 |   @{syntax_def (inner) sort} & = & @{syntax class_name}~~\<open>|\<close>~~\<^verbatim>\<open>_\<close>~~\<open>|\<close>~~\<^verbatim>\<open>{}\<close> \\
 | 
| 61503 | 728 |     & \<open>|\<close> & \<^verbatim>\<open>{\<close> @{syntax class_name} \<^verbatim>\<open>,\<close> \<open>\<dots>\<close> \<^verbatim>\<open>,\<close> @{syntax class_name} \<^verbatim>\<open>}\<close> \\
 | 
| 61493 | 729 |   @{syntax_def (inner) class_name} & = & \<open>id  |  longid\<close> \\
 | 
| 28773 | 730 |   \end{supertabular}
 | 
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 731 |   \end{center}
 | 
| 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 732 | |
| 61421 | 733 | \<^medskip> | 
| 62106 | 734 | Here literal terminals are printed \<^verbatim>\<open>verbatim\<close>; see also | 
| 735 |   \secref{sec:inner-lex} for further token categories of the inner syntax. The
 | |
| 736 | meaning of the nonterminals defined by the above grammar is as follows: | |
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 737 | |
| 61439 | 738 |   \<^descr> @{syntax_ref (inner) any} denotes any term.
 | 
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 739 | |
| 62106 | 740 |   \<^descr> @{syntax_ref (inner) prop} denotes meta-level propositions, which are
 | 
| 69597 | 741 | terms of type \<^typ>\<open>prop\<close>. The syntax of such formulae of the meta-logic is | 
| 62106 | 742 | carefully distinguished from usual conventions for object-logics. In | 
| 743 |   particular, plain \<open>\<lambda>\<close>-term notation is \<^emph>\<open>not\<close> recognized as @{syntax (inner)
 | |
| 744 | prop}. | |
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 745 | |
| 62106 | 746 |   \<^descr> @{syntax_ref (inner) aprop} denotes atomic propositions, which are
 | 
| 747 |   embedded into regular @{syntax (inner) prop} by means of an explicit \<^verbatim>\<open>PROP\<close>
 | |
| 748 | token. | |
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 749 | |
| 69597 | 750 | Terms of type \<^typ>\<open>prop\<close> with non-constant head, e.g.\ a plain variable, | 
| 751 | are printed in this form. Constants that yield type \<^typ>\<open>prop\<close> are expected | |
| 62106 | 752 | to provide their own concrete syntax; otherwise the printed version will | 
| 753 |   appear like @{syntax (inner) logic} and cannot be parsed again as @{syntax
 | |
| 754 | (inner) prop}. | |
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 755 | |
| 62106 | 756 |   \<^descr> @{syntax_ref (inner) logic} denotes arbitrary terms of a logical type,
 | 
| 69597 | 757 | excluding type \<^typ>\<open>prop\<close>. This is the main syntactic category of | 
| 62106 | 758 | object-logic entities, covering plain \<open>\<lambda>\<close>-term notation (variables, | 
| 759 | abstraction, application), plus anything defined by the user. | |
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 760 | |
| 62106 | 761 | When specifying notation for logical entities, all logical types (excluding | 
| 69597 | 762 |   \<^typ>\<open>prop\<close>) are \<^emph>\<open>collapsed\<close> to this single category of @{syntax (inner)
 | 
| 62106 | 763 | logic}. | 
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 764 | |
| 62106 | 765 |   \<^descr> @{syntax_ref (inner) index} denotes an optional index term for indexed
 | 
| 766 |   syntax. If omitted, it refers to the first @{keyword_ref "structure"}
 | |
| 767 | variable in the context. The special dummy ``\<open>\<index>\<close>'' serves as pattern | |
| 768 | variable in mixfix annotations that introduce indexed notation. | |
| 46287 | 769 | |
| 62106 | 770 |   \<^descr> @{syntax_ref (inner) idt} denotes identifiers, possibly constrained by
 | 
| 771 | types. | |
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 772 | |
| 62106 | 773 |   \<^descr> @{syntax_ref (inner) idts} denotes a sequence of @{syntax_ref (inner)
 | 
| 774 | idt}. This is the most basic category for variables in iterated binders, | |
| 775 | such as \<open>\<lambda>\<close> or \<open>\<And>\<close>. | |
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 776 | |
| 62106 | 777 |   \<^descr> @{syntax_ref (inner) pttrn} and @{syntax_ref (inner) pttrns} denote
 | 
| 778 | patterns for abstraction, cases bindings etc. In Pure, these categories | |
| 779 |   start as a merely copy of @{syntax (inner) idt} and @{syntax (inner) idts},
 | |
| 780 | respectively. Object-logics may add additional productions for binding | |
| 781 | forms. | |
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 782 | |
| 61439 | 783 |   \<^descr> @{syntax_ref (inner) type} denotes types of the meta-logic.
 | 
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 784 | |
| 61439 | 785 |   \<^descr> @{syntax_ref (inner) sort} denotes meta-level sorts.
 | 
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 786 | |
| 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 787 | |
| 28774 | 788 | Here are some further explanations of certain syntax features. | 
| 28773 | 789 | |
| 62106 | 790 |   \<^item> In @{syntax (inner) idts}, note that \<open>x :: nat y\<close> is parsed as \<open>x :: (nat
 | 
| 791 | y)\<close>, treating \<open>y\<close> like a type constructor applied to \<open>nat\<close>. To avoid this | |
| 792 | interpretation, write \<open>(x :: nat) y\<close> with explicit parentheses. | |
| 28773 | 793 | |
| 62106 | 794 | \<^item> Similarly, \<open>x :: nat y :: nat\<close> is parsed as \<open>x :: (nat y :: nat)\<close>. The | 
| 795 | correct form is \<open>(x :: nat) (y :: nat)\<close>, or \<open>(x :: nat) y :: nat\<close> if \<open>y\<close> is | |
| 796 | last in the sequence of identifiers. | |
| 28773 | 797 | |
| 62106 | 798 | \<^item> Type constraints for terms bind very weakly. For example, \<open>x < y :: nat\<close> | 
| 799 | is normally parsed as \<open>(x < y) :: nat\<close>, unless \<open><\<close> has a very low priority, | |
| 800 | in which case the input is likely to be ambiguous. The correct form is \<open>x < | |
| 801 | (y :: nat)\<close>. | |
| 28773 | 802 | |
| 61421 | 803 | \<^item> Dummy variables (written as underscore) may occur in different | 
| 28774 | 804 | roles. | 
| 28773 | 805 | |
| 67718 | 806 | \<^descr> A sort ``\<open>_\<close>'' refers to a vacuous constraint for type variables, which | 
| 807 | is effectively ignored in type-inference. | |
| 808 | ||
| 62106 | 809 | \<^descr> A type ``\<open>_\<close>'' or ``\<open>_ :: sort\<close>'' acts like an anonymous inference | 
| 810 | parameter, which is filled-in according to the most general type produced | |
| 811 | by the type-checking phase. | |
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 812 | |
| 62106 | 813 | \<^descr> A bound ``\<open>_\<close>'' refers to a vacuous abstraction, where the body does not | 
| 69597 | 814 | refer to the binding introduced here. As in the term \<^term>\<open>\<lambda>x _. x\<close>, | 
| 62106 | 815 | which is \<open>\<alpha>\<close>-equivalent to \<open>\<lambda>x y. x\<close>. | 
| 28773 | 816 | |
| 62106 | 817 | \<^descr> A free ``\<open>_\<close>'' refers to an implicit outer binding. Higher definitional | 
| 818 | packages usually allow forms like \<open>f x _ = x\<close>. | |
| 28773 | 819 | |
| 62106 | 820 |     \<^descr> A schematic ``\<open>_\<close>'' (within a term pattern, see \secref{sec:term-decls})
 | 
| 821 | refers to an anonymous variable that is implicitly abstracted over its | |
| 822 | context of locally bound variables. For example, this allows pattern | |
| 823 |     matching of \<open>{x. f x = g x}\<close> against \<open>{x. _ = _}\<close>, or even \<open>{_. _ = _}\<close> by
 | |
| 61458 | 824 | using both bound and schematic dummies. | 
| 28773 | 825 | |
| 62106 | 826 | \<^descr> The three literal dots ``\<^verbatim>\<open>...\<close>'' may be also written as ellipsis symbol | 
| 827 | \<^verbatim>\<open>\<dots>\<close>. In both cases this refers to a special schematic variable, which is | |
| 828 | bound in the context. This special term abbreviation works nicely with | |
| 28774 | 829 |   calculational reasoning (\secref{sec:calculation}).
 | 
| 830 | ||
| 62106 | 831 | \<^descr> \<^verbatim>\<open>CONST\<close> ensures that the given identifier is treated as constant term, | 
| 832 | and passed through the parse tree in fully internalized form. This is | |
| 833 |   particularly relevant for translation rules (\secref{sec:syn-trans}),
 | |
| 834 | notably on the RHS. | |
| 46287 | 835 | |
| 62106 | 836 | \<^descr> \<^verbatim>\<open>XCONST\<close> is similar to \<^verbatim>\<open>CONST\<close>, but retains the constant name as given. | 
| 837 |   This is only relevant to translation rules (\secref{sec:syn-trans}), notably
 | |
| 838 | on the LHS. | |
| 58618 | 839 | \<close> | 
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 840 | |
| 28777 | 841 | |
| 58618 | 842 | subsection \<open>Inspecting the syntax\<close> | 
| 28777 | 843 | |
| 58618 | 844 | text \<open> | 
| 46282 | 845 |   \begin{matharray}{rcl}
 | 
| 61493 | 846 |     @{command_def "print_syntax"}\<open>\<^sup>*\<close> & : & \<open>context \<rightarrow>\<close> \\
 | 
| 46282 | 847 |   \end{matharray}
 | 
| 28777 | 848 | |
| 62106 | 849 |   \<^descr> @{command "print_syntax"} prints the inner syntax of the current context.
 | 
| 850 | The output can be quite large; the most important sections are explained | |
| 851 | below. | |
| 28777 | 852 | |
| 62106 | 853 | \<^descr> \<open>lexicon\<close> lists the delimiters of the inner token language; see | 
| 854 |     \secref{sec:inner-lex}.
 | |
| 28777 | 855 | |
| 67513 | 856 | \<^descr> \<open>productions\<close> lists the productions of the underlying priority grammar; | 
| 857 |     see \secref{sec:priority-grammar}.
 | |
| 28777 | 858 | |
| 67513 | 859 | Many productions have an extra \<open>\<dots> \<^bold>\<Rightarrow> name\<close>. These names later become the | 
| 860 | heads of parse trees; they also guide the pretty printer. | |
| 62106 | 861 | |
| 862 | Productions without such parse tree names are called \<^emph>\<open>copy productions\<close>. | |
| 863 | Their right-hand side must have exactly one nonterminal symbol (or named | |
| 864 | token). The parser does not create a new parse tree node for copy | |
| 865 | productions, but simply returns the parse tree of the right-hand symbol. | |
| 46282 | 866 | |
| 61458 | 867 | If the right-hand side of a copy production consists of a single | 
| 61477 | 868 | nonterminal without any delimiters, then it is called a \<^emph>\<open>chain | 
| 62106 | 869 | production\<close>. Chain productions act as abbreviations: conceptually, they | 
| 870 | are removed from the grammar by adding new productions. Priority | |
| 67513 | 871 | information attached to chain productions is ignored. | 
| 46282 | 872 | |
| 62106 | 873 | \<^descr> \<open>print modes\<close> lists the alternative print modes provided by this | 
| 874 |     grammar; see \secref{sec:print-modes}.
 | |
| 28777 | 875 | |
| 62106 | 876 | \<^descr> \<open>parse_rules\<close> and \<open>print_rules\<close> relate to syntax translations (macros); | 
| 877 |     see \secref{sec:syn-trans}.
 | |
| 46282 | 878 | |
| 62106 | 879 | \<^descr> \<open>parse_ast_translation\<close> and \<open>print_ast_translation\<close> list sets of | 
| 880 | constants that invoke translation functions for abstract syntax trees, | |
| 881 | which are only required in very special situations; see | |
| 882 |     \secref{sec:tr-funs}.
 | |
| 28777 | 883 | |
| 62106 | 884 | \<^descr> \<open>parse_translation\<close> and \<open>print_translation\<close> list the sets of constants | 
| 885 |     that invoke regular translation functions; see \secref{sec:tr-funs}.
 | |
| 58618 | 886 | \<close> | 
| 28774 | 887 | |
| 28770 
93a372e2dc7a
added section "The Pure grammar" (incomplete version, based on old ref manual);
 wenzelm parents: 
28769diff
changeset | 888 | |
| 58618 | 889 | subsection \<open>Ambiguity of parsed expressions\<close> | 
| 46291 | 890 | |
| 58618 | 891 | text \<open> | 
| 46291 | 892 |   \begin{tabular}{rcll}
 | 
| 61493 | 893 |     @{attribute_def syntax_ambiguity_warning} & : & \<open>attribute\<close> & default \<open>true\<close> \\
 | 
| 894 |     @{attribute_def syntax_ambiguity_limit} & : & \<open>attribute\<close> & default \<open>10\<close> \\
 | |
| 46291 | 895 |   \end{tabular}
 | 
| 896 | ||
| 62106 | 897 | Depending on the grammar and the given input, parsing may be ambiguous. | 
| 898 | Isabelle lets the Earley parser enumerate all possible parse trees, and then | |
| 899 | tries to make the best out of the situation. Terms that cannot be | |
| 900 | type-checked are filtered out, which often leads to a unique result in the | |
| 901 | end. Unlike regular type reconstruction, which is applied to the whole | |
| 902 | collection of input terms simultaneously, the filtering stage only treats | |
| 903 | each given term in isolation. Filtering is also not attempted for individual | |
| 46291 | 904 |   types or raw ASTs (as required for @{command translations}).
 | 
| 905 | ||
| 62106 | 906 | Certain warning or error messages are printed, depending on the situation | 
| 907 | and the given configuration options. Parsing ultimately fails, if multiple | |
| 908 | results remain after the filtering phase. | |
| 46291 | 909 | |
| 62106 | 910 |   \<^descr> @{attribute syntax_ambiguity_warning} controls output of explicit warning
 | 
| 911 | messages about syntax ambiguity. | |
| 46291 | 912 | |
| 62106 | 913 |   \<^descr> @{attribute syntax_ambiguity_limit} determines the number of resulting
 | 
| 914 | parse trees that are shown as part of the printed message in case of an | |
| 915 | ambiguity. | |
| 58618 | 916 | \<close> | 
| 46291 | 917 | |
| 918 | ||
| 58618 | 919 | section \<open>Syntax transformations \label{sec:syntax-transformations}\<close>
 | 
| 48113 | 920 | |
| 62106 | 921 | text \<open> | 
| 922 | The inner syntax engine of Isabelle provides separate mechanisms to | |
| 923 | transform parse trees either via rewrite systems on first-order ASTs | |
| 924 |   (\secref{sec:syn-trans}), or ML functions on ASTs or syntactic \<open>\<lambda>\<close>-terms
 | |
| 925 |   (\secref{sec:tr-funs}). This works both for parsing and printing, as
 | |
| 926 |   outlined in \figref{fig:parse-print}.
 | |
| 48113 | 927 | |
| 928 |   \begin{figure}[htbp]
 | |
| 929 |   \begin{center}
 | |
| 930 |   \begin{tabular}{cl}
 | |
| 931 | string & \\ | |
| 61493 | 932 | \<open>\<down>\<close> & lexer + parser \\ | 
| 48113 | 933 | parse tree & \\ | 
| 61493 | 934 | \<open>\<down>\<close> & parse AST translation \\ | 
| 48113 | 935 | AST & \\ | 
| 61493 | 936 | \<open>\<down>\<close> & AST rewriting (macros) \\ | 
| 48113 | 937 | AST & \\ | 
| 61493 | 938 | \<open>\<down>\<close> & parse translation \\ | 
| 48113 | 939 | --- pre-term --- & \\ | 
| 61493 | 940 | \<open>\<down>\<close> & print translation \\ | 
| 48113 | 941 | AST & \\ | 
| 61493 | 942 | \<open>\<down>\<close> & AST rewriting (macros) \\ | 
| 48113 | 943 | AST & \\ | 
| 61493 | 944 | \<open>\<down>\<close> & print AST translation \\ | 
| 48113 | 945 | string & | 
| 946 |   \end{tabular}
 | |
| 947 |   \end{center}
 | |
| 948 |   \caption{Parsing and printing with translations}\label{fig:parse-print}
 | |
| 949 |   \end{figure}
 | |
| 950 | ||
| 62106 | 951 | These intermediate syntax tree formats eventually lead to a pre-term with | 
| 952 | all names and binding scopes resolved, but most type information still | |
| 953 | missing. Explicit type constraints might be given by the user, or implicit | |
| 954 | position information by the system --- both need to be passed-through | |
| 955 | carefully by syntax transformations. | |
| 48113 | 956 | |
| 62106 | 957 | Pre-terms are further processed by the so-called \<^emph>\<open>check\<close> and \<^emph>\<open>uncheck\<close> | 
| 76987 | 958 | phases that are intertwined with type-inference (see also \<^cite>\<open>"isabelle-implementation"\<close>). The latter allows to operate on higher-order | 
| 62106 | 959 | abstract syntax with proper binding and type information already available. | 
| 48113 | 960 | |
| 62106 | 961 | As a rule of thumb, anything that manipulates bindings of variables or | 
| 962 | constants needs to be implemented as syntax transformation (see below). | |
| 963 | Anything else is better done via check/uncheck: a prominent example | |
| 964 |   application is the @{command abbreviation} concept of Isabelle/Pure.
 | |
| 965 | \<close> | |
| 48113 | 966 | |
| 967 | ||
| 58618 | 968 | subsection \<open>Abstract syntax trees \label{sec:ast}\<close>
 | 
| 48113 | 969 | |
| 62106 | 970 | text \<open> | 
| 69597 | 971 | The ML datatype \<^ML_type>\<open>Ast.ast\<close> explicitly represents the intermediate | 
| 62106 | 972 |   AST format that is used for syntax rewriting (\secref{sec:syn-trans}). It is
 | 
| 973 | defined in ML as follows: | |
| 61408 
9020a3ba6c9a
@{verbatim [display]} supersedes old alltt/ttbox;
 wenzelm parents: 
61143diff
changeset | 974 |   @{verbatim [display]
 | 
| 
9020a3ba6c9a
@{verbatim [display]} supersedes old alltt/ttbox;
 wenzelm parents: 
61143diff
changeset | 975 | \<open>datatype ast = | 
| 
9020a3ba6c9a
@{verbatim [display]} supersedes old alltt/ttbox;
 wenzelm parents: 
61143diff
changeset | 976 | Constant of string | | 
| 
9020a3ba6c9a
@{verbatim [display]} supersedes old alltt/ttbox;
 wenzelm parents: 
61143diff
changeset | 977 | Variable of string | | 
| 
9020a3ba6c9a
@{verbatim [display]} supersedes old alltt/ttbox;
 wenzelm parents: 
61143diff
changeset | 978 | Appl of ast list\<close>} | 
| 48114 | 979 | |
| 62106 | 980 | An AST is either an atom (constant or variable) or a list of (at least two) | 
| 981 | subtrees. Occasional diagnostic output of ASTs uses notation that resembles | |
| 982 | S-expression of LISP. Constant atoms are shown as quoted strings, variable | |
| 983 | atoms as non-quoted strings and applications as a parenthesized list of | |
| 984 | subtrees. For example, the AST | |
| 58724 | 985 |   @{ML [display] \<open>Ast.Appl [Ast.Constant "_abs", Ast.Variable "x", Ast.Variable "t"]\<close>}
 | 
| 62106 | 986 |   is pretty-printed as \<^verbatim>\<open>("_abs" x t)\<close>. Note that \<^verbatim>\<open>()\<close> and \<^verbatim>\<open>(x)\<close> are
 | 
| 987 | excluded as ASTs, because they have too few subtrees. | |
| 48114 | 988 | |
| 61421 | 989 | \<^medskip> | 
| 62106 | 990 | AST application is merely a pro-forma mechanism to indicate certain | 
| 991 | syntactic structures. Thus \<^verbatim>\<open>(c a b)\<close> could mean either term application or | |
| 992 | type application, depending on the syntactic context. | |
| 48114 | 993 | |
| 62106 | 994 |   Nested application like \<^verbatim>\<open>(("_abs" x t) u)\<close> is also possible, but ASTs are
 | 
| 995 | definitely first-order: the syntax constant \<^verbatim>\<open>"_abs"\<close> does not bind the \<^verbatim>\<open>x\<close> | |
| 996 | in any way. Proper bindings are introduced in later stages of the term | |
| 69597 | 997 |   syntax, where \<^verbatim>\<open>("_abs" x t)\<close> becomes an \<^ML>\<open>Abs\<close> node and occurrences of
 | 
| 62106 | 998 | \<^verbatim>\<open>x\<close> in \<^verbatim>\<open>t\<close> are replaced by bound variables (represented as de-Bruijn | 
| 999 | indices). | |
| 58618 | 1000 | \<close> | 
| 48113 | 1001 | |
| 1002 | ||
| 58618 | 1003 | subsubsection \<open>AST constants versus variables\<close> | 
| 48114 | 1004 | |
| 62106 | 1005 | text \<open> | 
| 1006 | Depending on the situation --- input syntax, output syntax, translation | |
| 69597 | 1007 | patterns --- the distinction of atomic ASTs as \<^ML>\<open>Ast.Constant\<close> versus | 
| 1008 | \<^ML>\<open>Ast.Variable\<close> serves slightly different purposes. | |
| 48114 | 1009 | |
| 62106 | 1010 | Input syntax of a term such as \<open>f a b = c\<close> does not yet indicate the scopes | 
| 1011 | of atomic entities \<open>f, a, b, c\<close>: they could be global constants or local | |
| 69597 | 1012 | variables, even bound ones depending on the context of the term. \<^ML>\<open>Ast.Variable\<close> leaves this choice still open: later syntax layers (or | 
| 62106 | 1013 | translation functions) may capture such a variable to determine its role | 
| 1014 | specifically, to make it a constant, bound variable, free variable etc. In | |
| 1015 | contrast, syntax translations that introduce already known constants would | |
| 69597 | 1016 | rather do it via \<^ML>\<open>Ast.Constant\<close> to prevent accidental re-interpretation | 
| 62106 | 1017 | later on. | 
| 48114 | 1018 | |
| 69597 | 1019 | Output syntax turns term constants into \<^ML>\<open>Ast.Constant\<close> and variables | 
| 1020 | (free or schematic) into \<^ML>\<open>Ast.Variable\<close>. This information is precise | |
| 62106 | 1021 | when printing fully formal \<open>\<lambda>\<close>-terms. | 
| 48114 | 1022 | |
| 61421 | 1023 | \<^medskip> | 
| 62106 | 1024 |   AST translation patterns (\secref{sec:syn-trans}) that represent terms
 | 
| 1025 | cannot distinguish constants and variables syntactically. Explicit | |
| 1026 | indication of \<open>CONST c\<close> inside the term language is required, unless \<open>c\<close> is | |
| 1027 |   known as special \<^emph>\<open>syntax constant\<close> (see also @{command syntax}). It is also
 | |
| 1028 |   possible to use @{command syntax} declarations (without mixfix annotation)
 | |
| 1029 | to enforce that certain unqualified names are always treated as constant | |
| 1030 | within the syntax machinery. | |
| 48114 | 1031 | |
| 62106 | 1032 | The situation is simpler for ASTs that represent types or sorts, since the | 
| 1033 | concrete syntax already distinguishes type variables from type constants | |
| 1034 |   (constructors). So \<open>('a, 'b) foo\<close> corresponds to an AST application of some
 | |
| 1035 | constant for \<open>foo\<close> and variable arguments for \<open>'a\<close> and \<open>'b\<close>. Note that the | |
| 1036 | postfix application is merely a feature of the concrete syntax, while in the | |
| 1037 | AST the constructor occurs in head position. | |
| 1038 | \<close> | |
| 48114 | 1039 | |
| 1040 | ||
| 58618 | 1041 | subsubsection \<open>Authentic syntax names\<close> | 
| 48114 | 1042 | |
| 62106 | 1043 | text \<open> | 
| 1044 | Naming constant entities within ASTs is another delicate issue. Unqualified | |
| 1045 | names are resolved in the name space tables in the last stage of parsing, | |
| 1046 | after all translations have been applied. Since syntax transformations do | |
| 1047 | not know about this later name resolution, there can be surprises in | |
| 1048 | boundary cases. | |
| 48114 | 1049 | |
| 69597 | 1050 | \<^emph>\<open>Authentic syntax names\<close> for \<^ML>\<open>Ast.Constant\<close> avoid this problem: the | 
| 62106 | 1051 | fully-qualified constant name with a special prefix for its formal category | 
| 1052 | (\<open>class\<close>, \<open>type\<close>, \<open>const\<close>, \<open>fixed\<close>) represents the information faithfully | |
| 1053 | within the untyped AST format. Accidental overlap with free or bound | |
| 1054 | variables is excluded as well. Authentic syntax names work implicitly in the | |
| 1055 | following situations: | |
| 48114 | 1056 | |
| 62106 | 1057 | \<^item> Input of term constants (or fixed variables) that are introduced by | 
| 1058 |     concrete syntax via @{command notation}: the correspondence of a
 | |
| 1059 | particular grammar production to some known term entity is preserved. | |
| 48114 | 1060 | |
| 62106 | 1061 | \<^item> Input of type constants (constructors) and type classes --- thanks to | 
| 1062 | explicit syntactic distinction independently on the context. | |
| 48114 | 1063 | |
| 62106 | 1064 | \<^item> Output of term constants, type constants, type classes --- this | 
| 1065 | information is already available from the internal term to be printed. | |
| 48114 | 1066 | |
| 62106 | 1067 | In other words, syntax transformations that operate on input terms written | 
| 1068 | as prefix applications are difficult to make robust. Luckily, this case | |
| 1069 | rarely occurs in practice, because syntax forms to be translated usually | |
| 1070 | correspond to some concrete notation. | |
| 1071 | \<close> | |
| 48114 | 1072 | |
| 1073 | ||
| 58618 | 1074 | subsection \<open>Raw syntax and translations \label{sec:syn-trans}\<close>
 | 
| 28762 | 1075 | |
| 58618 | 1076 | text \<open> | 
| 48117 | 1077 |   \begin{tabular}{rcll}
 | 
| 61493 | 1078 |     @{command_def "nonterminal"} & : & \<open>theory \<rightarrow> theory\<close> \\
 | 
| 74333 | 1079 |     @{command_def "syntax"} & : & \<open>local_theory \<rightarrow> local_theory\<close> \\
 | 
| 1080 |     @{command_def "no_syntax"} & : & \<open>local_theory \<rightarrow> local_theory\<close> \\
 | |
| 61493 | 1081 |     @{command_def "translations"} & : & \<open>theory \<rightarrow> theory\<close> \\
 | 
| 1082 |     @{command_def "no_translations"} & : & \<open>theory \<rightarrow> theory\<close> \\
 | |
| 1083 |     @{attribute_def syntax_ast_trace} & : & \<open>attribute\<close> & default \<open>false\<close> \\
 | |
| 1084 |     @{attribute_def syntax_ast_stats} & : & \<open>attribute\<close> & default \<open>false\<close> \\
 | |
| 48117 | 1085 |   \end{tabular}
 | 
| 61421 | 1086 | \<^medskip> | 
| 59783 
00b62aa9f430
tuned syntax diagrams -- no duplication of "target";
 wenzelm parents: 
58842diff
changeset | 1087 | |
| 62106 | 1088 |   Unlike mixfix notation for existing formal entities (\secref{sec:notation}),
 | 
| 1089 | raw syntax declarations provide full access to the priority grammar of the | |
| 1090 | inner syntax, without any sanity checks. This includes additional syntactic | |
| 1091 |   categories (via @{command nonterminal}) and free-form grammar productions
 | |
| 1092 |   (via @{command syntax}). Additional syntax translations (or macros, via
 | |
| 1093 |   @{command translations}) are required to turn resulting parse trees into
 | |
| 1094 | proper representations of formal entities again. | |
| 46292 | 1095 | |
| 69597 | 1096 | \<^rail>\<open> | 
| 42596 
6c621a9d612a
modernized rail diagrams using @{rail} antiquotation;
 wenzelm parents: 
42358diff
changeset | 1097 |     @@{command nonterminal} (@{syntax name} + @'and')
 | 
| 28762 | 1098 | ; | 
| 46494 
ea2ae63336f3
clarified outer syntax "constdecl", which is only local to some rail diagrams;
 wenzelm parents: 
46483diff
changeset | 1099 |     (@@{command syntax} | @@{command no_syntax}) @{syntax mode}? (constdecl +)
 | 
| 28762 | 1100 | ; | 
| 42596 
6c621a9d612a
modernized rail diagrams using @{rail} antiquotation;
 wenzelm parents: 
42358diff
changeset | 1101 |     (@@{command translations} | @@{command no_translations})
 | 
| 
6c621a9d612a
modernized rail diagrams using @{rail} antiquotation;
 wenzelm parents: 
42358diff
changeset | 1102 |       (transpat ('==' | '=>' | '<=' | '\<rightleftharpoons>' | '\<rightharpoonup>' | '\<leftharpoondown>') transpat +)
 | 
| 28762 | 1103 | ; | 
| 1104 | ||
| 46494 
ea2ae63336f3
clarified outer syntax "constdecl", which is only local to some rail diagrams;
 wenzelm parents: 
46483diff
changeset | 1105 |     constdecl: @{syntax name} '::' @{syntax type} @{syntax mixfix}?
 | 
| 
ea2ae63336f3
clarified outer syntax "constdecl", which is only local to some rail diagrams;
 wenzelm parents: 
46483diff
changeset | 1106 | ; | 
| 42596 
6c621a9d612a
modernized rail diagrams using @{rail} antiquotation;
 wenzelm parents: 
42358diff
changeset | 1107 |     mode: ('(' ( @{syntax name} | @'output' | @{syntax name} @'output' ) ')')
 | 
| 28762 | 1108 | ; | 
| 62969 | 1109 |     transpat: ('(' @{syntax name} ')')? @{syntax string}
 | 
| 69597 | 1110 | \<close> | 
| 28762 | 1111 | |
| 62106 | 1112 |   \<^descr> @{command "nonterminal"}~\<open>c\<close> declares a type constructor \<open>c\<close> (without
 | 
| 1113 | arguments) to act as purely syntactic type: a nonterminal symbol of the | |
| 1114 | inner syntax. | |
| 28762 | 1115 | |
| 62106 | 1116 |   \<^descr> @{command "syntax"}~\<open>(mode) c :: \<sigma> (mx)\<close> augments the priority grammar and
 | 
| 1117 | the pretty printer table for the given print mode (default \<^verbatim>\<open>""\<close>). An | |
| 1118 |   optional keyword @{keyword_ref "output"} means that only the pretty printer
 | |
| 1119 | table is affected. | |
| 46292 | 1120 | |
| 62106 | 1121 |   Following \secref{sec:mixfix}, the mixfix annotation \<open>mx = template ps q\<close>
 | 
| 1122 | together with type \<open>\<sigma> = \<tau>\<^sub>1 \<Rightarrow> \<dots> \<tau>\<^sub>n \<Rightarrow> \<tau>\<close> and specify a grammar production. | |
| 1123 | The \<open>template\<close> contains delimiter tokens that surround \<open>n\<close> argument | |
| 1124 | positions (\<^verbatim>\<open>_\<close>). The latter correspond to nonterminal symbols \<open>A\<^sub>i\<close> derived | |
| 1125 | from the argument types \<open>\<tau>\<^sub>i\<close> as follows: | |
| 46292 | 1126 | |
| 61493 | 1127 | \<^item> \<open>prop\<close> if \<open>\<tau>\<^sub>i = prop\<close> | 
| 46292 | 1128 | |
| 62106 | 1129 | \<^item> \<open>logic\<close> if \<open>\<tau>\<^sub>i = (\<dots>)\<kappa>\<close> for logical type constructor \<open>\<kappa> \<noteq> prop\<close> | 
| 46292 | 1130 | |
| 61493 | 1131 | \<^item> \<open>any\<close> if \<open>\<tau>\<^sub>i = \<alpha>\<close> for type variables | 
| 46292 | 1132 | |
| 62106 | 1133 | \<^item> \<open>\<kappa>\<close> if \<open>\<tau>\<^sub>i = \<kappa>\<close> for nonterminal \<open>\<kappa>\<close> (syntactic type constructor) | 
| 46292 | 1134 | |
| 62106 | 1135 | Each \<open>A\<^sub>i\<close> is decorated by priority \<open>p\<^sub>i\<close> from the given list \<open>ps\<close>; missing | 
| 1136 | priorities default to 0. | |
| 46292 | 1137 | |
| 62106 | 1138 | The resulting nonterminal of the production is determined similarly from | 
| 1139 | type \<open>\<tau>\<close>, with priority \<open>q\<close> and default 1000. | |
| 46292 | 1140 | |
| 61421 | 1141 | \<^medskip> | 
| 62106 | 1142 | Parsing via this production produces parse trees \<open>t\<^sub>1, \<dots>, t\<^sub>n\<close> for the | 
| 1143 | argument slots. The resulting parse tree is composed as \<open>c t\<^sub>1 \<dots> t\<^sub>n\<close>, by | |
| 1144 | using the syntax constant \<open>c\<close> of the syntax declaration. | |
| 46292 | 1145 | |
| 62106 | 1146 | Such syntactic constants are invented on the spot, without formal check | 
| 1147 | wrt.\ existing declarations. It is conventional to use plain identifiers | |
| 1148 | prefixed by a single underscore (e.g.\ \<open>_foobar\<close>). Names should be chosen | |
| 1149 | with care, to avoid clashes with other syntax declarations. | |
| 46292 | 1150 | |
| 61421 | 1151 | \<^medskip> | 
| 62106 | 1152 | The special case of copy production is specified by \<open>c =\<close>~\<^verbatim>\<open>""\<close> (empty | 
| 1153 | string). It means that the resulting parse tree \<open>t\<close> is copied directly, | |
| 1154 | without any further decoration. | |
| 46282 | 1155 | |
| 62106 | 1156 |   \<^descr> @{command "no_syntax"}~\<open>(mode) decls\<close> removes grammar declarations (and
 | 
| 1157 | translations) resulting from \<open>decls\<close>, which are interpreted in the same | |
| 1158 |   manner as for @{command "syntax"} above.
 | |
| 1159 | ||
| 1160 |   \<^descr> @{command "translations"}~\<open>rules\<close> specifies syntactic translation rules
 | |
| 1161 |   (i.e.\ macros) as first-order rewrite rules on ASTs (\secref{sec:ast}). The
 | |
| 1162 | theory context maintains two independent lists translation rules: parse | |
| 1163 | rules (\<^verbatim>\<open>=>\<close> or \<open>\<rightharpoonup>\<close>) and print rules (\<^verbatim>\<open><=\<close> or \<open>\<leftharpoondown>\<close>). For convenience, both | |
| 1164 | can be specified simultaneously as parse~/ print rules (\<^verbatim>\<open>==\<close> or \<open>\<rightleftharpoons>\<close>). | |
| 46282 | 1165 | |
| 62106 | 1166 | Translation patterns may be prefixed by the syntactic category to be used | 
| 1167 | for parsing; the default is \<open>logic\<close> which means that regular term syntax is | |
| 1168 | used. Both sides of the syntax translation rule undergo parsing and parse | |
| 1169 |   AST translations \secref{sec:tr-funs}, in order to perform some fundamental
 | |
| 1170 | normalization like \<open>\<lambda>x y. b \<leadsto> \<lambda>x. \<lambda>y. b\<close>, but other AST translation rules | |
| 1171 | are \<^emph>\<open>not\<close> applied recursively here. | |
| 48115 | 1172 | |
| 62106 | 1173 | When processing AST patterns, the inner syntax lexer runs in a different | 
| 1174 | mode that allows identifiers to start with underscore. This accommodates the | |
| 1175 | usual naming convention for auxiliary syntax constants --- those that do not | |
| 1176 | have a logical counter part --- by allowing to specify arbitrary AST | |
| 1177 | applications within the term syntax, independently of the corresponding | |
| 1178 | concrete syntax. | |
| 48115 | 1179 | |
| 69597 | 1180 | Atomic ASTs are distinguished as \<^ML>\<open>Ast.Constant\<close> versus \<^ML>\<open>Ast.Variable\<close> as follows: a qualified name or syntax constant declared via | 
| 1181 |   @{command syntax}, or parse tree head of concrete notation becomes \<^ML>\<open>Ast.Constant\<close>, anything else \<^ML>\<open>Ast.Variable\<close>. Note that \<open>CONST\<close> and
 | |
| 62106 | 1182 |   \<open>XCONST\<close> within the term language (\secref{sec:pure-grammar}) allow to
 | 
| 1183 | enforce treatment as constants. | |
| 48115 | 1184 | |
| 62106 | 1185 | AST rewrite rules \<open>(lhs, rhs)\<close> need to obey the following side-conditions: | 
| 48115 | 1186 | |
| 62106 | 1187 | \<^item> Rules must be left linear: \<open>lhs\<close> must not contain repeated | 
| 1188 | variables.\<^footnote>\<open>The deeper reason for this is that AST equality is not | |
| 1189 | well-defined: different occurrences of the ``same'' AST could be decorated | |
| 1190 | differently by accidental type-constraints or source position information, | |
| 1191 | for example.\<close> | |
| 48115 | 1192 | |
| 61493 | 1193 | \<^item> Every variable in \<open>rhs\<close> must also occur in \<open>lhs\<close>. | 
| 48115 | 1194 | |
| 62106 | 1195 |   \<^descr> @{command "no_translations"}~\<open>rules\<close> removes syntactic translation rules,
 | 
| 1196 |   which are interpreted in the same manner as for @{command "translations"}
 | |
| 1197 | above. | |
| 28762 | 1198 | |
| 62106 | 1199 |   \<^descr> @{attribute syntax_ast_trace} and @{attribute syntax_ast_stats} control
 | 
| 1200 | diagnostic output in the AST normalization process, when translation rules | |
| 1201 | are applied to concrete input or output. | |
| 48117 | 1202 | |
| 46293 | 1203 | |
| 62106 | 1204 | Raw syntax and translations provides a slightly more low-level access to the | 
| 1205 | grammar and the form of resulting parse trees. It is often possible to avoid | |
| 1206 |   this untyped macro mechanism, and use type-safe @{command abbreviation} or
 | |
| 1207 |   @{command notation} instead. Some important situations where @{command
 | |
| 1208 |   syntax} and @{command translations} are really need are as follows:
 | |
| 46293 | 1209 | |
| 62106 | 1210 |   \<^item> Iterated replacement via recursive @{command translations}. For example,
 | 
| 69597 | 1211 | consider list enumeration \<^term>\<open>[a, b, c, d]\<close> as defined in theory | 
| 1212 | \<^theory>\<open>HOL.List\<close>. | |
| 46293 | 1213 | |
| 62106 | 1214 | \<^item> Change of binding status of variables: anything beyond the built-in | 
| 1215 |   @{keyword "binder"} mixfix annotation requires explicit syntax translations.
 | |
| 69597 | 1216 |   For example, consider the set comprehension syntax \<^term>\<open>{x. P}\<close> as
 | 
| 1217 | defined in theory \<^theory>\<open>HOL.Set\<close>. | |
| 61458 | 1218 | \<close> | 
| 46293 | 1219 | |
| 28762 | 1220 | |
| 58618 | 1221 | subsubsection \<open>Applying translation rules\<close> | 
| 48117 | 1222 | |
| 62106 | 1223 | text \<open> | 
| 1224 | As a term is being parsed or printed, an AST is generated as an intermediate | |
| 1225 |   form according to \figref{fig:parse-print}. The AST is normalized by
 | |
| 1226 | applying translation rules in the manner of a first-order term rewriting | |
| 1227 | system. We first examine how a single rule is applied. | |
| 48117 | 1228 | |
| 62106 | 1229 | Let \<open>t\<close> be the abstract syntax tree to be normalized and \<open>(lhs, rhs)\<close> some | 
| 1230 | translation rule. A subtree \<open>u\<close> of \<open>t\<close> is called \<^emph>\<open>redex\<close> if it is an | |
| 1231 | instance of \<open>lhs\<close>; in this case the pattern \<open>lhs\<close> is said to match the | |
| 1232 | object \<open>u\<close>. A redex matched by \<open>lhs\<close> may be replaced by the corresponding | |
| 1233 | instance of \<open>rhs\<close>, thus \<^emph>\<open>rewriting\<close> the AST \<open>t\<close>. Matching requires some | |
| 69597 | 1234 | notion of \<^emph>\<open>place-holders\<close> in rule patterns: \<^ML>\<open>Ast.Variable\<close> serves this | 
| 62106 | 1235 | purpose. | 
| 48117 | 1236 | |
| 62106 | 1237 | More precisely, the matching of the object \<open>u\<close> against the pattern \<open>lhs\<close> is | 
| 1238 | performed as follows: | |
| 48117 | 1239 | |
| 69597 | 1240 | \<^item> Objects of the form \<^ML>\<open>Ast.Variable\<close>~\<open>x\<close> or \<^ML>\<open>Ast.Constant\<close>~\<open>x\<close> are | 
| 1241 | matched by pattern \<^ML>\<open>Ast.Constant\<close>~\<open>x\<close>. Thus all atomic ASTs in the | |
| 62106 | 1242 | object are treated as (potential) constants, and a successful match makes | 
| 1243 | them actual constants even before name space resolution (see also | |
| 1244 |     \secref{sec:ast}).
 | |
| 48117 | 1245 | |
| 69597 | 1246 | \<^item> Object \<open>u\<close> is matched by pattern \<^ML>\<open>Ast.Variable\<close>~\<open>x\<close>, binding \<open>x\<close> to | 
| 62106 | 1247 | \<open>u\<close>. | 
| 48117 | 1248 | |
| 69597 | 1249 | \<^item> Object \<^ML>\<open>Ast.Appl\<close>~\<open>us\<close> is matched by \<^ML>\<open>Ast.Appl\<close>~\<open>ts\<close> if \<open>us\<close> and | 
| 62106 | 1250 | \<open>ts\<close> have the same length and each corresponding subtree matches. | 
| 48117 | 1251 | |
| 62106 | 1252 | \<^item> In every other case, matching fails. | 
| 48117 | 1253 | |
| 62106 | 1254 | A successful match yields a substitution that is applied to \<open>rhs\<close>, | 
| 1255 | generating the instance that replaces \<open>u\<close>. | |
| 48117 | 1256 | |
| 62106 | 1257 | Normalizing an AST involves repeatedly applying translation rules until none | 
| 1258 | are applicable. This works yoyo-like: top-down, bottom-up, top-down, etc. At | |
| 1259 | each subtree position, rules are chosen in order of appearance in the theory | |
| 1260 | definitions. | |
| 48117 | 1261 | |
| 62106 | 1262 |   The configuration options @{attribute syntax_ast_trace} and @{attribute
 | 
| 1263 | syntax_ast_stats} might help to understand this process and diagnose | |
| 1264 | problems. | |
| 48117 | 1265 | |
| 1266 |   \begin{warn}
 | |
| 62106 | 1267 |   If syntax translation rules work incorrectly, the output of @{command_ref
 | 
| 1268 | print_syntax} with its \<^emph>\<open>rules\<close> sections reveals the actual internal forms | |
| 1269 | of AST pattern, without potentially confusing concrete syntax. Recall that | |
| 1270 | AST constants appear as quoted strings and variables without quotes. | |
| 48117 | 1271 |   \end{warn}
 | 
| 1272 | ||
| 1273 |   \begin{warn}
 | |
| 62106 | 1274 |   If @{attribute_ref eta_contract} is set to \<open>true\<close>, terms will be
 | 
| 1275 | \<open>\<eta>\<close>-contracted \<^emph>\<open>before\<close> the AST rewriter sees them. Thus some abstraction | |
| 1276 | nodes needed for print rules to match may vanish. For example, \<open>Ball A (\<lambda>x. | |
| 1277 | P x)\<close> would contract to \<open>Ball A P\<close> and the standard print rule would fail to | |
| 1278 | apply. This problem can be avoided by hand-written ML translation functions | |
| 1279 |   (see also \secref{sec:tr-funs}), which is in fact the same mechanism used in
 | |
| 1280 |   built-in @{keyword "binder"} declarations.
 | |
| 48117 | 1281 |   \end{warn}
 | 
| 58618 | 1282 | \<close> | 
| 48117 | 1283 | |
| 28762 | 1284 | |
| 58618 | 1285 | subsection \<open>Syntax translation functions \label{sec:tr-funs}\<close>
 | 
| 28762 | 1286 | |
| 58618 | 1287 | text \<open> | 
| 28762 | 1288 |   \begin{matharray}{rcl}
 | 
| 61493 | 1289 |     @{command_def "parse_ast_translation"} & : & \<open>theory \<rightarrow> theory\<close> \\
 | 
| 1290 |     @{command_def "parse_translation"} & : & \<open>theory \<rightarrow> theory\<close> \\
 | |
| 1291 |     @{command_def "print_translation"} & : & \<open>theory \<rightarrow> theory\<close> \\
 | |
| 1292 |     @{command_def "typed_print_translation"} & : & \<open>theory \<rightarrow> theory\<close> \\
 | |
| 1293 |     @{command_def "print_ast_translation"} & : & \<open>theory \<rightarrow> theory\<close> \\
 | |
| 1294 |     @{ML_antiquotation_def "class_syntax"} & : & \<open>ML antiquotation\<close> \\
 | |
| 1295 |     @{ML_antiquotation_def "type_syntax"} & : & \<open>ML antiquotation\<close> \\
 | |
| 1296 |     @{ML_antiquotation_def "const_syntax"} & : & \<open>ML antiquotation\<close> \\
 | |
| 1297 |     @{ML_antiquotation_def "syntax_const"} & : & \<open>ML antiquotation\<close> \\
 | |
| 28762 | 1298 |   \end{matharray}
 | 
| 1299 | ||
| 48118 | 1300 | Syntax translation functions written in ML admit almost arbitrary | 
| 1301 | manipulations of inner syntax, at the expense of some complexity and | |
| 1302 | obscurity in the implementation. | |
| 1303 | ||
| 69597 | 1304 | \<^rail>\<open> | 
| 42596 
6c621a9d612a
modernized rail diagrams using @{rail} antiquotation;
 wenzelm parents: 
42358diff
changeset | 1305 |   ( @@{command parse_ast_translation} | @@{command parse_translation} |
 | 
| 
6c621a9d612a
modernized rail diagrams using @{rail} antiquotation;
 wenzelm parents: 
42358diff
changeset | 1306 |     @@{command print_translation} | @@{command typed_print_translation} |
 | 
| 52143 | 1307 |     @@{command print_ast_translation}) @{syntax text}
 | 
| 48119 
55c305e29f4b
cover @{class_syntax}, @{type_syntax}, @{const_syntax}, @{syntax_const} in isar-ref, in contrast to other ML antiquotations in implementation manual;
 wenzelm parents: 
48118diff
changeset | 1308 | ; | 
| 
55c305e29f4b
cover @{class_syntax}, @{type_syntax}, @{const_syntax}, @{syntax_const} in isar-ref, in contrast to other ML antiquotations in implementation manual;
 wenzelm parents: 
48118diff
changeset | 1309 |   (@@{ML_antiquotation class_syntax} |
 | 
| 
55c305e29f4b
cover @{class_syntax}, @{type_syntax}, @{const_syntax}, @{syntax_const} in isar-ref, in contrast to other ML antiquotations in implementation manual;
 wenzelm parents: 
48118diff
changeset | 1310 |    @@{ML_antiquotation type_syntax} |
 | 
| 
55c305e29f4b
cover @{class_syntax}, @{type_syntax}, @{const_syntax}, @{syntax_const} in isar-ref, in contrast to other ML antiquotations in implementation manual;
 wenzelm parents: 
48118diff
changeset | 1311 |    @@{ML_antiquotation const_syntax} |
 | 
| 67146 | 1312 |    @@{ML_antiquotation syntax_const}) embedded
 | 
| 69597 | 1313 | \<close> | 
| 28762 | 1314 | |
| 62106 | 1315 |   \<^descr> @{command parse_translation} etc. declare syntax translation functions to
 | 
| 1316 |   the theory. Any of these commands have a single @{syntax text} argument that
 | |
| 1317 | refers to an ML expression of appropriate type as follows: | |
| 48118 | 1318 | |
| 61421 | 1319 | \<^medskip> | 
| 48119 
55c305e29f4b
cover @{class_syntax}, @{type_syntax}, @{const_syntax}, @{syntax_const} in isar-ref, in contrast to other ML antiquotations in implementation manual;
 wenzelm parents: 
48118diff
changeset | 1320 |   {\footnotesize
 | 
| 52143 | 1321 |   \begin{tabular}{l}
 | 
| 1322 |   @{command parse_ast_translation} : \\
 | |
| 69597 | 1323 | \quad \<^ML_type>\<open>(string * (Proof.context -> Ast.ast list -> Ast.ast)) list\<close> \\ | 
| 52143 | 1324 |   @{command parse_translation} : \\
 | 
| 69597 | 1325 | \quad \<^ML_type>\<open>(string * (Proof.context -> term list -> term)) list\<close> \\ | 
| 52143 | 1326 |   @{command print_translation} : \\
 | 
| 69597 | 1327 | \quad \<^ML_type>\<open>(string * (Proof.context -> term list -> term)) list\<close> \\ | 
| 52143 | 1328 |   @{command typed_print_translation} : \\
 | 
| 69597 | 1329 | \quad \<^ML_type>\<open>(string * (Proof.context -> typ -> term list -> term)) list\<close> \\ | 
| 52143 | 1330 |   @{command print_ast_translation} : \\
 | 
| 69597 | 1331 | \quad \<^ML_type>\<open>(string * (Proof.context -> Ast.ast list -> Ast.ast)) list\<close> \\ | 
| 48118 | 1332 |   \end{tabular}}
 | 
| 61421 | 1333 | \<^medskip> | 
| 28762 | 1334 | |
| 62106 | 1335 | The argument list consists of \<open>(c, tr)\<close> pairs, where \<open>c\<close> is the syntax name | 
| 1336 | of the formal entity involved, and \<open>tr\<close> a function that translates a syntax | |
| 1337 | form \<open>c args\<close> into \<open>tr ctxt args\<close> (depending on the context). The | |
| 1338 | Isabelle/ML naming convention for parse translations is \<open>c_tr\<close> and for print | |
| 1339 | translations \<open>c_tr'\<close>. | |
| 48118 | 1340 | |
| 1341 |   The @{command_ref print_syntax} command displays the sets of names
 | |
| 62106 | 1342 | associated with the translation functions of a theory under | 
| 1343 | \<open>parse_ast_translation\<close> etc. | |
| 48118 | 1344 | |
| 62106 | 1345 |   \<^descr> \<open>@{class_syntax c}\<close>, \<open>@{type_syntax c}\<close>, \<open>@{const_syntax c}\<close> inline the
 | 
| 1346 | authentic syntax name of the given formal entities into the ML source. This | |
| 1347 | is the fully-qualified logical name prefixed by a special marker to indicate | |
| 1348 | its kind: thus different logical name spaces are properly distinguished | |
| 1349 | within parse trees. | |
| 48119 
55c305e29f4b
cover @{class_syntax}, @{type_syntax}, @{const_syntax}, @{syntax_const} in isar-ref, in contrast to other ML antiquotations in implementation manual;
 wenzelm parents: 
48118diff
changeset | 1350 | |
| 62106 | 1351 |   \<^descr> \<open>@{const_syntax c}\<close> inlines the name \<open>c\<close> of the given syntax constant,
 | 
| 1352 |   having checked that it has been declared via some @{command syntax} commands
 | |
| 1353 | within the theory context. Note that the usual naming convention makes | |
| 1354 | syntax constants start with underscore, to reduce the chance of accidental | |
| 1355 | clashes with other names occurring in parse trees (unqualified constants | |
| 1356 | etc.). | |
| 58618 | 1357 | \<close> | 
| 48118 | 1358 | |
| 48119 
55c305e29f4b
cover @{class_syntax}, @{type_syntax}, @{const_syntax}, @{syntax_const} in isar-ref, in contrast to other ML antiquotations in implementation manual;
 wenzelm parents: 
48118diff
changeset | 1359 | |
| 58618 | 1360 | subsubsection \<open>The translation strategy\<close> | 
| 28762 | 1361 | |
| 62106 | 1362 | text \<open> | 
| 1363 | The different kinds of translation functions are invoked during the | |
| 1364 | transformations between parse trees, ASTs and syntactic terms (cf.\ | |
| 1365 |   \figref{fig:parse-print}). Whenever a combination of the form \<open>c x\<^sub>1 \<dots> x\<^sub>n\<close>
 | |
| 1366 | is encountered, and a translation function \<open>f\<close> of appropriate kind is | |
| 1367 | declared for \<open>c\<close>, the result is produced by evaluation of \<open>f [x\<^sub>1, \<dots>, x\<^sub>n]\<close> | |
| 1368 | in ML. | |
| 48118 | 1369 | |
| 62106 | 1370 | For AST translations, the arguments \<open>x\<^sub>1, \<dots>, x\<^sub>n\<close> are ASTs. A combination | 
| 69597 | 1371 | has the form \<^ML>\<open>Ast.Constant\<close>~\<open>c\<close> or \<^ML>\<open>Ast.Appl\<close>~\<open>[\<close>\<^ML>\<open>Ast.Constant\<close>~\<open>c, x\<^sub>1, \<dots>, x\<^sub>n]\<close>. For term translations, the arguments are | 
| 1372 | terms and a combination has the form \<^ML>\<open>Const\<close>~\<open>(c, \<tau>)\<close> or \<^ML>\<open>Const\<close>~\<open>(c, \<tau>) $ x\<^sub>1 $ \<dots> $ x\<^sub>n\<close>. Terms allow more sophisticated | |
| 62106 | 1373 | transformations than ASTs do, typically involving abstractions and bound | 
| 1374 | variables. \<^emph>\<open>Typed\<close> print translations may even peek at the type \<open>\<tau>\<close> of the | |
| 1375 | constant they are invoked on, although some information might have been | |
| 1376 | suppressed for term output already. | |
| 48118 | 1377 | |
| 62106 | 1378 | Regardless of whether they act on ASTs or terms, translation functions | 
| 1379 | called during the parsing process differ from those for printing in their | |
| 1380 | overall behaviour: | |
| 48118 | 1381 | |
| 62106 | 1382 | \<^descr>[Parse translations] are applied bottom-up. The arguments are already in | 
| 1383 | translated form. The translations must not fail; exceptions trigger an | |
| 1384 | error message. There may be at most one function associated with any | |
| 1385 | syntactic name. | |
| 46294 | 1386 | |
| 62106 | 1387 | \<^descr>[Print translations] are applied top-down. They are supplied with | 
| 1388 | arguments that are partly still in internal form. The result again | |
| 1389 | undergoes translation; therefore a print translation should not introduce | |
| 1390 | as head the very constant that invoked it. The function may raise | |
| 69597 | 1391 | exception \<^ML>\<open>Match\<close> to indicate failure; in this event it has no effect. | 
| 62106 | 1392 | Multiple functions associated with some syntactic name are tried in the | 
| 1393 | order of declaration in the theory. | |
| 48118 | 1394 | |
| 69597 | 1395 | Only constant atoms --- constructor \<^ML>\<open>Ast.Constant\<close> for ASTs and \<^ML>\<open>Const\<close> for terms --- can invoke translation functions. This means that parse | 
| 62106 | 1396 | translations can only be associated with parse tree heads of concrete | 
| 1397 | syntax, or syntactic constants introduced via other translations. For plain | |
| 1398 | identifiers within the term language, the status of constant versus variable | |
| 1399 | is not yet know during parsing. This is in contrast to print translations, | |
| 1400 | where constants are explicitly known from the given term in its fully | |
| 1401 | internal form. | |
| 58618 | 1402 | \<close> | 
| 28762 | 1403 | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1404 | |
| 58618 | 1405 | subsection \<open>Built-in syntax transformations\<close> | 
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1406 | |
| 58618 | 1407 | text \<open> | 
| 62106 | 1408 | Here are some further details of the main syntax transformation phases of | 
| 1409 |   \figref{fig:parse-print}.
 | |
| 58618 | 1410 | \<close> | 
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1411 | |
| 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1412 | |
| 58618 | 1413 | subsubsection \<open>Transforming parse trees to ASTs\<close> | 
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1414 | |
| 62106 | 1415 | text \<open> | 
| 1416 | The parse tree is the raw output of the parser. It is transformed into an | |
| 1417 | AST according to some basic scheme that may be augmented by AST translation | |
| 1418 |   functions as explained in \secref{sec:tr-funs}.
 | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1419 | |
| 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1420 | The parse tree is constructed by nesting the right-hand sides of the | 
| 62106 | 1421 | productions used to recognize the input. Such parse trees are simply lists | 
| 1422 | of tokens and constituent parse trees, the latter representing the | |
| 1423 | nonterminals of the productions. Ignoring AST translation functions, parse | |
| 1424 | trees are transformed to ASTs by stripping out delimiters and copy | |
| 1425 | productions, while retaining some source position information from input | |
| 1426 | tokens. | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1427 | |
| 62106 | 1428 | The Pure syntax provides predefined AST translations to make the basic | 
| 1429 | \<open>\<lambda>\<close>-term structure more apparent within the (first-order) AST | |
| 1430 |   representation, and thus facilitate the use of @{command translations} (see
 | |
| 1431 |   also \secref{sec:syn-trans}). This covers ordinary term application, type
 | |
| 1432 | application, nested abstraction, iterated meta implications and function | |
| 1433 | types. The effect is illustrated on some representative input strings is as | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1434 | follows: | 
| 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1435 | |
| 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1436 |   \begin{center}
 | 
| 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1437 |   \begin{tabular}{ll}
 | 
| 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1438 | input source & AST \\ | 
| 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1439 | \hline | 
| 61503 | 1440 | \<open>f x y z\<close> & \<^verbatim>\<open>(f x y z)\<close> \\ | 
| 1441 | \<open>'a ty\<close> & \<^verbatim>\<open>(ty 'a)\<close> \\ | |
| 1442 |   \<open>('a, 'b)ty\<close> & \<^verbatim>\<open>(ty 'a 'b)\<close> \\
 | |
| 1443 |   \<open>\<lambda>x y z. t\<close> & \<^verbatim>\<open>("_abs" x ("_abs" y ("_abs" z t)))\<close> \\
 | |
| 1444 |   \<open>\<lambda>x :: 'a. t\<close> & \<^verbatim>\<open>("_abs" ("_constrain" x 'a) t)\<close> \\
 | |
| 1445 |   \<open>\<lbrakk>P; Q; R\<rbrakk> \<Longrightarrow> S\<close> & \<^verbatim>\<open>("Pure.imp" P ("Pure.imp" Q ("Pure.imp" R S)))\<close> \\
 | |
| 1446 |    \<open>['a, 'b, 'c] \<Rightarrow> 'd\<close> & \<^verbatim>\<open>("fun" 'a ("fun" 'b ("fun" 'c 'd)))\<close> \\
 | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1447 |   \end{tabular}
 | 
| 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1448 |   \end{center}
 | 
| 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1449 | |
| 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1450 | Note that type and sort constraints may occur in further places --- | 
| 62106 | 1451 | translations need to be ready to cope with them. The built-in syntax | 
| 1452 | transformation from parse trees to ASTs insert additional constraints that | |
| 1453 | represent source positions. | |
| 58618 | 1454 | \<close> | 
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1455 | |
| 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1456 | |
| 58618 | 1457 | subsubsection \<open>Transforming ASTs to terms\<close> | 
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1458 | |
| 62106 | 1459 | text \<open> | 
| 1460 |   After application of macros (\secref{sec:syn-trans}), the AST is transformed
 | |
| 1461 | into a term. This term still lacks proper type information, but it might | |
| 1462 | contain some constraints consisting of applications with head \<^verbatim>\<open>_constrain\<close>, | |
| 1463 | where the second argument is a type encoded as a pre-term within the syntax. | |
| 1464 | Type inference later introduces correct types, or indicates type errors in | |
| 1465 | the input. | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1466 | |
| 62106 | 1467 | Ignoring parse translations, ASTs are transformed to terms by mapping AST | 
| 1468 | constants to term constants, AST variables to term variables or constants | |
| 1469 | (according to the name space), and AST applications to iterated term | |
| 1470 | applications. | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1471 | |
| 62106 | 1472 | The outcome is still a first-order term. Proper abstractions and bound | 
| 1473 | variables are introduced by parse translations associated with certain | |
| 1474 |   syntax constants. Thus \<^verbatim>\<open>("_abs" x x)\<close> eventually becomes a de-Bruijn term
 | |
| 1475 |   \<^verbatim>\<open>Abs ("x", _, Bound 0)\<close>.
 | |
| 58618 | 1476 | \<close> | 
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1477 | |
| 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1478 | |
| 58618 | 1479 | subsubsection \<open>Printing of terms\<close> | 
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1480 | |
| 62106 | 1481 | text \<open> | 
| 1482 | The output phase is essentially the inverse of the input phase. Terms are | |
| 1483 | translated via abstract syntax trees into pretty-printed text. | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1484 | |
| 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1485 | Ignoring print translations, the transformation maps term constants, | 
| 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1486 | variables and applications to the corresponding constructs on ASTs. | 
| 62106 | 1487 | Abstractions are mapped to applications of the special constant \<^verbatim>\<open>_abs\<close> as | 
| 1488 | seen before. Type constraints are represented via special \<^verbatim>\<open>_constrain\<close> | |
| 1489 | forms, according to various policies of type annotation determined | |
| 1490 | elsewhere. Sort constraints of type variables are handled in a similar | |
| 1491 | fashion. | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1492 | |
| 62106 | 1493 |   After application of macros (\secref{sec:syn-trans}), the AST is finally
 | 
| 1494 | pretty-printed. The built-in print AST translations reverse the | |
| 1495 | corresponding parse AST translations. | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1496 | |
| 61421 | 1497 | \<^medskip> | 
| 1498 | For the actual printing process, the priority grammar | |
| 62106 | 1499 |   (\secref{sec:priority-grammar}) plays a vital role: productions are used as
 | 
| 1500 | templates for pretty printing, with argument slots stemming from | |
| 1501 | nonterminals, and syntactic sugar stemming from literal tokens. | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1502 | |
| 62106 | 1503 | Each AST application with constant head \<open>c\<close> and arguments \<open>t\<^sub>1\<close>, \dots, | 
| 1504 | \<open>t\<^sub>n\<close> (for \<open>n = 0\<close> the AST is just the constant \<open>c\<close> itself) is printed | |
| 1505 | according to the first grammar production of result name \<open>c\<close>. The required | |
| 1506 | syntax priority of the argument slot is given by its nonterminal \<open>A\<^sup>(\<^sup>p\<^sup>)\<close>. | |
| 1507 | The argument \<open>t\<^sub>i\<close> that corresponds to the position of \<open>A\<^sup>(\<^sup>p\<^sup>)\<close> is printed | |
| 1508 | recursively, and then put in parentheses \<^emph>\<open>if\<close> its priority \<open>p\<close> requires | |
| 1509 | this. The resulting output is concatenated with the syntactic sugar | |
| 1510 | according to the grammar production. | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1511 | |
| 62106 | 1512 | If an AST application \<open>(c x\<^sub>1 \<dots> x\<^sub>m)\<close> has more arguments than the | 
| 1513 | corresponding production, it is first split into \<open>((c x\<^sub>1 \<dots> x\<^sub>n) x\<^sub>n\<^sub>+\<^sub>1 \<dots> | |
| 1514 | x\<^sub>m)\<close> and then printed recursively as above. | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1515 | |
| 62106 | 1516 | Applications with too few arguments or with non-constant head or without a | 
| 1517 | corresponding production are printed in prefix-form like \<open>f t\<^sub>1 \<dots> t\<^sub>n\<close> for | |
| 1518 | terms. | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1519 | |
| 62106 | 1520 | Multiple productions associated with some name \<open>c\<close> are tried in order of | 
| 1521 | appearance within the grammar. An occurrence of some AST variable \<open>x\<close> is | |
| 1522 | printed as \<open>x\<close> outright. | |
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1523 | |
| 61421 | 1524 | \<^medskip> | 
| 62106 | 1525 | White space is \<^emph>\<open>not\<close> inserted automatically. If blanks (or breaks) are | 
| 1526 | required to separate tokens, they need to be specified in the mixfix | |
| 1527 |   declaration (\secref{sec:mixfix}).
 | |
| 58618 | 1528 | \<close> | 
| 52414 
8429123bc58a
more on built-in syntax transformations, based on reduced version of old material;
 wenzelm parents: 
52413diff
changeset | 1529 | |
| 28762 | 1530 | end |