1211 \<close> |
1211 \<close> |
1212 |
1212 |
1213 |
1213 |
1214 section \<open>Strings of symbols \label{sec:symbols}\<close> |
1214 section \<open>Strings of symbols \label{sec:symbols}\<close> |
1215 |
1215 |
1216 text \<open>A \<^emph>\<open>symbol\<close> constitutes the smallest textual unit in |
1216 text \<open>A \<^emph>\<open>symbol\<close> constitutes the smallest textual unit in Isabelle/ML --- raw |
1217 Isabelle/ML --- raw ML characters are normally not encountered at |
1217 ML characters are normally not encountered at all. Isabelle strings consist |
1218 all. Isabelle strings consist of a sequence of symbols, represented |
1218 of a sequence of symbols, represented as a packed string or an exploded list |
1219 as a packed string or an exploded list of strings. Each symbol is |
1219 of strings. Each symbol is in itself a small string, which has either one of |
1220 in itself a small string, which has either one of the following |
1220 the following forms: |
1221 forms: |
1221 |
1222 |
1222 \<^enum> a single ASCII character ``\<open>c\<close>'', for example ``\<^verbatim>\<open>a\<close>'', |
1223 \<^enum> a single ASCII character ``\<open>c\<close>'', for example ``\<^verbatim>\<open>a\<close>'', |
1223 |
1224 |
1224 \<^enum> a codepoint according to UTF-8 (non-ASCII byte sequence), |
1225 \<^enum> a codepoint according to UTF-8 (non-ASCII byte sequence), |
1225 |
1226 |
1226 \<^enum> a regular symbol ``\<^verbatim>\<open>\<ident>\<close>'', for example ``\<^verbatim>\<open>\<alpha>\<close>'', |
1227 \<^enum> a regular symbol ``\<^verbatim>\<open>\\<close>\<^verbatim>\<open><\<close>\<open>ident\<close>\<^verbatim>\<open>>\<close>'', for example ``\<^verbatim>\<open>\<alpha>\<close>'', |
1227 |
1228 |
1228 \<^enum> a control symbol ``\<^verbatim>\<open>\<^ident>\<close>'', for example ``\<^verbatim>\<open>\<^bold>\<close>'', |
1229 \<^enum> a control symbol ``\<^verbatim>\<open>\\<close>\<^verbatim>\<open><^\<close>\<open>ident\<close>\<^verbatim>\<open>>\<close>'', for example ``\<^verbatim>\<open>\<^bold>\<close>'', |
1229 |
1230 |
1230 \<^enum> a raw symbol ``\<^verbatim>\<open>\\<close>\<^verbatim>\<open><^raw:\<close>\<open>text\<close>\<^verbatim>\<open>>\<close>'' where \<open>text\<close> consists of |
1231 \<^enum> a raw symbol ``\<^verbatim>\<open>\\<close>\<^verbatim>\<open><^raw:\<close>\<open>text\<close>\<^verbatim>\<open>>\<close>'' where \<open>text\<close> consists of printable characters |
1231 printable characters excluding ``\<^verbatim>\<open>.\<close>'' and ``\<^verbatim>\<open>>\<close>'', for example |
1232 excluding ``\<^verbatim>\<open>.\<close>'' and ``\<^verbatim>\<open>>\<close>'', for example |
1232 ``\<^verbatim>\<open>\<^raw:$\sum_{i = 1}^n$>\<close>'', |
1233 ``\<^verbatim>\<open>\<^raw:$\sum_{i = 1}^n$>\<close>'', |
1233 |
1234 |
1234 \<^enum> a numbered raw control symbol ``\<^verbatim>\<open>\\<close>\<^verbatim>\<open><^raw\<close>\<open>n\<close>\<^verbatim>\<open>>\<close>, where \<open>n\<close> consists |
1235 \<^enum> a numbered raw control symbol ``\<^verbatim>\<open>\\<close>\<^verbatim>\<open><^raw\<close>\<open>n\<close>\<^verbatim>\<open>>\<close>, where \<open>n\<close> consists |
1235 of digits, for example ``\<^verbatim>\<open>\<^raw42>\<close>''. |
1236 of digits, for example ``\<^verbatim>\<open>\<^raw42>\<close>''. |
1236 |
1237 |
1237 The \<open>ident\<close> syntax for symbol names is \<open>letter (letter | digit)\<^sup>*\<close>, where |
1238 |
1238 \<open>letter = A..Za..z\<close> and \<open>digit = 0..9\<close>. There are infinitely many regular |
1239 The \<open>ident\<close> syntax for symbol names is \<open>letter |
1239 symbols and control symbols, but a fixed collection of standard symbols is |
1240 (letter | digit)\<^sup>*\<close>, where \<open>letter = A..Za..z\<close> and \<open>digit = 0..9\<close>. There are infinitely many regular symbols and |
1240 treated specifically. For example, ``\<^verbatim>\<open>\<alpha>\<close>'' is classified as a letter, which |
1241 control symbols, but a fixed collection of standard symbols is |
1241 means it may occur within regular Isabelle identifiers. |
1242 treated specifically. For example, ``\<^verbatim>\<open>\<alpha>\<close>'' is |
|
1243 classified as a letter, which means it may occur within regular |
|
1244 Isabelle identifiers. |
|
1245 |
1242 |
1246 The character set underlying Isabelle symbols is 7-bit ASCII, but 8-bit |
1243 The character set underlying Isabelle symbols is 7-bit ASCII, but 8-bit |
1247 character sequences are passed-through unchanged. Unicode/UCS data in UTF-8 |
1244 character sequences are passed-through unchanged. Unicode/UCS data in UTF-8 |
1248 encoding is processed in a non-strict fashion, such that well-formed code |
1245 encoding is processed in a non-strict fashion, such that well-formed code |
1249 sequences are recognized accordingly. Unicode provides its own collection of |
1246 sequences are recognized accordingly. Unicode provides its own collection of |
1250 mathematical symbols, but within the core Isabelle/ML world there is no link |
1247 mathematical symbols, but within the core Isabelle/ML world there is no link |
1251 to the standard collection of Isabelle regular symbols. |
1248 to the standard collection of Isabelle regular symbols. |
1252 |
1249 |
1253 \<^medskip> |
1250 \<^medskip> |
1254 Output of Isabelle symbols depends on the print mode. For example, |
1251 Output of Isabelle symbols depends on the print mode. For example, the |
1255 the standard {\LaTeX} setup of the Isabelle document preparation system |
1252 standard {\LaTeX} setup of the Isabelle document preparation system would |
1256 would present ``\<^verbatim>\<open>\<alpha>\<close>'' as \<open>\<alpha>\<close>, and ``\<^verbatim>\<open>\<^bold>\<alpha>\<close>'' as \<open>\<^bold>\<alpha>\<close>. On-screen rendering |
1253 present ``\<^verbatim>\<open>\<alpha>\<close>'' as \<open>\<alpha>\<close>, and ``\<^verbatim>\<open>\<^bold>\<alpha>\<close>'' as \<open>\<^bold>\<alpha>\<close>. On-screen rendering usually |
1257 usually works by mapping a finite subset of Isabelle symbols to suitable |
1254 works by mapping a finite subset of Isabelle symbols to suitable Unicode |
1258 Unicode characters. |
1255 characters. |
1259 \<close> |
1256 \<close> |
1260 |
1257 |
1261 text %mlref \<open> |
1258 text %mlref \<open> |
1262 \begin{mldecls} |
1259 \begin{mldecls} |
1263 @{index_ML_type "Symbol.symbol": string} \\ |
1260 @{index_ML_type "Symbol.symbol": string} \\ |