1250 forms: |
1249 forms: |
1251 |
1250 |
1252 \begin{enumerate} |
1251 \begin{enumerate} |
1253 |
1252 |
1254 \item a single ASCII character ``@{text "c"}'', for example |
1253 \item a single ASCII character ``@{text "c"}'', for example |
1255 ``\verb,a,'', |
1254 ``@{verbatim a}'', |
1256 |
1255 |
1257 \item a codepoint according to UTF-8 (non-ASCII byte sequence), |
1256 \item a codepoint according to UTF-8 (non-ASCII byte sequence), |
1258 |
1257 |
1259 \item a regular symbol ``\verb,\,\verb,<,@{text "ident"}\verb,>,'', |
1258 \item a regular symbol ``@{verbatim \<open>\\<close>}@{verbatim "<"}@{text |
1260 for example ``\verb,\,\verb,<alpha>,'', |
1259 "ident"}@{verbatim ">"}'', for example ``@{verbatim "\<alpha>"}'', |
1261 |
1260 |
1262 \item a control symbol ``\verb,\,\verb,<^,@{text "ident"}\verb,>,'', |
1261 \item a control symbol ``@{verbatim \<open>\\<close>}@{verbatim "<^"}@{text |
1263 for example ``\verb,\,\verb,<^bold>,'', |
1262 "ident"}@{verbatim ">"}'', for example ``@{verbatim "\<^bold>"}'', |
1264 |
1263 |
1265 \item a raw symbol ``\verb,\,\verb,<^raw:,@{text text}\verb,>,'' |
1264 \item a raw symbol ``@{verbatim \<open>\\<close>}@{verbatim "<^raw:"}@{text |
1266 where @{text text} consists of printable characters excluding |
1265 text}@{verbatim ">"}'' where @{text text} consists of printable characters |
1267 ``\verb,.,'' and ``\verb,>,'', for example |
1266 excluding ``@{verbatim "."}'' and ``@{verbatim ">"}'', for example |
1268 ``\verb,\,\verb,<^raw:$\sum_{i = 1}^n$>,'', |
1267 ``@{verbatim "\<^raw:$\sum_{i = 1}^n$>"}'', |
1269 |
1268 |
1270 \item a numbered raw control symbol ``\verb,\,\verb,<^raw,@{text |
1269 \item a numbered raw control symbol ``@{verbatim \<open>\\<close>}@{verbatim |
1271 n}\verb,>, where @{text n} consists of digits, for example |
1270 "<^raw"}@{text n}@{verbatim ">"}, where @{text n} consists of digits, for |
1272 ``\verb,\,\verb,<^raw42>,''. |
1271 example ``@{verbatim "\<^raw42>"}''. |
1273 |
1272 |
1274 \end{enumerate} |
1273 \end{enumerate} |
1275 |
1274 |
1276 The @{text "ident"} syntax for symbol names is @{text "letter |
1275 The @{text "ident"} syntax for symbol names is @{text "letter |
1277 (letter | digit)\<^sup>*"}, where @{text "letter = A..Za..z"} and @{text |
1276 (letter | digit)\<^sup>*"}, where @{text "letter = A..Za..z"} and @{text |
1278 "digit = 0..9"}. There are infinitely many regular symbols and |
1277 "digit = 0..9"}. There are infinitely many regular symbols and |
1279 control symbols, but a fixed collection of standard symbols is |
1278 control symbols, but a fixed collection of standard symbols is |
1280 treated specifically. For example, ``\verb,\,\verb,<alpha>,'' is |
1279 treated specifically. For example, ``@{verbatim "\<alpha>"}'' is |
1281 classified as a letter, which means it may occur within regular |
1280 classified as a letter, which means it may occur within regular |
1282 Isabelle identifiers. |
1281 Isabelle identifiers. |
1283 |
1282 |
1284 The character set underlying Isabelle symbols is 7-bit ASCII, but 8-bit |
1283 The character set underlying Isabelle symbols is 7-bit ASCII, but 8-bit |
1285 character sequences are passed-through unchanged. Unicode/UCS data in UTF-8 |
1284 character sequences are passed-through unchanged. Unicode/UCS data in UTF-8 |
1288 mathematical symbols, but within the core Isabelle/ML world there is no link |
1287 mathematical symbols, but within the core Isabelle/ML world there is no link |
1289 to the standard collection of Isabelle regular symbols. |
1288 to the standard collection of Isabelle regular symbols. |
1290 |
1289 |
1291 \medskip Output of Isabelle symbols depends on the print mode. For example, |
1290 \medskip Output of Isabelle symbols depends on the print mode. For example, |
1292 the standard {\LaTeX} setup of the Isabelle document preparation system |
1291 the standard {\LaTeX} setup of the Isabelle document preparation system |
1293 would present ``\verb,\,\verb,<alpha>,'' as @{text "\<alpha>"}, and |
1292 would present ``@{verbatim "\<alpha>"}'' as @{text "\<alpha>"}, and ``@{verbatim |
1294 ``\verb,\,\verb,<^bold>,\verb,\,\verb,<alpha>,'' as @{text "\<^bold>\<alpha>"}. On-screen |
1293 "\<^bold>\<alpha>"}'' as @{text "\<^bold>\<alpha>"}. On-screen rendering usually works by mapping a |
1295 rendering usually works by mapping a finite subset of Isabelle symbols to |
1294 finite subset of Isabelle symbols to suitable Unicode characters. |
1296 suitable Unicode characters. |
|
1297 \<close> |
1295 \<close> |
1298 |
1296 |
1299 text %mlref \<open> |
1297 text %mlref \<open> |
1300 \begin{mldecls} |
1298 \begin{mldecls} |
1301 @{index_ML_type "Symbol.symbol": string} \\ |
1299 @{index_ML_type "Symbol.symbol": string} \\ |
1407 \item XML tree structure via YXML (see also @{cite "isabelle-sys"}), |
1405 \item XML tree structure via YXML (see also @{cite "isabelle-sys"}), |
1408 with @{ML YXML.parse_body} as key operation. |
1406 with @{ML YXML.parse_body} as key operation. |
1409 |
1407 |
1410 \end{enumerate} |
1408 \end{enumerate} |
1411 |
1409 |
1412 Note that Isabelle/ML string literals may refer Isabelle symbols |
1410 Note that Isabelle/ML string literals may refer Isabelle symbols like |
1413 like ``\verb,\,\verb,<alpha>,'' natively, \emph{without} escaping |
1411 ``@{verbatim \<alpha>}'' natively, \emph{without} escaping the backslash. This is a |
1414 the backslash. This is a consequence of Isabelle treating all |
1412 consequence of Isabelle treating all source text as strings of symbols, |
1415 source text as strings of symbols, instead of raw characters. |
1413 instead of raw characters. |
1416 |
1414 |
1417 \end{description} |
1415 \end{description} |
1418 \<close> |
1416 \<close> |
1419 |
1417 |
1420 text %mlex \<open>The subsequent example illustrates the difference of |
1418 text %mlex \<open>The subsequent example illustrates the difference of |
1431 |
1429 |
1432 text \<open>Note that in Unicode renderings of the symbol @{text "\<A>"}, |
1430 text \<open>Note that in Unicode renderings of the symbol @{text "\<A>"}, |
1433 variations of encodings like UTF-8 or UTF-16 pose delicate questions |
1431 variations of encodings like UTF-8 or UTF-16 pose delicate questions |
1434 about the multi-byte representations of its codepoint, which is outside |
1432 about the multi-byte representations of its codepoint, which is outside |
1435 of the 16-bit address space of the original Unicode standard from |
1433 of the 16-bit address space of the original Unicode standard from |
1436 the 1990-ies. In Isabelle/ML it is just ``\verb,\,\verb,<A>,'' |
1434 the 1990-ies. In Isabelle/ML it is just ``@{verbatim \<A>}'' |
1437 literally, using plain ASCII characters beyond any doubts.\<close> |
1435 literally, using plain ASCII characters beyond any doubts.\<close> |
1438 |
1436 |
1439 |
1437 |
1440 subsection \<open>Integers\<close> |
1438 subsection \<open>Integers\<close> |
1441 |
1439 |