src/Doc/Implementation/ML.thy
changeset 61504 a7ae3ef886a9
parent 61503 28e788ca2c5d
child 61506 436b7fe89cdc
equal deleted inserted replaced
61503:28e788ca2c5d 61504:a7ae3ef886a9
  1211 \<close>
  1211 \<close>
  1212 
  1212 
  1213 
  1213 
  1214 section \<open>Strings of symbols \label{sec:symbols}\<close>
  1214 section \<open>Strings of symbols \label{sec:symbols}\<close>
  1215 
  1215 
  1216 text \<open>A \<^emph>\<open>symbol\<close> constitutes the smallest textual unit in
  1216 text \<open>A \<^emph>\<open>symbol\<close> constitutes the smallest textual unit in Isabelle/ML --- raw
  1217   Isabelle/ML --- raw ML characters are normally not encountered at
  1217   ML characters are normally not encountered at all. Isabelle strings consist
  1218   all.  Isabelle strings consist of a sequence of symbols, represented
  1218   of a sequence of symbols, represented as a packed string or an exploded list
  1219   as a packed string or an exploded list of strings.  Each symbol is
  1219   of strings. Each symbol is in itself a small string, which has either one of
  1220   in itself a small string, which has either one of the following
  1220   the following forms:
  1221   forms:
  1221 
  1222 
  1222     \<^enum> a single ASCII character ``\<open>c\<close>'', for example ``\<^verbatim>\<open>a\<close>'',
  1223   \<^enum> a single ASCII character ``\<open>c\<close>'', for example ``\<^verbatim>\<open>a\<close>'',
  1223 
  1224 
  1224     \<^enum> a codepoint according to UTF-8 (non-ASCII byte sequence),
  1225   \<^enum> a codepoint according to UTF-8 (non-ASCII byte sequence),
  1225 
  1226 
  1226     \<^enum> a regular symbol ``\<^verbatim>\<open>\<ident>\<close>'', for example ``\<^verbatim>\<open>\<alpha>\<close>'',
  1227   \<^enum> a regular symbol ``\<^verbatim>\<open>\\<close>\<^verbatim>\<open><\<close>\<open>ident\<close>\<^verbatim>\<open>>\<close>'', for example ``\<^verbatim>\<open>\<alpha>\<close>'',
  1227 
  1228 
  1228     \<^enum> a control symbol ``\<^verbatim>\<open>\<^ident>\<close>'', for example ``\<^verbatim>\<open>\<^bold>\<close>'',
  1229   \<^enum> a control symbol ``\<^verbatim>\<open>\\<close>\<^verbatim>\<open><^\<close>\<open>ident\<close>\<^verbatim>\<open>>\<close>'', for example ``\<^verbatim>\<open>\<^bold>\<close>'',
  1229 
  1230 
  1230     \<^enum> a raw symbol ``\<^verbatim>\<open>\\<close>\<^verbatim>\<open><^raw:\<close>\<open>text\<close>\<^verbatim>\<open>>\<close>'' where \<open>text\<close> consists of
  1231   \<^enum> a raw symbol ``\<^verbatim>\<open>\\<close>\<^verbatim>\<open><^raw:\<close>\<open>text\<close>\<^verbatim>\<open>>\<close>'' where \<open>text\<close> consists of printable characters
  1231     printable characters excluding ``\<^verbatim>\<open>.\<close>'' and ``\<^verbatim>\<open>>\<close>'', for example
  1232   excluding ``\<^verbatim>\<open>.\<close>'' and ``\<^verbatim>\<open>>\<close>'', for example
  1232     ``\<^verbatim>\<open>\<^raw:$\sum_{i = 1}^n$>\<close>'',
  1233   ``\<^verbatim>\<open>\<^raw:$\sum_{i = 1}^n$>\<close>'',
  1233 
  1234 
  1234     \<^enum> a numbered raw control symbol ``\<^verbatim>\<open>\\<close>\<^verbatim>\<open><^raw\<close>\<open>n\<close>\<^verbatim>\<open>>\<close>, where \<open>n\<close> consists
  1235   \<^enum> a numbered raw control symbol ``\<^verbatim>\<open>\\<close>\<^verbatim>\<open><^raw\<close>\<open>n\<close>\<^verbatim>\<open>>\<close>, where \<open>n\<close> consists
  1235     of digits, for example ``\<^verbatim>\<open>\<^raw42>\<close>''.
  1236   of digits, for example ``\<^verbatim>\<open>\<^raw42>\<close>''.
  1236 
  1237 
  1237   The \<open>ident\<close> syntax for symbol names is \<open>letter (letter | digit)\<^sup>*\<close>, where
  1238 
  1238   \<open>letter = A..Za..z\<close> and \<open>digit = 0..9\<close>. There are infinitely many regular
  1239   The \<open>ident\<close> syntax for symbol names is \<open>letter
  1239   symbols and control symbols, but a fixed collection of standard symbols is
  1240   (letter | digit)\<^sup>*\<close>, where \<open>letter = A..Za..z\<close> and \<open>digit = 0..9\<close>.  There are infinitely many regular symbols and
  1240   treated specifically. For example, ``\<^verbatim>\<open>\<alpha>\<close>'' is classified as a letter, which
  1241   control symbols, but a fixed collection of standard symbols is
  1241   means it may occur within regular Isabelle identifiers.
  1242   treated specifically.  For example, ``\<^verbatim>\<open>\<alpha>\<close>'' is
       
  1243   classified as a letter, which means it may occur within regular
       
  1244   Isabelle identifiers.
       
  1245 
  1242 
  1246   The character set underlying Isabelle symbols is 7-bit ASCII, but 8-bit
  1243   The character set underlying Isabelle symbols is 7-bit ASCII, but 8-bit
  1247   character sequences are passed-through unchanged. Unicode/UCS data in UTF-8
  1244   character sequences are passed-through unchanged. Unicode/UCS data in UTF-8
  1248   encoding is processed in a non-strict fashion, such that well-formed code
  1245   encoding is processed in a non-strict fashion, such that well-formed code
  1249   sequences are recognized accordingly. Unicode provides its own collection of
  1246   sequences are recognized accordingly. Unicode provides its own collection of
  1250   mathematical symbols, but within the core Isabelle/ML world there is no link
  1247   mathematical symbols, but within the core Isabelle/ML world there is no link
  1251   to the standard collection of Isabelle regular symbols.
  1248   to the standard collection of Isabelle regular symbols.
  1252 
  1249 
  1253   \<^medskip>
  1250   \<^medskip>
  1254   Output of Isabelle symbols depends on the print mode. For example,
  1251   Output of Isabelle symbols depends on the print mode. For example, the
  1255   the standard {\LaTeX} setup of the Isabelle document preparation system
  1252   standard {\LaTeX} setup of the Isabelle document preparation system would
  1256   would present ``\<^verbatim>\<open>\<alpha>\<close>'' as \<open>\<alpha>\<close>, and ``\<^verbatim>\<open>\<^bold>\<alpha>\<close>'' as \<open>\<^bold>\<alpha>\<close>. On-screen rendering
  1253   present ``\<^verbatim>\<open>\<alpha>\<close>'' as \<open>\<alpha>\<close>, and ``\<^verbatim>\<open>\<^bold>\<alpha>\<close>'' as \<open>\<^bold>\<alpha>\<close>. On-screen rendering usually
  1257   usually works by mapping a finite subset of Isabelle symbols to suitable
  1254   works by mapping a finite subset of Isabelle symbols to suitable Unicode
  1258   Unicode characters.
  1255   characters.
  1259 \<close>
  1256 \<close>
  1260 
  1257 
  1261 text %mlref \<open>
  1258 text %mlref \<open>
  1262   \begin{mldecls}
  1259   \begin{mldecls}
  1263   @{index_ML_type "Symbol.symbol": string} \\
  1260   @{index_ML_type "Symbol.symbol": string} \\