| author | blanchet | 
| Sun, 01 Mar 2009 18:40:16 +0100 | |
| changeset 30238 | d8944fd4365e | 
| parent 29646 | 5941c156902d | 
| child 29755 | d66b34e46bdf | 
| permissions | -rw-r--r-- | 
| 18538 | 1 | (* $Id$ *) | 
| 2 | ||
| 3 | theory "ML" imports base begin | |
| 4 | ||
| 26437 | 5 | chapter {* Advanced ML programming *}
 | 
| 18538 | 6 | |
| 24089 | 7 | section {* Style *}
 | 
| 8 | ||
| 26459 | 9 | text {*
 | 
| 10 | Like any style guide, also this one should not be interpreted dogmatically, but | |
| 11 | with care and discernment. It consists of a collection of | |
| 12 | recommendations which have been turned out useful over long years of | |
| 13 | Isabelle system development and are supposed to support writing of readable | |
| 14 | and managable code. Special cases encourage derivations, | |
| 15 | as far as you know what you are doing. | |
| 16 |   \footnote{This style guide is loosely based on
 | |
| 17 |   \url{http://caml.inria.fr/resources/doc/guides/guidelines.en.html}}
 | |
| 21148 | 18 | |
| 19 |   \begin{description}
 | |
| 20 | ||
| 21 | \item[fundamental law of programming] | |
| 22 | Whenever writing code, keep in mind: A program is | |
| 23 | written once, modified ten times, and read | |
| 26459 | 24 | hundred times. So simplify its writing, | 
| 21148 | 25 | always keep future modifications in mind, | 
| 26 | and never jeopardize readability. Every second you hesitate | |
| 27 | to spend on making your code more clear you will | |
| 28 | have to spend ten times understanding what you have | |
| 29 | written later on. | |
| 30 | ||
| 31 | \item[white space matters] | |
| 32 | Treat white space in your code as if it determines | |
| 33 | the meaning of code. | |
| 34 | ||
| 35 |       \begin{itemize}
 | |
| 36 | ||
| 37 | \item The space bar is the easiest key to find on the keyboard, | |
| 24089 | 38 |           press it as often as necessary. @{verbatim "2 + 2"} is better
 | 
| 39 |           than @{verbatim "2+2"}, likewise @{verbatim "f (x, y)"} is
 | |
| 40 |           better than @{verbatim "f(x,y)"}.
 | |
| 21148 | 41 | |
| 24089 | 42 | \item Restrict your lines to 80 characters. This will allow | 
| 43 | you to keep the beginning of a line in view while watching | |
| 44 |           its end.\footnote{To acknowledge the lax practice of
 | |
| 45 | text editing these days, we tolerate as much as 100 | |
| 46 | characters per line, but anything beyond 120 is not | |
| 47 | considered proper source text.} | |
| 21148 | 48 | |
| 24089 | 49 | \item Ban tabulators; they are a context-sensitive formatting | 
| 50 | feature and likely to confuse anyone not using your favorite | |
| 51 |           editor.\footnote{Some modern programming language even
 | |
| 52 | forbid tabulators altogether according to the formal syntax | |
| 53 | definition.} | |
| 21148 | 54 | |
| 55 | \item Get rid of trailing whitespace. Instead, do not | |
| 24089 | 56 | suppress a trailing newline at the end of your files. | 
| 21148 | 57 | |
| 58 | \item Choose a generally accepted style of indentation, | |
| 59 | then use it systematically throughout the whole | |
| 60 | application. An indentation of two spaces is appropriate. | |
| 61 | Avoid dangling indentation. | |
| 62 | ||
| 63 |       \end{itemize}
 | |
| 64 | ||
| 65 | \item[cut-and-paste succeeds over copy-and-paste] | |
| 24089 | 66 |        \emph{Never} copy-and-paste code when programming.  If you
 | 
| 21148 | 67 | need the same piece of code twice, introduce a | 
| 68 | reasonable auxiliary function (if there is no | |
| 69 | such function, very likely you got something wrong). | |
| 70 | Any copy-and-paste will turn out to be painful | |
| 26459 | 71 | when something has to be changed later on. | 
| 21148 | 72 | |
| 73 | \item[comments] | |
| 74 | are a device which requires careful thinking before using | |
| 75 | it. The best comment for your code should be the code itself. | |
| 76 | Prefer efforts to write clear, understandable code | |
| 77 | over efforts to explain nasty code. | |
| 78 | ||
| 79 | \item[functional programming is based on functions] | |
| 26459 | 80 | Model things as abstract as appropriate. Avoid unnecessarrily | 
| 81 | concrete datatype representations. For example, consider a function | |
| 82 | taking some table data structure as argument and performing | |
| 83 | lookups on it. Instead of taking a table, it could likewise | |
| 84 | take just a lookup function. Accustom | |
| 24089 | 85 | your way of coding to the level of expressiveness a functional | 
| 86 | programming language is giving onto you. | |
| 21148 | 87 | |
| 88 | \item[tuples] | |
| 89 | are often in the way. When there is no striking argument | |
| 90 | to tuple function arguments, just write your function curried. | |
| 91 | ||
| 92 | \item[telling names] | |
| 24089 | 93 | Any name should tell its purpose as exactly as possible, while | 
| 94 | keeping its length to the absolutely necessary minimum. Always | |
| 95 | give the same name to function arguments which have the same | |
| 96 |       meaning. Separate words by underscores (@{verbatim
 | |
| 97 |       int_of_string}, not @{verbatim intOfString}).\footnote{Some
 | |
| 98 | recent tools for Emacs include special precautions to cope with | |
| 99 |       bumpy names in @{text "camelCase"}, e.g.\ for improved on-screen
 | |
| 100 | readability. It is easier to abstain from using such names in the | |
| 101 | first place.} | |
| 102 | ||
| 103 |   \end{description}
 | |
| 104 | *} | |
| 105 | ||
| 106 | ||
| 107 | section {* Thread-safe programming *}
 | |
| 108 | ||
| 109 | text {*
 | |
| 29159 | 110 | Recent versions of Poly/ML (5.2.1 or later) support robust | 
| 111 | multithreaded execution, based on native operating system threads of | |
| 112 | the underlying platform. Thus threads will actually be executed in | |
| 113 | parallel on multi-core systems. A speedup-factor of approximately | |
| 114 |   1.5--3 can be expected on a regular 4-core machine.\footnote{There
 | |
| 115 | is some inherent limitation of the speedup factor due to garbage | |
| 116 | collection, which is still sequential. It helps to provide initial | |
| 117 |   heap space generously, using the \texttt{-H} option of Poly/ML.}
 | |
| 118 | Threads also help to organize advanced operations of the system, | |
| 119 | with explicit communication between sub-components, real-time | |
| 120 | conditions, time-outs etc. | |
| 24089 | 121 | |
| 29159 | 122 | Threads lack the memory protection of separate processes, and | 
| 24089 | 123 | operate concurrently on shared heap memory. This has the advantage | 
| 124 | that results of independent computations are immediately available | |
| 29159 | 125 | to other threads, without requiring untyped character streams, | 
| 126 | awkward serialization etc. | |
| 24089 | 127 | |
| 128 | On the other hand, some programming guidelines need to be observed | |
| 129 | in order to make unprotected parallelism work out smoothly. While | |
| 130 | the ML system implementation is responsible to maintain basic | |
| 131 | integrity of the representation of ML values in memory, the | |
| 132 | application programmer needs to ensure that multithreaded execution | |
| 133 | does not break the intended semantics. | |
| 134 | ||
| 135 |   \medskip \paragraph{Critical shared resources.} Actually only those
 | |
| 136 | parts outside the purely functional world of ML are critical. In | |
| 137 | particular, this covers | |
| 138 | ||
| 139 |   \begin{itemize}
 | |
| 140 | ||
| 141 | \item global references (or arrays), i.e.\ those that persist over | |
| 142 |   several invocations of associated operations,\footnote{This is
 | |
| 143 | independent of the visibility of such mutable values in the toplevel | |
| 144 | scope.} | |
| 145 | ||
| 146 |   \item direct I/O on shared channels, notably @{text "stdin"}, @{text
 | |
| 147 |   "stdout"}, @{text "stderr"}.
 | |
| 148 | ||
| 149 |   \end{itemize}
 | |
| 150 | ||
| 151 | The majority of tools implemented within the Isabelle/Isar framework | |
| 152 | will not require any of these critical elements: nothing special | |
| 153 | needs to be observed when staying in the purely functional fragment | |
| 154 | of ML. Note that output via the official Isabelle channels does not | |
| 29159 | 155 |   count as direct I/O, so the operations @{ML "writeln"}, @{ML
 | 
| 156 |   "warning"}, @{ML "tracing"} etc.\ are safe.
 | |
| 157 | ||
| 158 |   Moreover, ML bindings within the toplevel environment (@{verbatim
 | |
| 159 |   "type"}, @{verbatim val}, @{verbatim "structure"} etc.) due to
 | |
| 160 | run-time invocation of the compiler are also safe, because | |
| 161 | Isabelle/Isar manages this as part of the theory or proof context. | |
| 24089 | 162 | |
| 29159 | 163 |   \paragraph{Multithreading in Isabelle/Isar.}  The theory loader
 | 
| 164 | automatically exploits the overall parallelism of independent nodes | |
| 165 | in the development graph, as well as the inherent irrelevance of | |
| 166 | proofs for goals being fully specified in advance. This means, | |
| 167 | checking of individual Isar proofs is parallelized by default. | |
| 168 | Beyond that, very sophisticated proof tools may use local | |
| 169 | parallelism internally, via the general programming model of | |
| 170 |   ``future values'' (see also @{"file"
 | |
| 171 | "~~/src/Pure/Concurrent/future.ML"}). | |
| 24089 | 172 | |
| 29159 | 173 | Any ML code that works relatively to the present background theory | 
| 24089 | 174 | is already safe. Contextual data may be easily stored within the | 
| 24090 | 175 | theory or proof context, thanks to the generic data concept of | 
| 176 |   Isabelle/Isar (see \secref{sec:context-data}).  This greatly
 | |
| 24089 | 177 | diminishes the demand for global state information in the first | 
| 178 | place. | |
| 179 | ||
| 180 | \medskip In rare situations where actual mutable content needs to be | |
| 181 |   manipulated, Isabelle provides a single \emph{critical section} that
 | |
| 182 | may be entered while preventing any other thread from doing the | |
| 183 | same. Entering the critical section without contention is very | |
| 184 | fast, and several basic system operations do so frequently. This | |
| 185 | also means that each thread should leave the critical section | |
| 186 | quickly, otherwise parallel execution performance may degrade | |
| 187 | significantly. | |
| 188 | ||
| 29159 | 189 | Despite this potential bottle-neck, centralized locking is | 
| 190 | convenient, because it prevents deadlocks without demanding special | |
| 191 | precautions. Explicit communication demands other means, though. | |
| 192 |   The high-level abstraction of synchronized variables @{"file"
 | |
| 193 | "~~/src/Pure/Concurrent/synchronized.ML"} enables parallel | |
| 194 |   components to communicate via shared state; see also @{"file"
 | |
| 195 | "~~/src/Pure/Concurrent/mailbox.ML"} as canonical example. | |
| 24089 | 196 | |
| 197 |   \paragraph{Good conduct of impure programs.} The following
 | |
| 198 | guidelines enable non-functional programs to participate in | |
| 199 | multithreading. | |
| 200 | ||
| 201 |   \begin{itemize}
 | |
| 202 | ||
| 203 | \item Minimize global state information. Using proper theory and | |
| 204 | proof context data will actually return to functional update of | |
| 205 | values, without any special precautions for multithreading. Apart | |
| 206 | from the fully general functors for theory and proof data (see | |
| 207 |   \secref{sec:context-data}) there are drop-in replacements that
 | |
| 24090 | 208 |   emulate primitive references for common cases of \emph{configuration
 | 
| 209 |   options} for type @{ML_type "bool"}/@{ML_type "int"}/@{ML_type
 | |
| 24110 
4ab3084e311c
tuned config options: eliminated separate attribute "option";
 wenzelm parents: 
24090diff
changeset | 210 |   "string"} (see structure @{ML_struct Config} and @{ML
 | 
| 
4ab3084e311c
tuned config options: eliminated separate attribute "option";
 wenzelm parents: 
24090diff
changeset | 211 | Attrib.config_bool} etc.), and lists of theorems (see functor | 
| 
4ab3084e311c
tuned config options: eliminated separate attribute "option";
 wenzelm parents: 
24090diff
changeset | 212 |   @{ML_functor NamedThmsFun}).
 | 
| 24089 | 213 | |
| 214 | \item Keep components with local state information | |
| 215 |   \emph{re-entrant}.  Instead of poking initial values into (private)
 | |
| 216 | global references, create a new state record on each invocation, and | |
| 217 | pass that through any auxiliary functions of the component. The | |
| 218 | state record may well contain mutable references, without requiring | |
| 219 | any special synchronizations, as long as each invocation sees its | |
| 220 | own copy. Occasionally, one might even return to plain functional | |
| 221 | updates on non-mutable record values here. | |
| 222 | ||
| 223 | \item Isolate process configuration flags. The main legitimate | |
| 224 | application of global references is to configure the whole process | |
| 225 | in a certain way, essentially affecting all threads. A typical | |
| 226 |   example is the @{ML show_types} flag, which tells the pretty printer
 | |
| 227 | to output explicit type information for terms. Such flags usually | |
| 228 | do not affect the functionality of the core system, but only the | |
| 24090 | 229 | view being presented to the user. | 
| 24089 | 230 | |
| 231 | Occasionally, such global process flags are treated like implicit | |
| 232 |   arguments to certain operations, by using the @{ML setmp} combinator
 | |
| 24090 | 233 | for safe temporary assignment. Its traditional purpose was to | 
| 234 | ensure proper recovery of the original value when exceptions are | |
| 235 | raised in the body, now the functionality is extended to enter the | |
| 236 |   \emph{critical section} (with its usual potential of degrading
 | |
| 237 | parallelism). | |
| 24089 | 238 | |
| 239 |   Note that recovery of plain value passing semantics via @{ML
 | |
| 240 |   setmp}~@{text "ref value"} assumes that this @{text "ref"} is
 | |
| 241 | exclusively manipulated within the critical section. In particular, | |
| 242 |   any persistent global assignment of @{text "ref := value"} needs to
 | |
| 243 | be marked critical as well, to prevent intruding another threads | |
| 244 | local view, and a lost-update in the global scope, too. | |
| 245 | ||
| 246 |   \end{itemize}
 | |
| 247 | ||
| 248 | Recall that in an open ``LCF-style'' system like Isabelle/Isar, the | |
| 249 | user participates in constructing the overall environment. This | |
| 24090 | 250 | means that state-based facilities offered by one component will | 
| 251 | require special caution later on. So minimizing critical elements, | |
| 252 | by staying within the plain value-oriented view relative to theory | |
| 253 | or proof contexts most of the time, will also reduce the chance of | |
| 24089 | 254 | mishaps occurring to end-users. | 
| 255 | *} | |
| 256 | ||
| 257 | text %mlref {*
 | |
| 258 |   \begin{mldecls}
 | |
| 259 |   @{index_ML NAMED_CRITICAL: "string -> (unit -> 'a) -> 'a"} \\
 | |
| 260 |   @{index_ML CRITICAL: "(unit -> 'a) -> 'a"} \\
 | |
| 261 |   @{index_ML setmp: "'a ref -> 'a -> ('b -> 'c) -> 'b -> 'c"} \\
 | |
| 262 |   \end{mldecls}
 | |
| 263 | ||
| 264 |   \begin{description}
 | |
| 265 | ||
| 266 |   \item @{ML NAMED_CRITICAL}~@{text "name f"} evaluates @{text "f ()"}
 | |
| 24090 | 267 | while staying within the critical section of Isabelle/Isar. No | 
| 268 | other thread may do so at the same time, but non-critical parallel | |
| 269 |   execution will continue.  The @{text "name"} argument serves for
 | |
| 270 | diagnostic purposes and might help to spot sources of congestion. | |
| 24089 | 271 | |
| 272 |   \item @{ML CRITICAL} is the same as @{ML NAMED_CRITICAL} with empty
 | |
| 273 | name argument. | |
| 274 | ||
| 275 |   \item @{ML setmp}~@{text "ref value f x"} evaluates @{text "f x"}
 | |
| 276 |   while staying within the critical section and having @{text "ref :=
 | |
| 277 | value"} assigned temporarily. This recovers a value-passing | |
| 278 | semantics involving global references, regardless of exceptions or | |
| 279 | concurrency. | |
| 21148 | 280 | |
| 281 |   \end{description}
 | |
| 18554 | 282 | *} | 
| 18538 | 283 | |
| 284 | ||
| 285 | chapter {* Basic library functions *}
 | |
| 286 | ||
| 22293 | 287 | text {*
 | 
| 288 | Beyond the proposal of the SML/NJ basis library, Isabelle comes | |
| 289 | with its own library, from which selected parts are given here. | |
| 26459 | 290 | These should encourage a study of the Isabelle sources, | 
| 291 |   in particular files \emph{Pure/library.ML} and \emph{Pure/General/*.ML}.
 | |
| 22293 | 292 | *} | 
| 293 | ||
| 29646 | 294 | section {* Linear transformations \label{sec:ML-linear-trans} *}
 | 
| 22293 | 295 | |
| 296 | text %mlref {*
 | |
| 297 |   \begin{mldecls}
 | |
| 24089 | 298 |   @{index_ML "op |> ": "'a * ('a -> 'b) -> 'b"} \\
 | 
| 22293 | 299 |   \end{mldecls}
 | 
| 300 | *} | |
| 301 | ||
| 22322 | 302 | (*<*) | 
| 303 | typedecl foo | |
| 304 | consts foo :: foo | |
| 305 | ML {*
 | |
| 306 | val thy = Theory.copy @{theory}
 | |
| 307 | *} | |
| 308 | (*>*) | |
| 309 | ||
| 310 | text {*
 | |
| 26460 | 311 | \noindent Many problems in functional programming can be thought of | 
| 22322 | 312 | as linear transformations, i.e.~a caluclation starts with a | 
| 26459 | 313 |   particular value @{ML_text "x : foo"} which is then transformed
 | 
| 314 |   by application of a function @{ML_text "f : foo -> foo"},
 | |
| 315 |   continued by an application of a function @{ML_text "g : foo -> bar"},
 | |
| 25151 | 316 | and so on. As a canoncial example, take functions enriching | 
| 317 | a theory by constant declararion and primitive definitions: | |
| 318 | ||
| 26459 | 319 |   \smallskip\begin{mldecls}
 | 
| 29581 | 320 |   @{ML "Sign.declare_const: Properties.T -> (binding * typ) * mixfix
 | 
| 26459 | 321 | -> theory -> term * theory"} \\ | 
| 29612 | 322 |   @{ML "Thm.add_def: bool -> bool -> binding * term -> theory -> thm * theory"}
 | 
| 26459 | 323 |   \end{mldecls}
 | 
| 25151 | 324 | |
| 26460 | 325 | \noindent Written with naive application, an addition of constant | 
| 26459 | 326 |   @{term bar} with type @{typ "foo \<Rightarrow> foo"} and
 | 
| 25151 | 327 |   a corresponding definition @{term "bar \<equiv> \<lambda>x. x"} would look like:
 | 
| 328 | ||
| 26459 | 329 |   \smallskip\begin{mldecls}
 | 
| 330 |   @{ML "(fn (t, thy) => Thm.add_def false false
 | |
| 29612 | 331 |   (Binding.name \"bar_def\", Logic.mk_equals (t, @{term \"%x. x\"})) thy)
 | 
| 25151 | 332 | (Sign.declare_const [] | 
| 29008 | 333 |       ((Binding.name \"bar\", @{typ \"foo => foo\"}), NoSyn) thy)"}
 | 
| 26459 | 334 |   \end{mldecls}
 | 
| 25151 | 335 | |
| 26460 | 336 | \noindent With increasing numbers of applications, this code gets quite | 
| 25151 | 337 | unreadable. Further, it is unintuitive that things are | 
| 338 | written down in the opposite order as they actually ``happen''. | |
| 22322 | 339 | *} | 
| 340 | ||
| 341 | text {*
 | |
| 25151 | 342 | \noindent At this stage, Isabelle offers some combinators which allow | 
| 343 | for more convenient notation, most notably reverse application: | |
| 26459 | 344 | |
| 345 |   \smallskip\begin{mldecls}
 | |
| 25151 | 346 | @{ML "thy
 | 
| 29008 | 347 | |> Sign.declare_const [] ((Binding.name \"bar\", @{typ \"foo => foo\"}), NoSyn)
 | 
| 25151 | 348 | |> (fn (t, thy) => thy | 
| 25608 | 349 | |> Thm.add_def false false | 
| 29612 | 350 |      (Binding.name \"bar_def\", Logic.mk_equals (t, @{term \"%x. x\"})))"}
 | 
| 26459 | 351 |   \end{mldecls}
 | 
| 22322 | 352 | *} | 
| 22293 | 353 | |
| 354 | text %mlref {*
 | |
| 355 |   \begin{mldecls}
 | |
| 24089 | 356 |   @{index_ML "op |-> ": "('c * 'a) * ('c -> 'a -> 'b) -> 'b"} \\
 | 
| 357 |   @{index_ML "op |>> ": "('a * 'c) * ('a -> 'b) -> 'b * 'c"} \\
 | |
| 358 |   @{index_ML "op ||> ": "('c * 'a) * ('a -> 'b) -> 'c * 'b"} \\
 | |
| 359 |   @{index_ML "op ||>> ": "('c * 'a) * ('a -> 'd * 'b) -> ('c * 'd) * 'b"} \\
 | |
| 25151 | 360 |   \end{mldecls}
 | 
| 361 | *} | |
| 362 | ||
| 363 | text {*
 | |
| 364 | \noindent Usually, functions involving theory updates also return | |
| 365 | side results; to use these conveniently, yet another | |
| 366 |   set of combinators is at hand, most notably @{ML "op |->"}
 | |
| 367 | which allows curried access to side results: | |
| 26459 | 368 | |
| 369 |   \smallskip\begin{mldecls}
 | |
| 25151 | 370 | @{ML "thy
 | 
| 29008 | 371 | |> Sign.declare_const [] ((Binding.name \"bar\", @{typ \"foo => foo\"}), NoSyn)
 | 
| 25608 | 372 | |-> (fn t => Thm.add_def false false | 
| 29612 | 373 |       (Binding.name \"bar_def\", Logic.mk_equals (t, @{term \"%x. x\"})))
 | 
| 25151 | 374 | "} | 
| 26459 | 375 |   \end{mldecls}
 | 
| 25151 | 376 | |
| 377 |   \noindent @{ML "op |>>"} allows for processing side results on their own:
 | |
| 26459 | 378 | |
| 379 |   \smallskip\begin{mldecls}
 | |
| 25151 | 380 | @{ML "thy
 | 
| 29008 | 381 | |> Sign.declare_const [] ((Binding.name \"bar\", @{typ \"foo => foo\"}), NoSyn)
 | 
| 25151 | 382 | |>> (fn t => Logic.mk_equals (t, @{term \"%x. x\"}))
 | 
| 29612 | 383 | |-> (fn def => Thm.add_def false false (Binding.name \"bar_def\", def)) | 
| 25151 | 384 | "} | 
| 26459 | 385 |   \end{mldecls}
 | 
| 25151 | 386 | |
| 387 |   \noindent Orthogonally, @{ML "op ||>"} applies a transformation
 | |
| 388 | in the presence of side results which are left unchanged: | |
| 26459 | 389 | |
| 390 |   \smallskip\begin{mldecls}
 | |
| 25151 | 391 | @{ML "thy
 | 
| 29008 | 392 | |> Sign.declare_const [] ((Binding.name \"bar\", @{typ \"foo => foo\"}), NoSyn)
 | 
| 25151 | 393 | ||> Sign.add_path \"foobar\" | 
| 25608 | 394 | |-> (fn t => Thm.add_def false false | 
| 29612 | 395 |       (Binding.name \"bar_def\", Logic.mk_equals (t, @{term \"%x. x\"})))
 | 
| 25151 | 396 | ||> Sign.restore_naming thy | 
| 397 | "} | |
| 26459 | 398 |   \end{mldecls}
 | 
| 25151 | 399 | |
| 26459 | 400 |   \noindent @{ML "op ||>>"} accumulates side results:
 | 
| 401 | ||
| 402 |   \smallskip\begin{mldecls}
 | |
| 25151 | 403 | @{ML "thy
 | 
| 29008 | 404 | |> Sign.declare_const [] ((Binding.name \"bar\", @{typ \"foo => foo\"}), NoSyn)
 | 
| 405 | ||>> Sign.declare_const [] ((Binding.name \"foobar\", @{typ \"foo => foo\"}), NoSyn)
 | |
| 25608 | 406 | |-> (fn (t1, t2) => Thm.add_def false false | 
| 29612 | 407 | (Binding.name \"bar_def\", Logic.mk_equals (t1, t2))) | 
| 25151 | 408 | "} | 
| 26459 | 409 |   \end{mldecls}
 | 
| 25151 | 410 | *} | 
| 411 | ||
| 412 | text %mlref {*
 | |
| 413 |   \begin{mldecls}
 | |
| 414 |   @{index_ML fold: "('a -> 'b -> 'b) -> 'a list -> 'b -> 'b"} \\
 | |
| 22293 | 415 |   @{index_ML fold_map: "('a -> 'b -> 'c * 'b) -> 'a list -> 'b -> 'c list * 'b"} \\
 | 
| 416 |   \end{mldecls}
 | |
| 417 | *} | |
| 418 | ||
| 22322 | 419 | text {*
 | 
| 26459 | 420 |   \noindent This principles naturally lift to \emph{lists} using
 | 
| 25151 | 421 |   the @{ML fold} and @{ML fold_map} combinators.
 | 
| 422 | The first lifts a single function | |
| 26459 | 423 |   \begin{quote}\footnotesize
 | 
| 424 |     @{ML_text "f : 'a -> 'b -> 'b"} to @{ML_text "'a list -> 'b -> 'b"}
 | |
| 425 |   \end{quote}
 | |
| 25151 | 426 | such that | 
| 26459 | 427 |   \begin{quote}\footnotesize
 | 
| 428 |     @{ML_text "y |> fold f [x1, x2, ..., x_n]"} \\
 | |
| 429 |     \hspace*{2ex}@{text "\<leadsto>"} @{ML_text "y |> f x1 |> f x2 |> ... |> f x_n"}
 | |
| 430 |   \end{quote}
 | |
| 26460 | 431 | \noindent The second accumulates side results in a list by lifting | 
| 25151 | 432 | a single function | 
| 26459 | 433 |   \begin{quote}\footnotesize
 | 
| 434 |     @{ML_text "f : 'a -> 'b -> 'c * 'b"} to @{ML_text "'a list -> 'b -> 'c list * 'b"}
 | |
| 435 |   \end{quote}
 | |
| 25151 | 436 | such that | 
| 26459 | 437 |   \begin{quote}\footnotesize
 | 
| 438 |     @{ML_text "y |> fold_map f [x1, x2, ..., x_n]"} \\
 | |
| 439 |     \hspace*{2ex}@{text "\<leadsto>"} @{ML_text "y |> f x1 ||>> f x2 ||>> ... ||>> f x_n"} \\
 | |
| 440 |     \hspace*{6ex}@{ML_text "||> (fn ((z1, z2), ..., z_n) => [z1, z2, ..., z_n])"}
 | |
| 441 |   \end{quote}
 | |
| 25185 | 442 | |
| 26459 | 443 | \noindent Example: | 
| 444 | ||
| 445 |   \smallskip\begin{mldecls}
 | |
| 25185 | 446 | @{ML "let
 | 
| 447 | val consts = [\"foo\", \"bar\"]; | |
| 448 | in | |
| 449 | thy | |
| 450 | |> fold_map (fn const => Sign.declare_const [] | |
| 29008 | 451 |        ((Binding.name const, @{typ \"foo => foo\"}), NoSyn)) consts
 | 
| 25185 | 452 |   |>> map (fn t => Logic.mk_equals (t, @{term \"%x. x\"}))
 | 
| 453 | |-> (fn defs => fold_map (fn def => | |
| 29612 | 454 | Thm.add_def false false (Binding.empty, def)) defs) | 
| 26459 | 455 | end"} | 
| 456 |   \end{mldecls}
 | |
| 22322 | 457 | *} | 
| 458 | ||
| 22293 | 459 | text %mlref {*
 | 
| 460 |   \begin{mldecls}
 | |
| 24089 | 461 |   @{index_ML "op #> ": "('a -> 'b) * ('b -> 'c) -> 'a -> 'c"} \\
 | 
| 462 |   @{index_ML "op #-> ": "('a -> 'c * 'b) * ('c -> 'b -> 'd) -> 'a -> 'd"} \\
 | |
| 463 |   @{index_ML "op #>> ": "('a -> 'c * 'b) * ('c -> 'd) -> 'a -> 'd * 'b"} \\
 | |
| 464 |   @{index_ML "op ##> ": "('a -> 'c * 'b) * ('b -> 'd) -> 'a -> 'c * 'd"} \\
 | |
| 465 |   @{index_ML "op ##>> ": "('a -> 'c * 'b) * ('b -> 'e * 'd) -> 'a -> ('c * 'e) * 'd"} \\
 | |
| 22293 | 466 |   \end{mldecls}
 | 
| 467 | *} | |
| 468 | ||
| 22322 | 469 | text {*
 | 
| 22550 | 470 | \noindent All those linear combinators also exist in higher-order | 
| 471 | variants which do not expect a value on the left hand side | |
| 472 | but a function. | |
| 22322 | 473 | *} | 
| 474 | ||
| 22293 | 475 | text %mlref {*
 | 
| 476 |   \begin{mldecls}
 | |
| 24089 | 477 |   @{index_ML "op ` ": "('b -> 'a) -> 'b -> 'a * 'b"} \\
 | 
| 22293 | 478 |   @{index_ML tap: "('b -> 'a) -> 'b -> 'b"} \\
 | 
| 479 |   \end{mldecls}
 | |
| 480 | *} | |
| 481 | ||
| 22550 | 482 | text {*
 | 
| 25185 | 483 | \noindent These operators allow to ``query'' a context | 
| 484 | in a series of context transformations: | |
| 485 | ||
| 26459 | 486 |   \smallskip\begin{mldecls}
 | 
| 25185 | 487 | @{ML "thy
 | 
| 488 | |> tap (fn _ => writeln \"now adding constant\") | |
| 29008 | 489 | |> Sign.declare_const [] ((Binding.name \"bar\", @{typ \"foo => foo\"}), NoSyn)
 | 
| 25185 | 490 | ||>> `(fn thy => Sign.declared_const thy | 
| 29008 | 491 | (Sign.full_name thy (Binding.name \"foobar\"))) | 
| 25185 | 492 | |-> (fn (t, b) => if b then I | 
| 493 | else Sign.declare_const [] | |
| 29008 | 494 |          ((Binding.name \"foobar\", @{typ \"foo => foo\"}), NoSyn) #> snd)
 | 
| 25185 | 495 | "} | 
| 26459 | 496 |   \end{mldecls}
 | 
| 22550 | 497 | *} | 
| 498 | ||
| 22293 | 499 | section {* Options and partiality *}
 | 
| 500 | ||
| 501 | text %mlref {*
 | |
| 502 |   \begin{mldecls}
 | |
| 503 |   @{index_ML is_some: "'a option -> bool"} \\
 | |
| 504 |   @{index_ML is_none: "'a option -> bool"} \\
 | |
| 505 |   @{index_ML the: "'a option -> 'a"} \\
 | |
| 506 |   @{index_ML these: "'a list option -> 'a list"} \\
 | |
| 507 |   @{index_ML the_list: "'a option -> 'a list"} \\
 | |
| 508 |   @{index_ML the_default: "'a -> 'a option -> 'a"} \\
 | |
| 509 |   @{index_ML try: "('a -> 'b) -> 'a -> 'b option"} \\
 | |
| 510 |   @{index_ML can: "('a -> 'b) -> 'a -> bool"} \\
 | |
| 511 |   \end{mldecls}
 | |
| 512 | *} | |
| 513 | ||
| 22550 | 514 | text {*
 | 
| 24089 | 515 |   Standard selector functions on @{text option}s are provided.  The
 | 
| 516 |   @{ML try} and @{ML can} functions provide a convenient interface for
 | |
| 26459 | 517 |   handling exceptions -- both take as arguments a function @{ML_text f}
 | 
| 518 |   together with a parameter @{ML_text x} and handle any exception during
 | |
| 519 |   the evaluation of the application of @{ML_text f} to @{ML_text x}, either
 | |
| 24089 | 520 |   return a lifted result (@{ML NONE} on failure) or a boolean value
 | 
| 521 |   (@{ML false} on failure).
 | |
| 22550 | 522 | *} | 
| 22293 | 523 | |
| 24089 | 524 | |
| 22293 | 525 | section {* Common data structures *}
 | 
| 526 | ||
| 527 | subsection {* Lists (as set-like data structures) *}
 | |
| 18538 | 528 | |
| 22293 | 529 | text {*
 | 
| 530 |   \begin{mldecls}
 | |
| 531 |   @{index_ML member: "('b * 'a -> bool) -> 'a list -> 'b -> bool"} \\
 | |
| 532 |   @{index_ML insert: "('a * 'a -> bool) -> 'a -> 'a list -> 'a list"} \\
 | |
| 533 |   @{index_ML remove: "('b * 'a -> bool) -> 'b -> 'a list -> 'a list"} \\
 | |
| 534 |   @{index_ML merge: "('a * 'a -> bool) -> 'a list * 'a list -> 'a list"} \\
 | |
| 535 |   \end{mldecls}
 | |
| 536 | *} | |
| 537 | ||
| 22503 | 538 | text {*
 | 
| 539 | Lists are often used as set-like data structures -- set-like in | |
| 25151 | 540 |   the sense that they support a notion of @{ML member}-ship,
 | 
| 22503 | 541 |   @{ML insert}-ing and @{ML remove}-ing, but are order-sensitive.
 | 
| 542 | This is convenient when implementing a history-like mechanism: | |
| 543 |   @{ML insert} adds an element \emph{to the front} of a list,
 | |
| 544 |   if not yet present; @{ML remove} removes \emph{all} occurences
 | |
| 545 |   of a particular element.  Correspondingly @{ML merge} implements a 
 | |
| 546 | a merge on two lists suitable for merges of context data | |
| 547 |   (\secref{sec:context-theory}).
 | |
| 548 | ||
| 549 | Functions are parametrized by an explicit equality function | |
| 550 | to accomplish overloaded equality; in most cases of monomorphic | |
| 24089 | 551 |   equality, writing @{ML "op ="} should suffice.
 | 
| 22503 | 552 | *} | 
| 22293 | 553 | |
| 554 | subsection {* Association lists *}
 | |
| 555 | ||
| 556 | text {*
 | |
| 557 |   \begin{mldecls}
 | |
| 23652 | 558 |   @{index_ML_exn AList.DUP} \\
 | 
| 22293 | 559 |   @{index_ML AList.lookup: "('a * 'b -> bool) -> ('b * 'c) list -> 'a -> 'c option"} \\
 | 
| 560 |   @{index_ML AList.defined: "('a * 'b -> bool) -> ('b * 'c) list -> 'a -> bool"} \\
 | |
| 561 |   @{index_ML AList.update: "('a * 'a -> bool) -> ('a * 'b) -> ('a * 'b) list -> ('a * 'b) list"} \\
 | |
| 562 |   @{index_ML AList.default: "('a * 'a -> bool) -> ('a * 'b) -> ('a * 'b) list -> ('a * 'b) list"} \\
 | |
| 563 |   @{index_ML AList.delete: "('a * 'b -> bool) -> 'a -> ('b * 'c) list -> ('b * 'c) list"} \\
 | |
| 564 |   @{index_ML AList.map_entry: "('a * 'b -> bool) -> 'a
 | |
| 565 |     -> ('c -> 'c) -> ('b * 'c) list -> ('b * 'c) list"} \\
 | |
| 566 |   @{index_ML AList.map_default: "('a * 'a -> bool) -> 'a * 'b -> ('b -> 'b)
 | |
| 567 |     -> ('a * 'b) list -> ('a * 'b) list"} \\
 | |
| 568 |   @{index_ML AList.join: "('a * 'a -> bool) -> ('a -> 'b * 'b -> 'b) (*exception DUP*)
 | |
| 569 |     -> ('a * 'b) list * ('a * 'b) list -> ('a * 'b) list (*exception AList.DUP*)"} \\
 | |
| 570 |   @{index_ML AList.merge: "('a * 'a -> bool) -> ('b * 'b -> bool)
 | |
| 571 |     -> ('a * 'b) list * ('a * 'b) list -> ('a * 'b) list (*exception AList.DUP*)"}
 | |
| 572 |   \end{mldecls}
 | |
| 573 | *} | |
| 574 | ||
| 22503 | 575 | text {*
 | 
| 576 | Association lists can be seens as an extension of set-like lists: | |
| 577 | on the one hand, they may be used to implement finite mappings, | |
| 578 | on the other hand, they remain order-sensitive and allow for | |
| 579 |   multiple key-value-pair with the same key: @{ML AList.lookup}
 | |
| 580 |   returns the \emph{first} value corresponding to a particular
 | |
| 581 |   key, if present.  @{ML AList.update} updates
 | |
| 582 |   the \emph{first} occurence of a particular key; if no such
 | |
| 583 |   key exists yet, the key-value-pair is added \emph{to the front}.
 | |
| 584 |   @{ML AList.delete} only deletes the \emph{first} occurence of a key.
 | |
| 585 |   @{ML AList.merge} provides an operation suitable for merges of context data
 | |
| 586 |   (\secref{sec:context-theory}), where an equality parameter on
 | |
| 587 | values determines whether a merge should be considered a conflict. | |
| 588 |   A slightly generalized operation if implementend by the @{ML AList.join}
 | |
| 589 | function which allows for explicit conflict resolution. | |
| 590 | *} | |
| 22293 | 591 | |
| 592 | subsection {* Tables *}
 | |
| 593 | ||
| 594 | text {*
 | |
| 595 |   \begin{mldecls}
 | |
| 596 |   @{index_ML_type "'a Symtab.table"} \\
 | |
| 23652 | 597 |   @{index_ML_exn Symtab.DUP: string} \\
 | 
| 598 |   @{index_ML_exn Symtab.SAME} \\
 | |
| 599 |   @{index_ML_exn Symtab.UNDEF: string} \\
 | |
| 22293 | 600 |   @{index_ML Symtab.empty: "'a Symtab.table"} \\
 | 
| 23652 | 601 |   @{index_ML Symtab.lookup: "'a Symtab.table -> string -> 'a option"} \\
 | 
| 602 |   @{index_ML Symtab.defined: "'a Symtab.table -> string -> bool"} \\
 | |
| 603 |   @{index_ML Symtab.update: "(string * 'a) -> 'a Symtab.table -> 'a Symtab.table"} \\
 | |
| 604 |   @{index_ML Symtab.default: "string * 'a -> 'a Symtab.table -> 'a Symtab.table"} \\
 | |
| 605 |   @{index_ML Symtab.delete: "string
 | |
| 22293 | 606 | -> 'a Symtab.table -> 'a Symtab.table (*exception Symtab.UNDEF*)"} \\ | 
| 23652 | 607 |   @{index_ML Symtab.map_entry: "string -> ('a -> 'a)
 | 
| 22293 | 608 | -> 'a Symtab.table -> 'a Symtab.table"} \\ | 
| 23652 | 609 |   @{index_ML Symtab.map_default: "(string * 'a) -> ('a -> 'a)
 | 
| 22293 | 610 | -> 'a Symtab.table -> 'a Symtab.table"} \\ | 
| 23652 | 611 |   @{index_ML Symtab.join: "(string -> 'a * 'a -> 'a) (*exception Symtab.DUP/Symtab.SAME*)
 | 
| 22293 | 612 | -> 'a Symtab.table * 'a Symtab.table | 
| 23652 | 613 | -> 'a Symtab.table (*exception Symtab.DUP*)"} \\ | 
| 22293 | 614 |   @{index_ML Symtab.merge: "('a * 'a -> bool)
 | 
| 615 | -> 'a Symtab.table * 'a Symtab.table | |
| 23652 | 616 | -> 'a Symtab.table (*exception Symtab.DUP*)"} | 
| 22293 | 617 |   \end{mldecls}
 | 
| 618 | *} | |
| 619 | ||
| 22503 | 620 | text {*
 | 
| 621 | Tables are an efficient representation of finite mappings without | |
| 622 | any notion of order; due to their efficiency they should be used | |
| 623 | whenever such pure finite mappings are neccessary. | |
| 624 | ||
| 625 | The key type of tables must be given explicitly by instantiating | |
| 626 |   the @{ML_functor TableFun} functor which takes the key type
 | |
| 627 |   together with its @{ML_type order}; for convience, we restrict
 | |
| 628 |   here to the @{ML_struct Symtab} instance with @{ML_type string}
 | |
| 629 | as key type. | |
| 630 | ||
| 631 | Most table functions correspond to those of association lists. | |
| 632 | *} | |
| 20489 | 633 | |
| 18538 | 634 | end |