author | wenzelm |
Sun, 17 Oct 2010 20:25:36 +0100 | |
changeset 39861 | b8d89db3e238 |
parent 39859 | 381e16bb5f46 |
child 39867 | a8363532cd4d |
permissions | -rw-r--r-- |
39822 | 1 |
theory ML_old |
29755 | 2 |
imports Base |
3 |
begin |
|
18538 | 4 |
|
26437 | 5 |
chapter {* Advanced ML programming *} |
18538 | 6 |
|
24089 | 7 |
section {* Style *} |
8 |
||
26459 | 9 |
text {* |
10 |
Like any style guide, also this one should not be interpreted dogmatically, but |
|
11 |
with care and discernment. It consists of a collection of |
|
12 |
recommendations which have been turned out useful over long years of |
|
13 |
Isabelle system development and are supposed to support writing of readable |
|
14 |
and managable code. Special cases encourage derivations, |
|
15 |
as far as you know what you are doing. |
|
16 |
\footnote{This style guide is loosely based on |
|
17 |
\url{http://caml.inria.fr/resources/doc/guides/guidelines.en.html}} |
|
21148 | 18 |
|
19 |
\begin{description} |
|
20 |
||
21 |
\item[fundamental law of programming] |
|
22 |
Whenever writing code, keep in mind: A program is |
|
23 |
written once, modified ten times, and read |
|
26459 | 24 |
hundred times. So simplify its writing, |
21148 | 25 |
always keep future modifications in mind, |
26 |
and never jeopardize readability. Every second you hesitate |
|
27 |
to spend on making your code more clear you will |
|
28 |
have to spend ten times understanding what you have |
|
29 |
written later on. |
|
30 |
||
31 |
\item[white space matters] |
|
32 |
Treat white space in your code as if it determines |
|
33 |
the meaning of code. |
|
34 |
||
35 |
\begin{itemize} |
|
36 |
||
37 |
\item The space bar is the easiest key to find on the keyboard, |
|
24089 | 38 |
press it as often as necessary. @{verbatim "2 + 2"} is better |
39 |
than @{verbatim "2+2"}, likewise @{verbatim "f (x, y)"} is |
|
40 |
better than @{verbatim "f(x,y)"}. |
|
21148 | 41 |
|
24089 | 42 |
\item Restrict your lines to 80 characters. This will allow |
43 |
you to keep the beginning of a line in view while watching |
|
44 |
its end.\footnote{To acknowledge the lax practice of |
|
45 |
text editing these days, we tolerate as much as 100 |
|
46 |
characters per line, but anything beyond 120 is not |
|
47 |
considered proper source text.} |
|
21148 | 48 |
|
24089 | 49 |
\item Ban tabulators; they are a context-sensitive formatting |
50 |
feature and likely to confuse anyone not using your favorite |
|
51 |
editor.\footnote{Some modern programming language even |
|
52 |
forbid tabulators altogether according to the formal syntax |
|
53 |
definition.} |
|
21148 | 54 |
|
55 |
\item Get rid of trailing whitespace. Instead, do not |
|
24089 | 56 |
suppress a trailing newline at the end of your files. |
21148 | 57 |
|
58 |
\item Choose a generally accepted style of indentation, |
|
59 |
then use it systematically throughout the whole |
|
60 |
application. An indentation of two spaces is appropriate. |
|
61 |
Avoid dangling indentation. |
|
62 |
||
63 |
\end{itemize} |
|
64 |
||
65 |
\item[cut-and-paste succeeds over copy-and-paste] |
|
24089 | 66 |
\emph{Never} copy-and-paste code when programming. If you |
21148 | 67 |
need the same piece of code twice, introduce a |
68 |
reasonable auxiliary function (if there is no |
|
69 |
such function, very likely you got something wrong). |
|
70 |
Any copy-and-paste will turn out to be painful |
|
26459 | 71 |
when something has to be changed later on. |
21148 | 72 |
|
73 |
\item[comments] |
|
74 |
are a device which requires careful thinking before using |
|
75 |
it. The best comment for your code should be the code itself. |
|
76 |
Prefer efforts to write clear, understandable code |
|
77 |
over efforts to explain nasty code. |
|
78 |
||
79 |
\item[functional programming is based on functions] |
|
26459 | 80 |
Model things as abstract as appropriate. Avoid unnecessarrily |
81 |
concrete datatype representations. For example, consider a function |
|
82 |
taking some table data structure as argument and performing |
|
83 |
lookups on it. Instead of taking a table, it could likewise |
|
84 |
take just a lookup function. Accustom |
|
24089 | 85 |
your way of coding to the level of expressiveness a functional |
86 |
programming language is giving onto you. |
|
21148 | 87 |
|
88 |
\item[tuples] |
|
89 |
are often in the way. When there is no striking argument |
|
90 |
to tuple function arguments, just write your function curried. |
|
91 |
||
92 |
\item[telling names] |
|
24089 | 93 |
Any name should tell its purpose as exactly as possible, while |
94 |
keeping its length to the absolutely necessary minimum. Always |
|
95 |
give the same name to function arguments which have the same |
|
96 |
meaning. Separate words by underscores (@{verbatim |
|
97 |
int_of_string}, not @{verbatim intOfString}).\footnote{Some |
|
98 |
recent tools for Emacs include special precautions to cope with |
|
99 |
bumpy names in @{text "camelCase"}, e.g.\ for improved on-screen |
|
100 |
readability. It is easier to abstain from using such names in the |
|
101 |
first place.} |
|
102 |
||
103 |
\end{description} |
|
104 |
*} |
|
105 |
||
106 |
||
107 |
section {* Thread-safe programming *} |
|
108 |
||
109 |
text {* |
|
29159 | 110 |
Recent versions of Poly/ML (5.2.1 or later) support robust |
111 |
multithreaded execution, based on native operating system threads of |
|
112 |
the underlying platform. Thus threads will actually be executed in |
|
113 |
parallel on multi-core systems. A speedup-factor of approximately |
|
114 |
1.5--3 can be expected on a regular 4-core machine.\footnote{There |
|
115 |
is some inherent limitation of the speedup factor due to garbage |
|
116 |
collection, which is still sequential. It helps to provide initial |
|
117 |
heap space generously, using the \texttt{-H} option of Poly/ML.} |
|
118 |
Threads also help to organize advanced operations of the system, |
|
119 |
with explicit communication between sub-components, real-time |
|
120 |
conditions, time-outs etc. |
|
24089 | 121 |
|
29159 | 122 |
Threads lack the memory protection of separate processes, and |
24089 | 123 |
operate concurrently on shared heap memory. This has the advantage |
124 |
that results of independent computations are immediately available |
|
29159 | 125 |
to other threads, without requiring untyped character streams, |
126 |
awkward serialization etc. |
|
24089 | 127 |
|
128 |
On the other hand, some programming guidelines need to be observed |
|
129 |
in order to make unprotected parallelism work out smoothly. While |
|
130 |
the ML system implementation is responsible to maintain basic |
|
131 |
integrity of the representation of ML values in memory, the |
|
132 |
application programmer needs to ensure that multithreaded execution |
|
133 |
does not break the intended semantics. |
|
134 |
||
135 |
\medskip \paragraph{Critical shared resources.} Actually only those |
|
136 |
parts outside the purely functional world of ML are critical. In |
|
137 |
particular, this covers |
|
138 |
||
139 |
\begin{itemize} |
|
140 |
||
141 |
\item global references (or arrays), i.e.\ those that persist over |
|
142 |
several invocations of associated operations,\footnote{This is |
|
143 |
independent of the visibility of such mutable values in the toplevel |
|
144 |
scope.} |
|
145 |
||
146 |
\item direct I/O on shared channels, notably @{text "stdin"}, @{text |
|
147 |
"stdout"}, @{text "stderr"}. |
|
148 |
||
149 |
\end{itemize} |
|
150 |
||
151 |
The majority of tools implemented within the Isabelle/Isar framework |
|
152 |
will not require any of these critical elements: nothing special |
|
153 |
needs to be observed when staying in the purely functional fragment |
|
154 |
of ML. Note that output via the official Isabelle channels does not |
|
29159 | 155 |
count as direct I/O, so the operations @{ML "writeln"}, @{ML |
156 |
"warning"}, @{ML "tracing"} etc.\ are safe. |
|
157 |
||
158 |
Moreover, ML bindings within the toplevel environment (@{verbatim |
|
159 |
"type"}, @{verbatim val}, @{verbatim "structure"} etc.) due to |
|
160 |
run-time invocation of the compiler are also safe, because |
|
161 |
Isabelle/Isar manages this as part of the theory or proof context. |
|
24089 | 162 |
|
29159 | 163 |
\paragraph{Multithreading in Isabelle/Isar.} The theory loader |
164 |
automatically exploits the overall parallelism of independent nodes |
|
165 |
in the development graph, as well as the inherent irrelevance of |
|
166 |
proofs for goals being fully specified in advance. This means, |
|
167 |
checking of individual Isar proofs is parallelized by default. |
|
168 |
Beyond that, very sophisticated proof tools may use local |
|
169 |
parallelism internally, via the general programming model of |
|
170 |
``future values'' (see also @{"file" |
|
171 |
"~~/src/Pure/Concurrent/future.ML"}). |
|
24089 | 172 |
|
29159 | 173 |
Any ML code that works relatively to the present background theory |
24089 | 174 |
is already safe. Contextual data may be easily stored within the |
24090 | 175 |
theory or proof context, thanks to the generic data concept of |
176 |
Isabelle/Isar (see \secref{sec:context-data}). This greatly |
|
24089 | 177 |
diminishes the demand for global state information in the first |
178 |
place. |
|
179 |
||
180 |
\medskip In rare situations where actual mutable content needs to be |
|
181 |
manipulated, Isabelle provides a single \emph{critical section} that |
|
182 |
may be entered while preventing any other thread from doing the |
|
183 |
same. Entering the critical section without contention is very |
|
184 |
fast, and several basic system operations do so frequently. This |
|
185 |
also means that each thread should leave the critical section |
|
186 |
quickly, otherwise parallel execution performance may degrade |
|
187 |
significantly. |
|
188 |
||
29159 | 189 |
Despite this potential bottle-neck, centralized locking is |
190 |
convenient, because it prevents deadlocks without demanding special |
|
191 |
precautions. Explicit communication demands other means, though. |
|
192 |
The high-level abstraction of synchronized variables @{"file" |
|
193 |
"~~/src/Pure/Concurrent/synchronized.ML"} enables parallel |
|
194 |
components to communicate via shared state; see also @{"file" |
|
195 |
"~~/src/Pure/Concurrent/mailbox.ML"} as canonical example. |
|
24089 | 196 |
|
197 |
\paragraph{Good conduct of impure programs.} The following |
|
198 |
guidelines enable non-functional programs to participate in |
|
199 |
multithreading. |
|
200 |
||
201 |
\begin{itemize} |
|
202 |
||
203 |
\item Minimize global state information. Using proper theory and |
|
204 |
proof context data will actually return to functional update of |
|
205 |
values, without any special precautions for multithreading. Apart |
|
206 |
from the fully general functors for theory and proof data (see |
|
207 |
\secref{sec:context-data}) there are drop-in replacements that |
|
24090 | 208 |
emulate primitive references for common cases of \emph{configuration |
209 |
options} for type @{ML_type "bool"}/@{ML_type "int"}/@{ML_type |
|
24110
4ab3084e311c
tuned config options: eliminated separate attribute "option";
wenzelm
parents:
24090
diff
changeset
|
210 |
"string"} (see structure @{ML_struct Config} and @{ML |
4ab3084e311c
tuned config options: eliminated separate attribute "option";
wenzelm
parents:
24090
diff
changeset
|
211 |
Attrib.config_bool} etc.), and lists of theorems (see functor |
36164
532f4d1cb0fc
salvaged some ML functors from decay, which is the natural consequence of lack of formal checking;
wenzelm
parents:
36134
diff
changeset
|
212 |
@{ML_functor Named_Thms}). |
24089 | 213 |
|
214 |
\item Keep components with local state information |
|
215 |
\emph{re-entrant}. Instead of poking initial values into (private) |
|
216 |
global references, create a new state record on each invocation, and |
|
217 |
pass that through any auxiliary functions of the component. The |
|
218 |
state record may well contain mutable references, without requiring |
|
219 |
any special synchronizations, as long as each invocation sees its |
|
220 |
own copy. Occasionally, one might even return to plain functional |
|
221 |
updates on non-mutable record values here. |
|
222 |
||
223 |
\item Isolate process configuration flags. The main legitimate |
|
224 |
application of global references is to configure the whole process |
|
225 |
in a certain way, essentially affecting all threads. A typical |
|
226 |
example is the @{ML show_types} flag, which tells the pretty printer |
|
227 |
to output explicit type information for terms. Such flags usually |
|
228 |
do not affect the functionality of the core system, but only the |
|
24090 | 229 |
view being presented to the user. |
24089 | 230 |
|
231 |
Occasionally, such global process flags are treated like implicit |
|
32966
5b21661fe618
indicate CRITICAL nature of various setmp combinators;
wenzelm
parents:
32833
diff
changeset
|
232 |
arguments to certain operations, by using the @{ML setmp_CRITICAL} combinator |
24090 | 233 |
for safe temporary assignment. Its traditional purpose was to |
234 |
ensure proper recovery of the original value when exceptions are |
|
235 |
raised in the body, now the functionality is extended to enter the |
|
236 |
\emph{critical section} (with its usual potential of degrading |
|
237 |
parallelism). |
|
24089 | 238 |
|
239 |
Note that recovery of plain value passing semantics via @{ML |
|
32966
5b21661fe618
indicate CRITICAL nature of various setmp combinators;
wenzelm
parents:
32833
diff
changeset
|
240 |
setmp_CRITICAL}~@{text "ref value"} assumes that this @{text "ref"} is |
24089 | 241 |
exclusively manipulated within the critical section. In particular, |
242 |
any persistent global assignment of @{text "ref := value"} needs to |
|
243 |
be marked critical as well, to prevent intruding another threads |
|
244 |
local view, and a lost-update in the global scope, too. |
|
245 |
||
246 |
\end{itemize} |
|
247 |
||
248 |
Recall that in an open ``LCF-style'' system like Isabelle/Isar, the |
|
249 |
user participates in constructing the overall environment. This |
|
24090 | 250 |
means that state-based facilities offered by one component will |
251 |
require special caution later on. So minimizing critical elements, |
|
252 |
by staying within the plain value-oriented view relative to theory |
|
253 |
or proof contexts most of the time, will also reduce the chance of |
|
24089 | 254 |
mishaps occurring to end-users. |
255 |
*} |
|
256 |
||
257 |
text %mlref {* |
|
258 |
\begin{mldecls} |
|
259 |
@{index_ML NAMED_CRITICAL: "string -> (unit -> 'a) -> 'a"} \\ |
|
260 |
@{index_ML CRITICAL: "(unit -> 'a) -> 'a"} \\ |
|
32966
5b21661fe618
indicate CRITICAL nature of various setmp combinators;
wenzelm
parents:
32833
diff
changeset
|
261 |
@{index_ML setmp_CRITICAL: "'a Unsynchronized.ref -> 'a -> ('b -> 'c) -> 'b -> 'c"} \\ |
24089 | 262 |
\end{mldecls} |
263 |
||
264 |
\begin{description} |
|
265 |
||
266 |
\item @{ML NAMED_CRITICAL}~@{text "name f"} evaluates @{text "f ()"} |
|
24090 | 267 |
while staying within the critical section of Isabelle/Isar. No |
268 |
other thread may do so at the same time, but non-critical parallel |
|
269 |
execution will continue. The @{text "name"} argument serves for |
|
270 |
diagnostic purposes and might help to spot sources of congestion. |
|
24089 | 271 |
|
272 |
\item @{ML CRITICAL} is the same as @{ML NAMED_CRITICAL} with empty |
|
273 |
name argument. |
|
274 |
||
32966
5b21661fe618
indicate CRITICAL nature of various setmp combinators;
wenzelm
parents:
32833
diff
changeset
|
275 |
\item @{ML setmp_CRITICAL}~@{text "ref value f x"} evaluates @{text "f x"} |
24089 | 276 |
while staying within the critical section and having @{text "ref := |
277 |
value"} assigned temporarily. This recovers a value-passing |
|
278 |
semantics involving global references, regardless of exceptions or |
|
279 |
concurrency. |
|
21148 | 280 |
|
281 |
\end{description} |
|
18554 | 282 |
*} |
18538 | 283 |
|
284 |
||
285 |
chapter {* Basic library functions *} |
|
286 |
||
22293 | 287 |
text {* |
288 |
Beyond the proposal of the SML/NJ basis library, Isabelle comes |
|
289 |
with its own library, from which selected parts are given here. |
|
26459 | 290 |
These should encourage a study of the Isabelle sources, |
291 |
in particular files \emph{Pure/library.ML} and \emph{Pure/General/*.ML}. |
|
22293 | 292 |
*} |
293 |
||
29646 | 294 |
section {* Linear transformations \label{sec:ML-linear-trans} *} |
22293 | 295 |
|
296 |
text %mlref {* |
|
297 |
\begin{mldecls} |
|
24089 | 298 |
@{index_ML "op |> ": "'a * ('a -> 'b) -> 'b"} \\ |
22293 | 299 |
\end{mldecls} |
300 |
*} |
|
301 |
||
22322 | 302 |
(*<*) |
303 |
typedecl foo |
|
304 |
consts foo :: foo |
|
305 |
ML {* |
|
306 |
val thy = Theory.copy @{theory} |
|
307 |
*} |
|
308 |
(*>*) |
|
309 |
||
310 |
text {* |
|
26460 | 311 |
\noindent Many problems in functional programming can be thought of |
22322 | 312 |
as linear transformations, i.e.~a caluclation starts with a |
26459 | 313 |
particular value @{ML_text "x : foo"} which is then transformed |
314 |
by application of a function @{ML_text "f : foo -> foo"}, |
|
315 |
continued by an application of a function @{ML_text "g : foo -> bar"}, |
|
25151 | 316 |
and so on. As a canoncial example, take functions enriching |
317 |
a theory by constant declararion and primitive definitions: |
|
318 |
||
26459 | 319 |
\smallskip\begin{mldecls} |
33174 | 320 |
@{ML "Sign.declare_const: (binding * typ) * mixfix |
26459 | 321 |
-> theory -> term * theory"} \\ |
36134 | 322 |
@{ML "Thm.add_def: bool -> bool -> binding * term -> theory -> (string * thm) * theory"} |
26459 | 323 |
\end{mldecls} |
25151 | 324 |
|
26460 | 325 |
\noindent Written with naive application, an addition of constant |
26459 | 326 |
@{term bar} with type @{typ "foo \<Rightarrow> foo"} and |
25151 | 327 |
a corresponding definition @{term "bar \<equiv> \<lambda>x. x"} would look like: |
328 |
||
26459 | 329 |
\smallskip\begin{mldecls} |
330 |
@{ML "(fn (t, thy) => Thm.add_def false false |
|
29612 | 331 |
(Binding.name \"bar_def\", Logic.mk_equals (t, @{term \"%x. x\"})) thy) |
33174 | 332 |
(Sign.declare_const |
29008 | 333 |
((Binding.name \"bar\", @{typ \"foo => foo\"}), NoSyn) thy)"} |
26459 | 334 |
\end{mldecls} |
25151 | 335 |
|
26460 | 336 |
\noindent With increasing numbers of applications, this code gets quite |
25151 | 337 |
unreadable. Further, it is unintuitive that things are |
338 |
written down in the opposite order as they actually ``happen''. |
|
22322 | 339 |
*} |
340 |
||
341 |
text {* |
|
25151 | 342 |
\noindent At this stage, Isabelle offers some combinators which allow |
343 |
for more convenient notation, most notably reverse application: |
|
26459 | 344 |
|
345 |
\smallskip\begin{mldecls} |
|
25151 | 346 |
@{ML "thy |
33174 | 347 |
|> Sign.declare_const ((Binding.name \"bar\", @{typ \"foo => foo\"}), NoSyn) |
25151 | 348 |
|> (fn (t, thy) => thy |
25608 | 349 |
|> Thm.add_def false false |
29612 | 350 |
(Binding.name \"bar_def\", Logic.mk_equals (t, @{term \"%x. x\"})))"} |
26459 | 351 |
\end{mldecls} |
22322 | 352 |
*} |
22293 | 353 |
|
354 |
text %mlref {* |
|
355 |
\begin{mldecls} |
|
24089 | 356 |
@{index_ML "op |-> ": "('c * 'a) * ('c -> 'a -> 'b) -> 'b"} \\ |
357 |
@{index_ML "op |>> ": "('a * 'c) * ('a -> 'b) -> 'b * 'c"} \\ |
|
358 |
@{index_ML "op ||> ": "('c * 'a) * ('a -> 'b) -> 'c * 'b"} \\ |
|
359 |
@{index_ML "op ||>> ": "('c * 'a) * ('a -> 'd * 'b) -> ('c * 'd) * 'b"} \\ |
|
25151 | 360 |
\end{mldecls} |
361 |
*} |
|
362 |
||
363 |
text {* |
|
364 |
\noindent Usually, functions involving theory updates also return |
|
365 |
side results; to use these conveniently, yet another |
|
366 |
set of combinators is at hand, most notably @{ML "op |->"} |
|
367 |
which allows curried access to side results: |
|
26459 | 368 |
|
369 |
\smallskip\begin{mldecls} |
|
25151 | 370 |
@{ML "thy |
33174 | 371 |
|> Sign.declare_const ((Binding.name \"bar\", @{typ \"foo => foo\"}), NoSyn) |
25608 | 372 |
|-> (fn t => Thm.add_def false false |
29612 | 373 |
(Binding.name \"bar_def\", Logic.mk_equals (t, @{term \"%x. x\"}))) |
25151 | 374 |
"} |
26459 | 375 |
\end{mldecls} |
25151 | 376 |
|
377 |
\noindent @{ML "op |>>"} allows for processing side results on their own: |
|
26459 | 378 |
|
379 |
\smallskip\begin{mldecls} |
|
25151 | 380 |
@{ML "thy |
33174 | 381 |
|> Sign.declare_const ((Binding.name \"bar\", @{typ \"foo => foo\"}), NoSyn) |
25151 | 382 |
|>> (fn t => Logic.mk_equals (t, @{term \"%x. x\"})) |
29612 | 383 |
|-> (fn def => Thm.add_def false false (Binding.name \"bar_def\", def)) |
25151 | 384 |
"} |
26459 | 385 |
\end{mldecls} |
25151 | 386 |
|
387 |
\noindent Orthogonally, @{ML "op ||>"} applies a transformation |
|
388 |
in the presence of side results which are left unchanged: |
|
26459 | 389 |
|
390 |
\smallskip\begin{mldecls} |
|
25151 | 391 |
@{ML "thy |
33174 | 392 |
|> Sign.declare_const ((Binding.name \"bar\", @{typ \"foo => foo\"}), NoSyn) |
25151 | 393 |
||> Sign.add_path \"foobar\" |
25608 | 394 |
|-> (fn t => Thm.add_def false false |
29612 | 395 |
(Binding.name \"bar_def\", Logic.mk_equals (t, @{term \"%x. x\"}))) |
25151 | 396 |
||> Sign.restore_naming thy |
397 |
"} |
|
26459 | 398 |
\end{mldecls} |
25151 | 399 |
|
26459 | 400 |
\noindent @{ML "op ||>>"} accumulates side results: |
401 |
||
402 |
\smallskip\begin{mldecls} |
|
25151 | 403 |
@{ML "thy |
33174 | 404 |
|> Sign.declare_const ((Binding.name \"bar\", @{typ \"foo => foo\"}), NoSyn) |
405 |
||>> Sign.declare_const ((Binding.name \"foobar\", @{typ \"foo => foo\"}), NoSyn) |
|
25608 | 406 |
|-> (fn (t1, t2) => Thm.add_def false false |
29612 | 407 |
(Binding.name \"bar_def\", Logic.mk_equals (t1, t2))) |
25151 | 408 |
"} |
26459 | 409 |
\end{mldecls} |
25151 | 410 |
*} |
411 |
||
412 |
text %mlref {* |
|
413 |
\begin{mldecls} |
|
414 |
@{index_ML fold: "('a -> 'b -> 'b) -> 'a list -> 'b -> 'b"} \\ |
|
22293 | 415 |
@{index_ML fold_map: "('a -> 'b -> 'c * 'b) -> 'a list -> 'b -> 'c list * 'b"} \\ |
416 |
\end{mldecls} |
|
417 |
*} |
|
418 |
||
22322 | 419 |
text {* |
26459 | 420 |
\noindent This principles naturally lift to \emph{lists} using |
25151 | 421 |
the @{ML fold} and @{ML fold_map} combinators. |
422 |
The first lifts a single function |
|
26459 | 423 |
\begin{quote}\footnotesize |
424 |
@{ML_text "f : 'a -> 'b -> 'b"} to @{ML_text "'a list -> 'b -> 'b"} |
|
425 |
\end{quote} |
|
25151 | 426 |
such that |
26459 | 427 |
\begin{quote}\footnotesize |
428 |
@{ML_text "y |> fold f [x1, x2, ..., x_n]"} \\ |
|
429 |
\hspace*{2ex}@{text "\<leadsto>"} @{ML_text "y |> f x1 |> f x2 |> ... |> f x_n"} |
|
430 |
\end{quote} |
|
26460 | 431 |
\noindent The second accumulates side results in a list by lifting |
25151 | 432 |
a single function |
26459 | 433 |
\begin{quote}\footnotesize |
434 |
@{ML_text "f : 'a -> 'b -> 'c * 'b"} to @{ML_text "'a list -> 'b -> 'c list * 'b"} |
|
435 |
\end{quote} |
|
25151 | 436 |
such that |
26459 | 437 |
\begin{quote}\footnotesize |
438 |
@{ML_text "y |> fold_map f [x1, x2, ..., x_n]"} \\ |
|
439 |
\hspace*{2ex}@{text "\<leadsto>"} @{ML_text "y |> f x1 ||>> f x2 ||>> ... ||>> f x_n"} \\ |
|
440 |
\hspace*{6ex}@{ML_text "||> (fn ((z1, z2), ..., z_n) => [z1, z2, ..., z_n])"} |
|
441 |
\end{quote} |
|
25185 | 442 |
|
26459 | 443 |
\noindent Example: |
444 |
||
445 |
\smallskip\begin{mldecls} |
|
25185 | 446 |
@{ML "let |
447 |
val consts = [\"foo\", \"bar\"]; |
|
448 |
in |
|
449 |
thy |
|
33174 | 450 |
|> fold_map (fn const => Sign.declare_const |
29008 | 451 |
((Binding.name const, @{typ \"foo => foo\"}), NoSyn)) consts |
25185 | 452 |
|>> map (fn t => Logic.mk_equals (t, @{term \"%x. x\"})) |
453 |
|-> (fn defs => fold_map (fn def => |
|
29612 | 454 |
Thm.add_def false false (Binding.empty, def)) defs) |
26459 | 455 |
end"} |
456 |
\end{mldecls} |
|
22322 | 457 |
*} |
458 |
||
22293 | 459 |
text %mlref {* |
460 |
\begin{mldecls} |
|
24089 | 461 |
@{index_ML "op #> ": "('a -> 'b) * ('b -> 'c) -> 'a -> 'c"} \\ |
462 |
@{index_ML "op #-> ": "('a -> 'c * 'b) * ('c -> 'b -> 'd) -> 'a -> 'd"} \\ |
|
463 |
@{index_ML "op #>> ": "('a -> 'c * 'b) * ('c -> 'd) -> 'a -> 'd * 'b"} \\ |
|
464 |
@{index_ML "op ##> ": "('a -> 'c * 'b) * ('b -> 'd) -> 'a -> 'c * 'd"} \\ |
|
465 |
@{index_ML "op ##>> ": "('a -> 'c * 'b) * ('b -> 'e * 'd) -> 'a -> ('c * 'e) * 'd"} \\ |
|
22293 | 466 |
\end{mldecls} |
467 |
*} |
|
468 |
||
22322 | 469 |
text {* |
22550 | 470 |
\noindent All those linear combinators also exist in higher-order |
471 |
variants which do not expect a value on the left hand side |
|
472 |
but a function. |
|
22322 | 473 |
*} |
474 |
||
22293 | 475 |
text %mlref {* |
476 |
\begin{mldecls} |
|
24089 | 477 |
@{index_ML "op ` ": "('b -> 'a) -> 'b -> 'a * 'b"} \\ |
22293 | 478 |
@{index_ML tap: "('b -> 'a) -> 'b -> 'b"} \\ |
479 |
\end{mldecls} |
|
480 |
*} |
|
481 |
||
22550 | 482 |
text {* |
25185 | 483 |
\noindent These operators allow to ``query'' a context |
484 |
in a series of context transformations: |
|
485 |
||
26459 | 486 |
\smallskip\begin{mldecls} |
25185 | 487 |
@{ML "thy |
488 |
|> tap (fn _ => writeln \"now adding constant\") |
|
33174 | 489 |
|> Sign.declare_const ((Binding.name \"bar\", @{typ \"foo => foo\"}), NoSyn) |
25185 | 490 |
||>> `(fn thy => Sign.declared_const thy |
29008 | 491 |
(Sign.full_name thy (Binding.name \"foobar\"))) |
25185 | 492 |
|-> (fn (t, b) => if b then I |
33174 | 493 |
else Sign.declare_const |
29008 | 494 |
((Binding.name \"foobar\", @{typ \"foo => foo\"}), NoSyn) #> snd) |
25185 | 495 |
"} |
26459 | 496 |
\end{mldecls} |
22550 | 497 |
*} |
498 |
||
24089 | 499 |
|
22293 | 500 |
section {* Common data structures *} |
501 |
||
502 |
subsection {* Lists (as set-like data structures) *} |
|
18538 | 503 |
|
22293 | 504 |
text {* |
505 |
\begin{mldecls} |
|
506 |
@{index_ML member: "('b * 'a -> bool) -> 'a list -> 'b -> bool"} \\ |
|
507 |
@{index_ML insert: "('a * 'a -> bool) -> 'a -> 'a list -> 'a list"} \\ |
|
508 |
@{index_ML remove: "('b * 'a -> bool) -> 'b -> 'a list -> 'a list"} \\ |
|
509 |
@{index_ML merge: "('a * 'a -> bool) -> 'a list * 'a list -> 'a list"} \\ |
|
510 |
\end{mldecls} |
|
511 |
*} |
|
512 |
||
22503 | 513 |
text {* |
514 |
Lists are often used as set-like data structures -- set-like in |
|
25151 | 515 |
the sense that they support a notion of @{ML member}-ship, |
22503 | 516 |
@{ML insert}-ing and @{ML remove}-ing, but are order-sensitive. |
517 |
This is convenient when implementing a history-like mechanism: |
|
518 |
@{ML insert} adds an element \emph{to the front} of a list, |
|
519 |
if not yet present; @{ML remove} removes \emph{all} occurences |
|
520 |
of a particular element. Correspondingly @{ML merge} implements a |
|
521 |
a merge on two lists suitable for merges of context data |
|
522 |
(\secref{sec:context-theory}). |
|
523 |
||
524 |
Functions are parametrized by an explicit equality function |
|
525 |
to accomplish overloaded equality; in most cases of monomorphic |
|
24089 | 526 |
equality, writing @{ML "op ="} should suffice. |
22503 | 527 |
*} |
22293 | 528 |
|
529 |
subsection {* Association lists *} |
|
530 |
||
531 |
text {* |
|
532 |
\begin{mldecls} |
|
23652 | 533 |
@{index_ML_exn AList.DUP} \\ |
22293 | 534 |
@{index_ML AList.lookup: "('a * 'b -> bool) -> ('b * 'c) list -> 'a -> 'c option"} \\ |
535 |
@{index_ML AList.defined: "('a * 'b -> bool) -> ('b * 'c) list -> 'a -> bool"} \\ |
|
536 |
@{index_ML AList.update: "('a * 'a -> bool) -> ('a * 'b) -> ('a * 'b) list -> ('a * 'b) list"} \\ |
|
537 |
@{index_ML AList.default: "('a * 'a -> bool) -> ('a * 'b) -> ('a * 'b) list -> ('a * 'b) list"} \\ |
|
538 |
@{index_ML AList.delete: "('a * 'b -> bool) -> 'a -> ('b * 'c) list -> ('b * 'c) list"} \\ |
|
539 |
@{index_ML AList.map_entry: "('a * 'b -> bool) -> 'a |
|
540 |
-> ('c -> 'c) -> ('b * 'c) list -> ('b * 'c) list"} \\ |
|
541 |
@{index_ML AList.map_default: "('a * 'a -> bool) -> 'a * 'b -> ('b -> 'b) |
|
542 |
-> ('a * 'b) list -> ('a * 'b) list"} \\ |
|
543 |
@{index_ML AList.join: "('a * 'a -> bool) -> ('a -> 'b * 'b -> 'b) (*exception DUP*) |
|
544 |
-> ('a * 'b) list * ('a * 'b) list -> ('a * 'b) list (*exception AList.DUP*)"} \\ |
|
545 |
@{index_ML AList.merge: "('a * 'a -> bool) -> ('b * 'b -> bool) |
|
546 |
-> ('a * 'b) list * ('a * 'b) list -> ('a * 'b) list (*exception AList.DUP*)"} |
|
547 |
\end{mldecls} |
|
548 |
*} |
|
549 |
||
22503 | 550 |
text {* |
551 |
Association lists can be seens as an extension of set-like lists: |
|
552 |
on the one hand, they may be used to implement finite mappings, |
|
553 |
on the other hand, they remain order-sensitive and allow for |
|
554 |
multiple key-value-pair with the same key: @{ML AList.lookup} |
|
555 |
returns the \emph{first} value corresponding to a particular |
|
556 |
key, if present. @{ML AList.update} updates |
|
557 |
the \emph{first} occurence of a particular key; if no such |
|
558 |
key exists yet, the key-value-pair is added \emph{to the front}. |
|
559 |
@{ML AList.delete} only deletes the \emph{first} occurence of a key. |
|
560 |
@{ML AList.merge} provides an operation suitable for merges of context data |
|
561 |
(\secref{sec:context-theory}), where an equality parameter on |
|
562 |
values determines whether a merge should be considered a conflict. |
|
563 |
A slightly generalized operation if implementend by the @{ML AList.join} |
|
564 |
function which allows for explicit conflict resolution. |
|
565 |
*} |
|
22293 | 566 |
|
567 |
subsection {* Tables *} |
|
568 |
||
569 |
text {* |
|
570 |
\begin{mldecls} |
|
571 |
@{index_ML_type "'a Symtab.table"} \\ |
|
23652 | 572 |
@{index_ML_exn Symtab.DUP: string} \\ |
573 |
@{index_ML_exn Symtab.SAME} \\ |
|
574 |
@{index_ML_exn Symtab.UNDEF: string} \\ |
|
22293 | 575 |
@{index_ML Symtab.empty: "'a Symtab.table"} \\ |
23652 | 576 |
@{index_ML Symtab.lookup: "'a Symtab.table -> string -> 'a option"} \\ |
577 |
@{index_ML Symtab.defined: "'a Symtab.table -> string -> bool"} \\ |
|
578 |
@{index_ML Symtab.update: "(string * 'a) -> 'a Symtab.table -> 'a Symtab.table"} \\ |
|
579 |
@{index_ML Symtab.default: "string * 'a -> 'a Symtab.table -> 'a Symtab.table"} \\ |
|
580 |
@{index_ML Symtab.delete: "string |
|
22293 | 581 |
-> 'a Symtab.table -> 'a Symtab.table (*exception Symtab.UNDEF*)"} \\ |
23652 | 582 |
@{index_ML Symtab.map_entry: "string -> ('a -> 'a) |
22293 | 583 |
-> 'a Symtab.table -> 'a Symtab.table"} \\ |
23652 | 584 |
@{index_ML Symtab.map_default: "(string * 'a) -> ('a -> 'a) |
22293 | 585 |
-> 'a Symtab.table -> 'a Symtab.table"} \\ |
23652 | 586 |
@{index_ML Symtab.join: "(string -> 'a * 'a -> 'a) (*exception Symtab.DUP/Symtab.SAME*) |
22293 | 587 |
-> 'a Symtab.table * 'a Symtab.table |
23652 | 588 |
-> 'a Symtab.table (*exception Symtab.DUP*)"} \\ |
22293 | 589 |
@{index_ML Symtab.merge: "('a * 'a -> bool) |
590 |
-> 'a Symtab.table * 'a Symtab.table |
|
23652 | 591 |
-> 'a Symtab.table (*exception Symtab.DUP*)"} |
22293 | 592 |
\end{mldecls} |
593 |
*} |
|
594 |
||
22503 | 595 |
text {* |
596 |
Tables are an efficient representation of finite mappings without |
|
597 |
any notion of order; due to their efficiency they should be used |
|
598 |
whenever such pure finite mappings are neccessary. |
|
599 |
||
600 |
The key type of tables must be given explicitly by instantiating |
|
36164
532f4d1cb0fc
salvaged some ML functors from decay, which is the natural consequence of lack of formal checking;
wenzelm
parents:
36134
diff
changeset
|
601 |
the @{ML_functor Table} functor which takes the key type |
22503 | 602 |
together with its @{ML_type order}; for convience, we restrict |
603 |
here to the @{ML_struct Symtab} instance with @{ML_type string} |
|
604 |
as key type. |
|
605 |
||
606 |
Most table functions correspond to those of association lists. |
|
607 |
*} |
|
20489 | 608 |
|
30270
61811c9224a6
dummy changes to produce a new changeset of these files;
wenzelm
parents:
29755
diff
changeset
|
609 |
end |