|
1 \chapter{Functional Programming in HOL} |
|
2 |
|
3 This chapter describes how to write |
|
4 functional programs in HOL and how to verify them. However, |
|
5 most of the constructs and |
|
6 proof procedures introduced are general and recur in any specification |
|
7 or verification task. We really should speak of functional |
|
8 \emph{modelling} rather than functional \emph{programming}: |
|
9 our primary aim is not |
|
10 to write programs but to design abstract models of systems. HOL is |
|
11 a specification language that goes well beyond what can be expressed as a |
|
12 program. However, for the time being we concentrate on the computable. |
|
13 |
|
14 If you are a purist functional programmer, please note that all functions |
|
15 in HOL must be total: |
|
16 they must terminate for all inputs. Lazy data structures are not |
|
17 directly available. |
|
18 |
|
19 \section{An Introductory Theory} |
|
20 \label{sec:intro-theory} |
|
21 |
|
22 Functional programming needs datatypes and functions. Both of them can be |
|
23 defined in a theory with a syntax reminiscent of languages like ML or |
|
24 Haskell. As an example consider the theory in figure~\ref{fig:ToyList}. |
|
25 We will now examine it line by line. |
|
26 |
|
27 \begin{figure}[htbp] |
|
28 \begin{ttbox}\makeatother |
|
29 \input{ToyList1}\end{ttbox} |
|
30 \caption{A Theory of Lists} |
|
31 \label{fig:ToyList} |
|
32 \end{figure} |
|
33 |
|
34 \index{*ToyList example|(} |
|
35 {\makeatother\medskip\input{ToyList.tex}} |
|
36 |
|
37 The complete proof script is shown in Fig.\ts\ref{fig:ToyList-proofs}. The |
|
38 concatenation of Figs.\ts\ref{fig:ToyList} and~\ref{fig:ToyList-proofs} |
|
39 constitutes the complete theory \texttt{ToyList} and should reside in file |
|
40 \texttt{ToyList.thy}. |
|
41 % It is good practice to present all declarations and |
|
42 %definitions at the beginning of a theory to facilitate browsing.% |
|
43 \index{*ToyList example|)} |
|
44 |
|
45 \begin{figure}[htbp] |
|
46 \begin{ttbox}\makeatother |
|
47 \input{ToyList2}\end{ttbox} |
|
48 \caption{Proofs about Lists} |
|
49 \label{fig:ToyList-proofs} |
|
50 \end{figure} |
|
51 |
|
52 \subsubsection*{Review} |
|
53 |
|
54 This is the end of our toy proof. It should have familiarized you with |
|
55 \begin{itemize} |
|
56 \item the standard theorem proving procedure: |
|
57 state a goal (lemma or theorem); proceed with proof until a separate lemma is |
|
58 required; prove that lemma; come back to the original goal. |
|
59 \item a specific procedure that works well for functional programs: |
|
60 induction followed by all-out simplification via \isa{auto}. |
|
61 \item a basic repertoire of proof commands. |
|
62 \end{itemize} |
|
63 |
|
64 \begin{warn} |
|
65 It is tempting to think that all lemmas should have the \isa{simp} attribute |
|
66 just because this was the case in the example above. However, in that example |
|
67 all lemmas were equations, and the right-hand side was simpler than the |
|
68 left-hand side --- an ideal situation for simplification purposes. Unless |
|
69 this is clearly the case, novices should refrain from awarding a lemma the |
|
70 \isa{simp} attribute, which has a global effect. Instead, lemmas can be |
|
71 applied locally where they are needed, which is discussed in the following |
|
72 chapter. |
|
73 \end{warn} |
|
74 |
|
75 \section{Some Helpful Commands} |
|
76 \label{sec:commands-and-hints} |
|
77 |
|
78 This section discusses a few basic commands for manipulating the proof state |
|
79 and can be skipped by casual readers. |
|
80 |
|
81 There are two kinds of commands used during a proof: the actual proof |
|
82 commands and auxiliary commands for examining the proof state and controlling |
|
83 the display. Simple proof commands are of the form |
|
84 \commdx{apply}(\textit{method}), where \textit{method} is typically |
|
85 \isa{induct_tac} or \isa{auto}. All such theorem proving operations |
|
86 are referred to as \bfindex{methods}, and further ones are |
|
87 introduced throughout the tutorial. Unless stated otherwise, you may |
|
88 assume that a method attacks merely the first subgoal. An exception is |
|
89 \isa{auto}, which tries to solve all subgoals. |
|
90 |
|
91 The most useful auxiliary commands are as follows: |
|
92 \begin{description} |
|
93 \item[Modifying the order of subgoals:] |
|
94 \commdx{defer} moves the first subgoal to the end and |
|
95 \commdx{prefer}~$n$ moves subgoal $n$ to the front. |
|
96 \item[Printing theorems:] |
|
97 \commdx{thm}~\textit{name}$@1$~\dots~\textit{name}$@n$ |
|
98 prints the named theorems. |
|
99 \item[Reading terms and types:] \commdx{term} |
|
100 \textit{string} reads, type-checks and prints the given string as a term in |
|
101 the current context; the inferred type is output as well. |
|
102 \commdx{typ} \textit{string} reads and prints the given |
|
103 string as a type in the current context. |
|
104 \end{description} |
|
105 Further commands are found in the Isabelle/Isar Reference |
|
106 Manual~\cite{isabelle-isar-ref}. |
|
107 |
|
108 \begin{pgnote} |
|
109 Clicking on the \pgmenu{State} button redisplays the current proof state. |
|
110 This is helpful in case commands like \isacommand{thm} have overwritten it. |
|
111 \end{pgnote} |
|
112 |
|
113 We now examine Isabelle's functional programming constructs systematically, |
|
114 starting with inductive datatypes. |
|
115 |
|
116 |
|
117 \section{Datatypes} |
|
118 \label{sec:datatype} |
|
119 |
|
120 \index{datatypes|(}% |
|
121 Inductive datatypes are part of almost every non-trivial application of HOL. |
|
122 First we take another look at an important example, the datatype of |
|
123 lists, before we turn to datatypes in general. The section closes with a |
|
124 case study. |
|
125 |
|
126 |
|
127 \subsection{Lists} |
|
128 |
|
129 \index{*list (type)}% |
|
130 Lists are one of the essential datatypes in computing. We expect that you |
|
131 are already familiar with their basic operations. |
|
132 Theory \isa{ToyList} is only a small fragment of HOL's predefined theory |
|
133 \thydx{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}. |
|
134 The latter contains many further operations. For example, the functions |
|
135 \cdx{hd} (``head'') and \cdx{tl} (``tail'') return the first |
|
136 element and the remainder of a list. (However, pattern matching is usually |
|
137 preferable to \isa{hd} and \isa{tl}.) |
|
138 Also available are higher-order functions like \isa{map} and \isa{filter}. |
|
139 Theory \isa{List} also contains |
|
140 more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates |
|
141 $x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}. In the rest of the tutorial we |
|
142 always use HOL's predefined lists by building on theory \isa{Main}. |
|
143 \begin{warn} |
|
144 Looking ahead to sets and quanifiers in Part II: |
|
145 The best way to express that some element \isa{x} is in a list \isa{xs} is |
|
146 \isa{x $\in$ set xs}, where \isa{set} is a function that turns a list into the |
|
147 set of its elements. |
|
148 By the same device you can also write bounded quantifiers like |
|
149 \isa{$\forall$x $\in$ set xs} or embed lists in other set expressions. |
|
150 \end{warn} |
|
151 |
|
152 |
|
153 \subsection{The General Format} |
|
154 \label{sec:general-datatype} |
|
155 |
|
156 The general HOL \isacommand{datatype} definition is of the form |
|
157 \[ |
|
158 \isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~ |
|
159 C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~ |
|
160 C@m~\tau@{m1}~\dots~\tau@{mk@m} |
|
161 \] |
|
162 where $\alpha@i$ are distinct type variables (the parameters), $C@i$ are distinct |
|
163 constructor names and $\tau@{ij}$ are types; it is customary to capitalize |
|
164 the first letter in constructor names. There are a number of |
|
165 restrictions (such as that the type should not be empty) detailed |
|
166 elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them. |
|
167 |
|
168 Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and |
|
169 \isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically |
|
170 during proofs by simplification. The same is true for the equations in |
|
171 primitive recursive function definitions. |
|
172 |
|
173 Every\footnote{Except for advanced datatypes where the recursion involves |
|
174 ``\isasymRightarrow'' as in {\S}\ref{sec:nested-fun-datatype}.} datatype $t$ |
|
175 comes equipped with a \isa{size} function from $t$ into the natural numbers |
|
176 (see~{\S}\ref{sec:nat} below). For lists, \isa{size} is just the length, i.e.\ |
|
177 \isa{size [] = 0} and \isa{size(x \# xs) = size xs + 1}. In general, |
|
178 \cdx{size} returns |
|
179 \begin{itemize} |
|
180 \item zero for all constructors that do not have an argument of type $t$, |
|
181 \item one plus the sum of the sizes of all arguments of type~$t$, |
|
182 for all other constructors. |
|
183 \end{itemize} |
|
184 Note that because |
|
185 \isa{size} is defined on every datatype, it is overloaded; on lists |
|
186 \isa{size} is also called \sdx{length}, which is not overloaded. |
|
187 Isabelle will always show \isa{size} on lists as \isa{length}. |
|
188 |
|
189 |
|
190 \subsection{Primitive Recursion} |
|
191 |
|
192 \index{recursion!primitive}% |
|
193 Functions on datatypes are usually defined by recursion. In fact, most of the |
|
194 time they are defined by what is called \textbf{primitive recursion} over some |
|
195 datatype $t$. This means that the recursion equations must be of the form |
|
196 \[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \] |
|
197 such that $C$ is a constructor of $t$ and all recursive calls of |
|
198 $f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus |
|
199 Isabelle immediately sees that $f$ terminates because one (fixed!) argument |
|
200 becomes smaller with every recursive call. There must be at most one equation |
|
201 for each constructor. Their order is immaterial. |
|
202 A more general method for defining total recursive functions is introduced in |
|
203 {\S}\ref{sec:fun}. |
|
204 |
|
205 \begin{exercise}\label{ex:Tree} |
|
206 \input{Tree.tex}% |
|
207 \end{exercise} |
|
208 |
|
209 \input{case_exprs.tex} |
|
210 |
|
211 \input{Ifexpr.tex} |
|
212 \index{datatypes|)} |
|
213 |
|
214 |
|
215 \section{Some Basic Types} |
|
216 |
|
217 This section introduces the types of natural numbers and ordered pairs. Also |
|
218 described is type \isa{option}, which is useful for modelling exceptional |
|
219 cases. |
|
220 |
|
221 \subsection{Natural Numbers} |
|
222 \label{sec:nat}\index{natural numbers}% |
|
223 \index{linear arithmetic|(} |
|
224 |
|
225 \input{fakenat.tex}\medskip |
|
226 \input{natsum.tex} |
|
227 |
|
228 \index{linear arithmetic|)} |
|
229 |
|
230 |
|
231 \subsection{Pairs} |
|
232 \input{pairs2.tex} |
|
233 |
|
234 \subsection{Datatype {\tt\slshape option}} |
|
235 \label{sec:option} |
|
236 \input{Option2.tex} |
|
237 |
|
238 \section{Definitions} |
|
239 \label{sec:Definitions} |
|
240 |
|
241 A definition is simply an abbreviation, i.e.\ a new name for an existing |
|
242 construction. In particular, definitions cannot be recursive. Isabelle offers |
|
243 definitions on the level of types and terms. Those on the type level are |
|
244 called \textbf{type synonyms}; those on the term level are simply called |
|
245 definitions. |
|
246 |
|
247 |
|
248 \subsection{Type Synonyms} |
|
249 |
|
250 \index{type synonyms}% |
|
251 Type synonyms are similar to those found in ML\@. They are created by a |
|
252 \commdx{type\protect\_synonym} command: |
|
253 |
|
254 \medskip |
|
255 \input{types.tex} |
|
256 |
|
257 \input{prime_def.tex} |
|
258 |
|
259 |
|
260 \section{The Definitional Approach} |
|
261 \label{sec:definitional} |
|
262 |
|
263 \index{Definitional Approach}% |
|
264 As we pointed out at the beginning of the chapter, asserting arbitrary |
|
265 axioms such as $f(n) = f(n) + 1$ can easily lead to contradictions. In order |
|
266 to avoid this danger, we advocate the definitional rather than |
|
267 the axiomatic approach: introduce new concepts by definitions. However, Isabelle/HOL seems to |
|
268 support many richer definitional constructs, such as |
|
269 \isacommand{primrec}. The point is that Isabelle reduces such constructs to first principles. For example, each |
|
270 \isacommand{primrec} function definition is turned into a proper |
|
271 (nonrecursive!) definition from which the user-supplied recursion equations are |
|
272 automatically proved. This process is |
|
273 hidden from the user, who does not have to understand the details. Other commands described |
|
274 later, like \isacommand{fun} and \isacommand{inductive}, work similarly. |
|
275 This strict adherence to the definitional approach reduces the risk of |
|
276 soundness errors. |
|
277 |
|
278 \chapter{More Functional Programming} |
|
279 |
|
280 The purpose of this chapter is to deepen your understanding of the |
|
281 concepts encountered so far and to introduce advanced forms of datatypes and |
|
282 recursive functions. The first two sections give a structured presentation of |
|
283 theorem proving by simplification ({\S}\ref{sec:Simplification}) and discuss |
|
284 important heuristics for induction ({\S}\ref{sec:InductionHeuristics}). You can |
|
285 skip them if you are not planning to perform proofs yourself. |
|
286 We then present a case |
|
287 study: a compiler for expressions ({\S}\ref{sec:ExprCompiler}). Advanced |
|
288 datatypes, including those involving function spaces, are covered in |
|
289 {\S}\ref{sec:advanced-datatypes}; it closes with another case study, search |
|
290 trees (``tries''). Finally we introduce \isacommand{fun}, a general |
|
291 form of recursive function definition that goes well beyond |
|
292 \isacommand{primrec} ({\S}\ref{sec:fun}). |
|
293 |
|
294 |
|
295 \section{Simplification} |
|
296 \label{sec:Simplification} |
|
297 \index{simplification|(} |
|
298 |
|
299 So far we have proved our theorems by \isa{auto}, which simplifies |
|
300 all subgoals. In fact, \isa{auto} can do much more than that. |
|
301 To go beyond toy examples, you |
|
302 need to understand the ingredients of \isa{auto}. This section covers the |
|
303 method that \isa{auto} always applies first, simplification. |
|
304 |
|
305 Simplification is one of the central theorem proving tools in Isabelle and |
|
306 many other systems. The tool itself is called the \textbf{simplifier}. |
|
307 This section introduces the many features of the simplifier |
|
308 and is required reading if you intend to perform proofs. Later on, |
|
309 {\S}\ref{sec:simplification-II} explains some more advanced features and a |
|
310 little bit of how the simplifier works. The serious student should read that |
|
311 section as well, in particular to understand why the simplifier did |
|
312 something unexpected. |
|
313 |
|
314 \subsection{What is Simplification?} |
|
315 |
|
316 In its most basic form, simplification means repeated application of |
|
317 equations from left to right. For example, taking the rules for \isa{\at} |
|
318 and applying them to the term \isa{[0,1] \at\ []} results in a sequence of |
|
319 simplification steps: |
|
320 \begin{ttbox}\makeatother |
|
321 (0#1#[]) @ [] \(\leadsto\) 0#((1#[]) @ []) \(\leadsto\) 0#(1#([] @ [])) \(\leadsto\) 0#1#[] |
|
322 \end{ttbox} |
|
323 This is also known as \bfindex{term rewriting}\indexbold{rewriting} and the |
|
324 equations are referred to as \bfindex{rewrite rules}. |
|
325 ``Rewriting'' is more honest than ``simplification'' because the terms do not |
|
326 necessarily become simpler in the process. |
|
327 |
|
328 The simplifier proves arithmetic goals as described in |
|
329 {\S}\ref{sec:nat} above. Arithmetic expressions are simplified using built-in |
|
330 procedures that go beyond mere rewrite rules. New simplification procedures |
|
331 can be coded and installed, but they are definitely not a matter for this |
|
332 tutorial. |
|
333 |
|
334 \input{simp.tex} |
|
335 |
|
336 \index{simplification|)} |
|
337 |
|
338 \input{Itrev.tex} |
|
339 \begin{exercise} |
|
340 \input{Plus.tex}% |
|
341 \end{exercise} |
|
342 \begin{exercise} |
|
343 \input{Tree2.tex}% |
|
344 \end{exercise} |
|
345 |
|
346 \input{CodeGen.tex} |
|
347 |
|
348 |
|
349 \section{Advanced Datatypes} |
|
350 \label{sec:advanced-datatypes} |
|
351 \index{datatype@\isacommand {datatype} (command)|(} |
|
352 \index{primrec@\isacommand {primrec} (command)|(} |
|
353 %|) |
|
354 |
|
355 This section presents advanced forms of datatypes: mutual and nested |
|
356 recursion. A series of examples will culminate in a treatment of the trie |
|
357 data structure. |
|
358 |
|
359 |
|
360 \subsection{Mutual Recursion} |
|
361 \label{sec:datatype-mut-rec} |
|
362 |
|
363 \input{ABexpr.tex} |
|
364 |
|
365 \subsection{Nested Recursion} |
|
366 \label{sec:nested-datatype} |
|
367 |
|
368 {\makeatother\input{Nested.tex}} |
|
369 |
|
370 |
|
371 \subsection{The Limits of Nested Recursion} |
|
372 \label{sec:nested-fun-datatype} |
|
373 |
|
374 How far can we push nested recursion? By the unfolding argument above, we can |
|
375 reduce nested to mutual recursion provided the nested recursion only involves |
|
376 previously defined datatypes. This does not include functions: |
|
377 \begin{isabelle} |
|
378 \isacommand{datatype} t = C "t \isasymRightarrow\ bool" |
|
379 \end{isabelle} |
|
380 This declaration is a real can of worms. |
|
381 In HOL it must be ruled out because it requires a type |
|
382 \isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the |
|
383 same cardinality --- an impossibility. For the same reason it is not possible |
|
384 to allow recursion involving the type \isa{t set}, which is isomorphic to |
|
385 \isa{t \isasymFun\ bool}. |
|
386 |
|
387 Fortunately, a limited form of recursion |
|
388 involving function spaces is permitted: the recursive type may occur on the |
|
389 right of a function arrow, but never on the left. Hence the above can of worms |
|
390 is ruled out but the following example of a potentially |
|
391 \index{infinitely branching trees}% |
|
392 infinitely branching tree is accepted: |
|
393 \smallskip |
|
394 |
|
395 \input{Fundata.tex} |
|
396 |
|
397 If you need nested recursion on the left of a function arrow, there are |
|
398 alternatives to pure HOL\@. In the Logic for Computable Functions |
|
399 (\rmindex{LCF}), types like |
|
400 \begin{isabelle} |
|
401 \isacommand{datatype} lam = C "lam \isasymrightarrow\ lam" |
|
402 \end{isabelle} |
|
403 do indeed make sense~\cite{paulson87}. Note the different arrow, |
|
404 \isa{\isasymrightarrow} instead of \isa{\isasymRightarrow}, |
|
405 expressing the type of \emph{continuous} functions. |
|
406 There is even a version of LCF on top of HOL, |
|
407 called \rmindex{HOLCF}~\cite{MuellerNvOS99}. |
|
408 \index{datatype@\isacommand {datatype} (command)|)} |
|
409 \index{primrec@\isacommand {primrec} (command)|)} |
|
410 |
|
411 |
|
412 \subsection{Case Study: Tries} |
|
413 \label{sec:Trie} |
|
414 |
|
415 \index{tries|(}% |
|
416 Tries are a classic search tree data structure~\cite{Knuth3-75} for fast |
|
417 indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a |
|
418 trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and |
|
419 ``cat''. When searching a string in a trie, the letters of the string are |
|
420 examined sequentially. Each letter determines which subtrie to search next. |
|
421 In this case study we model tries as a datatype, define a lookup and an |
|
422 update function, and prove that they behave as expected. |
|
423 |
|
424 \begin{figure}[htbp] |
|
425 \begin{center} |
|
426 \unitlength1mm |
|
427 \begin{picture}(60,30) |
|
428 \put( 5, 0){\makebox(0,0)[b]{l}} |
|
429 \put(25, 0){\makebox(0,0)[b]{e}} |
|
430 \put(35, 0){\makebox(0,0)[b]{n}} |
|
431 \put(45, 0){\makebox(0,0)[b]{r}} |
|
432 \put(55, 0){\makebox(0,0)[b]{t}} |
|
433 % |
|
434 \put( 5, 9){\line(0,-1){5}} |
|
435 \put(25, 9){\line(0,-1){5}} |
|
436 \put(44, 9){\line(-3,-2){9}} |
|
437 \put(45, 9){\line(0,-1){5}} |
|
438 \put(46, 9){\line(3,-2){9}} |
|
439 % |
|
440 \put( 5,10){\makebox(0,0)[b]{l}} |
|
441 \put(15,10){\makebox(0,0)[b]{n}} |
|
442 \put(25,10){\makebox(0,0)[b]{p}} |
|
443 \put(45,10){\makebox(0,0)[b]{a}} |
|
444 % |
|
445 \put(14,19){\line(-3,-2){9}} |
|
446 \put(15,19){\line(0,-1){5}} |
|
447 \put(16,19){\line(3,-2){9}} |
|
448 \put(45,19){\line(0,-1){5}} |
|
449 % |
|
450 \put(15,20){\makebox(0,0)[b]{a}} |
|
451 \put(45,20){\makebox(0,0)[b]{c}} |
|
452 % |
|
453 \put(30,30){\line(-3,-2){13}} |
|
454 \put(30,30){\line(3,-2){13}} |
|
455 \end{picture} |
|
456 \end{center} |
|
457 \caption{A Sample Trie} |
|
458 \label{fig:trie} |
|
459 \end{figure} |
|
460 |
|
461 Proper tries associate some value with each string. Since the |
|
462 information is stored only in the final node associated with the string, many |
|
463 nodes do not carry any value. This distinction is modeled with the help |
|
464 of the predefined datatype \isa{option} (see {\S}\ref{sec:option}). |
|
465 \input{Trie.tex} |
|
466 \index{tries|)} |
|
467 |
|
468 \section{Total Recursive Functions: \isacommand{fun}} |
|
469 \label{sec:fun} |
|
470 \index{fun@\isacommand {fun} (command)|(}\index{functions!total|(} |
|
471 |
|
472 Although many total functions have a natural primitive recursive definition, |
|
473 this is not always the case. Arbitrary total recursive functions can be |
|
474 defined by means of \isacommand{fun}: you can use full pattern matching, |
|
475 recursion need not involve datatypes, and termination is proved by showing |
|
476 that the arguments of all recursive calls are smaller in a suitable sense. |
|
477 In this section we restrict ourselves to functions where Isabelle can prove |
|
478 termination automatically. More advanced function definitions, including user |
|
479 supplied termination proofs, nested recursion and partiality, are discussed |
|
480 in a separate tutorial~\cite{isabelle-function}. |
|
481 |
|
482 \input{fun0.tex} |
|
483 |
|
484 \index{fun@\isacommand {fun} (command)|)}\index{functions!total|)} |