|
1 \chapter{Functional Programming in HOL} |
|
2 |
|
3 Although on the surface this chapter is mainly concerned with how to write |
|
4 functional programs in HOL and how to verify them, most of the |
|
5 constructs and proof procedures introduced are general purpose and recur in |
|
6 any specification or verification task. |
|
7 |
|
8 The dedicated functional programmer should be warned: HOL offers only {\em |
|
9 total functional programming} --- all functions in HOL must be total; lazy |
|
10 data structures are not directly available. On the positive side, functions |
|
11 in HOL need not be computable: HOL is a specification language that goes well |
|
12 beyond what can be expressed as a program. However, for the time being we |
|
13 concentrate on the computable. |
|
14 |
|
15 \section{An introductory theory} |
|
16 \label{sec:intro-theory} |
|
17 |
|
18 Functional programming needs datatypes and functions. Both of them can be |
|
19 defined in a theory with a syntax reminiscent of languages like ML or |
|
20 Haskell. As an example consider the theory in figure~\ref{fig:ToyList}. |
|
21 We will now examine it line by line. |
|
22 |
|
23 \begin{figure}[htbp] |
|
24 \begin{ttbox}\makeatother |
|
25 \input{ToyList2/ToyList1}\end{ttbox} |
|
26 \caption{A theory of lists} |
|
27 \label{fig:ToyList} |
|
28 \end{figure} |
|
29 |
|
30 {\makeatother\input{ToyList/document/ToyList.tex}} |
|
31 |
|
32 The complete proof script is shown in figure~\ref{fig:ToyList-proofs}. The |
|
33 concatenation of figures \ref{fig:ToyList} and \ref{fig:ToyList-proofs} |
|
34 constitutes the complete theory \texttt{ToyList} and should reside in file |
|
35 \texttt{ToyList.thy}. It is good practice to present all declarations and |
|
36 definitions at the beginning of a theory to facilitate browsing. |
|
37 |
|
38 \begin{figure}[htbp] |
|
39 \begin{ttbox}\makeatother |
|
40 \input{ToyList2/ToyList2}\end{ttbox} |
|
41 \caption{Proofs about lists} |
|
42 \label{fig:ToyList-proofs} |
|
43 \end{figure} |
|
44 |
|
45 \subsubsection*{Review} |
|
46 |
|
47 This is the end of our toy proof. It should have familiarized you with |
|
48 \begin{itemize} |
|
49 \item the standard theorem proving procedure: |
|
50 state a goal (lemma or theorem); proceed with proof until a separate lemma is |
|
51 required; prove that lemma; come back to the original goal. |
|
52 \item a specific procedure that works well for functional programs: |
|
53 induction followed by all-out simplification via \isa{auto}. |
|
54 \item a basic repertoire of proof commands. |
|
55 \end{itemize} |
|
56 |
|
57 |
|
58 \section{Some helpful commands} |
|
59 \label{sec:commands-and-hints} |
|
60 |
|
61 This section discusses a few basic commands for manipulating the proof state |
|
62 and can be skipped by casual readers. |
|
63 |
|
64 There are two kinds of commands used during a proof: the actual proof |
|
65 commands and auxiliary commands for examining the proof state and controlling |
|
66 the display. Simple proof commands are of the form |
|
67 \isacommand{apply}\isa{(method)}\indexbold{apply} where \bfindex{method} is a |
|
68 synonym for ``theorem proving function''. Typical examples are |
|
69 \isa{induct_tac} and \isa{auto}. Further methods are introduced throughout |
|
70 the tutorial. Unless stated otherwise you may assume that a method attacks |
|
71 merely the first subgoal. An exception is \isa{auto} which tries to solve all |
|
72 subgoals. |
|
73 |
|
74 The most useful auxiliary commands are: |
|
75 \begin{description} |
|
76 \item[Undoing:] \isacommand{undo}\indexbold{*undo} undoes the effect of the |
|
77 last command; \isacommand{undo} can be undone by |
|
78 \isacommand{redo}\indexbold{*redo}. Both are only needed at the shell |
|
79 level and should not occur in the final theory. |
|
80 \item[Printing the current state:] \isacommand{pr}\indexbold{*pr} redisplays |
|
81 the current proof state, for example when it has disappeared off the |
|
82 screen. |
|
83 \item[Limiting the number of subgoals:] \isacommand{pr}~$n$ tells Isabelle to |
|
84 print only the first $n$ subgoals from now on and redisplays the current |
|
85 proof state. This is helpful when there are many subgoals. |
|
86 \item[Modifying the order of subgoals:] |
|
87 \isacommand{defer}\indexbold{*defer} moves the first subgoal to the end and |
|
88 \isacommand{prefer}\indexbold{*prefer}~$n$ moves subgoal $n$ to the front. |
|
89 \item[Printing theorems:] |
|
90 \isacommand{thm}\indexbold{*thm}~\textit{name}$@1$~\dots~\textit{name}$@n$ |
|
91 prints the named theorems. |
|
92 \item[Displaying types:] We have already mentioned the flag |
|
93 \ttindex{show_types} above. It can also be useful for detecting typos in |
|
94 formulae early on. For example, if \texttt{show_types} is set and the goal |
|
95 \isa{rev(rev xs) = xs} is started, Isabelle prints the additional output |
|
96 \par\noindent |
|
97 \begin{isabelle}% |
|
98 Variables:\isanewline |
|
99 ~~xs~::~'a~list |
|
100 \end{isabelle}% |
|
101 \par\noindent |
|
102 which tells us that Isabelle has correctly inferred that |
|
103 \isa{xs} is a variable of list type. On the other hand, had we |
|
104 made a typo as in \isa{rev(re xs) = xs}, the response |
|
105 \par\noindent |
|
106 \begin{isabelle}% |
|
107 Variables:\isanewline |
|
108 ~~re~::~'a~list~{\isasymRightarrow}~'a~list\isanewline |
|
109 ~~xs~::~'a~list% |
|
110 \end{isabelle}% |
|
111 \par\noindent |
|
112 would have alerted us because of the unexpected variable \isa{re}. |
|
113 \item[Reading terms and types:] \isacommand{term}\indexbold{*term} |
|
114 \textit{string} reads, type-checks and prints the given string as a term in |
|
115 the current context; the inferred type is output as well. |
|
116 \isacommand{typ}\indexbold{*typ} \textit{string} reads and prints the given |
|
117 string as a type in the current context. |
|
118 \item[(Re)loading theories:] When you start your interaction you must open a |
|
119 named theory with the line |
|
120 \isacommand{theory}\texttt{~T~=}~\dots~\texttt{:}. Isabelle automatically |
|
121 loads all the required parent theories from their respective files (which |
|
122 may take a moment, unless the theories are already loaded and the files |
|
123 have not been modified). The only time when you need to load a theory by |
|
124 hand is when you simply want to check if it loads successfully without |
|
125 wanting to make use of the theory itself. This you can do by typing |
|
126 \isacommand{use\_thy}\indexbold{*use_thy}~\texttt{"T"}. |
|
127 |
|
128 If you suddenly discover that you need to modify a parent theory of your |
|
129 current theory you must first abandon your current theory (at the shell |
|
130 level type \isacommand{kill}\indexbold{*kill}). After the parent theory has |
|
131 been modified you go back to your original theory. When its first line |
|
132 \isacommand{theory}\texttt{~T~=}~\dots~\texttt{:} is processed, the |
|
133 modified parent is reloaded automatically. |
|
134 \end{description} |
|
135 Further commands are found in the Isabelle/Isar Reference Manual. |
|
136 |
|
137 |
|
138 \section{Datatypes} |
|
139 \label{sec:datatype} |
|
140 |
|
141 Inductive datatypes are part of almost every non-trivial application of HOL. |
|
142 First we take another look at a very important example, the datatype of |
|
143 lists, before we turn to datatypes in general. The section closes with a |
|
144 case study. |
|
145 |
|
146 |
|
147 \subsection{Lists} |
|
148 \indexbold{*list} |
|
149 |
|
150 Lists are one of the essential datatypes in computing. Readers of this |
|
151 tutorial and users of HOL need to be familiar with their basic operations. |
|
152 Theory \texttt{ToyList} is only a small fragment of HOL's predefined theory |
|
153 \texttt{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}. |
|
154 The latter contains many further operations. For example, the functions |
|
155 \isaindexbold{hd} (`head') and \isaindexbold{tl} (`tail') return the first |
|
156 element and the remainder of a list. (However, pattern-matching is usually |
|
157 preferable to \isa{hd} and \isa{tl}.) Theory \texttt{List} also contains |
|
158 more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates |
|
159 $x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}. In the rest of the tutorial we |
|
160 always use HOL's predefined lists. |
|
161 |
|
162 |
|
163 \subsection{The general format} |
|
164 \label{sec:general-datatype} |
|
165 |
|
166 The general HOL \isacommand{datatype} definition is of the form |
|
167 \[ |
|
168 \isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~ |
|
169 C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~ |
|
170 C@m~\tau@{m1}~\dots~\tau@{mk@m} |
|
171 \] |
|
172 where $\alpha@i$ are type variables (the parameters), $C@i$ are distinct |
|
173 constructor names and $\tau@{ij}$ are types; it is customary to capitalize |
|
174 the first letter in constructor names. There are a number of |
|
175 restrictions (such as that the type should not be empty) detailed |
|
176 elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them. |
|
177 |
|
178 Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and |
|
179 \isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically |
|
180 during proofs by simplification. The same is true for the equations in |
|
181 primitive recursive function definitions. |
|
182 |
|
183 \subsection{Primitive recursion} |
|
184 |
|
185 Functions on datatypes are usually defined by recursion. In fact, most of the |
|
186 time they are defined by what is called \bfindex{primitive recursion}. |
|
187 The keyword \isacommand{primrec}\indexbold{*primrec} is followed by a list of |
|
188 equations |
|
189 \[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \] |
|
190 such that $C$ is a constructor of the datatype $t$ and all recursive calls of |
|
191 $f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus |
|
192 Isabelle immediately sees that $f$ terminates because one (fixed!) argument |
|
193 becomes smaller with every recursive call. There must be exactly one equation |
|
194 for each constructor. Their order is immaterial. |
|
195 A more general method for defining total recursive functions is explained in |
|
196 \S\ref{sec:recdef}. |
|
197 |
|
198 \begin{exercise} |
|
199 \input{Misc/document/Tree.tex}% |
|
200 \end{exercise} |
|
201 |
|
202 \subsection{Case expressions} |
|
203 \label{sec:case-expressions} |
|
204 |
|
205 HOL also features \isaindexbold{case}-expressions for analyzing |
|
206 elements of a datatype. For example, |
|
207 % case xs of [] => 0 | y#ys => y |
|
208 \begin{isabellepar}% |
|
209 ~~~case~xs~of~[]~{\isasymRightarrow}~0~|~y~\#~ys~{\isasymRightarrow}~y |
|
210 \end{isabellepar}% |
|
211 evaluates to \isa{0} if \isa{xs} is \isa{[]} and to \isa{y} if |
|
212 \isa{xs} is \isa{y\#ys}. (Since the result in both branches must be of |
|
213 the same type, it follows that \isa{y::nat} and hence |
|
214 \isa{xs::(nat)list}.) |
|
215 |
|
216 In general, if $e$ is a term of the datatype $t$ defined in |
|
217 \S\ref{sec:general-datatype} above, the corresponding |
|
218 \isa{case}-expression analyzing $e$ is |
|
219 \[ |
|
220 \begin{array}{rrcl} |
|
221 \isa{case}~e~\isa{of} & C@1~x@{11}~\dots~x@{1k@1} & \To & e@1 \\ |
|
222 \vdots \\ |
|
223 \mid & C@m~x@{m1}~\dots~x@{mk@m} & \To & e@m |
|
224 \end{array} |
|
225 \] |
|
226 |
|
227 \begin{warn} |
|
228 {\em All} constructors must be present, their order is fixed, and nested |
|
229 patterns are not supported. Violating these restrictions results in strange |
|
230 error messages. |
|
231 \end{warn} |
|
232 \noindent |
|
233 Nested patterns can be simulated by nested \isa{case}-expressions: instead |
|
234 of |
|
235 % case xs of [] => 0 | [x] => x | x#(y#zs) => y |
|
236 \begin{isabellepar}% |
|
237 ~~~case~xs~of~[]~{\isasymRightarrow}~0~|~[x]~{\isasymRightarrow}~x~|~x~\#~y~\#~zs~{\isasymRightarrow}~y |
|
238 \end{isabellepar}% |
|
239 write |
|
240 % term(latex xsymbols symbols)"case xs of [] => 0 | x#ys => (case ys of [] => x | y#zs => y)"; |
|
241 \begin{isabellepar}% |
|
242 ~~~case~xs~of~[]~{\isasymRightarrow}~0~|~x~\#~ys~{\isasymRightarrow}~(case~ys~of~[]~{\isasymRightarrow}~x~|~y~\#~zs~{\isasymRightarrow}~y)% |
|
243 \end{isabellepar}% |
|
244 Note that \isa{case}-expressions should be enclosed in parentheses to |
|
245 indicate their scope. |
|
246 |
|
247 \subsection{Structural induction and case distinction} |
|
248 \indexbold{structural induction} |
|
249 \indexbold{induction!structural} |
|
250 \indexbold{case distinction} |
|
251 |
|
252 Almost all the basic laws about a datatype are applied automatically during |
|
253 simplification. Only induction is invoked by hand via \isaindex{induct_tac}, |
|
254 which works for any datatype. In some cases, induction is overkill and a case |
|
255 distinction over all constructors of the datatype suffices. This is performed |
|
256 by \isaindexbold{case_tac}. A trivial example: |
|
257 |
|
258 \input{Misc/document/cases.tex}% |
|
259 |
|
260 Note that we do not need to give a lemma a name if we do not intend to refer |
|
261 to it explicitly in the future. |
|
262 |
|
263 \begin{warn} |
|
264 Induction is only allowed on free (or \isasymAnd-bound) variables that |
|
265 should not occur among the assumptions of the subgoal. Case distinction |
|
266 (\isa{case_tac}) works for arbitrary terms, which need to be |
|
267 quoted if they are non-atomic. |
|
268 \end{warn} |
|
269 |
|
270 |
|
271 \subsection{Case study: boolean expressions} |
|
272 \label{sec:boolex} |
|
273 |
|
274 The aim of this case study is twofold: it shows how to model boolean |
|
275 expressions and some algorithms for manipulating them, and it demonstrates |
|
276 the constructs introduced above. |
|
277 |
|
278 \input{Ifexpr/document/Ifexpr.tex} |
|
279 |
|
280 How does one come up with the required lemmas? Try to prove the main theorems |
|
281 without them and study carefully what \isa{auto} leaves unproved. This has |
|
282 to provide the clue. The necessity of universal quantification |
|
283 (\isa{\isasymforall{}t e}) in the two lemmas is explained in |
|
284 \S\ref{sec:InductionHeuristics} |
|
285 |
|
286 \begin{exercise} |
|
287 We strengthen the definition of a \isa{normal} If-expression as follows: |
|
288 the first argument of all \isa{IF}s must be a variable. Adapt the above |
|
289 development to this changed requirement. (Hint: you may need to formulate |
|
290 some of the goals as implications (\isasymimp) rather than equalities |
|
291 (\isa{=}).) |
|
292 \end{exercise} |
|
293 |
|
294 \section{Some basic types} |
|
295 |
|
296 |
|
297 \subsection{Natural numbers} |
|
298 \index{arithmetic|(} |
|
299 |
|
300 \input{Misc/document/fakenat.tex} |
|
301 \input{Misc/document/natsum.tex} |
|
302 |
|
303 The usual arithmetic operations \ttindexboldpos{+}{$HOL2arithfun}, |
|
304 \ttindexboldpos{-}{$HOL2arithfun}, \ttindexboldpos{*}{$HOL2arithfun}, |
|
305 \isaindexbold{div}, \isaindexbold{mod}, \isaindexbold{min} and |
|
306 \isaindexbold{max} are predefined, as are the relations |
|
307 \indexboldpos{\isasymle}{$HOL2arithrel} and |
|
308 \ttindexboldpos{<}{$HOL2arithrel}. There is even a least number operation |
|
309 \isaindexbold{LEAST}. For example, \isa{(LEAST n.$\,$1 < n) = 2}, although |
|
310 Isabelle does not prove this completely automatically. Note that \isa{1} and |
|
311 \isa{2} are available as abbreviations for the corresponding |
|
312 \isa{Suc}-expressions. If you need the full set of numerals, |
|
313 see~\S\ref{nat-numerals}. |
|
314 |
|
315 \begin{warn} |
|
316 The operations \ttindexboldpos{+}{$HOL2arithfun}, |
|
317 \ttindexboldpos{-}{$HOL2arithfun}, \ttindexboldpos{*}{$HOL2arithfun}, |
|
318 \isaindexbold{min}, \isaindexbold{max}, |
|
319 \indexboldpos{\isasymle}{$HOL2arithrel} and |
|
320 \ttindexboldpos{<}{$HOL2arithrel} are overloaded, i.e.\ they are available |
|
321 not just for natural numbers but at other types as well (see |
|
322 \S\ref{sec:TypeClasses}). For example, given the goal \isa{x+y = y+x}, |
|
323 there is nothing to indicate that you are talking about natural numbers. |
|
324 Hence Isabelle can only infer that \isa{x} and \isa{y} are of some |
|
325 arbitrary type where \isa{+} is declared. As a consequence, you will be |
|
326 unable to prove the goal (although it may take you some time to realize |
|
327 what has happened if \texttt{show_types} is not set). In this particular |
|
328 example, you need to include an explicit type constraint, for example |
|
329 \isa{x+y = y+(x::nat)}. If there is enough contextual information this may |
|
330 not be necessary: \isa{x+0 = x} automatically implies \isa{x::nat}. |
|
331 \end{warn} |
|
332 |
|
333 Simple arithmetic goals are proved automatically by both \isa{auto} |
|
334 and the simplification methods introduced in \S\ref{sec:Simplification}. For |
|
335 example, |
|
336 |
|
337 \input{Misc/document/arith1.tex}% |
|
338 is proved automatically. The main restriction is that only addition is taken |
|
339 into account; other arithmetic operations and quantified formulae are ignored. |
|
340 |
|
341 For more complex goals, there is the special method |
|
342 \isaindexbold{arith} (which attacks the first subgoal). It proves |
|
343 arithmetic goals involving the usual logical connectives (\isasymnot, |
|
344 \isasymand, \isasymor, \isasymimp), the relations \isasymle\ and \isa{<}, and |
|
345 the operations \isa{+}, \isa{-}, \isa{min} and \isa{max}. For example, |
|
346 |
|
347 \input{Misc/document/arith2.tex}% |
|
348 succeeds because \isa{k*k} can be treated as atomic. |
|
349 In contrast, |
|
350 |
|
351 \input{Misc/document/arith3.tex}% |
|
352 is not even proved by \isa{arith} because the proof relies essentially |
|
353 on properties of multiplication. |
|
354 |
|
355 \begin{warn} |
|
356 The running time of \isa{arith} is exponential in the number of occurrences |
|
357 of \ttindexboldpos{-}{$HOL2arithfun}, \isaindexbold{min} and |
|
358 \isaindexbold{max} because they are first eliminated by case distinctions. |
|
359 |
|
360 \isa{arith} is incomplete even for the restricted class of formulae |
|
361 described above (known as ``linear arithmetic''). If divisibility plays a |
|
362 role, it may fail to prove a valid formula, for example $m+m \neq n+n+1$. |
|
363 Fortunately, such examples are rare in practice. |
|
364 \end{warn} |
|
365 |
|
366 \index{arithmetic|)} |
|
367 |
|
368 |
|
369 \subsection{Products} |
|
370 |
|
371 HOL also has pairs: \isa{($a@1$,$a@2$)} is of type \texttt{$\tau@1$ * |
|
372 $\tau@2$} provided each $a@i$ is of type $\tau@i$. The components of a pair |
|
373 are extracted by \isa{fst} and \isa{snd}: \isa{fst($x$,$y$) = $x$} and |
|
374 \isa{snd($x$,$y$) = $y$}. Tuples are simulated by pairs nested to the right: |
|
375 \isa{($a@1$,$a@2$,$a@3$)} stands for \isa{($a@1$,($a@2$,$a@3$))} and |
|
376 \isa{$\tau@1$ * $\tau@2$ * $\tau@3$} for \isa{$\tau@1$ * ($\tau@2$ * |
|
377 $\tau@3$)}. Therefore we have \isa{fst(snd($a@1$,$a@2$,$a@3$)) = $a@2$}. |
|
378 |
|
379 It is possible to use (nested) tuples as patterns in abstractions, for |
|
380 example \isa{\isasymlambda(x,y,z).x+y+z} and |
|
381 \isa{\isasymlambda((x,y),z).x+y+z}. |
|
382 In addition to explicit $\lambda$-abstractions, tuple patterns can be used in |
|
383 most variable binding constructs. Typical examples are |
|
384 |
|
385 \input{Misc/document/pairs.tex}% |
|
386 Further important examples are quantifiers and sets (see~\S\ref{quant-pats}). |
|
387 |
|
388 %FIXME move stuff below into section on proofs about products? |
|
389 \begin{warn} |
|
390 Abstraction over pairs and tuples is merely a convenient shorthand for a |
|
391 more complex internal representation. Thus the internal and external form |
|
392 of a term may differ, which can affect proofs. If you want to avoid this |
|
393 complication, use \isa{fst} and \isa{snd}, i.e.\ write |
|
394 \isa{\isasymlambda{}p.~fst p + snd p} instead of |
|
395 \isa{\isasymlambda(x,y).~x + y}. See~\S\ref{proofs-about-products} for |
|
396 theorem proving with tuple patterns. |
|
397 \end{warn} |
|
398 |
|
399 Finally note that products, like natural numbers, are datatypes, which means |
|
400 in particular that \isa{induct_tac} and \isa{case_tac} are applicable to |
|
401 products (see \S\ref{proofs-about-products}). |
|
402 |
|
403 \section{Definitions} |
|
404 \label{sec:Definitions} |
|
405 |
|
406 A definition is simply an abbreviation, i.e.\ a new name for an existing |
|
407 construction. In particular, definitions cannot be recursive. Isabelle offers |
|
408 definitions on the level of types and terms. Those on the type level are |
|
409 called type synonyms, those on the term level are called (constant) |
|
410 definitions. |
|
411 |
|
412 |
|
413 \subsection{Type synonyms} |
|
414 \indexbold{type synonyms} |
|
415 |
|
416 Type synonyms are similar to those found in ML. Their syntax is fairly self |
|
417 explanatory: |
|
418 |
|
419 \input{Misc/document/types.tex}% |
|
420 |
|
421 Note that pattern-matching is not allowed, i.e.\ each definition must be of |
|
422 the form $f\,x@1\,\dots\,x@n~\isasymequiv~t$. |
|
423 |
|
424 Section~\S\ref{sec:Simplification} explains how definitions are used |
|
425 in proofs. |
|
426 |
|
427 \begin{warn}% |
|
428 A common mistake when writing definitions is to introduce extra free |
|
429 variables on the right-hand side as in the following fictitious definition: |
|
430 \input{Misc/document/prime_def.tex}% |
|
431 \end{warn} |
|
432 |
|
433 |
|
434 \chapter{More Functional Programming} |
|
435 |
|
436 The purpose of this chapter is to deepen the reader's understanding of the |
|
437 concepts encountered so far and to introduce an advanced forms of datatypes |
|
438 and recursive functions. The first two sections give a structured |
|
439 presentation of theorem proving by simplification |
|
440 (\S\ref{sec:Simplification}) and discuss important heuristics for induction |
|
441 (\S\ref{sec:InductionHeuristics}). They can be skipped by readers less |
|
442 interested in proofs. They are followed by a case study, a compiler for |
|
443 expressions (\S\ref{sec:ExprCompiler}). Advanced datatypes, including those |
|
444 involving function spaces, are covered in \S\ref{sec:advanced-datatypes}, |
|
445 which closes with another case study, search trees (``tries''). Finally we |
|
446 introduce a very general form of recursive function definition which goes |
|
447 well beyond what \isacommand{primrec} allows (\S\ref{sec:recdef}). |
|
448 |
|
449 |
|
450 \section{Simplification} |
|
451 \label{sec:Simplification} |
|
452 \index{simplification|(} |
|
453 |
|
454 So far we have proved our theorems by \isa{auto}, which ``simplifies'' all |
|
455 subgoals. In fact, \isa{auto} can do much more than that, except that it |
|
456 did not need to so far. However, when you go beyond toy examples, you need to |
|
457 understand the ingredients of \isa{auto}. This section covers the method |
|
458 that \isa{auto} always applies first, namely simplification. |
|
459 |
|
460 Simplification is one of the central theorem proving tools in Isabelle and |
|
461 many other systems. The tool itself is called the \bfindex{simplifier}. The |
|
462 purpose of this section is twofold: to introduce the many features of the |
|
463 simplifier (\S\ref{sec:SimpFeatures}) and to explain a little bit how the |
|
464 simplifier works (\S\ref{sec:SimpHow}). Anybody intending to use HOL should |
|
465 read \S\ref{sec:SimpFeatures}, and the serious student should read |
|
466 \S\ref{sec:SimpHow} as well in order to understand what happened in case |
|
467 things do not simplify as expected. |
|
468 |
|
469 |
|
470 \subsection{Using the simplifier} |
|
471 \label{sec:SimpFeatures} |
|
472 |
|
473 In its most basic form, simplification means repeated application of |
|
474 equations from left to right. For example, taking the rules for \isa{\at} |
|
475 and applying them to the term \isa{[0,1] \at\ []} results in a sequence of |
|
476 simplification steps: |
|
477 \begin{ttbox}\makeatother |
|
478 (0#1#[]) @ [] \(\leadsto\) 0#((1#[]) @ []) \(\leadsto\) 0#(1#([] @ [])) \(\leadsto\) 0#1#[] |
|
479 \end{ttbox} |
|
480 This is also known as \bfindex{term rewriting} and the equations are referred |
|
481 to as \bfindex{rewrite rules}. This is more honest than ``simplification'' |
|
482 because the terms do not necessarily become simpler in the process. |
|
483 |
|
484 \subsubsection{Simplification rules} |
|
485 \indexbold{simplification rules} |
|
486 |
|
487 To facilitate simplification, theorems can be declared to be simplification |
|
488 rules (with the help of the attribute \isa{[simp]}\index{*simp |
|
489 (attribute)}), in which case proofs by simplification make use of these |
|
490 rules by default. In addition the constructs \isacommand{datatype} and |
|
491 \isacommand{primrec} (and a few others) invisibly declare useful |
|
492 simplification rules. Explicit definitions are \emph{not} declared |
|
493 simplification rules automatically! |
|
494 |
|
495 Not merely equations but pretty much any theorem can become a simplification |
|
496 rule. The simplifier will try to make sense of it. For example, a theorem |
|
497 \isasymnot$P$ is automatically turned into \isa{$P$ = False}. The details |
|
498 are explained in \S\ref{sec:SimpHow}. |
|
499 |
|
500 The simplification attribute of theorems can be turned on and off as follows: |
|
501 \begin{ttbox} |
|
502 theorems [simp] = \(list of theorem names\); |
|
503 theorems [simp del] = \(list of theorem names\); |
|
504 \end{ttbox} |
|
505 As a rule of thumb, rules that really simplify (like \isa{rev(rev xs) = |
|
506 xs} and \mbox{\isa{xs \at\ [] = xs}}) should be made simplification |
|
507 rules. Those of a more specific nature (e.g.\ distributivity laws, which |
|
508 alter the structure of terms considerably) should only be used selectively, |
|
509 i.e.\ they should not be default simplification rules. Conversely, it may |
|
510 also happen that a simplification rule needs to be disabled in certain |
|
511 proofs. Frequent changes in the simplification status of a theorem may |
|
512 indicates a badly designed theory. |
|
513 \begin{warn} |
|
514 Simplification may not terminate, for example if both $f(x) = g(x)$ and |
|
515 $g(x) = f(x)$ are simplification rules. It is the user's responsibility not |
|
516 to include simplification rules that can lead to nontermination, either on |
|
517 their own or in combination with other simplification rules. |
|
518 \end{warn} |
|
519 |
|
520 \subsubsection{The simplification method} |
|
521 \index{*simp (method)|bold} |
|
522 |
|
523 The general format of the simplification method is |
|
524 \begin{ttbox} |
|
525 simp \(list of modifiers\) |
|
526 \end{ttbox} |
|
527 where the list of \emph{modifiers} helps to fine tune the behaviour and may |
|
528 be empty. Most if not all of the proofs seen so far could have been performed |
|
529 with \isa{simp} instead of \isa{auto}, except that \isa{simp} attacks |
|
530 only the first subgoal and may thus need to be repeated. |
|
531 Note that \isa{simp} fails if nothing changes. |
|
532 |
|
533 \subsubsection{Adding and deleting simplification rules} |
|
534 |
|
535 If a certain theorem is merely needed in a few proofs by simplification, |
|
536 we do not need to make it a global simplification rule. Instead we can modify |
|
537 the set of simplification rules used in a simplification step by adding rules |
|
538 to it and/or deleting rules from it. The two modifiers for this are |
|
539 \begin{ttbox} |
|
540 add: \(list of theorem names\) |
|
541 del: \(list of theorem names\) |
|
542 \end{ttbox} |
|
543 In case you want to use only a specific list of theorems and ignore all |
|
544 others: |
|
545 \begin{ttbox} |
|
546 only: \(list of theorem names\) |
|
547 \end{ttbox} |
|
548 |
|
549 |
|
550 \subsubsection{Assumptions} |
|
551 \index{simplification!with/of assumptions} |
|
552 |
|
553 {\makeatother\input{Misc/document/asm_simp.tex}} |
|
554 |
|
555 \subsubsection{Rewriting with definitions} |
|
556 \index{simplification!with definitions} |
|
557 |
|
558 \input{Misc/document/def_rewr.tex} |
|
559 |
|
560 \begin{warn} |
|
561 If you have defined $f\,x\,y~\isasymequiv~t$ then you can only expand |
|
562 occurrences of $f$ with at least two arguments. Thus it is safer to define |
|
563 $f$~\isasymequiv~\isasymlambda$x\,y.\;t$. |
|
564 \end{warn} |
|
565 |
|
566 \subsubsection{Simplifying let-expressions} |
|
567 \index{simplification!of let-expressions} |
|
568 |
|
569 Proving a goal containing \isaindex{let}-expressions almost invariably |
|
570 requires the \isa{let}-con\-structs to be expanded at some point. Since |
|
571 \isa{let}-\isa{in} is just syntactic sugar for a defined constant (called |
|
572 \isa{Let}), expanding \isa{let}-constructs means rewriting with |
|
573 \isa{Let_def}: |
|
574 |
|
575 {\makeatother\input{Misc/document/let_rewr.tex}} |
|
576 |
|
577 \subsubsection{Conditional equations} |
|
578 |
|
579 \input{Misc/document/cond_rewr.tex} |
|
580 |
|
581 |
|
582 \subsubsection{Automatic case splits} |
|
583 \indexbold{case splits} |
|
584 \index{*split|(} |
|
585 |
|
586 {\makeatother\input{Misc/document/case_splits.tex}} |
|
587 |
|
588 \index{*split|)} |
|
589 |
|
590 \begin{warn} |
|
591 The simplifier merely simplifies the condition of an \isa{if} but not the |
|
592 \isa{then} or \isa{else} parts. The latter are simplified only after the |
|
593 condition reduces to \isa{True} or \isa{False}, or after splitting. The |
|
594 same is true for \isaindex{case}-expressions: only the selector is |
|
595 simplified at first, until either the expression reduces to one of the |
|
596 cases or it is split. |
|
597 \end{warn} |
|
598 |
|
599 \subsubsection{Arithmetic} |
|
600 \index{arithmetic} |
|
601 |
|
602 The simplifier routinely solves a small class of linear arithmetic formulae |
|
603 (over types \isa{nat} and \isa{int}): it only takes into account |
|
604 assumptions and conclusions that are (possibly negated) (in)equalities |
|
605 (\isa{=}, \isasymle, \isa{<}) and it only knows about addition. Thus |
|
606 |
|
607 \input{Misc/document/arith1.tex}% |
|
608 is proved by simplification, whereas the only slightly more complex |
|
609 |
|
610 \input{Misc/document/arith4.tex}% |
|
611 is not proved by simplification and requires \isa{arith}. |
|
612 |
|
613 \subsubsection{Permutative rewrite rules} |
|
614 |
|
615 A rewrite rule is {\bf permutative} if the left-hand side and right-hand side |
|
616 are the same up to renaming of variables. The most common permutative rule |
|
617 is commutativity: $x+y = y+x$. Another example is $(x-y)-z = (x-z)-y$. Such |
|
618 rules are problematic because once they apply, they can be used forever. |
|
619 The simplifier is aware of this danger and treats permutative rules |
|
620 separately. For details see~\cite{isabelle-ref}. |
|
621 |
|
622 \subsubsection{Tracing} |
|
623 \indexbold{tracing the simplifier} |
|
624 |
|
625 {\makeatother\input{Misc/document/trace_simp.tex}} |
|
626 |
|
627 \subsection{How it works} |
|
628 \label{sec:SimpHow} |
|
629 |
|
630 \subsubsection{Higher-order patterns} |
|
631 |
|
632 \subsubsection{Local assumptions} |
|
633 |
|
634 \subsubsection{The preprocessor} |
|
635 |
|
636 \index{simplification|)} |
|
637 |
|
638 \section{Induction heuristics} |
|
639 \label{sec:InductionHeuristics} |
|
640 |
|
641 The purpose of this section is to illustrate some simple heuristics for |
|
642 inductive proofs. The first one we have already mentioned in our initial |
|
643 example: |
|
644 \begin{quote} |
|
645 {\em 1. Theorems about recursive functions are proved by induction.} |
|
646 \end{quote} |
|
647 In case the function has more than one argument |
|
648 \begin{quote} |
|
649 {\em 2. Do induction on argument number $i$ if the function is defined by |
|
650 recursion in argument number $i$.} |
|
651 \end{quote} |
|
652 When we look at the proof of {\makeatother\isa{(xs @ ys) @ zs = xs @ (ys @ |
|
653 zs)}} in \S\ref{sec:intro-proof} we find (a) \isa{\at} is recursive in |
|
654 the first argument, (b) \isa{xs} occurs only as the first argument of |
|
655 \isa{\at}, and (c) both \isa{ys} and \isa{zs} occur at least once as |
|
656 the second argument of \isa{\at}. Hence it is natural to perform induction |
|
657 on \isa{xs}. |
|
658 |
|
659 The key heuristic, and the main point of this section, is to |
|
660 generalize the goal before induction. The reason is simple: if the goal is |
|
661 too specific, the induction hypothesis is too weak to allow the induction |
|
662 step to go through. Let us now illustrate the idea with an example. |
|
663 |
|
664 {\makeatother\input{Misc/document/Itrev.tex}} |
|
665 |
|
666 A final point worth mentioning is the orientation of the equation we just |
|
667 proved: the more complex notion (\isa{itrev}) is on the left-hand |
|
668 side, the simpler \isa{rev} on the right-hand side. This constitutes |
|
669 another, albeit weak heuristic that is not restricted to induction: |
|
670 \begin{quote} |
|
671 {\em 5. The right-hand side of an equation should (in some sense) be |
|
672 simpler than the left-hand side.} |
|
673 \end{quote} |
|
674 The heuristic is tricky to apply because it is not obvious that |
|
675 \isa{rev xs \at\ ys} is simpler than \isa{itrev xs ys}. But see what |
|
676 happens if you try to prove the symmetric equation! |
|
677 |
|
678 |
|
679 \section{Case study: compiling expressions} |
|
680 \label{sec:ExprCompiler} |
|
681 |
|
682 {\makeatother\input{CodeGen/document/CodeGen.tex}} |
|
683 |
|
684 |
|
685 \section{Advanced datatypes} |
|
686 \label{sec:advanced-datatypes} |
|
687 \index{*datatype|(} |
|
688 \index{*primrec|(} |
|
689 %|) |
|
690 |
|
691 This section presents advanced forms of \isacommand{datatype}s. |
|
692 |
|
693 \subsection{Mutual recursion} |
|
694 \label{sec:datatype-mut-rec} |
|
695 |
|
696 \input{Datatype/document/ABexpr.tex} |
|
697 |
|
698 \subsection{Nested recursion} |
|
699 |
|
700 \input{Datatype/document/Nested.tex} |
|
701 |
|
702 |
|
703 \subsection{The limits of nested recursion} |
|
704 |
|
705 How far can we push nested recursion? By the unfolding argument above, we can |
|
706 reduce nested to mutual recursion provided the nested recursion only involves |
|
707 previously defined datatypes. This does not include functions: |
|
708 \begin{ttbox} |
|
709 datatype t = C (t => bool) |
|
710 \end{ttbox} |
|
711 is a real can of worms: in HOL it must be ruled out because it requires a type |
|
712 \isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the |
|
713 same cardinality---an impossibility. For the same reason it is not possible |
|
714 to allow recursion involving the type \isa{set}, which is isomorphic to |
|
715 \isa{t \isasymFun\ bool}. |
|
716 |
|
717 Fortunately, a limited form of recursion |
|
718 involving function spaces is permitted: the recursive type may occur on the |
|
719 right of a function arrow, but never on the left. Hence the above can of worms |
|
720 is ruled out but the following example of a potentially infinitely branching tree is |
|
721 accepted: |
|
722 |
|
723 \input{Datatype/document/Fundata.tex} |
|
724 \bigskip |
|
725 |
|
726 If you need nested recursion on the left of a function arrow, |
|
727 there are alternatives to pure HOL: LCF~\cite{paulson87} is a logic where types like |
|
728 \begin{ttbox} |
|
729 datatype lam = C (lam -> lam) |
|
730 \end{ttbox} |
|
731 do indeed make sense (note the intentionally different arrow \isa{->}!). |
|
732 There is even a version of LCF on top of HOL, called |
|
733 HOLCF~\cite{MuellerNvOS99}. |
|
734 |
|
735 \index{*primrec|)} |
|
736 \index{*datatype|)} |
|
737 |
|
738 \subsection{Case study: Tries} |
|
739 |
|
740 Tries are a classic search tree data structure~\cite{Knuth3-75} for fast |
|
741 indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a |
|
742 trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and |
|
743 ``cat''. When searching a string in a trie, the letters of the string are |
|
744 examined sequentially. Each letter determines which subtrie to search next. |
|
745 In this case study we model tries as a datatype, define a lookup and an |
|
746 update function, and prove that they behave as expected. |
|
747 |
|
748 \begin{figure}[htbp] |
|
749 \begin{center} |
|
750 \unitlength1mm |
|
751 \begin{picture}(60,30) |
|
752 \put( 5, 0){\makebox(0,0)[b]{l}} |
|
753 \put(25, 0){\makebox(0,0)[b]{e}} |
|
754 \put(35, 0){\makebox(0,0)[b]{n}} |
|
755 \put(45, 0){\makebox(0,0)[b]{r}} |
|
756 \put(55, 0){\makebox(0,0)[b]{t}} |
|
757 % |
|
758 \put( 5, 9){\line(0,-1){5}} |
|
759 \put(25, 9){\line(0,-1){5}} |
|
760 \put(44, 9){\line(-3,-2){9}} |
|
761 \put(45, 9){\line(0,-1){5}} |
|
762 \put(46, 9){\line(3,-2){9}} |
|
763 % |
|
764 \put( 5,10){\makebox(0,0)[b]{l}} |
|
765 \put(15,10){\makebox(0,0)[b]{n}} |
|
766 \put(25,10){\makebox(0,0)[b]{p}} |
|
767 \put(45,10){\makebox(0,0)[b]{a}} |
|
768 % |
|
769 \put(14,19){\line(-3,-2){9}} |
|
770 \put(15,19){\line(0,-1){5}} |
|
771 \put(16,19){\line(3,-2){9}} |
|
772 \put(45,19){\line(0,-1){5}} |
|
773 % |
|
774 \put(15,20){\makebox(0,0)[b]{a}} |
|
775 \put(45,20){\makebox(0,0)[b]{c}} |
|
776 % |
|
777 \put(30,30){\line(-3,-2){13}} |
|
778 \put(30,30){\line(3,-2){13}} |
|
779 \end{picture} |
|
780 \end{center} |
|
781 \caption{A sample trie} |
|
782 \label{fig:trie} |
|
783 \end{figure} |
|
784 |
|
785 Proper tries associate some value with each string. Since the |
|
786 information is stored only in the final node associated with the string, many |
|
787 nodes do not carry any value. This distinction is captured by the |
|
788 following predefined datatype (from theory \texttt{Option}, which is a parent |
|
789 of \texttt{Main}): |
|
790 \input{Trie/document/Option2.tex} |
|
791 \indexbold{*option}\indexbold{*None}\indexbold{*Some} |
|
792 |
|
793 \input{Trie/document/Trie.tex} |
|
794 |
|
795 \begin{exercise} |
|
796 Write an improved version of \isa{update} that does not suffer from the |
|
797 space leak in the version above. Prove the main theorem for your improved |
|
798 \isa{update}. |
|
799 \end{exercise} |
|
800 |
|
801 \begin{exercise} |
|
802 Write a function to \emph{delete} entries from a trie. An easy solution is |
|
803 a clever modification of \isa{update} which allows both insertion and |
|
804 deletion with a single function. Prove (a modified version of) the main |
|
805 theorem above. Optimize you function such that it shrinks tries after |
|
806 deletion, if possible. |
|
807 \end{exercise} |
|
808 |
|
809 \section{Total recursive functions} |
|
810 \label{sec:recdef} |
|
811 \index{*recdef|(} |
|
812 |
|
813 Although many total functions have a natural primitive recursive definition, |
|
814 this is not always the case. Arbitrary total recursive functions can be |
|
815 defined by means of \isacommand{recdef}: you can use full pattern-matching, |
|
816 recursion need not involve datatypes, and termination is proved by showing |
|
817 that the arguments of all recursive calls are smaller in a suitable (user |
|
818 supplied) sense. |
|
819 |
|
820 \subsection{Defining recursive functions} |
|
821 |
|
822 \input{Recdef/document/examples.tex} |
|
823 |
|
824 \subsection{Proving termination} |
|
825 |
|
826 \input{Recdef/document/termination.tex} |
|
827 |
|
828 \subsection{Simplification with recdef} |
|
829 |
|
830 \input{Recdef/document/simplification.tex} |
|
831 |
|
832 |
|
833 \subsection{Induction} |
|
834 \index{induction!recursion|(} |
|
835 \index{recursion induction|(} |
|
836 |
|
837 \input{Recdef/document/Induction.tex} |
|
838 |
|
839 \index{induction!recursion|)} |
|
840 \index{recursion induction|)} |
|
841 \index{*recdef|)} |