
1 \chapter{Functional Programming in HOL} 

2 

3 Although on the surface this chapter is mainly concerned with how to write 

4 functional programs in HOL and how to verify them, most of the 

5 constructs and proof procedures introduced are general purpose and recur in 

6 any specification or verification task. 

7 

8 The dedicated functional programmer should be warned: HOL offers only {\em 

9 total functional programming}  all functions in HOL must be total; lazy 

10 data structures are not directly available. On the positive side, functions 

11 in HOL need not be computable: HOL is a specification language that goes well 

12 beyond what can be expressed as a program. However, for the time being we 

13 concentrate on the computable. 

14 

15 \section{An introductory theory} 

16 \label{sec:introtheory} 

17 

18 Functional programming needs datatypes and functions. Both of them can be 

19 defined in a theory with a syntax reminiscent of languages like ML or 

20 Haskell. As an example consider the theory in figure~\ref{fig:ToyList}. 

21 We will now examine it line by line. 

22 

23 \begin{figure}[htbp] 

24 \begin{ttbox}\makeatother 

25 \input{ToyList2/ToyList1}\end{ttbox} 

26 \caption{A theory of lists} 

27 \label{fig:ToyList} 

28 \end{figure} 

29 

30 {\makeatother\input{ToyList/document/ToyList.tex}} 

31 

32 The complete proof script is shown in figure~\ref{fig:ToyListproofs}. The 

33 concatenation of figures \ref{fig:ToyList} and \ref{fig:ToyListproofs} 

34 constitutes the complete theory \texttt{ToyList} and should reside in file 

35 \texttt{ToyList.thy}. It is good practice to present all declarations and 

36 definitions at the beginning of a theory to facilitate browsing. 

37 

38 \begin{figure}[htbp] 

39 \begin{ttbox}\makeatother 

40 \input{ToyList2/ToyList2}\end{ttbox} 

41 \caption{Proofs about lists} 

42 \label{fig:ToyListproofs} 

43 \end{figure} 

44 

45 \subsubsection*{Review} 

46 

47 This is the end of our toy proof. It should have familiarized you with 

48 \begin{itemize} 

49 \item the standard theorem proving procedure: 

50 state a goal (lemma or theorem); proceed with proof until a separate lemma is 

51 required; prove that lemma; come back to the original goal. 

52 \item a specific procedure that works well for functional programs: 

53 induction followed by allout simplification via \isa{auto}. 

54 \item a basic repertoire of proof commands. 

55 \end{itemize} 

56 

57 

58 \section{Some helpful commands} 

59 \label{sec:commandsandhints} 

60 

61 This section discusses a few basic commands for manipulating the proof state 

62 and can be skipped by casual readers. 

63 

64 There are two kinds of commands used during a proof: the actual proof 

65 commands and auxiliary commands for examining the proof state and controlling 

66 the display. Simple proof commands are of the form 

67 \isacommand{apply}\isa{(method)}\indexbold{apply} where \bfindex{method} is a 

68 synonym for ``theorem proving function''. Typical examples are 

69 \isa{induct_tac} and \isa{auto}. Further methods are introduced throughout 

70 the tutorial. Unless stated otherwise you may assume that a method attacks 

71 merely the first subgoal. An exception is \isa{auto} which tries to solve all 

72 subgoals. 

73 

74 The most useful auxiliary commands are: 

75 \begin{description} 

76 \item[Undoing:] \isacommand{undo}\indexbold{*undo} undoes the effect of the 

77 last command; \isacommand{undo} can be undone by 

78 \isacommand{redo}\indexbold{*redo}. Both are only needed at the shell 

79 level and should not occur in the final theory. 

80 \item[Printing the current state:] \isacommand{pr}\indexbold{*pr} redisplays 

81 the current proof state, for example when it has disappeared off the 

82 screen. 

83 \item[Limiting the number of subgoals:] \isacommand{pr}~$n$ tells Isabelle to 

84 print only the first $n$ subgoals from now on and redisplays the current 

85 proof state. This is helpful when there are many subgoals. 

86 \item[Modifying the order of subgoals:] 

87 \isacommand{defer}\indexbold{*defer} moves the first subgoal to the end and 

88 \isacommand{prefer}\indexbold{*prefer}~$n$ moves subgoal $n$ to the front. 

89 \item[Printing theorems:] 

90 \isacommand{thm}\indexbold{*thm}~\textit{name}$@1$~\dots~\textit{name}$@n$ 

91 prints the named theorems. 

92 \item[Displaying types:] We have already mentioned the flag 

93 \ttindex{show_types} above. It can also be useful for detecting typos in 

94 formulae early on. For example, if \texttt{show_types} is set and the goal 

95 \isa{rev(rev xs) = xs} is started, Isabelle prints the additional output 

96 \par\noindent 

97 \begin{isabelle}% 

98 Variables:\isanewline 

99 ~~xs~::~'a~list 

100 \end{isabelle}% 

101 \par\noindent 

102 which tells us that Isabelle has correctly inferred that 

103 \isa{xs} is a variable of list type. On the other hand, had we 

104 made a typo as in \isa{rev(re xs) = xs}, the response 

105 \par\noindent 

106 \begin{isabelle}% 

107 Variables:\isanewline 

108 ~~re~::~'a~list~{\isasymRightarrow}~'a~list\isanewline 

109 ~~xs~::~'a~list% 

110 \end{isabelle}% 

111 \par\noindent 

112 would have alerted us because of the unexpected variable \isa{re}. 

113 \item[Reading terms and types:] \isacommand{term}\indexbold{*term} 

114 \textit{string} reads, typechecks and prints the given string as a term in 

115 the current context; the inferred type is output as well. 

116 \isacommand{typ}\indexbold{*typ} \textit{string} reads and prints the given 

117 string as a type in the current context. 

118 \item[(Re)loading theories:] When you start your interaction you must open a 

119 named theory with the line 

120 \isacommand{theory}\texttt{~T~=}~\dots~\texttt{:}. Isabelle automatically 

121 loads all the required parent theories from their respective files (which 

122 may take a moment, unless the theories are already loaded and the files 

123 have not been modified). The only time when you need to load a theory by 

124 hand is when you simply want to check if it loads successfully without 

125 wanting to make use of the theory itself. This you can do by typing 

126 \isacommand{use\_thy}\indexbold{*use_thy}~\texttt{"T"}. 

127 

128 If you suddenly discover that you need to modify a parent theory of your 

129 current theory you must first abandon your current theory (at the shell 

130 level type \isacommand{kill}\indexbold{*kill}). After the parent theory has 

131 been modified you go back to your original theory. When its first line 

132 \isacommand{theory}\texttt{~T~=}~\dots~\texttt{:} is processed, the 

133 modified parent is reloaded automatically. 

134 \end{description} 

135 Further commands are found in the Isabelle/Isar Reference Manual. 

136 

137 

138 \section{Datatypes} 

139 \label{sec:datatype} 

140 

141 Inductive datatypes are part of almost every nontrivial application of HOL. 

142 First we take another look at a very important example, the datatype of 

143 lists, before we turn to datatypes in general. The section closes with a 

144 case study. 

145 

146 

147 \subsection{Lists} 

148 \indexbold{*list} 

149 

150 Lists are one of the essential datatypes in computing. Readers of this 

151 tutorial and users of HOL need to be familiar with their basic operations. 

152 Theory \texttt{ToyList} is only a small fragment of HOL's predefined theory 

153 \texttt{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}. 

154 The latter contains many further operations. For example, the functions 

155 \isaindexbold{hd} (`head') and \isaindexbold{tl} (`tail') return the first 

156 element and the remainder of a list. (However, patternmatching is usually 

157 preferable to \isa{hd} and \isa{tl}.) Theory \texttt{List} also contains 

158 more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates 

159 $x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}. In the rest of the tutorial we 

160 always use HOL's predefined lists. 

161 

162 

163 \subsection{The general format} 

164 \label{sec:generaldatatype} 

165 

166 The general HOL \isacommand{datatype} definition is of the form 

167 \[ 

168 \isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~ 

169 C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~ 

170 C@m~\tau@{m1}~\dots~\tau@{mk@m} 

171 \] 

172 where $\alpha@i$ are type variables (the parameters), $C@i$ are distinct 

173 constructor names and $\tau@{ij}$ are types; it is customary to capitalize 

174 the first letter in constructor names. There are a number of 

175 restrictions (such as that the type should not be empty) detailed 

176 elsewhere~\cite{isabelleHOL}. Isabelle notifies you if you violate them. 

177 

178 Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and 

179 \isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically 

180 during proofs by simplification. The same is true for the equations in 

181 primitive recursive function definitions. 

182 

183 \subsection{Primitive recursion} 

184 

185 Functions on datatypes are usually defined by recursion. In fact, most of the 

186 time they are defined by what is called \bfindex{primitive recursion}. 

187 The keyword \isacommand{primrec}\indexbold{*primrec} is followed by a list of 

188 equations 

189 \[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \] 

190 such that $C$ is a constructor of the datatype $t$ and all recursive calls of 

191 $f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus 

192 Isabelle immediately sees that $f$ terminates because one (fixed!) argument 

193 becomes smaller with every recursive call. There must be exactly one equation 

194 for each constructor. Their order is immaterial. 

195 A more general method for defining total recursive functions is explained in 

196 \S\ref{sec:recdef}. 

197 

198 \begin{exercise} 

199 \input{Misc/document/Tree.tex}% 

200 \end{exercise} 

201 

202 \subsection{Case expressions} 

203 \label{sec:caseexpressions} 

204 

205 HOL also features \isaindexbold{case}expressions for analyzing 

206 elements of a datatype. For example, 

207 % case xs of [] => 0  y#ys => y 

208 \begin{isabellepar}% 

209 ~~~case~xs~of~[]~{\isasymRightarrow}~0~~y~\#~ys~{\isasymRightarrow}~y 

210 \end{isabellepar}% 

211 evaluates to \isa{0} if \isa{xs} is \isa{[]} and to \isa{y} if 

212 \isa{xs} is \isa{y\#ys}. (Since the result in both branches must be of 

213 the same type, it follows that \isa{y::nat} and hence 

214 \isa{xs::(nat)list}.) 

215 

216 In general, if $e$ is a term of the datatype $t$ defined in 

217 \S\ref{sec:generaldatatype} above, the corresponding 

218 \isa{case}expression analyzing $e$ is 

219 \[ 

220 \begin{array}{rrcl} 

221 \isa{case}~e~\isa{of} & C@1~x@{11}~\dots~x@{1k@1} & \To & e@1 \\ 

222 \vdots \\ 

223 \mid & C@m~x@{m1}~\dots~x@{mk@m} & \To & e@m 

224 \end{array} 

225 \] 

226 

227 \begin{warn} 

228 {\em All} constructors must be present, their order is fixed, and nested 

229 patterns are not supported. Violating these restrictions results in strange 

230 error messages. 

231 \end{warn} 

232 \noindent 

233 Nested patterns can be simulated by nested \isa{case}expressions: instead 

234 of 

235 % case xs of [] => 0  [x] => x  x#(y#zs) => y 

236 \begin{isabellepar}% 

237 ~~~case~xs~of~[]~{\isasymRightarrow}~0~~[x]~{\isasymRightarrow}~x~~x~\#~y~\#~zs~{\isasymRightarrow}~y 

238 \end{isabellepar}% 

239 write 

240 % term(latex xsymbols symbols)"case xs of [] => 0  x#ys => (case ys of [] => x  y#zs => y)"; 

241 \begin{isabellepar}% 

242 ~~~case~xs~of~[]~{\isasymRightarrow}~0~~x~\#~ys~{\isasymRightarrow}~(case~ys~of~[]~{\isasymRightarrow}~x~~y~\#~zs~{\isasymRightarrow}~y)% 

243 \end{isabellepar}% 

244 Note that \isa{case}expressions should be enclosed in parentheses to 

245 indicate their scope. 

246 

247 \subsection{Structural induction and case distinction} 

248 \indexbold{structural induction} 

249 \indexbold{induction!structural} 

250 \indexbold{case distinction} 

251 

252 Almost all the basic laws about a datatype are applied automatically during 

253 simplification. Only induction is invoked by hand via \isaindex{induct_tac}, 

254 which works for any datatype. In some cases, induction is overkill and a case 

255 distinction over all constructors of the datatype suffices. This is performed 

256 by \isaindexbold{case_tac}. A trivial example: 

257 

258 \input{Misc/document/cases.tex}% 

259 

260 Note that we do not need to give a lemma a name if we do not intend to refer 

261 to it explicitly in the future. 

262 

263 \begin{warn} 

264 Induction is only allowed on free (or \isasymAndbound) variables that 

265 should not occur among the assumptions of the subgoal. Case distinction 

266 (\isa{case_tac}) works for arbitrary terms, which need to be 

267 quoted if they are nonatomic. 

268 \end{warn} 

269 

270 

271 \subsection{Case study: boolean expressions} 

272 \label{sec:boolex} 

273 

274 The aim of this case study is twofold: it shows how to model boolean 

275 expressions and some algorithms for manipulating them, and it demonstrates 

276 the constructs introduced above. 

277 

278 \input{Ifexpr/document/Ifexpr.tex} 

279 

280 How does one come up with the required lemmas? Try to prove the main theorems 

281 without them and study carefully what \isa{auto} leaves unproved. This has 

282 to provide the clue. The necessity of universal quantification 

283 (\isa{\isasymforall{}t e}) in the two lemmas is explained in 

284 \S\ref{sec:InductionHeuristics} 

285 

286 \begin{exercise} 

287 We strengthen the definition of a \isa{normal} Ifexpression as follows: 

288 the first argument of all \isa{IF}s must be a variable. Adapt the above 

289 development to this changed requirement. (Hint: you may need to formulate 

290 some of the goals as implications (\isasymimp) rather than equalities 

291 (\isa{=}).) 

292 \end{exercise} 

293 

294 \section{Some basic types} 

295 

296 

297 \subsection{Natural numbers} 

298 \index{arithmetic(} 

299 

300 \input{Misc/document/fakenat.tex} 

301 \input{Misc/document/natsum.tex} 

302 

303 The usual arithmetic operations \ttindexboldpos{+}{$HOL2arithfun}, 

304 \ttindexboldpos{}{$HOL2arithfun}, \ttindexboldpos{*}{$HOL2arithfun}, 

305 \isaindexbold{div}, \isaindexbold{mod}, \isaindexbold{min} and 

306 \isaindexbold{max} are predefined, as are the relations 

307 \indexboldpos{\isasymle}{$HOL2arithrel} and 

308 \ttindexboldpos{<}{$HOL2arithrel}. There is even a least number operation 

309 \isaindexbold{LEAST}. For example, \isa{(LEAST n.$\,$1 < n) = 2}, although 

310 Isabelle does not prove this completely automatically. Note that \isa{1} and 

311 \isa{2} are available as abbreviations for the corresponding 

312 \isa{Suc}expressions. If you need the full set of numerals, 

313 see~\S\ref{natnumerals}. 

314 

315 \begin{warn} 

316 The operations \ttindexboldpos{+}{$HOL2arithfun}, 

317 \ttindexboldpos{}{$HOL2arithfun}, \ttindexboldpos{*}{$HOL2arithfun}, 

318 \isaindexbold{min}, \isaindexbold{max}, 

319 \indexboldpos{\isasymle}{$HOL2arithrel} and 

320 \ttindexboldpos{<}{$HOL2arithrel} are overloaded, i.e.\ they are available 

321 not just for natural numbers but at other types as well (see 

322 \S\ref{sec:TypeClasses}). For example, given the goal \isa{x+y = y+x}, 

323 there is nothing to indicate that you are talking about natural numbers. 

324 Hence Isabelle can only infer that \isa{x} and \isa{y} are of some 

325 arbitrary type where \isa{+} is declared. As a consequence, you will be 

326 unable to prove the goal (although it may take you some time to realize 

327 what has happened if \texttt{show_types} is not set). In this particular 

328 example, you need to include an explicit type constraint, for example 

329 \isa{x+y = y+(x::nat)}. If there is enough contextual information this may 

330 not be necessary: \isa{x+0 = x} automatically implies \isa{x::nat}. 

331 \end{warn} 

332 

333 Simple arithmetic goals are proved automatically by both \isa{auto} 

334 and the simplification methods introduced in \S\ref{sec:Simplification}. For 

335 example, 

336 

337 \input{Misc/document/arith1.tex}% 

338 is proved automatically. The main restriction is that only addition is taken 

339 into account; other arithmetic operations and quantified formulae are ignored. 

340 

341 For more complex goals, there is the special method 

342 \isaindexbold{arith} (which attacks the first subgoal). It proves 

343 arithmetic goals involving the usual logical connectives (\isasymnot, 

344 \isasymand, \isasymor, \isasymimp), the relations \isasymle\ and \isa{<}, and 

345 the operations \isa{+}, \isa{}, \isa{min} and \isa{max}. For example, 

346 

347 \input{Misc/document/arith2.tex}% 

348 succeeds because \isa{k*k} can be treated as atomic. 

349 In contrast, 

350 

351 \input{Misc/document/arith3.tex}% 

352 is not even proved by \isa{arith} because the proof relies essentially 

353 on properties of multiplication. 

354 

355 \begin{warn} 

356 The running time of \isa{arith} is exponential in the number of occurrences 

357 of \ttindexboldpos{}{$HOL2arithfun}, \isaindexbold{min} and 

358 \isaindexbold{max} because they are first eliminated by case distinctions. 

359 

360 \isa{arith} is incomplete even for the restricted class of formulae 

361 described above (known as ``linear arithmetic''). If divisibility plays a 

362 role, it may fail to prove a valid formula, for example $m+m \neq n+n+1$. 

363 Fortunately, such examples are rare in practice. 

364 \end{warn} 

365 

366 \index{arithmetic)} 

367 

368 

369 \subsection{Products} 

370 

371 HOL also has pairs: \isa{($a@1$,$a@2$)} is of type \texttt{$\tau@1$ * 

372 $\tau@2$} provided each $a@i$ is of type $\tau@i$. The components of a pair 

373 are extracted by \isa{fst} and \isa{snd}: \isa{fst($x$,$y$) = $x$} and 

374 \isa{snd($x$,$y$) = $y$}. Tuples are simulated by pairs nested to the right: 

375 \isa{($a@1$,$a@2$,$a@3$)} stands for \isa{($a@1$,($a@2$,$a@3$))} and 

376 \isa{$\tau@1$ * $\tau@2$ * $\tau@3$} for \isa{$\tau@1$ * ($\tau@2$ * 

377 $\tau@3$)}. Therefore we have \isa{fst(snd($a@1$,$a@2$,$a@3$)) = $a@2$}. 

378 

379 It is possible to use (nested) tuples as patterns in abstractions, for 

380 example \isa{\isasymlambda(x,y,z).x+y+z} and 

381 \isa{\isasymlambda((x,y),z).x+y+z}. 

382 In addition to explicit $\lambda$abstractions, tuple patterns can be used in 

383 most variable binding constructs. Typical examples are 

384 

385 \input{Misc/document/pairs.tex}% 

386 Further important examples are quantifiers and sets (see~\S\ref{quantpats}). 

387 

388 %FIXME move stuff below into section on proofs about products? 

389 \begin{warn} 

390 Abstraction over pairs and tuples is merely a convenient shorthand for a 

391 more complex internal representation. Thus the internal and external form 

392 of a term may differ, which can affect proofs. If you want to avoid this 

393 complication, use \isa{fst} and \isa{snd}, i.e.\ write 

394 \isa{\isasymlambda{}p.~fst p + snd p} instead of 

395 \isa{\isasymlambda(x,y).~x + y}. See~\S\ref{proofsaboutproducts} for 

396 theorem proving with tuple patterns. 

397 \end{warn} 

398 

399 Finally note that products, like natural numbers, are datatypes, which means 

400 in particular that \isa{induct_tac} and \isa{case_tac} are applicable to 

401 products (see \S\ref{proofsaboutproducts}). 

402 

403 \section{Definitions} 

404 \label{sec:Definitions} 

405 

406 A definition is simply an abbreviation, i.e.\ a new name for an existing 

407 construction. In particular, definitions cannot be recursive. Isabelle offers 

408 definitions on the level of types and terms. Those on the type level are 

409 called type synonyms, those on the term level are called (constant) 

410 definitions. 

411 

412 

413 \subsection{Type synonyms} 

414 \indexbold{type synonyms} 

415 

416 Type synonyms are similar to those found in ML. Their syntax is fairly self 

417 explanatory: 

418 

419 \input{Misc/document/types.tex}% 

420 

421 Note that patternmatching is not allowed, i.e.\ each definition must be of 

422 the form $f\,x@1\,\dots\,x@n~\isasymequiv~t$. 

423 

424 Section~\S\ref{sec:Simplification} explains how definitions are used 

425 in proofs. 

426 

427 \begin{warn}% 

428 A common mistake when writing definitions is to introduce extra free 

429 variables on the righthand side as in the following fictitious definition: 

430 \input{Misc/document/prime_def.tex}% 

431 \end{warn} 

432 

433 

434 \chapter{More Functional Programming} 

435 

436 The purpose of this chapter is to deepen the reader's understanding of the 

437 concepts encountered so far and to introduce an advanced forms of datatypes 

438 and recursive functions. The first two sections give a structured 

439 presentation of theorem proving by simplification 

440 (\S\ref{sec:Simplification}) and discuss important heuristics for induction 

441 (\S\ref{sec:InductionHeuristics}). They can be skipped by readers less 

442 interested in proofs. They are followed by a case study, a compiler for 

443 expressions (\S\ref{sec:ExprCompiler}). Advanced datatypes, including those 

444 involving function spaces, are covered in \S\ref{sec:advanceddatatypes}, 

445 which closes with another case study, search trees (``tries''). Finally we 

446 introduce a very general form of recursive function definition which goes 

447 well beyond what \isacommand{primrec} allows (\S\ref{sec:recdef}). 

448 

449 

450 \section{Simplification} 

451 \label{sec:Simplification} 

452 \index{simplification(} 

453 

454 So far we have proved our theorems by \isa{auto}, which ``simplifies'' all 

455 subgoals. In fact, \isa{auto} can do much more than that, except that it 

456 did not need to so far. However, when you go beyond toy examples, you need to 

457 understand the ingredients of \isa{auto}. This section covers the method 

458 that \isa{auto} always applies first, namely simplification. 

459 

460 Simplification is one of the central theorem proving tools in Isabelle and 

461 many other systems. The tool itself is called the \bfindex{simplifier}. The 

462 purpose of this section is twofold: to introduce the many features of the 

463 simplifier (\S\ref{sec:SimpFeatures}) and to explain a little bit how the 

464 simplifier works (\S\ref{sec:SimpHow}). Anybody intending to use HOL should 

465 read \S\ref{sec:SimpFeatures}, and the serious student should read 

466 \S\ref{sec:SimpHow} as well in order to understand what happened in case 

467 things do not simplify as expected. 

468 

469 

470 \subsection{Using the simplifier} 

471 \label{sec:SimpFeatures} 

472 

473 In its most basic form, simplification means repeated application of 

474 equations from left to right. For example, taking the rules for \isa{\at} 

475 and applying them to the term \isa{[0,1] \at\ []} results in a sequence of 

476 simplification steps: 

477 \begin{ttbox}\makeatother 

478 (0#1#[]) @ [] \(\leadsto\) 0#((1#[]) @ []) \(\leadsto\) 0#(1#([] @ [])) \(\leadsto\) 0#1#[] 

479 \end{ttbox} 

480 This is also known as \bfindex{term rewriting} and the equations are referred 

481 to as \bfindex{rewrite rules}. This is more honest than ``simplification'' 

482 because the terms do not necessarily become simpler in the process. 

483 

484 \subsubsection{Simplification rules} 

485 \indexbold{simplification rules} 

486 

487 To facilitate simplification, theorems can be declared to be simplification 

488 rules (with the help of the attribute \isa{[simp]}\index{*simp 

489 (attribute)}), in which case proofs by simplification make use of these 

490 rules by default. In addition the constructs \isacommand{datatype} and 

491 \isacommand{primrec} (and a few others) invisibly declare useful 

492 simplification rules. Explicit definitions are \emph{not} declared 

493 simplification rules automatically! 

494 

495 Not merely equations but pretty much any theorem can become a simplification 

496 rule. The simplifier will try to make sense of it. For example, a theorem 

497 \isasymnot$P$ is automatically turned into \isa{$P$ = False}. The details 

498 are explained in \S\ref{sec:SimpHow}. 

499 

500 The simplification attribute of theorems can be turned on and off as follows: 

501 \begin{ttbox} 

502 theorems [simp] = \(list of theorem names\); 

503 theorems [simp del] = \(list of theorem names\); 

504 \end{ttbox} 

505 As a rule of thumb, rules that really simplify (like \isa{rev(rev xs) = 

506 xs} and \mbox{\isa{xs \at\ [] = xs}}) should be made simplification 

507 rules. Those of a more specific nature (e.g.\ distributivity laws, which 

508 alter the structure of terms considerably) should only be used selectively, 

509 i.e.\ they should not be default simplification rules. Conversely, it may 

510 also happen that a simplification rule needs to be disabled in certain 

511 proofs. Frequent changes in the simplification status of a theorem may 

512 indicates a badly designed theory. 

513 \begin{warn} 

514 Simplification may not terminate, for example if both $f(x) = g(x)$ and 

515 $g(x) = f(x)$ are simplification rules. It is the user's responsibility not 

516 to include simplification rules that can lead to nontermination, either on 

517 their own or in combination with other simplification rules. 

518 \end{warn} 

519 

520 \subsubsection{The simplification method} 

521 \index{*simp (method)bold} 

522 

523 The general format of the simplification method is 

524 \begin{ttbox} 

525 simp \(list of modifiers\) 

526 \end{ttbox} 

527 where the list of \emph{modifiers} helps to fine tune the behaviour and may 

528 be empty. Most if not all of the proofs seen so far could have been performed 

529 with \isa{simp} instead of \isa{auto}, except that \isa{simp} attacks 

530 only the first subgoal and may thus need to be repeated. 

531 Note that \isa{simp} fails if nothing changes. 

532 

533 \subsubsection{Adding and deleting simplification rules} 

534 

535 If a certain theorem is merely needed in a few proofs by simplification, 

536 we do not need to make it a global simplification rule. Instead we can modify 

537 the set of simplification rules used in a simplification step by adding rules 

538 to it and/or deleting rules from it. The two modifiers for this are 

539 \begin{ttbox} 

540 add: \(list of theorem names\) 

541 del: \(list of theorem names\) 

542 \end{ttbox} 

543 In case you want to use only a specific list of theorems and ignore all 

544 others: 

545 \begin{ttbox} 

546 only: \(list of theorem names\) 

547 \end{ttbox} 

548 

549 

550 \subsubsection{Assumptions} 

551 \index{simplification!with/of assumptions} 

552 

553 {\makeatother\input{Misc/document/asm_simp.tex}} 

554 

555 \subsubsection{Rewriting with definitions} 

556 \index{simplification!with definitions} 

557 

558 \input{Misc/document/def_rewr.tex} 

559 

560 \begin{warn} 

561 If you have defined $f\,x\,y~\isasymequiv~t$ then you can only expand 

562 occurrences of $f$ with at least two arguments. Thus it is safer to define 

563 $f$~\isasymequiv~\isasymlambda$x\,y.\;t$. 

564 \end{warn} 

565 

566 \subsubsection{Simplifying letexpressions} 

567 \index{simplification!of letexpressions} 

568 

569 Proving a goal containing \isaindex{let}expressions almost invariably 

570 requires the \isa{let}con\structs to be expanded at some point. Since 

571 \isa{let}\isa{in} is just syntactic sugar for a defined constant (called 

572 \isa{Let}), expanding \isa{let}constructs means rewriting with 

573 \isa{Let_def}: 

574 

575 {\makeatother\input{Misc/document/let_rewr.tex}} 

576 

577 \subsubsection{Conditional equations} 

578 

579 \input{Misc/document/cond_rewr.tex} 

580 

581 

582 \subsubsection{Automatic case splits} 

583 \indexbold{case splits} 

584 \index{*split(} 

585 

586 {\makeatother\input{Misc/document/case_splits.tex}} 

587 

588 \index{*split)} 

589 

590 \begin{warn} 

591 The simplifier merely simplifies the condition of an \isa{if} but not the 

592 \isa{then} or \isa{else} parts. The latter are simplified only after the 

593 condition reduces to \isa{True} or \isa{False}, or after splitting. The 

594 same is true for \isaindex{case}expressions: only the selector is 

595 simplified at first, until either the expression reduces to one of the 

596 cases or it is split. 

597 \end{warn} 

598 

599 \subsubsection{Arithmetic} 

600 \index{arithmetic} 

601 

602 The simplifier routinely solves a small class of linear arithmetic formulae 

603 (over types \isa{nat} and \isa{int}): it only takes into account 

604 assumptions and conclusions that are (possibly negated) (in)equalities 

605 (\isa{=}, \isasymle, \isa{<}) and it only knows about addition. Thus 

606 

607 \input{Misc/document/arith1.tex}% 

608 is proved by simplification, whereas the only slightly more complex 

609 

610 \input{Misc/document/arith4.tex}% 

611 is not proved by simplification and requires \isa{arith}. 

612 

613 \subsubsection{Permutative rewrite rules} 

614 

615 A rewrite rule is {\bf permutative} if the lefthand side and righthand side 

616 are the same up to renaming of variables. The most common permutative rule 

617 is commutativity: $x+y = y+x$. Another example is $(xy)z = (xz)y$. Such 

618 rules are problematic because once they apply, they can be used forever. 

619 The simplifier is aware of this danger and treats permutative rules 

620 separately. For details see~\cite{isabelleref}. 

621 

622 \subsubsection{Tracing} 

623 \indexbold{tracing the simplifier} 

624 

625 {\makeatother\input{Misc/document/trace_simp.tex}} 

626 

627 \subsection{How it works} 

628 \label{sec:SimpHow} 

629 

630 \subsubsection{Higherorder patterns} 

631 

632 \subsubsection{Local assumptions} 

633 

634 \subsubsection{The preprocessor} 

635 

636 \index{simplification)} 

637 

638 \section{Induction heuristics} 

639 \label{sec:InductionHeuristics} 

640 

641 The purpose of this section is to illustrate some simple heuristics for 

642 inductive proofs. The first one we have already mentioned in our initial 

643 example: 

644 \begin{quote} 

645 {\em 1. Theorems about recursive functions are proved by induction.} 

646 \end{quote} 

647 In case the function has more than one argument 

648 \begin{quote} 

649 {\em 2. Do induction on argument number $i$ if the function is defined by 

650 recursion in argument number $i$.} 

651 \end{quote} 

652 When we look at the proof of {\makeatother\isa{(xs @ ys) @ zs = xs @ (ys @ 

653 zs)}} in \S\ref{sec:introproof} we find (a) \isa{\at} is recursive in 

654 the first argument, (b) \isa{xs} occurs only as the first argument of 

655 \isa{\at}, and (c) both \isa{ys} and \isa{zs} occur at least once as 

656 the second argument of \isa{\at}. Hence it is natural to perform induction 

657 on \isa{xs}. 

658 

659 The key heuristic, and the main point of this section, is to 

660 generalize the goal before induction. The reason is simple: if the goal is 

661 too specific, the induction hypothesis is too weak to allow the induction 

662 step to go through. Let us now illustrate the idea with an example. 

663 

664 {\makeatother\input{Misc/document/Itrev.tex}} 

665 

666 A final point worth mentioning is the orientation of the equation we just 

667 proved: the more complex notion (\isa{itrev}) is on the lefthand 

668 side, the simpler \isa{rev} on the righthand side. This constitutes 

669 another, albeit weak heuristic that is not restricted to induction: 

670 \begin{quote} 

671 {\em 5. The righthand side of an equation should (in some sense) be 

672 simpler than the lefthand side.} 

673 \end{quote} 

674 The heuristic is tricky to apply because it is not obvious that 

675 \isa{rev xs \at\ ys} is simpler than \isa{itrev xs ys}. But see what 

676 happens if you try to prove the symmetric equation! 

677 

678 

679 \section{Case study: compiling expressions} 

680 \label{sec:ExprCompiler} 

681 

682 {\makeatother\input{CodeGen/document/CodeGen.tex}} 

683 

684 

685 \section{Advanced datatypes} 

686 \label{sec:advanceddatatypes} 

687 \index{*datatype(} 

688 \index{*primrec(} 

689 %) 

690 

691 This section presents advanced forms of \isacommand{datatype}s. 

692 

693 \subsection{Mutual recursion} 

694 \label{sec:datatypemutrec} 

695 

696 \input{Datatype/document/ABexpr.tex} 

697 

698 \subsection{Nested recursion} 

699 

700 \input{Datatype/document/Nested.tex} 

701 

702 

703 \subsection{The limits of nested recursion} 

704 

705 How far can we push nested recursion? By the unfolding argument above, we can 

706 reduce nested to mutual recursion provided the nested recursion only involves 

707 previously defined datatypes. This does not include functions: 

708 \begin{ttbox} 

709 datatype t = C (t => bool) 

710 \end{ttbox} 

711 is a real can of worms: in HOL it must be ruled out because it requires a type 

712 \isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the 

713 same cardinalityan impossibility. For the same reason it is not possible 

714 to allow recursion involving the type \isa{set}, which is isomorphic to 

715 \isa{t \isasymFun\ bool}. 

716 

717 Fortunately, a limited form of recursion 

718 involving function spaces is permitted: the recursive type may occur on the 

719 right of a function arrow, but never on the left. Hence the above can of worms 

720 is ruled out but the following example of a potentially infinitely branching tree is 

721 accepted: 

722 

723 \input{Datatype/document/Fundata.tex} 

724 \bigskip 

725 

726 If you need nested recursion on the left of a function arrow, 

727 there are alternatives to pure HOL: LCF~\cite{paulson87} is a logic where types like 

728 \begin{ttbox} 

729 datatype lam = C (lam > lam) 

730 \end{ttbox} 

731 do indeed make sense (note the intentionally different arrow \isa{>}!). 

732 There is even a version of LCF on top of HOL, called 

733 HOLCF~\cite{MuellerNvOS99}. 

734 

735 \index{*primrec)} 

736 \index{*datatype)} 

737 

738 \subsection{Case study: Tries} 

739 

740 Tries are a classic search tree data structure~\cite{Knuth375} for fast 

741 indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a 

742 trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and 

743 ``cat''. When searching a string in a trie, the letters of the string are 

744 examined sequentially. Each letter determines which subtrie to search next. 

745 In this case study we model tries as a datatype, define a lookup and an 

746 update function, and prove that they behave as expected. 

747 

748 \begin{figure}[htbp] 

749 \begin{center} 

750 \unitlength1mm 

751 \begin{picture}(60,30) 

752 \put( 5, 0){\makebox(0,0)[b]{l}} 

753 \put(25, 0){\makebox(0,0)[b]{e}} 

754 \put(35, 0){\makebox(0,0)[b]{n}} 

755 \put(45, 0){\makebox(0,0)[b]{r}} 

756 \put(55, 0){\makebox(0,0)[b]{t}} 

757 % 

758 \put( 5, 9){\line(0,1){5}} 

759 \put(25, 9){\line(0,1){5}} 

760 \put(44, 9){\line(3,2){9}} 

761 \put(45, 9){\line(0,1){5}} 

762 \put(46, 9){\line(3,2){9}} 

763 % 

764 \put( 5,10){\makebox(0,0)[b]{l}} 

765 \put(15,10){\makebox(0,0)[b]{n}} 

766 \put(25,10){\makebox(0,0)[b]{p}} 

767 \put(45,10){\makebox(0,0)[b]{a}} 

768 % 

769 \put(14,19){\line(3,2){9}} 

770 \put(15,19){\line(0,1){5}} 

771 \put(16,19){\line(3,2){9}} 

772 \put(45,19){\line(0,1){5}} 

773 % 

774 \put(15,20){\makebox(0,0)[b]{a}} 

775 \put(45,20){\makebox(0,0)[b]{c}} 

776 % 

777 \put(30,30){\line(3,2){13}} 

778 \put(30,30){\line(3,2){13}} 

779 \end{picture} 

780 \end{center} 

781 \caption{A sample trie} 

782 \label{fig:trie} 

783 \end{figure} 

784 

785 Proper tries associate some value with each string. Since the 

786 information is stored only in the final node associated with the string, many 

787 nodes do not carry any value. This distinction is captured by the 

788 following predefined datatype (from theory \texttt{Option}, which is a parent 

789 of \texttt{Main}): 

790 \input{Trie/document/Option2.tex} 

791 \indexbold{*option}\indexbold{*None}\indexbold{*Some} 

792 

793 \input{Trie/document/Trie.tex} 

794 

795 \begin{exercise} 

796 Write an improved version of \isa{update} that does not suffer from the 

797 space leak in the version above. Prove the main theorem for your improved 

798 \isa{update}. 

799 \end{exercise} 

800 

801 \begin{exercise} 

802 Write a function to \emph{delete} entries from a trie. An easy solution is 

803 a clever modification of \isa{update} which allows both insertion and 

804 deletion with a single function. Prove (a modified version of) the main 

805 theorem above. Optimize you function such that it shrinks tries after 

806 deletion, if possible. 

807 \end{exercise} 

808 

809 \section{Total recursive functions} 

810 \label{sec:recdef} 

811 \index{*recdef(} 

812 

813 Although many total functions have a natural primitive recursive definition, 

814 this is not always the case. Arbitrary total recursive functions can be 

815 defined by means of \isacommand{recdef}: you can use full patternmatching, 

816 recursion need not involve datatypes, and termination is proved by showing 

817 that the arguments of all recursive calls are smaller in a suitable (user 

818 supplied) sense. 

819 

820 \subsection{Defining recursive functions} 

821 

822 \input{Recdef/document/examples.tex} 

823 

824 \subsection{Proving termination} 

825 

826 \input{Recdef/document/termination.tex} 

827 

828 \subsection{Simplification with recdef} 

829 

830 \input{Recdef/document/simplification.tex} 

831 

832 

833 \subsection{Induction} 

834 \index{induction!recursion(} 

835 \index{recursion induction(} 

836 

837 \input{Recdef/document/Induction.tex} 

838 

839 \index{induction!recursion)} 

840 \index{recursion induction)} 

841 \index{*recdef)} 