doc-src/TutorialI/fp.tex
 changeset 8743 3253c6046d57 child 8771 026f37a86ea7
equal inserted replaced
8742:8a5b3f58b944 8743:3253c6046d57

     1 \chapter{Functional Programming in HOL}

     2

     3 Although on the surface this chapter is mainly concerned with how to write

     4 functional programs in HOL and how to verify them, most of the

     5 constructs and proof procedures introduced are general purpose and recur in

     6 any specification or verification task.

     7

     8 The dedicated functional programmer should be warned: HOL offers only {\em

     9   total functional programming} --- all functions in HOL must be total; lazy

    10 data structures are not directly available. On the positive side, functions

    11 in HOL need not be computable: HOL is a specification language that goes well

    12 beyond what can be expressed as a program. However, for the time being we

    13 concentrate on the computable.

    14

    15 \section{An introductory theory}

    16 \label{sec:intro-theory}

    17

    18 Functional programming needs datatypes and functions. Both of them can be

    19 defined in a theory with a syntax reminiscent of languages like ML or

    20 Haskell. As an example consider the theory in figure~\ref{fig:ToyList}.

    21 We will now examine it line by line.

    22

    23 \begin{figure}[htbp]

    24 \begin{ttbox}\makeatother

    25 \input{ToyList2/ToyList1}\end{ttbox}

    26 \caption{A theory of lists}

    27 \label{fig:ToyList}

    28 \end{figure}

    29

    30 {\makeatother\input{ToyList/document/ToyList.tex}}

    31

    32 The complete proof script is shown in figure~\ref{fig:ToyList-proofs}. The

    33 concatenation of figures \ref{fig:ToyList} and \ref{fig:ToyList-proofs}

    34 constitutes the complete theory \texttt{ToyList} and should reside in file

    35 \texttt{ToyList.thy}. It is good practice to present all declarations and

    36 definitions at the beginning of a theory to facilitate browsing.

    37

    38 \begin{figure}[htbp]

    39 \begin{ttbox}\makeatother

    40 \input{ToyList2/ToyList2}\end{ttbox}

    41 \caption{Proofs about lists}

    42 \label{fig:ToyList-proofs}

    43 \end{figure}

    44

    45 \subsubsection*{Review}

    46

    47 This is the end of our toy proof. It should have familiarized you with

    48 \begin{itemize}

    49 \item the standard theorem proving procedure:

    50 state a goal (lemma or theorem); proceed with proof until a separate lemma is

    51 required; prove that lemma; come back to the original goal.

    52 \item a specific procedure that works well for functional programs:

    53 induction followed by all-out simplification via \isa{auto}.

    54 \item a basic repertoire of proof commands.

    55 \end{itemize}

    56

    57

    58 \section{Some helpful commands}

    59 \label{sec:commands-and-hints}

    60

    61 This section discusses a few basic commands for manipulating the proof state

    62 and can be skipped by casual readers.

    63

    64 There are two kinds of commands used during a proof: the actual proof

    65 commands and auxiliary commands for examining the proof state and controlling

    66 the display. Simple proof commands are of the form

    67 \isacommand{apply}\isa{(method)}\indexbold{apply} where \bfindex{method} is a

    68 synonym for theorem proving function''. Typical examples are

    69 \isa{induct_tac} and \isa{auto}. Further methods are introduced throughout

    70 the tutorial.  Unless stated otherwise you may assume that a method attacks

    71 merely the first subgoal. An exception is \isa{auto} which tries to solve all

    72 subgoals.

    73

    74 The most useful auxiliary commands are:

    75 \begin{description}

    76 \item[Undoing:] \isacommand{undo}\indexbold{*undo} undoes the effect of the

    77   last command; \isacommand{undo} can be undone by

    78   \isacommand{redo}\indexbold{*redo}.  Both are only needed at the shell

    79   level and should not occur in the final theory.

    80 \item[Printing the current state:] \isacommand{pr}\indexbold{*pr} redisplays

    81   the current proof state, for example when it has disappeared off the

    82   screen.

    83 \item[Limiting the number of subgoals:] \isacommand{pr}~$n$ tells Isabelle to

    84   print only the first $n$ subgoals from now on and redisplays the current

    85   proof state. This is helpful when there are many subgoals.

    86 \item[Modifying the order of subgoals:]

    87 \isacommand{defer}\indexbold{*defer} moves the first subgoal to the end and

    88 \isacommand{prefer}\indexbold{*prefer}~$n$ moves subgoal $n$ to the front.

    89 \item[Printing theorems:]

    90   \isacommand{thm}\indexbold{*thm}~\textit{name}$@1$~\dots~\textit{name}$@n$

    91   prints the named theorems.

    92 \item[Displaying types:] We have already mentioned the flag

    93   \ttindex{show_types} above. It can also be useful for detecting typos in

    94   formulae early on. For example, if \texttt{show_types} is set and the goal

    95   \isa{rev(rev xs) = xs} is started, Isabelle prints the additional output

    96 \par\noindent

    97 \begin{isabelle}%

    98 Variables:\isanewline

    99 ~~xs~::~'a~list

   100 \end{isabelle}%

   101 \par\noindent

   102 which tells us that Isabelle has correctly inferred that

   103 \isa{xs} is a variable of list type. On the other hand, had we

   104 made a typo as in \isa{rev(re xs) = xs}, the response

   105 \par\noindent

   106 \begin{isabelle}%

   107 Variables:\isanewline

   108 ~~re~::~'a~list~{\isasymRightarrow}~'a~list\isanewline

   109 ~~xs~::~'a~list%

   110 \end{isabelle}%

   111 \par\noindent

   112 would have alerted us because of the unexpected variable \isa{re}.

   113 \item[Reading terms and types:] \isacommand{term}\indexbold{*term}

   114   \textit{string} reads, type-checks and prints the given string as a term in

   115   the current context; the inferred type is output as well.

   116   \isacommand{typ}\indexbold{*typ} \textit{string} reads and prints the given

   117   string as a type in the current context.

   118 \item[(Re)loading theories:] When you start your interaction you must open a

   119   named theory with the line

   120   \isacommand{theory}\texttt{~T~=}~\dots~\texttt{:}. Isabelle automatically

   121   loads all the required parent theories from their respective files (which

   122   may take a moment, unless the theories are already loaded and the files

   123   have not been modified). The only time when you need to load a theory by

   124   hand is when you simply want to check if it loads successfully without

   125   wanting to make use of the theory itself. This you can do by typing

   126   \isacommand{use\_thy}\indexbold{*use_thy}~\texttt{"T"}.

   127

   128   If you suddenly discover that you need to modify a parent theory of your

   129   current theory you must first abandon your current theory (at the shell

   130   level type \isacommand{kill}\indexbold{*kill}). After the parent theory has

   131   been modified you go back to your original theory. When its first line

   132   \isacommand{theory}\texttt{~T~=}~\dots~\texttt{:} is processed, the

   133   modified parent is reloaded automatically.

   134 \end{description}

   135 Further commands are found in the Isabelle/Isar Reference Manual.

   136

   137

   138 \section{Datatypes}

   139 \label{sec:datatype}

   140

   141 Inductive datatypes are part of almost every non-trivial application of HOL.

   142 First we take another look at a very important example, the datatype of

   143 lists, before we turn to datatypes in general. The section closes with a

   144 case study.

   145

   146

   147 \subsection{Lists}

   148 \indexbold{*list}

   149

   150 Lists are one of the essential datatypes in computing. Readers of this

   151 tutorial and users of HOL need to be familiar with their basic operations.

   152 Theory \texttt{ToyList} is only a small fragment of HOL's predefined theory

   153 \texttt{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}.

   154 The latter contains many further operations. For example, the functions

   155 \isaindexbold{hd} (head') and \isaindexbold{tl} (tail') return the first

   156 element and the remainder of a list. (However, pattern-matching is usually

   157 preferable to \isa{hd} and \isa{tl}.)  Theory \texttt{List} also contains

   158 more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates

   159 $x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}.  In the rest of the tutorial we

   160 always use HOL's predefined lists.

   161

   162

   163 \subsection{The general format}

   164 \label{sec:general-datatype}

   165

   166 The general HOL \isacommand{datatype} definition is of the form

   167 $    168 \isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~    169 C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~    170 C@m~\tau@{m1}~\dots~\tau@{mk@m}    171$

   172 where $\alpha@i$ are type variables (the parameters), $C@i$ are distinct

   173 constructor names and $\tau@{ij}$ are types; it is customary to capitalize

   174 the first letter in constructor names. There are a number of

   175 restrictions (such as that the type should not be empty) detailed

   176 elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them.

   177

   178 Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and

   179 \isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically

   180 during proofs by simplification.  The same is true for the equations in

   181 primitive recursive function definitions.

   182

   183 \subsection{Primitive recursion}

   184

   185 Functions on datatypes are usually defined by recursion. In fact, most of the

   186 time they are defined by what is called \bfindex{primitive recursion}.

   187 The keyword \isacommand{primrec}\indexbold{*primrec} is followed by a list of

   188 equations

   189 $f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r$

   190 such that $C$ is a constructor of the datatype $t$ and all recursive calls of

   191 $f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus

   192 Isabelle immediately sees that $f$ terminates because one (fixed!) argument

   193 becomes smaller with every recursive call. There must be exactly one equation

   194 for each constructor.  Their order is immaterial.

   195 A more general method for defining total recursive functions is explained in

   196 \S\ref{sec:recdef}.

   197

   198 \begin{exercise}

   199 \input{Misc/document/Tree.tex}%

   200 \end{exercise}

   201

   202 \subsection{Case expressions}

   203 \label{sec:case-expressions}

   204

   205 HOL also features \isaindexbold{case}-expressions for analyzing

   206 elements of a datatype. For example,

   207 % case xs of [] => 0 | y#ys => y

   208 \begin{isabellepar}%

   209 ~~~case~xs~of~[]~{\isasymRightarrow}~0~|~y~\#~ys~{\isasymRightarrow}~y

   210 \end{isabellepar}%

   211 evaluates to \isa{0} if \isa{xs} is \isa{[]} and to \isa{y} if

   212 \isa{xs} is \isa{y\#ys}. (Since the result in both branches must be of

   213 the same type, it follows that \isa{y::nat} and hence

   214 \isa{xs::(nat)list}.)

   215

   216 In general, if $e$ is a term of the datatype $t$ defined in

   217 \S\ref{sec:general-datatype} above, the corresponding

   218 \isa{case}-expression analyzing $e$ is

   219 $    220 \begin{array}{rrcl}    221 \isa{case}~e~\isa{of} & C@1~x@{11}~\dots~x@{1k@1} & \To & e@1 \\    222 \vdots \\    223 \mid & C@m~x@{m1}~\dots~x@{mk@m} & \To & e@m    224 \end{array}    225$

   226

   227 \begin{warn}

   228 {\em All} constructors must be present, their order is fixed, and nested

   229 patterns are not supported.  Violating these restrictions results in strange

   230 error messages.

   231 \end{warn}

   232 \noindent

   233 Nested patterns can be simulated by nested \isa{case}-expressions: instead

   234 of

   235 % case xs of [] => 0 | [x] => x | x#(y#zs) => y

   236 \begin{isabellepar}%

   237 ~~~case~xs~of~[]~{\isasymRightarrow}~0~|~[x]~{\isasymRightarrow}~x~|~x~\#~y~\#~zs~{\isasymRightarrow}~y

   238 \end{isabellepar}%

   239 write

   240 % term(latex xsymbols symbols)"case xs of [] => 0 | x#ys => (case ys of [] => x | y#zs => y)";

   241 \begin{isabellepar}%

   242 ~~~case~xs~of~[]~{\isasymRightarrow}~0~|~x~\#~ys~{\isasymRightarrow}~(case~ys~of~[]~{\isasymRightarrow}~x~|~y~\#~zs~{\isasymRightarrow}~y)%

   243 \end{isabellepar}%

   244 Note that \isa{case}-expressions should be enclosed in parentheses to

   245 indicate their scope.

   246

   247 \subsection{Structural induction and case distinction}

   248 \indexbold{structural induction}

   249 \indexbold{induction!structural}

   250 \indexbold{case distinction}

   251

   252 Almost all the basic laws about a datatype are applied automatically during

   253 simplification. Only induction is invoked by hand via \isaindex{induct_tac},

   254 which works for any datatype. In some cases, induction is overkill and a case

   255 distinction over all constructors of the datatype suffices. This is performed

   256 by \isaindexbold{case_tac}. A trivial example:

   257

   258 \input{Misc/document/cases.tex}%

   259

   260 Note that we do not need to give a lemma a name if we do not intend to refer

   261 to it explicitly in the future.

   262

   263 \begin{warn}

   264   Induction is only allowed on free (or \isasymAnd-bound) variables that

   265   should not occur among the assumptions of the subgoal.  Case distinction

   266   (\isa{case_tac}) works for arbitrary terms, which need to be

   267   quoted if they are non-atomic.

   268 \end{warn}

   269

   270

   271 \subsection{Case study: boolean expressions}

   272 \label{sec:boolex}

   273

   274 The aim of this case study is twofold: it shows how to model boolean

   275 expressions and some algorithms for manipulating them, and it demonstrates

   276 the constructs introduced above.

   277

   278 \input{Ifexpr/document/Ifexpr.tex}

   279

   280 How does one come up with the required lemmas? Try to prove the main theorems

   281 without them and study carefully what \isa{auto} leaves unproved. This has

   282 to provide the clue.  The necessity of universal quantification

   283 (\isa{\isasymforall{}t e}) in the two lemmas is explained in

   284 \S\ref{sec:InductionHeuristics}

   285

   286 \begin{exercise}

   287   We strengthen the definition of a \isa{normal} If-expression as follows:

   288   the first argument of all \isa{IF}s must be a variable. Adapt the above

   289   development to this changed requirement. (Hint: you may need to formulate

   290   some of the goals as implications (\isasymimp) rather than equalities

   291   (\isa{=}).)

   292 \end{exercise}

   293

   294 \section{Some basic types}

   295

   296

   297 \subsection{Natural numbers}

   298 \index{arithmetic|(}

   299

   300 \input{Misc/document/fakenat.tex}

   301 \input{Misc/document/natsum.tex}

   302

   303 The usual arithmetic operations \ttindexboldpos{+}{$HOL2arithfun},    304 \ttindexboldpos{-}{$HOL2arithfun}, \ttindexboldpos{*}{$HOL2arithfun},    305 \isaindexbold{div}, \isaindexbold{mod}, \isaindexbold{min} and    306 \isaindexbold{max} are predefined, as are the relations    307 \indexboldpos{\isasymle}{$HOL2arithrel} and

   308 \ttindexboldpos{<}{$HOL2arithrel}. There is even a least number operation    309 \isaindexbold{LEAST}. For example, \isa{(LEAST n.$\,$1 < n) = 2}, although    310 Isabelle does not prove this completely automatically. Note that \isa{1} and    311 \isa{2} are available as abbreviations for the corresponding    312 \isa{Suc}-expressions. If you need the full set of numerals,    313 see~\S\ref{nat-numerals}.    314     315 \begin{warn}    316 The operations \ttindexboldpos{+}{$HOL2arithfun},

   317   \ttindexboldpos{-}{$HOL2arithfun}, \ttindexboldpos{*}{$HOL2arithfun},

   318   \isaindexbold{min}, \isaindexbold{max},

   319   \indexboldpos{\isasymle}{$HOL2arithrel} and    320 \ttindexboldpos{<}{$HOL2arithrel} are overloaded, i.e.\ they are available

   321   not just for natural numbers but at other types as well (see

   322   \S\ref{sec:TypeClasses}). For example, given the goal \isa{x+y = y+x},

   323   there is nothing to indicate that you are talking about natural numbers.

   324   Hence Isabelle can only infer that \isa{x} and \isa{y} are of some

   325   arbitrary type where \isa{+} is declared. As a consequence, you will be

   326   unable to prove the goal (although it may take you some time to realize

   327   what has happened if \texttt{show_types} is not set).  In this particular

   328   example, you need to include an explicit type constraint, for example

   329   \isa{x+y = y+(x::nat)}.  If there is enough contextual information this may

   330   not be necessary: \isa{x+0 = x} automatically implies \isa{x::nat}.

   331 \end{warn}

   332

   333 Simple arithmetic goals are proved automatically by both \isa{auto}

   334 and the simplification methods introduced in \S\ref{sec:Simplification}.  For

   335 example,

   336

   337 \input{Misc/document/arith1.tex}%

   338 is proved automatically. The main restriction is that only addition is taken

   339 into account; other arithmetic operations and quantified formulae are ignored.

   340

   341 For more complex goals, there is the special method

   342 \isaindexbold{arith} (which attacks the first subgoal). It proves

   343 arithmetic goals involving the usual logical connectives (\isasymnot,

   344 \isasymand, \isasymor, \isasymimp), the relations \isasymle\ and \isa{<}, and

   345 the operations \isa{+}, \isa{-}, \isa{min} and \isa{max}. For example,

   346

   347 \input{Misc/document/arith2.tex}%

   348 succeeds because \isa{k*k} can be treated as atomic.

   349 In contrast,

   350

   351 \input{Misc/document/arith3.tex}%

   352 is not even proved by \isa{arith} because the proof relies essentially

   353 on properties of multiplication.

   354

   355 \begin{warn}

   356   The running time of \isa{arith} is exponential in the number of occurrences

   357   of \ttindexboldpos{-}{$HOL2arithfun}, \isaindexbold{min} and    358 \isaindexbold{max} because they are first eliminated by case distinctions.    359     360 \isa{arith} is incomplete even for the restricted class of formulae    361 described above (known as linear arithmetic''). If divisibility plays a    362 role, it may fail to prove a valid formula, for example$m+m \neq n+n+1$.    363 Fortunately, such examples are rare in practice.    364 \end{warn}    365     366 \index{arithmetic|)}    367     368     369 \subsection{Products}    370     371 HOL also has pairs: \isa{($a@1$,$a@2$)} is of type \texttt{$\tau@1$*    372$\tau@2$} provided each$a@i$is of type$\tau@i$. The components of a pair    373 are extracted by \isa{fst} and \isa{snd}: \isa{fst($x$,$y$) =$x$} and    374 \isa{snd($x$,$y$) =$y$}. Tuples are simulated by pairs nested to the right:    375 \isa{($a@1$,$a@2$,$a@3$)} stands for \isa{($a@1$,($a@2$,$a@3$))} and    376 \isa{$\tau@1$*$\tau@2$*$\tau@3$} for \isa{$\tau@1$* ($\tau@2$*    377$\tau@3$)}. Therefore we have \isa{fst(snd($a@1$,$a@2$,$a@3$)) =$a@2$}.    378     379 It is possible to use (nested) tuples as patterns in abstractions, for    380 example \isa{\isasymlambda(x,y,z).x+y+z} and    381 \isa{\isasymlambda((x,y),z).x+y+z}.    382 In addition to explicit$\lambda$-abstractions, tuple patterns can be used in    383 most variable binding constructs. Typical examples are    384     385 \input{Misc/document/pairs.tex}%    386 Further important examples are quantifiers and sets (see~\S\ref{quant-pats}).    387     388 %FIXME move stuff below into section on proofs about products?    389 \begin{warn}    390 Abstraction over pairs and tuples is merely a convenient shorthand for a    391 more complex internal representation. Thus the internal and external form    392 of a term may differ, which can affect proofs. If you want to avoid this    393 complication, use \isa{fst} and \isa{snd}, i.e.\ write    394 \isa{\isasymlambda{}p.~fst p + snd p} instead of    395 \isa{\isasymlambda(x,y).~x + y}. See~\S\ref{proofs-about-products} for    396 theorem proving with tuple patterns.    397 \end{warn}    398     399 Finally note that products, like natural numbers, are datatypes, which means    400 in particular that \isa{induct_tac} and \isa{case_tac} are applicable to    401 products (see \S\ref{proofs-about-products}).    402     403 \section{Definitions}    404 \label{sec:Definitions}    405     406 A definition is simply an abbreviation, i.e.\ a new name for an existing    407 construction. In particular, definitions cannot be recursive. Isabelle offers    408 definitions on the level of types and terms. Those on the type level are    409 called type synonyms, those on the term level are called (constant)    410 definitions.    411     412     413 \subsection{Type synonyms}    414 \indexbold{type synonyms}    415     416 Type synonyms are similar to those found in ML. Their syntax is fairly self    417 explanatory:    418     419 \input{Misc/document/types.tex}%    420     421 Note that pattern-matching is not allowed, i.e.\ each definition must be of    422 the form$f\,x@1\,\dots\,x@n~\isasymequiv~t$.    423     424 Section~\S\ref{sec:Simplification} explains how definitions are used    425 in proofs.    426     427 \begin{warn}%    428 A common mistake when writing definitions is to introduce extra free    429 variables on the right-hand side as in the following fictitious definition:    430 \input{Misc/document/prime_def.tex}%    431 \end{warn}    432     433     434 \chapter{More Functional Programming}    435     436 The purpose of this chapter is to deepen the reader's understanding of the    437 concepts encountered so far and to introduce an advanced forms of datatypes    438 and recursive functions. The first two sections give a structured    439 presentation of theorem proving by simplification    440 (\S\ref{sec:Simplification}) and discuss important heuristics for induction    441 (\S\ref{sec:InductionHeuristics}). They can be skipped by readers less    442 interested in proofs. They are followed by a case study, a compiler for    443 expressions (\S\ref{sec:ExprCompiler}). Advanced datatypes, including those    444 involving function spaces, are covered in \S\ref{sec:advanced-datatypes},    445 which closes with another case study, search trees (tries''). Finally we    446 introduce a very general form of recursive function definition which goes    447 well beyond what \isacommand{primrec} allows (\S\ref{sec:recdef}).    448     449     450 \section{Simplification}    451 \label{sec:Simplification}    452 \index{simplification|(}    453     454 So far we have proved our theorems by \isa{auto}, which simplifies'' all    455 subgoals. In fact, \isa{auto} can do much more than that, except that it    456 did not need to so far. However, when you go beyond toy examples, you need to    457 understand the ingredients of \isa{auto}. This section covers the method    458 that \isa{auto} always applies first, namely simplification.    459     460 Simplification is one of the central theorem proving tools in Isabelle and    461 many other systems. The tool itself is called the \bfindex{simplifier}. The    462 purpose of this section is twofold: to introduce the many features of the    463 simplifier (\S\ref{sec:SimpFeatures}) and to explain a little bit how the    464 simplifier works (\S\ref{sec:SimpHow}). Anybody intending to use HOL should    465 read \S\ref{sec:SimpFeatures}, and the serious student should read    466 \S\ref{sec:SimpHow} as well in order to understand what happened in case    467 things do not simplify as expected.    468     469     470 \subsection{Using the simplifier}    471 \label{sec:SimpFeatures}    472     473 In its most basic form, simplification means repeated application of    474 equations from left to right. For example, taking the rules for \isa{\at}    475 and applying them to the term \isa{[0,1] \at\ []} results in a sequence of    476 simplification steps:    477 \begin{ttbox}\makeatother    478 (0#1#[]) @ [] $$\leadsto$$ 0#((1#[]) @ []) $$\leadsto$$ 0#(1#([] @ [])) $$\leadsto$$ 0#1#[]    479 \end{ttbox}    480 This is also known as \bfindex{term rewriting} and the equations are referred    481 to as \bfindex{rewrite rules}. This is more honest than simplification''    482 because the terms do not necessarily become simpler in the process.    483     484 \subsubsection{Simplification rules}    485 \indexbold{simplification rules}    486     487 To facilitate simplification, theorems can be declared to be simplification    488 rules (with the help of the attribute \isa{[simp]}\index{*simp    489 (attribute)}), in which case proofs by simplification make use of these    490 rules by default. In addition the constructs \isacommand{datatype} and    491 \isacommand{primrec} (and a few others) invisibly declare useful    492 simplification rules. Explicit definitions are \emph{not} declared    493 simplification rules automatically!    494     495 Not merely equations but pretty much any theorem can become a simplification    496 rule. The simplifier will try to make sense of it. For example, a theorem    497 \isasymnot$P$is automatically turned into \isa{$P$= False}. The details    498 are explained in \S\ref{sec:SimpHow}.    499     500 The simplification attribute of theorems can be turned on and off as follows:    501 \begin{ttbox}    502 theorems [simp] = $$list of theorem names$$;    503 theorems [simp del] = $$list of theorem names$$;    504 \end{ttbox}    505 As a rule of thumb, rules that really simplify (like \isa{rev(rev xs) =    506 xs} and \mbox{\isa{xs \at\ [] = xs}}) should be made simplification    507 rules. Those of a more specific nature (e.g.\ distributivity laws, which    508 alter the structure of terms considerably) should only be used selectively,    509 i.e.\ they should not be default simplification rules. Conversely, it may    510 also happen that a simplification rule needs to be disabled in certain    511 proofs. Frequent changes in the simplification status of a theorem may    512 indicates a badly designed theory.    513 \begin{warn}    514 Simplification may not terminate, for example if both$f(x) = g(x)$and    515$g(x) = f(x)$are simplification rules. It is the user's responsibility not    516 to include simplification rules that can lead to nontermination, either on    517 their own or in combination with other simplification rules.    518 \end{warn}    519     520 \subsubsection{The simplification method}    521 \index{*simp (method)|bold}    522     523 The general format of the simplification method is    524 \begin{ttbox}    525 simp $$list of modifiers$$    526 \end{ttbox}    527 where the list of \emph{modifiers} helps to fine tune the behaviour and may    528 be empty. Most if not all of the proofs seen so far could have been performed    529 with \isa{simp} instead of \isa{auto}, except that \isa{simp} attacks    530 only the first subgoal and may thus need to be repeated.    531 Note that \isa{simp} fails if nothing changes.    532     533 \subsubsection{Adding and deleting simplification rules}    534     535 If a certain theorem is merely needed in a few proofs by simplification,    536 we do not need to make it a global simplification rule. Instead we can modify    537 the set of simplification rules used in a simplification step by adding rules    538 to it and/or deleting rules from it. The two modifiers for this are    539 \begin{ttbox}    540 add: $$list of theorem names$$    541 del: $$list of theorem names$$    542 \end{ttbox}    543 In case you want to use only a specific list of theorems and ignore all    544 others:    545 \begin{ttbox}    546 only: $$list of theorem names$$    547 \end{ttbox}    548     549     550 \subsubsection{Assumptions}    551 \index{simplification!with/of assumptions}    552     553 {\makeatother\input{Misc/document/asm_simp.tex}}    554     555 \subsubsection{Rewriting with definitions}    556 \index{simplification!with definitions}    557     558 \input{Misc/document/def_rewr.tex}    559     560 \begin{warn}    561 If you have defined$f\,x\,y~\isasymequiv~t$then you can only expand    562 occurrences of$f$with at least two arguments. Thus it is safer to define    563$f$~\isasymequiv~\isasymlambda$x\,y.\;t$.    564 \end{warn}    565     566 \subsubsection{Simplifying let-expressions}    567 \index{simplification!of let-expressions}    568     569 Proving a goal containing \isaindex{let}-expressions almost invariably    570 requires the \isa{let}-con\-structs to be expanded at some point. Since    571 \isa{let}-\isa{in} is just syntactic sugar for a defined constant (called    572 \isa{Let}), expanding \isa{let}-constructs means rewriting with    573 \isa{Let_def}:    574     575 {\makeatother\input{Misc/document/let_rewr.tex}}    576     577 \subsubsection{Conditional equations}    578     579 \input{Misc/document/cond_rewr.tex}    580     581     582 \subsubsection{Automatic case splits}    583 \indexbold{case splits}    584 \index{*split|(}    585     586 {\makeatother\input{Misc/document/case_splits.tex}}    587     588 \index{*split|)}    589     590 \begin{warn}    591 The simplifier merely simplifies the condition of an \isa{if} but not the    592 \isa{then} or \isa{else} parts. The latter are simplified only after the    593 condition reduces to \isa{True} or \isa{False}, or after splitting. The    594 same is true for \isaindex{case}-expressions: only the selector is    595 simplified at first, until either the expression reduces to one of the    596 cases or it is split.    597 \end{warn}    598     599 \subsubsection{Arithmetic}    600 \index{arithmetic}    601     602 The simplifier routinely solves a small class of linear arithmetic formulae    603 (over types \isa{nat} and \isa{int}): it only takes into account    604 assumptions and conclusions that are (possibly negated) (in)equalities    605 (\isa{=}, \isasymle, \isa{<}) and it only knows about addition. Thus    606     607 \input{Misc/document/arith1.tex}%    608 is proved by simplification, whereas the only slightly more complex    609     610 \input{Misc/document/arith4.tex}%    611 is not proved by simplification and requires \isa{arith}.    612     613 \subsubsection{Permutative rewrite rules}    614     615 A rewrite rule is {\bf permutative} if the left-hand side and right-hand side    616 are the same up to renaming of variables. The most common permutative rule    617 is commutativity:$x+y = y+x$. Another example is$(x-y)-z = (x-z)-y$. Such    618 rules are problematic because once they apply, they can be used forever.    619 The simplifier is aware of this danger and treats permutative rules    620 separately. For details see~\cite{isabelle-ref}.    621     622 \subsubsection{Tracing}    623 \indexbold{tracing the simplifier}    624     625 {\makeatother\input{Misc/document/trace_simp.tex}}    626     627 \subsection{How it works}    628 \label{sec:SimpHow}    629     630 \subsubsection{Higher-order patterns}    631     632 \subsubsection{Local assumptions}    633     634 \subsubsection{The preprocessor}    635     636 \index{simplification|)}    637     638 \section{Induction heuristics}    639 \label{sec:InductionHeuristics}    640     641 The purpose of this section is to illustrate some simple heuristics for    642 inductive proofs. The first one we have already mentioned in our initial    643 example:    644 \begin{quote}    645 {\em 1. Theorems about recursive functions are proved by induction.}    646 \end{quote}    647 In case the function has more than one argument    648 \begin{quote}    649 {\em 2. Do induction on argument number$i$if the function is defined by    650 recursion in argument number$i\$.}

   651 \end{quote}

   652 When we look at the proof of {\makeatother\isa{(xs @ ys) @ zs = xs @ (ys @

   653   zs)}} in \S\ref{sec:intro-proof} we find (a) \isa{\at} is recursive in

   654 the first argument, (b) \isa{xs} occurs only as the first argument of

   655 \isa{\at}, and (c) both \isa{ys} and \isa{zs} occur at least once as

   656 the second argument of \isa{\at}. Hence it is natural to perform induction

   657 on \isa{xs}.

   658

   659 The key heuristic, and the main point of this section, is to

   660 generalize the goal before induction. The reason is simple: if the goal is

   661 too specific, the induction hypothesis is too weak to allow the induction

   662 step to go through. Let us now illustrate the idea with an example.

   663

   664 {\makeatother\input{Misc/document/Itrev.tex}}

   665

   666 A final point worth mentioning is the orientation of the equation we just

   667 proved: the more complex notion (\isa{itrev}) is on the left-hand

   668 side, the simpler \isa{rev} on the right-hand side. This constitutes

   669 another, albeit weak heuristic that is not restricted to induction:

   670 \begin{quote}

   671   {\em 5. The right-hand side of an equation should (in some sense) be

   672     simpler than the left-hand side.}

   673 \end{quote}

   674 The heuristic is tricky to apply because it is not obvious that

   675 \isa{rev xs \at\ ys} is simpler than \isa{itrev xs ys}. But see what

   676 happens if you try to prove the symmetric equation!

   677

   678

   679 \section{Case study: compiling expressions}

   680 \label{sec:ExprCompiler}

   681

   682 {\makeatother\input{CodeGen/document/CodeGen.tex}}

   683

   684

   685 \section{Advanced datatypes}

   686 \label{sec:advanced-datatypes}

   687 \index{*datatype|(}

   688 \index{*primrec|(}

   689 %|)

   690

   691 This section presents advanced forms of \isacommand{datatype}s.

   692

   693 \subsection{Mutual recursion}

   694 \label{sec:datatype-mut-rec}

   695

   696 \input{Datatype/document/ABexpr.tex}

   697

   698 \subsection{Nested recursion}

   699

   700 \input{Datatype/document/Nested.tex}

   701

   702

   703 \subsection{The limits of nested recursion}

   704

   705 How far can we push nested recursion? By the unfolding argument above, we can

   706 reduce nested to mutual recursion provided the nested recursion only involves

   707 previously defined datatypes. This does not include functions:

   708 \begin{ttbox}

   709 datatype t = C (t => bool)

   710 \end{ttbox}

   711 is a real can of worms: in HOL it must be ruled out because it requires a type

   712 \isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the

   713 same cardinality---an impossibility. For the same reason it is not possible

   714 to allow recursion involving the type \isa{set}, which is isomorphic to

   715 \isa{t \isasymFun\ bool}.

   716

   717 Fortunately, a limited form of recursion

   718 involving function spaces is permitted: the recursive type may occur on the

   719 right of a function arrow, but never on the left. Hence the above can of worms

   720 is ruled out but the following example of a potentially infinitely branching tree is

   721 accepted:

   722

   723 \input{Datatype/document/Fundata.tex}

   724 \bigskip

   725

   726 If you need nested recursion on the left of a function arrow,

   727 there are alternatives to pure HOL: LCF~\cite{paulson87} is a logic where types like

   728 \begin{ttbox}

   729 datatype lam = C (lam -> lam)

   730 \end{ttbox}

   731 do indeed make sense (note the intentionally different arrow \isa{->}!).

   732 There is even a version of LCF on top of HOL, called

   733 HOLCF~\cite{MuellerNvOS99}.

   734

   735 \index{*primrec|)}

   736 \index{*datatype|)}

   737

   738 \subsection{Case study: Tries}

   739

   740 Tries are a classic search tree data structure~\cite{Knuth3-75} for fast

   741 indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a

   742 trie containing the words all'', an'', ape'', can'', car'' and

   743 cat''.  When searching a string in a trie, the letters of the string are

   744 examined sequentially. Each letter determines which subtrie to search next.

   745 In this case study we model tries as a datatype, define a lookup and an

   746 update function, and prove that they behave as expected.

   747

   748 \begin{figure}[htbp]

   749 \begin{center}

   750 \unitlength1mm

   751 \begin{picture}(60,30)

   752 \put( 5, 0){\makebox(0,0)[b]{l}}

   753 \put(25, 0){\makebox(0,0)[b]{e}}

   754 \put(35, 0){\makebox(0,0)[b]{n}}

   755 \put(45, 0){\makebox(0,0)[b]{r}}

   756 \put(55, 0){\makebox(0,0)[b]{t}}

   757 %

   758 \put( 5, 9){\line(0,-1){5}}

   759 \put(25, 9){\line(0,-1){5}}

   760 \put(44, 9){\line(-3,-2){9}}

   761 \put(45, 9){\line(0,-1){5}}

   762 \put(46, 9){\line(3,-2){9}}

   763 %

   764 \put( 5,10){\makebox(0,0)[b]{l}}

   765 \put(15,10){\makebox(0,0)[b]{n}}

   766 \put(25,10){\makebox(0,0)[b]{p}}

   767 \put(45,10){\makebox(0,0)[b]{a}}

   768 %

   769 \put(14,19){\line(-3,-2){9}}

   770 \put(15,19){\line(0,-1){5}}

   771 \put(16,19){\line(3,-2){9}}

   772 \put(45,19){\line(0,-1){5}}

   773 %

   774 \put(15,20){\makebox(0,0)[b]{a}}

   775 \put(45,20){\makebox(0,0)[b]{c}}

   776 %

   777 \put(30,30){\line(-3,-2){13}}

   778 \put(30,30){\line(3,-2){13}}

   779 \end{picture}

   780 \end{center}

   781 \caption{A sample trie}

   782 \label{fig:trie}

   783 \end{figure}

   784

   785 Proper tries associate some value with each string. Since the

   786 information is stored only in the final node associated with the string, many

   787 nodes do not carry any value. This distinction is captured by the

   788 following predefined datatype (from theory \texttt{Option}, which is a parent

   789 of \texttt{Main}):

   790 \input{Trie/document/Option2.tex}

   791 \indexbold{*option}\indexbold{*None}\indexbold{*Some}

   792

   793 \input{Trie/document/Trie.tex}

   794

   795 \begin{exercise}

   796   Write an improved version of \isa{update} that does not suffer from the

   797   space leak in the version above. Prove the main theorem for your improved

   798   \isa{update}.

   799 \end{exercise}

   800

   801 \begin{exercise}

   802   Write a function to \emph{delete} entries from a trie. An easy solution is

   803   a clever modification of \isa{update} which allows both insertion and

   804   deletion with a single function.  Prove (a modified version of) the main

   805   theorem above. Optimize you function such that it shrinks tries after

   806   deletion, if possible.

   807 \end{exercise}

   808

   809 \section{Total recursive functions}

   810 \label{sec:recdef}

   811 \index{*recdef|(}

   812

   813 Although many total functions have a natural primitive recursive definition,

   814 this is not always the case. Arbitrary total recursive functions can be

   815 defined by means of \isacommand{recdef}: you can use full pattern-matching,

   816 recursion need not involve datatypes, and termination is proved by showing

   817 that the arguments of all recursive calls are smaller in a suitable (user

   818 supplied) sense.

   819

   820 \subsection{Defining recursive functions}

   821

   822 \input{Recdef/document/examples.tex}

   823

   824 \subsection{Proving termination}

   825

   826 \input{Recdef/document/termination.tex}

   827

   828 \subsection{Simplification with recdef}

   829

   830 \input{Recdef/document/simplification.tex}

   831

   832

   833 \subsection{Induction}

   834 \index{induction!recursion|(}

   835 \index{recursion induction|(}

   836

   837 \input{Recdef/document/Induction.tex}

   838

   839 \index{induction!recursion|)}

   840 \index{recursion induction|)}

   841 \index{*recdef|)}