21212

1 
(* Title: Doc/IsarAdvanced/Functions/Thy/Fundefs.thy


2 
ID: $Id$


3 
Author: Alexander Krauss, TU Muenchen


4 


5 
Tutorial for function definitions with the new "function" package.


6 
*)


7 


8 
theory Functions


9 
imports Main


10 
begin


11 


12 
chapter {* Defining Recursive Functions in Isabelle/HOL *}


13 


14 
section {* Function Definition for Dummies *}


15 


16 
text {*


17 
In most cases, defining a recursive function is just as simple as other definitions:


18 
*}


19 


20 
fun fib :: "nat \<Rightarrow> nat"


21 
where


22 
"fib 0 = 1"


23 
 "fib (Suc 0) = 1"


24 
 "fib (Suc (Suc n)) = fib n + fib (Suc n)"


25 


26 
text {*


27 
The function always terminates, since the argument of gets smaller in every


28 
recursive call. Termination is an


29 
important requirement, since it prevents inconsistencies: From


30 
the "definition" @{text "f(n) = f(n) + 1"} we could prove


31 
@{text "0 = 1"} by subtracting @{text "f(n)"} on both sides.


32 


33 
Isabelle tries to prove termination automatically when a function is


34 
defined. We will later look at cases where this fails and see what to


35 
do then.


36 
*}


37 


38 
subsection {* Pattern matching *}


39 

22065

40 
text {* \label{patmatch}

21212

41 
Like in functional programming, functions can be defined by pattern


42 
matching. At the moment we will only consider \emph{datatype


43 
patterns}, which only consist of datatype constructors and


44 
variables.


45 


46 
If patterns overlap, the order of the equations is taken into


47 
account. The following function inserts a fixed element between any


48 
two elements of a list:


49 
*}


50 


51 
fun sep :: "'a \<Rightarrow> 'a list \<Rightarrow> 'a list"


52 
where


53 
"sep a (x#y#xs) = x # a # sep a (y # xs)"


54 
 "sep a xs = xs"


55 


56 
text {*


57 
Overlapping patterns are interpreted as "increments" to what is


58 
already there: The second equation is only meant for the cases where


59 
the first one does not match. Consequently, Isabelle replaces it

22065

60 
internally by the remaining cases, making the patterns disjoint:

21212

61 
*}


62 

22065

63 
thm sep.simps

21212

64 

22065

65 
text {* @{thm [display] sep.simps[no_vars]} *}

21212

66 


67 
text {*


68 
The equations from function definitions are automatically used in


69 
simplification:


70 
*}


71 

22065

72 
lemma "sep (0::nat) [1, 2, 3] = [1, 0, 2, 0, 3]"

21212

73 
by simp


74 


75 


76 


77 
subsection {* Induction *}


78 

22065

79 
text {*


80 


81 
Isabelle provides customized induction rules for recursive functions.


82 
See \cite[\S3.5.4]{isatutorial}.


83 
*}

21212

84 


85 

22065

86 
section {* Full form definitions *}

21212

87 


88 
text {*


89 
Up to now, we were using the \cmd{fun} command, which provides a


90 
convenient shorthand notation for simple function definitions. In


91 
this mode, Isabelle tries to solve all the necessary proof obligations


92 
automatically. If a proof does not go through, the definition is

22065

93 
rejected. This can either mean that the definition is indeed faulty,


94 
or that the default proof procedures are just not smart enough (or


95 
rather: not designed) to handle the definition.


96 


97 
By expanding the abbreviated \cmd{fun} to the full \cmd{function}


98 
command, the proof obligations become visible and can be analyzed or


99 
solved manually.


100 


101 
\end{isamarkuptext}


102 


103 


104 
\fbox{\parbox{\textwidth}{


105 
\noindent\cmd{fun} @{text "f :: \<tau>"}\\%


106 
\cmd{where}\isanewline%


107 
\ \ {\it equations}\isanewline%


108 
\ \ \quad\vdots


109 
}}

21212

110 

22065

111 
\begin{isamarkuptext}


112 
\vspace*{1em}


113 
\noindent abbreviates


114 
\end{isamarkuptext}

21212

115 

22065

116 
\fbox{\parbox{\textwidth}{


117 
\noindent\cmd{function} @{text "("}\cmd{sequential}@{text ") f :: \<tau>"}\\%


118 
\cmd{where}\isanewline%


119 
\ \ {\it equations}\isanewline%


120 
\ \ \quad\vdots\\%


121 
\cmd{by} @{text "pat_completeness auto"}\\%


122 
\cmd{termination by} @{text "lexicographic_order"}


123 
}}

21212

124 

22065

125 
\begin{isamarkuptext}


126 
\vspace*{1em}


127 
\noindent Some declarations and proofs have now become explicit:

21212

128 


129 
\begin{enumerate}

22065

130 
\item The \cmd{sequential} option enables the preprocessing of

21212

131 
pattern overlaps we already saw. Without this option, the equations


132 
must already be disjoint and complete. The automatic completion only


133 
works with datatype patterns.


134 


135 
\item A function definition now produces a proof obligation which


136 
expresses completeness and compatibility of patterns (We talk about

22065

137 
this later). The combination of the methods @{text "pat_completeness"} and


138 
@{text "auto"} is used to solve this proof obligation.

21212

139 


140 
\item A termination proof follows the definition, started by the

22065

141 
\cmd{termination} command, which sets up the goal. The @{text


142 
"lexicographic_order"} method can prove termination of a certain


143 
class of functions by searching for a suitable lexicographic


144 
combination of size measures.


145 
\end{enumerate}


146 
Whenever a \cmd{fun} command fails, it is usually a good idea to


147 
expand the syntax to the more verbose \cmd{function} form, to see


148 
what is actually going on.

21212

149 
*}


150 


151 


152 
section {* Proving termination *}


153 


154 
text {*


155 
Consider the following function, which sums up natural numbers up to

22065

156 
@{text "N"}, using a counter @{text "i"}:

21212

157 
*}


158 


159 
function sum :: "nat \<Rightarrow> nat \<Rightarrow> nat"


160 
where


161 
"sum i N = (if i > N then 0 else i + sum (Suc i) N)"

22065

162 
by pat_completeness auto

21212

163 


164 
text {*

22065

165 
\noindent The @{text "lexicographic_order"} method fails on this example, because none of the

21212

166 
arguments decreases in the recursive call.


167 


168 
A more general method for termination proofs is to supply a wellfounded


169 
relation on the argument type, and to show that the argument


170 
decreases in every recursive call.


171 


172 
The termination argument for @{text "sum"} is based on the fact that


173 
the \emph{difference} between @{text "i"} and @{text "N"} gets


174 
smaller in every step, and that the recursion stops when @{text "i"}


175 
is greater then @{text "n"}. Phrased differently, the expression


176 
@{text "N + 1  i"} decreases in every recursive call.


177 

22065

178 
We can use this expression as a measure function suitable to prove termination.

21212

179 
*}


180 


181 
termination

22065

182 
by (relation "measure (\<lambda>(i,N). N + 1  i)") auto

21212

183 


184 
text {*

22065

185 
The @{text relation} method takes a relation of


186 
type @{typ "('a \<times> 'a) set"}, where @{typ "'a"} is the argument type of


187 
the function. If the function has multiple curried arguments, then


188 
these are packed together into a tuple, as it happened in the above


189 
example.

21212

190 

22065

191 
The predefined function @{term_type "measure"} is a very common way of


192 
specifying termination relations in terms of a mapping into the


193 
natural numbers.


194 


195 
After the invocation of @{text "relation"}, we must prove that (a)


196 
the relation we supplied is wellfounded, and (b) that the arguments


197 
of recursive calls indeed decrease with respect to the


198 
relation. These goals are all solved by the subsequent call to


199 
@{text "auto"}.


200 


201 
Let us complicate the function a little, by adding some more


202 
recursive calls:

21212

203 
*}


204 


205 
function foo :: "nat \<Rightarrow> nat \<Rightarrow> nat"


206 
where


207 
"foo i N = (if i > N


208 
then (if N = 0 then 0 else foo 0 (N  1))


209 
else i + foo (Suc i) N)"


210 
by pat_completeness auto


211 


212 
text {*


213 
When @{text "i"} has reached @{text "N"}, it starts at zero again


214 
and @{text "N"} is decremented.


215 
This corresponds to a nested


216 
loop where one index counts up and the other down. Termination can


217 
be proved using a lexicographic combination of two measures, namely

22065

218 
the value of @{text "N"} and the above difference. The @{const


219 
"measures"} combinator generalizes @{text "measure"} by taking a


220 
list of measure functions.

21212

221 
*}


222 


223 
termination

22065

224 
by (relation "measures [\<lambda>(i, N). N, \<lambda>(i,N). N + 1  i]") auto

21212

225 


226 


227 
section {* Mutual Recursion *}


228 


229 
text {*


230 
If two or more functions call one another mutually, they have to be defined


231 
in one step. The simplest example are probably @{text "even"} and @{text "odd"}:


232 
*}


233 

22065

234 
function even :: "nat \<Rightarrow> bool"


235 
and odd :: "nat \<Rightarrow> bool"

21212

236 
where


237 
"even 0 = True"


238 
 "odd 0 = False"


239 
 "even (Suc n) = odd n"


240 
 "odd (Suc n) = even n"

22065

241 
by pat_completeness auto

21212

242 


243 
text {*


244 
To solve the problem of mutual dependencies, Isabelle internally


245 
creates a single function operating on the sum


246 
type. Then the original functions are defined as


247 
projections. Consequently, termination has to be proved


248 
simultaneously for both functions, by specifying a measure on the


249 
sum type:


250 
*}


251 


252 
termination

22065

253 
by (relation "measure (\<lambda>x. case x of Inl n \<Rightarrow> n  Inr n \<Rightarrow> n)")


254 
auto


255 


256 
subsection {* Induction for mutual recursion *}


257 


258 
text {*


259 


260 
When functions are mutually recursive, proving properties about them


261 
generally requires simultaneous induction. The induction rules


262 
generated from the definitions reflect this.


263 


264 
Let us prove something about @{const even} and @{const odd}:


265 
*}


266 


267 
lemma


268 
"even n = (n mod 2 = 0)"


269 
"odd n = (n mod 2 = 1)"


270 


271 
txt {*


272 
We apply simultaneous induction, specifying the induction variable


273 
for both goals, separated by \cmd{and}: *}


274 


275 
apply (induct n and n rule: even_odd.induct)


276 


277 
txt {*


278 
We get four subgoals, which correspond to the clauses in the


279 
definition of @{const even} and @{const odd}:


280 
@{subgoals[display,indent=0]}


281 
Simplification solves the first two goals, leaving us with two


282 
statements about the @{text "mod"} operation to prove:


283 
*}


284 


285 
apply simp_all


286 


287 
txt {*


288 
@{subgoals[display,indent=0]}


289 


290 
\noindent These can be handeled by the descision procedure for


291 
presburger arithmethic.


292 


293 
*}


294 


295 
apply presburger


296 
apply presburger


297 
done

21212

298 

22065

299 
text {*


300 
Even if we were just interested in one of the statements proved by


301 
simultaneous induction, the other ones may be necessary to


302 
strengthen the induction hypothesis. If we had left out the statement


303 
about @{const odd} (by substituting it with @{term "True"}, our


304 
proof would have failed:


305 
*}


306 


307 
lemma


308 
"even n = (n mod 2 = 0)"


309 
"True"


310 
apply (induct n rule: even_odd.induct)


311 


312 
txt {*


313 
\noindent Now the third subgoal is a dead end, since we have no


314 
useful induction hypothesis:


315 


316 
@{subgoals[display,indent=0]}


317 
*}


318 


319 
oops


320 


321 
section {* More general patterns *}


322 


323 
subsection {* Avoiding pattern splitting *}


324 


325 
text {*


326 


327 
Up to now, we used pattern matching only on datatypes, and the


328 
patterns were always disjoint and complete, and if they weren't,


329 
they were made disjoint automatically like in the definition of


330 
@{const "sep"} in \S\ref{patmatch}.


331 


332 
This splitting can significantly increase the number of equations


333 
involved, and is not always necessary. The following simple example


334 
shows the problem:


335 


336 
Suppose we are modelling incomplete knowledge about the world by a


337 
threevalued datatype, which has values for @{term "T"}, @{term "F"}


338 
and @{term "X"} for true, false and uncertain propositions.


339 
*}


340 


341 
datatype P3 = T  F  X


342 


343 
text {* Then the conjunction of such values can be defined as follows: *}


344 


345 
fun And :: "P3 \<Rightarrow> P3 \<Rightarrow> P3"


346 
where


347 
"And T p = p"


348 
"And p T = p"


349 
"And p F = F"


350 
"And F p = F"


351 
"And X X = X"

21212

352 


353 

22065

354 
text {*


355 
This definition is useful, because the equations can directly be used


356 
as rules to simplify expressions. But the patterns overlap, e.g.~the


357 
expression @{term "And T T"} is matched by the first two


358 
equations. By default, Isabelle makes the patterns disjoint by


359 
splitting them up, producing instances:


360 
*}


361 


362 
thm And.simps


363 


364 
text {*


365 
@{thm[indent=4] And.simps}


366 


367 
\vspace*{1em}


368 
\noindent There are several problems with this approach:


369 


370 
\begin{enumerate}


371 
\item When datatypes have many constructors, there can be an


372 
explosion of equations. For @{const "And"}, we get seven instead of


373 
five equation, which can be tolerated, but this is just a small


374 
example.


375 


376 
\item Since splitting makes the equations "more special", they


377 
do not always match in rewriting. While the term @{term "And x F"}


378 
can be simplified to @{term "F"} by the original specification, a


379 
(manual) case split on @{term "x"} is now necessary.


380 


381 
\item The splitting also concerns the induction rule @{text


382 
"And.induct"}. Instead of five premises it now has seven, which


383 
means that our induction proofs will have more cases.


384 


385 
\item In general, it increases clarity if we get the same definition


386 
back which we put in.


387 
\end{enumerate}


388 


389 
On the other hand, a definition needs to be consistent and defining


390 
both @{term "f x = True"} and @{term "f x = False"} is a bad


391 
idea. So if we don't want Isabelle to mangle our definitions, we


392 
will have to prove that this is not necessary. By using the full


393 
definition form withour the \cmd{sequential} option, we get this


394 
behaviour:


395 
*}


396 


397 
function And2 :: "P3 \<Rightarrow> P3 \<Rightarrow> P3"


398 
where


399 
"And2 T p = p"


400 
"And2 p T = p"


401 
"And2 p F = F"


402 
"And2 F p = F"


403 
"And2 X X = X"


404 


405 
txt {*


406 
Now it is also time to look at the subgoals generated by a


407 
function definition. In this case, they are:


408 


409 
@{subgoals[display,indent=0]}


410 


411 
The first subgoal expresses the completeness of the patterns. It has


412 
the form of an elimination rule and states that every @{term x} of


413 
the function's input type must match one of the patterns. It could


414 
be equivalently stated as a disjunction of existential statements:


415 
@{term "(\<exists>p. x = (T, p)) \<or> (\<exists>p. x = (p, T)) \<or> (\<exists>p. x = (p, F)) \<or>


416 
(\<exists>p. x = (F, p)) \<or> (x = (X, X))"} If the patterns just involve


417 
datatypes, we can solve it with the @{text "pat_completeness"} method:


418 
*}


419 


420 
apply pat_completeness


421 


422 
txt {*


423 
The remaining subgoals express \emph{pattern compatibility}. We do


424 
allow that a value is matched by more than one patterns, but in this


425 
case, the result (i.e.~the right hand sides of the equations) must


426 
also be equal. For each pair of two patterns, there is one such


427 
subgoal. Usually this needs injectivity of the constructors, which


428 
is used automatically by @{text "auto"}.


429 
*}


430 


431 
by auto

21212

432 


433 

22065

434 
subsection {* Nonconstructor patterns *}

21212

435 


436 
text {* FIXME *}


437 

22065

438 


439 
subsection {* Nonconstructor patterns *}

21212

440 


441 
text {* FIXME *}


442 


443 


444 


445 

22065

446 


447 


448 
section {* Partiality *}


449 


450 
text {*


451 
In HOL, all functions are total. A function @{term "f"} applied to


452 
@{term "x"} always has a value @{term "f x"}, and there is no notion


453 
of undefinedness.


454 


455 
FIXME


456 
*}


457 


458 


459 


460 


461 
section {* Nested recursion *}


462 


463 
text {*


464 


465 


466 


467 


468 


469 


470 
FIXME *}


471 


472 


473 


474 


475 


476 

21212

477 
end
