doc-src/Functions/Thy/Functions.thy
changeset 30242 aea5d7fa7ef5
parent 30226 2f4684e2ea95
child 33856 14a658faadb6
equal deleted inserted replaced
30241:3a1aef73b2b2 30242:aea5d7fa7ef5
       
     1 (*  Title:      doc-src/IsarAdvanced/Functions/Thy/Fundefs.thy
       
     2     Author:     Alexander Krauss, TU Muenchen
       
     3 
       
     4 Tutorial for function definitions with the new "function" package.
       
     5 *)
       
     6 
       
     7 theory Functions
       
     8 imports Main
       
     9 begin
       
    10 
       
    11 section {* Function Definitions for Dummies *}
       
    12 
       
    13 text {*
       
    14   In most cases, defining a recursive function is just as simple as other definitions:
       
    15 *}
       
    16 
       
    17 fun fib :: "nat \<Rightarrow> nat"
       
    18 where
       
    19   "fib 0 = 1"
       
    20 | "fib (Suc 0) = 1"
       
    21 | "fib (Suc (Suc n)) = fib n + fib (Suc n)"
       
    22 
       
    23 text {*
       
    24   The syntax is rather self-explanatory: We introduce a function by
       
    25   giving its name, its type, 
       
    26   and a set of defining recursive equations.
       
    27   If we leave out the type, the most general type will be
       
    28   inferred, which can sometimes lead to surprises: Since both @{term
       
    29   "1::nat"} and @{text "+"} are overloaded, we would end up
       
    30   with @{text "fib :: nat \<Rightarrow> 'a::{one,plus}"}.
       
    31 *}
       
    32 
       
    33 text {*
       
    34   The function always terminates, since its argument gets smaller in
       
    35   every recursive call. 
       
    36   Since HOL is a logic of total functions, termination is a
       
    37   fundamental requirement to prevent inconsistencies\footnote{From the
       
    38   \qt{definition} @{text "f(n) = f(n) + 1"} we could prove 
       
    39   @{text "0 = 1"} by subtracting @{text "f(n)"} on both sides.}.
       
    40   Isabelle tries to prove termination automatically when a definition
       
    41   is made. In \S\ref{termination}, we will look at cases where this
       
    42   fails and see what to do then.
       
    43 *}
       
    44 
       
    45 subsection {* Pattern matching *}
       
    46 
       
    47 text {* \label{patmatch}
       
    48   Like in functional programming, we can use pattern matching to
       
    49   define functions. At the moment we will only consider \emph{constructor
       
    50   patterns}, which only consist of datatype constructors and
       
    51   variables. Furthermore, patterns must be linear, i.e.\ all variables
       
    52   on the left hand side of an equation must be distinct. In
       
    53   \S\ref{genpats} we discuss more general pattern matching.
       
    54 
       
    55   If patterns overlap, the order of the equations is taken into
       
    56   account. The following function inserts a fixed element between any
       
    57   two elements of a list:
       
    58 *}
       
    59 
       
    60 fun sep :: "'a \<Rightarrow> 'a list \<Rightarrow> 'a list"
       
    61 where
       
    62   "sep a (x#y#xs) = x # a # sep a (y # xs)"
       
    63 | "sep a xs       = xs"
       
    64 
       
    65 text {* 
       
    66   Overlapping patterns are interpreted as \qt{increments} to what is
       
    67   already there: The second equation is only meant for the cases where
       
    68   the first one does not match. Consequently, Isabelle replaces it
       
    69   internally by the remaining cases, making the patterns disjoint:
       
    70 *}
       
    71 
       
    72 thm sep.simps
       
    73 
       
    74 text {* @{thm [display] sep.simps[no_vars]} *}
       
    75 
       
    76 text {* 
       
    77   \noindent The equations from function definitions are automatically used in
       
    78   simplification:
       
    79 *}
       
    80 
       
    81 lemma "sep 0 [1, 2, 3] = [1, 0, 2, 0, 3]"
       
    82 by simp
       
    83 
       
    84 subsection {* Induction *}
       
    85 
       
    86 text {*
       
    87 
       
    88   Isabelle provides customized induction rules for recursive
       
    89   functions. These rules follow the recursive structure of the
       
    90   definition. Here is the rule @{text sep.induct} arising from the
       
    91   above definition of @{const sep}:
       
    92 
       
    93   @{thm [display] sep.induct}
       
    94   
       
    95   We have a step case for list with at least two elements, and two
       
    96   base cases for the zero- and the one-element list. Here is a simple
       
    97   proof about @{const sep} and @{const map}
       
    98 *}
       
    99 
       
   100 lemma "map f (sep x ys) = sep (f x) (map f ys)"
       
   101 apply (induct x ys rule: sep.induct)
       
   102 
       
   103 txt {*
       
   104   We get three cases, like in the definition.
       
   105 
       
   106   @{subgoals [display]}
       
   107 *}
       
   108 
       
   109 apply auto 
       
   110 done
       
   111 text {*
       
   112 
       
   113   With the \cmd{fun} command, you can define about 80\% of the
       
   114   functions that occur in practice. The rest of this tutorial explains
       
   115   the remaining 20\%.
       
   116 *}
       
   117 
       
   118 
       
   119 section {* fun vs.\ function *}
       
   120 
       
   121 text {* 
       
   122   The \cmd{fun} command provides a
       
   123   convenient shorthand notation for simple function definitions. In
       
   124   this mode, Isabelle tries to solve all the necessary proof obligations
       
   125   automatically. If any proof fails, the definition is
       
   126   rejected. This can either mean that the definition is indeed faulty,
       
   127   or that the default proof procedures are just not smart enough (or
       
   128   rather: not designed) to handle the definition.
       
   129 
       
   130   By expanding the abbreviation to the more verbose \cmd{function} command, these proof obligations become visible and can be analyzed or
       
   131   solved manually. The expansion from \cmd{fun} to \cmd{function} is as follows:
       
   132 
       
   133 \end{isamarkuptext}
       
   134 
       
   135 
       
   136 \[\left[\;\begin{minipage}{0.25\textwidth}\vspace{6pt}
       
   137 \cmd{fun} @{text "f :: \<tau>"}\\%
       
   138 \cmd{where}\\%
       
   139 \hspace*{2ex}{\it equations}\\%
       
   140 \hspace*{2ex}\vdots\vspace*{6pt}
       
   141 \end{minipage}\right]
       
   142 \quad\equiv\quad
       
   143 \left[\;\begin{minipage}{0.48\textwidth}\vspace{6pt}
       
   144 \cmd{function} @{text "("}\cmd{sequential}@{text ") f :: \<tau>"}\\%
       
   145 \cmd{where}\\%
       
   146 \hspace*{2ex}{\it equations}\\%
       
   147 \hspace*{2ex}\vdots\\%
       
   148 \cmd{by} @{text "pat_completeness auto"}\\%
       
   149 \cmd{termination by} @{text "lexicographic_order"}\vspace{6pt}
       
   150 \end{minipage}
       
   151 \right]\]
       
   152 
       
   153 \begin{isamarkuptext}
       
   154   \vspace*{1em}
       
   155   \noindent Some details have now become explicit:
       
   156 
       
   157   \begin{enumerate}
       
   158   \item The \cmd{sequential} option enables the preprocessing of
       
   159   pattern overlaps which we already saw. Without this option, the equations
       
   160   must already be disjoint and complete. The automatic completion only
       
   161   works with constructor patterns.
       
   162 
       
   163   \item A function definition produces a proof obligation which
       
   164   expresses completeness and compatibility of patterns (we talk about
       
   165   this later). The combination of the methods @{text "pat_completeness"} and
       
   166   @{text "auto"} is used to solve this proof obligation.
       
   167 
       
   168   \item A termination proof follows the definition, started by the
       
   169   \cmd{termination} command. This will be explained in \S\ref{termination}.
       
   170  \end{enumerate}
       
   171   Whenever a \cmd{fun} command fails, it is usually a good idea to
       
   172   expand the syntax to the more verbose \cmd{function} form, to see
       
   173   what is actually going on.
       
   174  *}
       
   175 
       
   176 
       
   177 section {* Termination *}
       
   178 
       
   179 text {*\label{termination}
       
   180   The method @{text "lexicographic_order"} is the default method for
       
   181   termination proofs. It can prove termination of a
       
   182   certain class of functions by searching for a suitable lexicographic
       
   183   combination of size measures. Of course, not all functions have such
       
   184   a simple termination argument. For them, we can specify the termination
       
   185   relation manually.
       
   186 *}
       
   187 
       
   188 subsection {* The {\tt relation} method *}
       
   189 text{*
       
   190   Consider the following function, which sums up natural numbers up to
       
   191   @{text "N"}, using a counter @{text "i"}:
       
   192 *}
       
   193 
       
   194 function sum :: "nat \<Rightarrow> nat \<Rightarrow> nat"
       
   195 where
       
   196   "sum i N = (if i > N then 0 else i + sum (Suc i) N)"
       
   197 by pat_completeness auto
       
   198 
       
   199 text {*
       
   200   \noindent The @{text "lexicographic_order"} method fails on this example, because none of the
       
   201   arguments decreases in the recursive call, with respect to the standard size ordering.
       
   202   To prove termination manually, we must provide a custom wellfounded relation.
       
   203 
       
   204   The termination argument for @{text "sum"} is based on the fact that
       
   205   the \emph{difference} between @{text "i"} and @{text "N"} gets
       
   206   smaller in every step, and that the recursion stops when @{text "i"}
       
   207   is greater than @{text "N"}. Phrased differently, the expression 
       
   208   @{text "N + 1 - i"} always decreases.
       
   209 
       
   210   We can use this expression as a measure function suitable to prove termination.
       
   211 *}
       
   212 
       
   213 termination sum
       
   214 apply (relation "measure (\<lambda>(i,N). N + 1 - i)")
       
   215 
       
   216 txt {*
       
   217   The \cmd{termination} command sets up the termination goal for the
       
   218   specified function @{text "sum"}. If the function name is omitted, it
       
   219   implicitly refers to the last function definition.
       
   220 
       
   221   The @{text relation} method takes a relation of
       
   222   type @{typ "('a \<times> 'a) set"}, where @{typ "'a"} is the argument type of
       
   223   the function. If the function has multiple curried arguments, then
       
   224   these are packed together into a tuple, as it happened in the above
       
   225   example.
       
   226 
       
   227   The predefined function @{term[source] "measure :: ('a \<Rightarrow> nat) \<Rightarrow> ('a \<times> 'a) set"} constructs a
       
   228   wellfounded relation from a mapping into the natural numbers (a
       
   229   \emph{measure function}). 
       
   230 
       
   231   After the invocation of @{text "relation"}, we must prove that (a)
       
   232   the relation we supplied is wellfounded, and (b) that the arguments
       
   233   of recursive calls indeed decrease with respect to the
       
   234   relation:
       
   235 
       
   236   @{subgoals[display,indent=0]}
       
   237 
       
   238   These goals are all solved by @{text "auto"}:
       
   239 *}
       
   240 
       
   241 apply auto
       
   242 done
       
   243 
       
   244 text {*
       
   245   Let us complicate the function a little, by adding some more
       
   246   recursive calls: 
       
   247 *}
       
   248 
       
   249 function foo :: "nat \<Rightarrow> nat \<Rightarrow> nat"
       
   250 where
       
   251   "foo i N = (if i > N 
       
   252               then (if N = 0 then 0 else foo 0 (N - 1))
       
   253               else i + foo (Suc i) N)"
       
   254 by pat_completeness auto
       
   255 
       
   256 text {*
       
   257   When @{text "i"} has reached @{text "N"}, it starts at zero again
       
   258   and @{text "N"} is decremented.
       
   259   This corresponds to a nested
       
   260   loop where one index counts up and the other down. Termination can
       
   261   be proved using a lexicographic combination of two measures, namely
       
   262   the value of @{text "N"} and the above difference. The @{const
       
   263   "measures"} combinator generalizes @{text "measure"} by taking a
       
   264   list of measure functions.  
       
   265 *}
       
   266 
       
   267 termination 
       
   268 by (relation "measures [\<lambda>(i, N). N, \<lambda>(i,N). N + 1 - i]") auto
       
   269 
       
   270 subsection {* How @{text "lexicographic_order"} works *}
       
   271 
       
   272 (*fun fails :: "nat \<Rightarrow> nat list \<Rightarrow> nat"
       
   273 where
       
   274   "fails a [] = a"
       
   275 | "fails a (x#xs) = fails (x + a) (x # xs)"
       
   276 *)
       
   277 
       
   278 text {*
       
   279   To see how the automatic termination proofs work, let's look at an
       
   280   example where it fails\footnote{For a detailed discussion of the
       
   281   termination prover, see \cite{bulwahnKN07}}:
       
   282 
       
   283 \end{isamarkuptext}  
       
   284 \cmd{fun} @{text "fails :: \"nat \<Rightarrow> nat list \<Rightarrow> nat\""}\\%
       
   285 \cmd{where}\\%
       
   286 \hspace*{2ex}@{text "\"fails a [] = a\""}\\%
       
   287 |\hspace*{1.5ex}@{text "\"fails a (x#xs) = fails (x + a) (x#xs)\""}\\
       
   288 \begin{isamarkuptext}
       
   289 
       
   290 \noindent Isabelle responds with the following error:
       
   291 
       
   292 \begin{isabelle}
       
   293 *** Unfinished subgoals:\newline
       
   294 *** (a, 1, <):\newline
       
   295 *** \ 1.~@{text "\<And>x. x = 0"}\newline
       
   296 *** (a, 1, <=):\newline
       
   297 *** \ 1.~False\newline
       
   298 *** (a, 2, <):\newline
       
   299 *** \ 1.~False\newline
       
   300 *** Calls:\newline
       
   301 *** a) @{text "(a, x # xs) -->> (x + a, x # xs)"}\newline
       
   302 *** Measures:\newline
       
   303 *** 1) @{text "\<lambda>x. size (fst x)"}\newline
       
   304 *** 2) @{text "\<lambda>x. size (snd x)"}\newline
       
   305 *** Result matrix:\newline
       
   306 *** \ \ \ \ 1\ \ 2  \newline
       
   307 *** a:  ?   <= \newline
       
   308 *** Could not find lexicographic termination order.\newline
       
   309 *** At command "fun".\newline
       
   310 \end{isabelle}
       
   311 *}
       
   312 
       
   313 
       
   314 text {*
       
   315   The key to this error message is the matrix at the bottom. The rows
       
   316   of that matrix correspond to the different recursive calls (In our
       
   317   case, there is just one). The columns are the function's arguments 
       
   318   (expressed through different measure functions, which map the
       
   319   argument tuple to a natural number). 
       
   320 
       
   321   The contents of the matrix summarize what is known about argument
       
   322   descents: The second argument has a weak descent (@{text "<="}) at the
       
   323   recursive call, and for the first argument nothing could be proved,
       
   324   which is expressed by @{text "?"}. In general, there are the values
       
   325   @{text "<"}, @{text "<="} and @{text "?"}.
       
   326 
       
   327   For the failed proof attempts, the unfinished subgoals are also
       
   328   printed. Looking at these will often point to a missing lemma.
       
   329 
       
   330 %  As a more real example, here is quicksort:
       
   331 *}
       
   332 (*
       
   333 function qs :: "nat list \<Rightarrow> nat list"
       
   334 where
       
   335   "qs [] = []"
       
   336 | "qs (x#xs) = qs [y\<in>xs. y < x] @ x # qs [y\<in>xs. y \<ge> x]"
       
   337 by pat_completeness auto
       
   338 
       
   339 termination apply lexicographic_order
       
   340 
       
   341 text {* If we try @{text "lexicographic_order"} method, we get the
       
   342   following error *}
       
   343 termination by (lexicographic_order simp:l2)
       
   344 
       
   345 lemma l: "x \<le> y \<Longrightarrow> x < Suc y" by arith
       
   346 
       
   347 function 
       
   348 
       
   349 *)
       
   350 
       
   351 section {* Mutual Recursion *}
       
   352 
       
   353 text {*
       
   354   If two or more functions call one another mutually, they have to be defined
       
   355   in one step. Here are @{text "even"} and @{text "odd"}:
       
   356 *}
       
   357 
       
   358 function even :: "nat \<Rightarrow> bool"
       
   359     and odd  :: "nat \<Rightarrow> bool"
       
   360 where
       
   361   "even 0 = True"
       
   362 | "odd 0 = False"
       
   363 | "even (Suc n) = odd n"
       
   364 | "odd (Suc n) = even n"
       
   365 by pat_completeness auto
       
   366 
       
   367 text {*
       
   368   To eliminate the mutual dependencies, Isabelle internally
       
   369   creates a single function operating on the sum
       
   370   type @{typ "nat + nat"}. Then, @{const even} and @{const odd} are
       
   371   defined as projections. Consequently, termination has to be proved
       
   372   simultaneously for both functions, by specifying a measure on the
       
   373   sum type: 
       
   374 *}
       
   375 
       
   376 termination 
       
   377 by (relation "measure (\<lambda>x. case x of Inl n \<Rightarrow> n | Inr n \<Rightarrow> n)") auto
       
   378 
       
   379 text {* 
       
   380   We could also have used @{text lexicographic_order}, which
       
   381   supports mutual recursive termination proofs to a certain extent.
       
   382 *}
       
   383 
       
   384 subsection {* Induction for mutual recursion *}
       
   385 
       
   386 text {*
       
   387 
       
   388   When functions are mutually recursive, proving properties about them
       
   389   generally requires simultaneous induction. The induction rule @{text "even_odd.induct"}
       
   390   generated from the above definition reflects this.
       
   391 
       
   392   Let us prove something about @{const even} and @{const odd}:
       
   393 *}
       
   394 
       
   395 lemma even_odd_mod2:
       
   396   "even n = (n mod 2 = 0)"
       
   397   "odd n = (n mod 2 = 1)"
       
   398 
       
   399 txt {* 
       
   400   We apply simultaneous induction, specifying the induction variable
       
   401   for both goals, separated by \cmd{and}:  *}
       
   402 
       
   403 apply (induct n and n rule: even_odd.induct)
       
   404 
       
   405 txt {* 
       
   406   We get four subgoals, which correspond to the clauses in the
       
   407   definition of @{const even} and @{const odd}:
       
   408   @{subgoals[display,indent=0]}
       
   409   Simplification solves the first two goals, leaving us with two
       
   410   statements about the @{text "mod"} operation to prove:
       
   411 *}
       
   412 
       
   413 apply simp_all
       
   414 
       
   415 txt {* 
       
   416   @{subgoals[display,indent=0]} 
       
   417 
       
   418   \noindent These can be handled by Isabelle's arithmetic decision procedures.
       
   419   
       
   420 *}
       
   421 
       
   422 apply arith
       
   423 apply arith
       
   424 done
       
   425 
       
   426 text {*
       
   427   In proofs like this, the simultaneous induction is really essential:
       
   428   Even if we are just interested in one of the results, the other
       
   429   one is necessary to strengthen the induction hypothesis. If we leave
       
   430   out the statement about @{const odd} and just write @{term True} instead,
       
   431   the same proof fails:
       
   432 *}
       
   433 
       
   434 lemma failed_attempt:
       
   435   "even n = (n mod 2 = 0)"
       
   436   "True"
       
   437 apply (induct n rule: even_odd.induct)
       
   438 
       
   439 txt {*
       
   440   \noindent Now the third subgoal is a dead end, since we have no
       
   441   useful induction hypothesis available:
       
   442 
       
   443   @{subgoals[display,indent=0]} 
       
   444 *}
       
   445 
       
   446 oops
       
   447 
       
   448 section {* General pattern matching *}
       
   449 text{*\label{genpats} *}
       
   450 
       
   451 subsection {* Avoiding automatic pattern splitting *}
       
   452 
       
   453 text {*
       
   454 
       
   455   Up to now, we used pattern matching only on datatypes, and the
       
   456   patterns were always disjoint and complete, and if they weren't,
       
   457   they were made disjoint automatically like in the definition of
       
   458   @{const "sep"} in \S\ref{patmatch}.
       
   459 
       
   460   This automatic splitting can significantly increase the number of
       
   461   equations involved, and this is not always desirable. The following
       
   462   example shows the problem:
       
   463   
       
   464   Suppose we are modeling incomplete knowledge about the world by a
       
   465   three-valued datatype, which has values @{term "T"}, @{term "F"}
       
   466   and @{term "X"} for true, false and uncertain propositions, respectively. 
       
   467 *}
       
   468 
       
   469 datatype P3 = T | F | X
       
   470 
       
   471 text {* \noindent Then the conjunction of such values can be defined as follows: *}
       
   472 
       
   473 fun And :: "P3 \<Rightarrow> P3 \<Rightarrow> P3"
       
   474 where
       
   475   "And T p = p"
       
   476 | "And p T = p"
       
   477 | "And p F = F"
       
   478 | "And F p = F"
       
   479 | "And X X = X"
       
   480 
       
   481 
       
   482 text {* 
       
   483   This definition is useful, because the equations can directly be used
       
   484   as simplification rules. But the patterns overlap: For example,
       
   485   the expression @{term "And T T"} is matched by both the first and
       
   486   the second equation. By default, Isabelle makes the patterns disjoint by
       
   487   splitting them up, producing instances:
       
   488 *}
       
   489 
       
   490 thm And.simps
       
   491 
       
   492 text {*
       
   493   @{thm[indent=4] And.simps}
       
   494   
       
   495   \vspace*{1em}
       
   496   \noindent There are several problems with this:
       
   497 
       
   498   \begin{enumerate}
       
   499   \item If the datatype has many constructors, there can be an
       
   500   explosion of equations. For @{const "And"}, we get seven instead of
       
   501   five equations, which can be tolerated, but this is just a small
       
   502   example.
       
   503 
       
   504   \item Since splitting makes the equations \qt{less general}, they
       
   505   do not always match in rewriting. While the term @{term "And x F"}
       
   506   can be simplified to @{term "F"} with the original equations, a
       
   507   (manual) case split on @{term "x"} is now necessary.
       
   508 
       
   509   \item The splitting also concerns the induction rule @{text
       
   510   "And.induct"}. Instead of five premises it now has seven, which
       
   511   means that our induction proofs will have more cases.
       
   512 
       
   513   \item In general, it increases clarity if we get the same definition
       
   514   back which we put in.
       
   515   \end{enumerate}
       
   516 
       
   517   If we do not want the automatic splitting, we can switch it off by
       
   518   leaving out the \cmd{sequential} option. However, we will have to
       
   519   prove that our pattern matching is consistent\footnote{This prevents
       
   520   us from defining something like @{term "f x = True"} and @{term "f x
       
   521   = False"} simultaneously.}:
       
   522 *}
       
   523 
       
   524 function And2 :: "P3 \<Rightarrow> P3 \<Rightarrow> P3"
       
   525 where
       
   526   "And2 T p = p"
       
   527 | "And2 p T = p"
       
   528 | "And2 p F = F"
       
   529 | "And2 F p = F"
       
   530 | "And2 X X = X"
       
   531 
       
   532 txt {*
       
   533   \noindent Now let's look at the proof obligations generated by a
       
   534   function definition. In this case, they are:
       
   535 
       
   536   @{subgoals[display,indent=0]}\vspace{-1.2em}\hspace{3cm}\vdots\vspace{1.2em}
       
   537 
       
   538   The first subgoal expresses the completeness of the patterns. It has
       
   539   the form of an elimination rule and states that every @{term x} of
       
   540   the function's input type must match at least one of the patterns\footnote{Completeness could
       
   541   be equivalently stated as a disjunction of existential statements: 
       
   542 @{term "(\<exists>p. x = (T, p)) \<or> (\<exists>p. x = (p, T)) \<or> (\<exists>p. x = (p, F)) \<or>
       
   543   (\<exists>p. x = (F, p)) \<or> (x = (X, X))"}, and you can use the method @{text atomize_elim} to get that form instead.}. If the patterns just involve
       
   544   datatypes, we can solve it with the @{text "pat_completeness"}
       
   545   method:
       
   546 *}
       
   547 
       
   548 apply pat_completeness
       
   549 
       
   550 txt {*
       
   551   The remaining subgoals express \emph{pattern compatibility}. We do
       
   552   allow that an input value matches multiple patterns, but in this
       
   553   case, the result (i.e.~the right hand sides of the equations) must
       
   554   also be equal. For each pair of two patterns, there is one such
       
   555   subgoal. Usually this needs injectivity of the constructors, which
       
   556   is used automatically by @{text "auto"}.
       
   557 *}
       
   558 
       
   559 by auto
       
   560 
       
   561 
       
   562 subsection {* Non-constructor patterns *}
       
   563 
       
   564 text {*
       
   565   Most of Isabelle's basic types take the form of inductive datatypes,
       
   566   and usually pattern matching works on the constructors of such types. 
       
   567   However, this need not be always the case, and the \cmd{function}
       
   568   command handles other kind of patterns, too.
       
   569 
       
   570   One well-known instance of non-constructor patterns are
       
   571   so-called \emph{$n+k$-patterns}, which are a little controversial in
       
   572   the functional programming world. Here is the initial fibonacci
       
   573   example with $n+k$-patterns:
       
   574 *}
       
   575 
       
   576 function fib2 :: "nat \<Rightarrow> nat"
       
   577 where
       
   578   "fib2 0 = 1"
       
   579 | "fib2 1 = 1"
       
   580 | "fib2 (n + 2) = fib2 n + fib2 (Suc n)"
       
   581 
       
   582 (*<*)ML_val "goals_limit := 1"(*>*)
       
   583 txt {*
       
   584   This kind of matching is again justified by the proof of pattern
       
   585   completeness and compatibility. 
       
   586   The proof obligation for pattern completeness states that every natural number is
       
   587   either @{term "0::nat"}, @{term "1::nat"} or @{term "n +
       
   588   (2::nat)"}:
       
   589 
       
   590   @{subgoals[display,indent=0]}
       
   591 
       
   592   This is an arithmetic triviality, but unfortunately the
       
   593   @{text arith} method cannot handle this specific form of an
       
   594   elimination rule. However, we can use the method @{text
       
   595   "atomize_elim"} to do an ad-hoc conversion to a disjunction of
       
   596   existentials, which can then be solved by the arithmetic decision procedure.
       
   597   Pattern compatibility and termination are automatic as usual.
       
   598 *}
       
   599 (*<*)ML_val "goals_limit := 10"(*>*)
       
   600 apply atomize_elim
       
   601 apply arith
       
   602 apply auto
       
   603 done
       
   604 termination by lexicographic_order
       
   605 text {*
       
   606   We can stretch the notion of pattern matching even more. The
       
   607   following function is not a sensible functional program, but a
       
   608   perfectly valid mathematical definition:
       
   609 *}
       
   610 
       
   611 function ev :: "nat \<Rightarrow> bool"
       
   612 where
       
   613   "ev (2 * n) = True"
       
   614 | "ev (2 * n + 1) = False"
       
   615 apply atomize_elim
       
   616 by arith+
       
   617 termination by (relation "{}") simp
       
   618 
       
   619 text {*
       
   620   This general notion of pattern matching gives you a certain freedom
       
   621   in writing down specifications. However, as always, such freedom should
       
   622   be used with care:
       
   623 
       
   624   If we leave the area of constructor
       
   625   patterns, we have effectively departed from the world of functional
       
   626   programming. This means that it is no longer possible to use the
       
   627   code generator, and expect it to generate ML code for our
       
   628   definitions. Also, such a specification might not work very well together with
       
   629   simplification. Your mileage may vary.
       
   630 *}
       
   631 
       
   632 
       
   633 subsection {* Conditional equations *}
       
   634 
       
   635 text {* 
       
   636   The function package also supports conditional equations, which are
       
   637   similar to guards in a language like Haskell. Here is Euclid's
       
   638   algorithm written with conditional patterns\footnote{Note that the
       
   639   patterns are also overlapping in the base case}:
       
   640 *}
       
   641 
       
   642 function gcd :: "nat \<Rightarrow> nat \<Rightarrow> nat"
       
   643 where
       
   644   "gcd x 0 = x"
       
   645 | "gcd 0 y = y"
       
   646 | "x < y \<Longrightarrow> gcd (Suc x) (Suc y) = gcd (Suc x) (y - x)"
       
   647 | "\<not> x < y \<Longrightarrow> gcd (Suc x) (Suc y) = gcd (x - y) (Suc y)"
       
   648 by (atomize_elim, auto, arith)
       
   649 termination by lexicographic_order
       
   650 
       
   651 text {*
       
   652   By now, you can probably guess what the proof obligations for the
       
   653   pattern completeness and compatibility look like. 
       
   654 
       
   655   Again, functions with conditional patterns are not supported by the
       
   656   code generator.
       
   657 *}
       
   658 
       
   659 
       
   660 subsection {* Pattern matching on strings *}
       
   661 
       
   662 text {*
       
   663   As strings (as lists of characters) are normal datatypes, pattern
       
   664   matching on them is possible, but somewhat problematic. Consider the
       
   665   following definition:
       
   666 
       
   667 \end{isamarkuptext}
       
   668 \noindent\cmd{fun} @{text "check :: \"string \<Rightarrow> bool\""}\\%
       
   669 \cmd{where}\\%
       
   670 \hspace*{2ex}@{text "\"check (''good'') = True\""}\\%
       
   671 @{text "| \"check s = False\""}
       
   672 \begin{isamarkuptext}
       
   673 
       
   674   \noindent An invocation of the above \cmd{fun} command does not
       
   675   terminate. What is the problem? Strings are lists of characters, and
       
   676   characters are a datatype with a lot of constructors. Splitting the
       
   677   catch-all pattern thus leads to an explosion of cases, which cannot
       
   678   be handled by Isabelle.
       
   679 
       
   680   There are two things we can do here. Either we write an explicit
       
   681   @{text "if"} on the right hand side, or we can use conditional patterns:
       
   682 *}
       
   683 
       
   684 function check :: "string \<Rightarrow> bool"
       
   685 where
       
   686   "check (''good'') = True"
       
   687 | "s \<noteq> ''good'' \<Longrightarrow> check s = False"
       
   688 by auto
       
   689 
       
   690 
       
   691 section {* Partiality *}
       
   692 
       
   693 text {* 
       
   694   In HOL, all functions are total. A function @{term "f"} applied to
       
   695   @{term "x"} always has the value @{term "f x"}, and there is no notion
       
   696   of undefinedness. 
       
   697   This is why we have to do termination
       
   698   proofs when defining functions: The proof justifies that the
       
   699   function can be defined by wellfounded recursion.
       
   700 
       
   701   However, the \cmd{function} package does support partiality to a
       
   702   certain extent. Let's look at the following function which looks
       
   703   for a zero of a given function f. 
       
   704 *}
       
   705 
       
   706 function (*<*)(domintros, tailrec)(*>*)findzero :: "(nat \<Rightarrow> nat) \<Rightarrow> nat \<Rightarrow> nat"
       
   707 where
       
   708   "findzero f n = (if f n = 0 then n else findzero f (Suc n))"
       
   709 by pat_completeness auto
       
   710 (*<*)declare findzero.simps[simp del](*>*)
       
   711 
       
   712 text {*
       
   713   \noindent Clearly, any attempt of a termination proof must fail. And without
       
   714   that, we do not get the usual rules @{text "findzero.simps"} and 
       
   715   @{text "findzero.induct"}. So what was the definition good for at all?
       
   716 *}
       
   717 
       
   718 subsection {* Domain predicates *}
       
   719 
       
   720 text {*
       
   721   The trick is that Isabelle has not only defined the function @{const findzero}, but also
       
   722   a predicate @{term "findzero_dom"} that characterizes the values where the function
       
   723   terminates: the \emph{domain} of the function. If we treat a
       
   724   partial function just as a total function with an additional domain
       
   725   predicate, we can derive simplification and
       
   726   induction rules as we do for total functions. They are guarded
       
   727   by domain conditions and are called @{text psimps} and @{text
       
   728   pinduct}: 
       
   729 *}
       
   730 
       
   731 text {*
       
   732   \noindent\begin{minipage}{0.79\textwidth}@{thm[display,margin=85] findzero.psimps}\end{minipage}
       
   733   \hfill(@{text "findzero.psimps"})
       
   734   \vspace{1em}
       
   735 
       
   736   \noindent\begin{minipage}{0.79\textwidth}@{thm[display,margin=85] findzero.pinduct}\end{minipage}
       
   737   \hfill(@{text "findzero.pinduct"})
       
   738 *}
       
   739 
       
   740 text {*
       
   741   Remember that all we
       
   742   are doing here is use some tricks to make a total function appear
       
   743   as if it was partial. We can still write the term @{term "findzero
       
   744   (\<lambda>x. 1) 0"} and like any other term of type @{typ nat} it is equal
       
   745   to some natural number, although we might not be able to find out
       
   746   which one. The function is \emph{underdefined}.
       
   747 
       
   748   But it is defined enough to prove something interesting about it. We
       
   749   can prove that if @{term "findzero f n"}
       
   750   terminates, it indeed returns a zero of @{term f}:
       
   751 *}
       
   752 
       
   753 lemma findzero_zero: "findzero_dom (f, n) \<Longrightarrow> f (findzero f n) = 0"
       
   754 
       
   755 txt {* \noindent We apply induction as usual, but using the partial induction
       
   756   rule: *}
       
   757 
       
   758 apply (induct f n rule: findzero.pinduct)
       
   759 
       
   760 txt {* \noindent This gives the following subgoals:
       
   761 
       
   762   @{subgoals[display,indent=0]}
       
   763 
       
   764   \noindent The hypothesis in our lemma was used to satisfy the first premise in
       
   765   the induction rule. However, we also get @{term
       
   766   "findzero_dom (f, n)"} as a local assumption in the induction step. This
       
   767   allows to unfold @{term "findzero f n"} using the @{text psimps}
       
   768   rule, and the rest is trivial. Since the @{text psimps} rules carry the
       
   769   @{text "[simp]"} attribute by default, we just need a single step:
       
   770  *}
       
   771 apply simp
       
   772 done
       
   773 
       
   774 text {*
       
   775   Proofs about partial functions are often not harder than for total
       
   776   functions. Fig.~\ref{findzero_isar} shows a slightly more
       
   777   complicated proof written in Isar. It is verbose enough to show how
       
   778   partiality comes into play: From the partial induction, we get an
       
   779   additional domain condition hypothesis. Observe how this condition
       
   780   is applied when calls to @{term findzero} are unfolded.
       
   781 *}
       
   782 
       
   783 text_raw {*
       
   784 \begin{figure}
       
   785 \hrule\vspace{6pt}
       
   786 \begin{minipage}{0.8\textwidth}
       
   787 \isabellestyle{it}
       
   788 \isastyle\isamarkuptrue
       
   789 *}
       
   790 lemma "\<lbrakk>findzero_dom (f, n); x \<in> {n ..< findzero f n}\<rbrakk> \<Longrightarrow> f x \<noteq> 0"
       
   791 proof (induct rule: findzero.pinduct)
       
   792   fix f n assume dom: "findzero_dom (f, n)"
       
   793                and IH: "\<lbrakk>f n \<noteq> 0; x \<in> {Suc n ..< findzero f (Suc n)}\<rbrakk> \<Longrightarrow> f x \<noteq> 0"
       
   794                and x_range: "x \<in> {n ..< findzero f n}"
       
   795   have "f n \<noteq> 0"
       
   796   proof 
       
   797     assume "f n = 0"
       
   798     with dom have "findzero f n = n" by simp
       
   799     with x_range show False by auto
       
   800   qed
       
   801   
       
   802   from x_range have "x = n \<or> x \<in> {Suc n ..< findzero f n}" by auto
       
   803   thus "f x \<noteq> 0"
       
   804   proof
       
   805     assume "x = n"
       
   806     with `f n \<noteq> 0` show ?thesis by simp
       
   807   next
       
   808     assume "x \<in> {Suc n ..< findzero f n}"
       
   809     with dom and `f n \<noteq> 0` have "x \<in> {Suc n ..< findzero f (Suc n)}" by simp
       
   810     with IH and `f n \<noteq> 0`
       
   811     show ?thesis by simp
       
   812   qed
       
   813 qed
       
   814 text_raw {*
       
   815 \isamarkupfalse\isabellestyle{tt}
       
   816 \end{minipage}\vspace{6pt}\hrule
       
   817 \caption{A proof about a partial function}\label{findzero_isar}
       
   818 \end{figure}
       
   819 *}
       
   820 
       
   821 subsection {* Partial termination proofs *}
       
   822 
       
   823 text {*
       
   824   Now that we have proved some interesting properties about our
       
   825   function, we should turn to the domain predicate and see if it is
       
   826   actually true for some values. Otherwise we would have just proved
       
   827   lemmas with @{term False} as a premise.
       
   828 
       
   829   Essentially, we need some introduction rules for @{text
       
   830   findzero_dom}. The function package can prove such domain
       
   831   introduction rules automatically. But since they are not used very
       
   832   often (they are almost never needed if the function is total), this
       
   833   functionality is disabled by default for efficiency reasons. So we have to go
       
   834   back and ask for them explicitly by passing the @{text
       
   835   "(domintros)"} option to the function package:
       
   836 
       
   837 \vspace{1ex}
       
   838 \noindent\cmd{function} @{text "(domintros) findzero :: \"(nat \<Rightarrow> nat) \<Rightarrow> nat \<Rightarrow> nat\""}\\%
       
   839 \cmd{where}\isanewline%
       
   840 \ \ \ldots\\
       
   841 
       
   842   \noindent Now the package has proved an introduction rule for @{text findzero_dom}:
       
   843 *}
       
   844 
       
   845 thm findzero.domintros
       
   846 
       
   847 text {*
       
   848   @{thm[display] findzero.domintros}
       
   849 
       
   850   Domain introduction rules allow to show that a given value lies in the
       
   851   domain of a function, if the arguments of all recursive calls
       
   852   are in the domain as well. They allow to do a \qt{single step} in a
       
   853   termination proof. Usually, you want to combine them with a suitable
       
   854   induction principle.
       
   855 
       
   856   Since our function increases its argument at recursive calls, we
       
   857   need an induction principle which works \qt{backwards}. We will use
       
   858   @{text inc_induct}, which allows to do induction from a fixed number
       
   859   \qt{downwards}:
       
   860 
       
   861   \begin{center}@{thm inc_induct}\hfill(@{text "inc_induct"})\end{center}
       
   862 
       
   863   Figure \ref{findzero_term} gives a detailed Isar proof of the fact
       
   864   that @{text findzero} terminates if there is a zero which is greater
       
   865   or equal to @{term n}. First we derive two useful rules which will
       
   866   solve the base case and the step case of the induction. The
       
   867   induction is then straightforward, except for the unusual induction
       
   868   principle.
       
   869 
       
   870 *}
       
   871 
       
   872 text_raw {*
       
   873 \begin{figure}
       
   874 \hrule\vspace{6pt}
       
   875 \begin{minipage}{0.8\textwidth}
       
   876 \isabellestyle{it}
       
   877 \isastyle\isamarkuptrue
       
   878 *}
       
   879 lemma findzero_termination:
       
   880   assumes "x \<ge> n" and "f x = 0"
       
   881   shows "findzero_dom (f, n)"
       
   882 proof - 
       
   883   have base: "findzero_dom (f, x)"
       
   884     by (rule findzero.domintros) (simp add:`f x = 0`)
       
   885 
       
   886   have step: "\<And>i. findzero_dom (f, Suc i) 
       
   887     \<Longrightarrow> findzero_dom (f, i)"
       
   888     by (rule findzero.domintros) simp
       
   889 
       
   890   from `x \<ge> n` show ?thesis
       
   891   proof (induct rule:inc_induct)
       
   892     show "findzero_dom (f, x)" by (rule base)
       
   893   next
       
   894     fix i assume "findzero_dom (f, Suc i)"
       
   895     thus "findzero_dom (f, i)" by (rule step)
       
   896   qed
       
   897 qed      
       
   898 text_raw {*
       
   899 \isamarkupfalse\isabellestyle{tt}
       
   900 \end{minipage}\vspace{6pt}\hrule
       
   901 \caption{Termination proof for @{text findzero}}\label{findzero_term}
       
   902 \end{figure}
       
   903 *}
       
   904       
       
   905 text {*
       
   906   Again, the proof given in Fig.~\ref{findzero_term} has a lot of
       
   907   detail in order to explain the principles. Using more automation, we
       
   908   can also have a short proof:
       
   909 *}
       
   910 
       
   911 lemma findzero_termination_short:
       
   912   assumes zero: "x >= n" 
       
   913   assumes [simp]: "f x = 0"
       
   914   shows "findzero_dom (f, n)"
       
   915 using zero
       
   916 by (induct rule:inc_induct) (auto intro: findzero.domintros)
       
   917     
       
   918 text {*
       
   919   \noindent It is simple to combine the partial correctness result with the
       
   920   termination lemma:
       
   921 *}
       
   922 
       
   923 lemma findzero_total_correctness:
       
   924   "f x = 0 \<Longrightarrow> f (findzero f 0) = 0"
       
   925 by (blast intro: findzero_zero findzero_termination)
       
   926 
       
   927 subsection {* Definition of the domain predicate *}
       
   928 
       
   929 text {*
       
   930   Sometimes it is useful to know what the definition of the domain
       
   931   predicate looks like. Actually, @{text findzero_dom} is just an
       
   932   abbreviation:
       
   933 
       
   934   @{abbrev[display] findzero_dom}
       
   935 
       
   936   The domain predicate is the \emph{accessible part} of a relation @{const
       
   937   findzero_rel}, which was also created internally by the function
       
   938   package. @{const findzero_rel} is just a normal
       
   939   inductive predicate, so we can inspect its definition by
       
   940   looking at the introduction rules @{text findzero_rel.intros}.
       
   941   In our case there is just a single rule:
       
   942 
       
   943   @{thm[display] findzero_rel.intros}
       
   944 
       
   945   The predicate @{const findzero_rel}
       
   946   describes the \emph{recursion relation} of the function
       
   947   definition. The recursion relation is a binary relation on
       
   948   the arguments of the function that relates each argument to its
       
   949   recursive calls. In general, there is one introduction rule for each
       
   950   recursive call.
       
   951 
       
   952   The predicate @{term "accp findzero_rel"} is the accessible part of
       
   953   that relation. An argument belongs to the accessible part, if it can
       
   954   be reached in a finite number of steps (cf.~its definition in @{text
       
   955   "Wellfounded.thy"}).
       
   956 
       
   957   Since the domain predicate is just an abbreviation, you can use
       
   958   lemmas for @{const accp} and @{const findzero_rel} directly. Some
       
   959   lemmas which are occasionally useful are @{text accpI}, @{text
       
   960   accp_downward}, and of course the introduction and elimination rules
       
   961   for the recursion relation @{text "findzero.intros"} and @{text "findzero.cases"}.
       
   962 *}
       
   963 
       
   964 (*lemma findzero_nicer_domintros:
       
   965   "f x = 0 \<Longrightarrow> findzero_dom (f, x)"
       
   966   "findzero_dom (f, Suc x) \<Longrightarrow> findzero_dom (f, x)"
       
   967 by (rule accpI, erule findzero_rel.cases, auto)+
       
   968 *)
       
   969   
       
   970 subsection {* A Useful Special Case: Tail recursion *}
       
   971 
       
   972 text {*
       
   973   The domain predicate is our trick that allows us to model partiality
       
   974   in a world of total functions. The downside of this is that we have
       
   975   to carry it around all the time. The termination proof above allowed
       
   976   us to replace the abstract @{term "findzero_dom (f, n)"} by the more
       
   977   concrete @{term "(x \<ge> n \<and> f x = (0::nat))"}, but the condition is still
       
   978   there and can only be discharged for special cases.
       
   979   In particular, the domain predicate guards the unfolding of our
       
   980   function, since it is there as a condition in the @{text psimp}
       
   981   rules. 
       
   982 
       
   983   Now there is an important special case: We can actually get rid
       
   984   of the condition in the simplification rules, \emph{if the function
       
   985   is tail-recursive}. The reason is that for all tail-recursive
       
   986   equations there is a total function satisfying them, even if they
       
   987   are non-terminating. 
       
   988 
       
   989 %  A function is tail recursive, if each call to the function is either
       
   990 %  equal
       
   991 %
       
   992 %  So the outer form of the 
       
   993 %
       
   994 %if it can be written in the following
       
   995 %  form:
       
   996 %  {term[display] "f x = (if COND x then BASE x else f (LOOP x))"}
       
   997 
       
   998 
       
   999   The function package internally does the right construction and can
       
  1000   derive the unconditional simp rules, if we ask it to do so. Luckily,
       
  1001   our @{const "findzero"} function is tail-recursive, so we can just go
       
  1002   back and add another option to the \cmd{function} command:
       
  1003 
       
  1004 \vspace{1ex}
       
  1005 \noindent\cmd{function} @{text "(domintros, tailrec) findzero :: \"(nat \<Rightarrow> nat) \<Rightarrow> nat \<Rightarrow> nat\""}\\%
       
  1006 \cmd{where}\isanewline%
       
  1007 \ \ \ldots\\%
       
  1008 
       
  1009   
       
  1010   \noindent Now, we actually get unconditional simplification rules, even
       
  1011   though the function is partial:
       
  1012 *}
       
  1013 
       
  1014 thm findzero.simps
       
  1015 
       
  1016 text {*
       
  1017   @{thm[display] findzero.simps}
       
  1018 
       
  1019   \noindent Of course these would make the simplifier loop, so we better remove
       
  1020   them from the simpset:
       
  1021 *}
       
  1022 
       
  1023 declare findzero.simps[simp del]
       
  1024 
       
  1025 text {* 
       
  1026   Getting rid of the domain conditions in the simplification rules is
       
  1027   not only useful because it simplifies proofs. It is also required in
       
  1028   order to use Isabelle's code generator to generate ML code
       
  1029   from a function definition.
       
  1030   Since the code generator only works with equations, it cannot be
       
  1031   used with @{text "psimp"} rules. Thus, in order to generate code for
       
  1032   partial functions, they must be defined as a tail recursion.
       
  1033   Luckily, many functions have a relatively natural tail recursive
       
  1034   definition.
       
  1035 *}
       
  1036 
       
  1037 section {* Nested recursion *}
       
  1038 
       
  1039 text {*
       
  1040   Recursive calls which are nested in one another frequently cause
       
  1041   complications, since their termination proof can depend on a partial
       
  1042   correctness property of the function itself. 
       
  1043 
       
  1044   As a small example, we define the \qt{nested zero} function:
       
  1045 *}
       
  1046 
       
  1047 function nz :: "nat \<Rightarrow> nat"
       
  1048 where
       
  1049   "nz 0 = 0"
       
  1050 | "nz (Suc n) = nz (nz n)"
       
  1051 by pat_completeness auto
       
  1052 
       
  1053 text {*
       
  1054   If we attempt to prove termination using the identity measure on
       
  1055   naturals, this fails:
       
  1056 *}
       
  1057 
       
  1058 termination
       
  1059   apply (relation "measure (\<lambda>n. n)")
       
  1060   apply auto
       
  1061 
       
  1062 txt {*
       
  1063   We get stuck with the subgoal
       
  1064 
       
  1065   @{subgoals[display]}
       
  1066 
       
  1067   Of course this statement is true, since we know that @{const nz} is
       
  1068   the zero function. And in fact we have no problem proving this
       
  1069   property by induction.
       
  1070 *}
       
  1071 (*<*)oops(*>*)
       
  1072 lemma nz_is_zero: "nz_dom n \<Longrightarrow> nz n = 0"
       
  1073   by (induct rule:nz.pinduct) auto
       
  1074 
       
  1075 text {*
       
  1076   We formulate this as a partial correctness lemma with the condition
       
  1077   @{term "nz_dom n"}. This allows us to prove it with the @{text
       
  1078   pinduct} rule before we have proved termination. With this lemma,
       
  1079   the termination proof works as expected:
       
  1080 *}
       
  1081 
       
  1082 termination
       
  1083   by (relation "measure (\<lambda>n. n)") (auto simp: nz_is_zero)
       
  1084 
       
  1085 text {*
       
  1086   As a general strategy, one should prove the statements needed for
       
  1087   termination as a partial property first. Then they can be used to do
       
  1088   the termination proof. This also works for less trivial
       
  1089   examples. Figure \ref{f91} defines the 91-function, a well-known
       
  1090   challenge problem due to John McCarthy, and proves its termination.
       
  1091 *}
       
  1092 
       
  1093 text_raw {*
       
  1094 \begin{figure}
       
  1095 \hrule\vspace{6pt}
       
  1096 \begin{minipage}{0.8\textwidth}
       
  1097 \isabellestyle{it}
       
  1098 \isastyle\isamarkuptrue
       
  1099 *}
       
  1100 
       
  1101 function f91 :: "nat \<Rightarrow> nat"
       
  1102 where
       
  1103   "f91 n = (if 100 < n then n - 10 else f91 (f91 (n + 11)))"
       
  1104 by pat_completeness auto
       
  1105 
       
  1106 lemma f91_estimate: 
       
  1107   assumes trm: "f91_dom n" 
       
  1108   shows "n < f91 n + 11"
       
  1109 using trm by induct auto
       
  1110 
       
  1111 termination
       
  1112 proof
       
  1113   let ?R = "measure (\<lambda>x. 101 - x)"
       
  1114   show "wf ?R" ..
       
  1115 
       
  1116   fix n :: nat assume "\<not> 100 < n" -- "Assumptions for both calls"
       
  1117 
       
  1118   thus "(n + 11, n) \<in> ?R" by simp -- "Inner call"
       
  1119 
       
  1120   assume inner_trm: "f91_dom (n + 11)" -- "Outer call"
       
  1121   with f91_estimate have "n + 11 < f91 (n + 11) + 11" .
       
  1122   with `\<not> 100 < n` show "(f91 (n + 11), n) \<in> ?R" by simp
       
  1123 qed
       
  1124 
       
  1125 text_raw {*
       
  1126 \isamarkupfalse\isabellestyle{tt}
       
  1127 \end{minipage}
       
  1128 \vspace{6pt}\hrule
       
  1129 \caption{McCarthy's 91-function}\label{f91}
       
  1130 \end{figure}
       
  1131 *}
       
  1132 
       
  1133 
       
  1134 section {* Higher-Order Recursion *}
       
  1135 
       
  1136 text {*
       
  1137   Higher-order recursion occurs when recursive calls
       
  1138   are passed as arguments to higher-order combinators such as @{const
       
  1139   map}, @{term filter} etc.
       
  1140   As an example, imagine a datatype of n-ary trees:
       
  1141 *}
       
  1142 
       
  1143 datatype 'a tree = 
       
  1144   Leaf 'a 
       
  1145 | Branch "'a tree list"
       
  1146 
       
  1147 
       
  1148 text {* \noindent We can define a function which swaps the left and right subtrees recursively, using the 
       
  1149   list functions @{const rev} and @{const map}: *}
       
  1150 
       
  1151 fun mirror :: "'a tree \<Rightarrow> 'a tree"
       
  1152 where
       
  1153   "mirror (Leaf n) = Leaf n"
       
  1154 | "mirror (Branch l) = Branch (rev (map mirror l))"
       
  1155 
       
  1156 text {*
       
  1157   Although the definition is accepted without problems, let us look at the termination proof:
       
  1158 *}
       
  1159 
       
  1160 termination proof
       
  1161   txt {*
       
  1162 
       
  1163   As usual, we have to give a wellfounded relation, such that the
       
  1164   arguments of the recursive calls get smaller. But what exactly are
       
  1165   the arguments of the recursive calls when mirror is given as an
       
  1166   argument to @{const map}? Isabelle gives us the
       
  1167   subgoals
       
  1168 
       
  1169   @{subgoals[display,indent=0]} 
       
  1170 
       
  1171   So the system seems to know that @{const map} only
       
  1172   applies the recursive call @{term "mirror"} to elements
       
  1173   of @{term "l"}, which is essential for the termination proof.
       
  1174 
       
  1175   This knowledge about @{const map} is encoded in so-called congruence rules,
       
  1176   which are special theorems known to the \cmd{function} command. The
       
  1177   rule for @{const map} is
       
  1178 
       
  1179   @{thm[display] map_cong}
       
  1180 
       
  1181   You can read this in the following way: Two applications of @{const
       
  1182   map} are equal, if the list arguments are equal and the functions
       
  1183   coincide on the elements of the list. This means that for the value 
       
  1184   @{term "map f l"} we only have to know how @{term f} behaves on
       
  1185   the elements of @{term l}.
       
  1186 
       
  1187   Usually, one such congruence rule is
       
  1188   needed for each higher-order construct that is used when defining
       
  1189   new functions. In fact, even basic functions like @{const
       
  1190   If} and @{const Let} are handled by this mechanism. The congruence
       
  1191   rule for @{const If} states that the @{text then} branch is only
       
  1192   relevant if the condition is true, and the @{text else} branch only if it
       
  1193   is false:
       
  1194 
       
  1195   @{thm[display] if_cong}
       
  1196   
       
  1197   Congruence rules can be added to the
       
  1198   function package by giving them the @{term fundef_cong} attribute.
       
  1199 
       
  1200   The constructs that are predefined in Isabelle, usually
       
  1201   come with the respective congruence rules.
       
  1202   But if you define your own higher-order functions, you may have to
       
  1203   state and prove the required congruence rules yourself, if you want to use your
       
  1204   functions in recursive definitions. 
       
  1205 *}
       
  1206 (*<*)oops(*>*)
       
  1207 
       
  1208 subsection {* Congruence Rules and Evaluation Order *}
       
  1209 
       
  1210 text {* 
       
  1211   Higher order logic differs from functional programming languages in
       
  1212   that it has no built-in notion of evaluation order. A program is
       
  1213   just a set of equations, and it is not specified how they must be
       
  1214   evaluated. 
       
  1215 
       
  1216   However for the purpose of function definition, we must talk about
       
  1217   evaluation order implicitly, when we reason about termination.
       
  1218   Congruence rules express that a certain evaluation order is
       
  1219   consistent with the logical definition. 
       
  1220 
       
  1221   Consider the following function.
       
  1222 *}
       
  1223 
       
  1224 function f :: "nat \<Rightarrow> bool"
       
  1225 where
       
  1226   "f n = (n = 0 \<or> f (n - 1))"
       
  1227 (*<*)by pat_completeness auto(*>*)
       
  1228 
       
  1229 text {*
       
  1230   For this definition, the termination proof fails. The default configuration
       
  1231   specifies no congruence rule for disjunction. We have to add a
       
  1232   congruence rule that specifies left-to-right evaluation order:
       
  1233 
       
  1234   \vspace{1ex}
       
  1235   \noindent @{thm disj_cong}\hfill(@{text "disj_cong"})
       
  1236   \vspace{1ex}
       
  1237 
       
  1238   Now the definition works without problems. Note how the termination
       
  1239   proof depends on the extra condition that we get from the congruence
       
  1240   rule.
       
  1241 
       
  1242   However, as evaluation is not a hard-wired concept, we
       
  1243   could just turn everything around by declaring a different
       
  1244   congruence rule. Then we can make the reverse definition:
       
  1245 *}
       
  1246 
       
  1247 lemma disj_cong2[fundef_cong]: 
       
  1248   "(\<not> Q' \<Longrightarrow> P = P') \<Longrightarrow> (Q = Q') \<Longrightarrow> (P \<or> Q) = (P' \<or> Q')"
       
  1249   by blast
       
  1250 
       
  1251 fun f' :: "nat \<Rightarrow> bool"
       
  1252 where
       
  1253   "f' n = (f' (n - 1) \<or> n = 0)"
       
  1254 
       
  1255 text {*
       
  1256   \noindent These examples show that, in general, there is no \qt{best} set of
       
  1257   congruence rules.
       
  1258 
       
  1259   However, such tweaking should rarely be necessary in
       
  1260   practice, as most of the time, the default set of congruence rules
       
  1261   works well.
       
  1262 *}
       
  1263 
       
  1264 end