isabelle: comparison doc-src/IsarAdvanced/Functions/Thy/Functions.thy

equal deleted inserted replaced

-:6fc9c1eca94d
+:595a0e24bd8e
 theory Functions
 imports Main
 begin
-section {* Function Definition for Dummies *}
+section {* Function Definitions for Dummies *}
 text {*
 In most cases, defining a recursive function is just as simple as other definitions:
-Like in functional programming, a function definition consists of a
 *}
 fun fib :: "nat \<Rightarrow> nat"
 where
 "fib 0 = 1"
 | "fib (Suc (Suc n)) = fib n + fib (Suc n)"
 text {*
 The syntax is rather self-explanatory: We introduce a function by
 giving its name, its type and a set of defining recursive
-equations.
+equations. Note that the function is not primitive recursive.
 *}
 text {*
 The function always terminates, since its argument gets smaller in
-every recursive call. Termination is an important requirement, since
+every recursive call.
-it prevents inconsistencies: From the "definition" @{text "f(n) =
+Since HOL is a logic of total functions, termination is a
-f(n) + 1"} we could prove @{text "0 = 1"} by subtracting @{text
+fundamental requirement to prevent inconsistencies\footnote{From the
-"f(n)"} on both sides.
+\qt{definition} @{text "f(n) = f(n) + 1"} we could prove
+@{text "0 = 1"} by subtracting @{text "f(n)"} on both sides.}.
-Isabelle tries to prove termination automatically when a function is
-defined. We will later look at cases where this fails and see what to
+Isabelle tries to prove termination automatically when a definition
-do then.
+is made. In \S\ref{termination}, we will look at cases where this
+fails and see what to do then.
 *}
 subsection {* Pattern matching *}
 text {* \label{patmatch}
 Like in functional programming, we can use pattern matching to
 define functions. At the moment we will only consider \emph{constructor
 patterns}, which only consist of datatype constructors and
-variables.
+(linear occurrences of) variables.
 If patterns overlap, the order of the equations is taken into
 account. The following function inserts a fixed element between any
 two elements of a list:
 *}
 where
 "sep a (x#y#xs) = x # a # sep a (y # xs)"
 | "sep a xs       = xs"
 text {*
-Overlapping patterns are interpreted as "increments" to what is
+Overlapping patterns are interpreted as \qt{increments} to what is
 already there: The second equation is only meant for the cases where
 the first one does not match. Consequently, Isabelle replaces it
 internally by the remaining cases, making the patterns disjoint:
 *}
 text {*
 The equations from function definitions are automatically used in
 simplification:
 *}
-lemma "sep (0::nat) [1, 2, 3] = [1, 0, 2, 0, 3]"
+lemma "sep 0 [1, 2, 3] = [1, 0, 2, 0, 3]"
 by simp
 subsection {* Induction *}
 text {*
 Isabelle provides customized induction rules for recursive functions.
-See \cite[\S3.5.4]{isa-tutorial}.
+See \cite[\S3.5.4]{isa-tutorial}. \fixme{Cases?}
-*}
+With the \cmd{fun} command, you can define about 80\% of the
-section {* Full form definitions *}
+functions that occur in practice. The rest of this tutorial explains
+the remaining 20\%.
+*}
+section {* fun vs.\ function *}
 text {*
-Up to now, we were using the \cmd{fun} command, which provides a
+The \cmd{fun} command provides a
 convenient shorthand notation for simple function definitions. In
 this mode, Isabelle tries to solve all the necessary proof obligations
-automatically. If a proof does not go through, the definition is
+automatically. If a proof fails, the definition is
 rejected. This can either mean that the definition is indeed faulty,
 or that the default proof procedures are just not smart enough (or
 rather: not designed) to handle the definition.
-By expanding the abbreviated \cmd{fun} to the full \cmd{function}
+By expanding the abbreviation to the more verbose \cmd{function} command, these proof obligations become visible and can be analyzed or
-command, the proof obligations become visible and can be analyzed or
+solved manually. The expansion from \cmd{fun} to \cmd{function} is as follows:
-solved manually.
 \end{isamarkuptext}
-\fbox{\parbox{\textwidth}{
+\[\left[\;\begin{minipage}{0.25\textwidth}\vspace{6pt}
-\noindent\cmd{fun} @{text "f :: \<tau>"}\\%
+\cmd{fun} @{text "f :: \<tau>"}\\%
-\cmd{where}\isanewline%
+\cmd{where}\\%
-\ \ {\it equations}\isanewline%
+\hspace*{2ex}{\it equations}\\%
-\ \ \quad\vdots
+\hspace*{2ex}\vdots\vspace*{6pt}
-}}
+\end{minipage}\right]
+\quad\equiv\quad
-\begin{isamarkuptext}
+\left[\;\begin{minipage}{0.45\textwidth}\vspace{6pt}
-\vspace*{1em}
+\cmd{function} @{text "("}\cmd{sequential}@{text ") f :: \<tau>"}\\%
-\noindent abbreviates
+\cmd{where}\\%
-\end{isamarkuptext}
+\hspace*{2ex}{\it equations}\\%
+\hspace*{2ex}\vdots\\%
-\fbox{\parbox{\textwidth}{
-\noindent\cmd{function} @{text "("}\cmd{sequential}@{text ") f :: \<tau>"}\\%
-\cmd{where}\isanewline%
-\ \ {\it equations}\isanewline%
-\ \ \quad\vdots\\%
 \cmd{by} @{text "pat_completeness auto"}\\%
-\cmd{termination by} @{text "lexicographic_order"}
+\cmd{termination by} @{text "lexicographic_order"}\vspace{6pt}
-}}
+\end{minipage}
+\right]\]
 \begin{isamarkuptext}
 \vspace*{1em}
-\noindent Some declarations and proofs have now become explicit:
+\noindent Some details have now become explicit:
 \begin{enumerate}
 \item The \cmd{sequential} option enables the preprocessing of
 pattern overlaps we already saw. Without this option, the equations
 must already be disjoint and complete. The automatic completion only
-works with datatype patterns.
+works with constructor patterns.
-\item A function definition now produces a proof obligation which
+\item A function definition produces a proof obligation which
-expresses completeness and compatibility of patterns (We talk about
+expresses completeness and compatibility of patterns (we talk about
 this later). The combination of the methods @{text "pat_completeness"} and
 @{text "auto"} is used to solve this proof obligation.
 \item A termination proof follows the definition, started by the
-\cmd{termination} command, which sets up the goal. The @{text
+\cmd{termination} command. This will be explained in \S\ref{termination}.
-"lexicographic_order"} method can prove termination of a certain
-class of functions by searching for a suitable lexicographic
-combination of size measures.
 \end{enumerate}
 Whenever a \cmd{fun} command fails, it is usually a good idea to
 expand the syntax to the more verbose \cmd{function} form, to see
 what is actually going on.
 *}
-section {* Proving termination *}
+section {* Termination *}
-text {*
+text {*\label{termination}
+The @{text "lexicographic_order"} method can prove termination of a
+certain class of functions by searching for a suitable lexicographic
+combination of size measures. Of course, not all functions have such
+a simple termination argument.
+*}
+subsection {* The {\tt relation} method *}
+text{*
 Consider the following function, which sums up natural numbers up to
 @{text "N"}, using a counter @{text "i"}:
 *}
 function sum :: "nat \<Rightarrow> nat \<Rightarrow> nat"
 by pat_completeness auto
 text {*
 \noindent The @{text "lexicographic_order"} method fails on this example, because none of the
 arguments decreases in the recursive call.
+% FIXME: simps and induct only appear after "termination"
-A more general method for termination proofs is to supply a wellfounded
+The easiest way of doing termination proofs is to supply a wellfounded
 relation on the argument type, and to show that the argument
 decreases in every recursive call.
 The termination argument for @{text "sum"} is based on the fact that
 the \emph{difference} between @{text "i"} and @{text "N"} gets
 We can use this expression as a measure function suitable to prove termination.
 *}
 termination sum
-by (relation "measure (\<lambda>(i,N). N + 1 - i)") auto
+apply (relation "measure (\<lambda>(i,N). N + 1 - i)")
-text {*
+txt {*
 The \cmd{termination} command sets up the termination goal for the
-specified function @{text "sum"}. If the function name is omitted it
+specified function @{text "sum"}. If the function name is omitted, it
 implicitly refers to the last function definition.
 The @{text relation} method takes a relation of
 type @{typ "('a \<times> 'a) set"}, where @{typ "'a"} is the argument type of
 the function. If the function has multiple curried arguments, then
 these are packed together into a tuple, as it happened in the above
 example.
-The predefined function @{term_type "measure"} is a very common way of
+The predefined function @{term_type "measure"} constructs a
-specifying termination relations in terms of a mapping into the
+wellfounded relation from a mapping into the natural numbers (a
-natural numbers.
+\emph{measure function}).
 After the invocation of @{text "relation"}, we must prove that (a)
 the relation we supplied is wellfounded, and (b) that the arguments
 of recursive calls indeed decrease with respect to the
-relation. These goals are all solved by the subsequent call to
+relation:
-@{text "auto"}.
+@{subgoals[display,indent=0]}
+These goals are all solved by @{text "auto"}:
+*}
+apply auto
+done
+text {*
 Let us complicate the function a little, by adding some more
 recursive calls:
 *}
 function foo :: "nat \<Rightarrow> nat \<Rightarrow> nat"
 *}
 termination
 by (relation "measures [\<lambda>(i, N). N, \<lambda>(i,N). N + 1 - i]") auto
-subsection {* Manual Termination Proofs *}
+subsection {* How @{text "lexicographic_order"} works *}
-text {*
+(*fun fails :: "nat \<Rightarrow> nat list \<Rightarrow> nat"
-The @{text relation} method is often useful, but not
+where
-necessary. Since termination proofs are just normal Isabelle proofs,
+"fails a [] = a"
-they can also be carried out manually:
+| "fails a (x#xs) = fails (x + a) (x # xs)"
-*}
+*)
-function id :: "nat \<Rightarrow> nat"
+text {*
-where
+To see how the automatic termination proofs work, let's look at an
-"id 0 = 0"
+example where it fails\footnote{For a detailed discussion of the
-| "id (Suc n) = Suc (id n)"
+termination prover, see \cite{bulwahnKN07}}:
+\end{isamarkuptext}
+\cmd{fun} @{text "fails :: \"nat \<Rightarrow> nat list \<Rightarrow> nat\""}\\%
+\cmd{where}\\%
+\hspace*{2ex}@{text "\"fails a [] = a\""}\\%
+|\hspace*{1.5ex}@{text "\"fails a (x#xs) = fails (x + a) (x#xs)\""}\\
+\begin{isamarkuptext}
+\noindent Isabelle responds with the following error:
+\begin{isabelle}
+*** Could not find lexicographic termination order:\newline
+*** \ \ \ \ 1\ \ 2  \newline
+*** a:  N   <= \newline
+*** Calls:\newline
+*** a) @{text "(a, x # xs) -->> (x + a, x # xs)"}\newline
+*** Measures:\newline
+*** 1) @{text "\<lambda>x. size (fst x)"}\newline
+*** 2) @{text "\<lambda>x. size (snd x)"}\newline
+*** Unfinished subgoals:\newline
+*** @{text "\<And>a x xs."}\newline
+*** \quad @{text "(x. size (fst x)) (x + a, x # xs)"}\newline
+***  \quad @{text "\<le> (\<lambda>x. size (fst x)) (a, x # xs)"}\newline
+***  @{text "1. \<And>x. x = 0"}\newline
+*** At command "fun".\newline
+\end{isabelle}
+*}
+text {*
+The the key to this error message is the matrix at the top. The rows
+of that matrix correspond to the different recursive calls (In our
+case, there is just one). The columns are the function's arguments
+(expressed through different measure functions, which map the
+argument tuple to a natural number).
+The contents of the matrix summarize what is known about argument
+descents: The second argument has a weak descent (@{text "<="}) at the
+recursive call, and for the first argument nothing could be proved,
+which is expressed by @{text N}. In general, there are the values
+@{text "<"}, @{text "<="} and @{text "N"}.
+For the failed proof attempts, the unfinished subgoals are also
+printed. Looking at these will often point us to a missing lemma.
+%  As a more real example, here is quicksort:
+*}
+(*
+function qs :: "nat list \<Rightarrow> nat list"
+where
+"qs [] = []"
+| "qs (x#xs) = qs [y\<in>xs. y < x] @ x # qs [y\<in>xs. y \<ge> x]"
 by pat_completeness auto
-termination
+termination apply lexicographic_order
-proof
-show "wf less_than" ..
+text {* If we try @{text "lexicographic_order"} method, we get the
-next
+following error *}
-fix n show "(n, Suc n) \<in> less_than" by simp
+termination by (lexicographic_order simp:l2)
-qed
+lemma l: "x \<le> y \<Longrightarrow> x < Suc y" by arith
-text {*
-Of course this is just a trivial example, but manual proofs can
+function
-sometimes be the only choice if faced with very hard termination problems.
-*}
+*)
 section {* Mutual Recursion *}
 text {*
 If two or more functions call one another mutually, they have to be defined
-in one step. The simplest example are probably @{text "even"} and @{text "odd"}:
+in one step. Here are @{text "even"} and @{text "odd"}:
 *}
 function even :: "nat \<Rightarrow> bool"
 and odd  :: "nat \<Rightarrow> bool"
 where
 | "even (Suc n) = odd n"
 | "odd (Suc n) = even n"
 by pat_completeness auto
 text {*
-To solve the problem of mutual dependencies, Isabelle internally
+To eliminate the mutual dependencies, Isabelle internally
 creates a single function operating on the sum
-type. Then the original functions are defined as
+type @{typ "nat + nat"}. Then, @{const even} and @{const odd} are
-projections. Consequently, termination has to be proved
+defined as projections. Consequently, termination has to be proved
 simultaneously for both functions, by specifying a measure on the
 sum type:
 *}
 termination
-by (relation "measure (\<lambda>x. case x of Inl n \<Rightarrow> n | Inr n \<Rightarrow> n)")
+by (relation "measure (\<lambda>x. case x of Inl n \<Rightarrow> n | Inr n \<Rightarrow> n)") auto
-auto
+text {*
+We could also have used @{text lexicographic_order}, which
+supports mutual recursive termination proofs to a certain extent.
+*}
 subsection {* Induction for mutual recursion *}
 text {*
 When functions are mutually recursive, proving properties about them
-generally requires simultaneous induction. The induction rules
+generally requires simultaneous induction. The induction rule @{text "even_odd.induct"}
-generated from the definitions reflect this.
+generated from the above definition reflects this.
 Let us prove something about @{const even} and @{const odd}:
 *}
-lemma
+lemma even_odd_mod2:
 "even n = (n mod 2 = 0)"
 "odd n = (n mod 2 = 1)"
 txt {*
 We apply simultaneous induction, specifying the induction variable
 txt {*
 @{subgoals[display,indent=0]}
 \noindent These can be handeled by the descision procedure for
-presburger arithmethic.
+arithmethic.
 *}
-apply presburger
+apply presburger -- {* \fixme{arith} *}
 apply presburger
 done
 text {*
-Even if we were just interested in one of the statements proved by
+In proofs like this, the simultaneous induction is really essential:
-simultaneous induction, the other ones may be necessary to
+Even if we are just interested in one of the results, the other
-strengthen the induction hypothesis. If we had left out the statement
+one is necessary to strengthen the induction hypothesis. If we leave
-about @{const odd} (by substituting it with @{term "True"}, our
+out the statement about @{const odd} (by substituting it with @{term
-proof would have failed:
+"True"}), the same proof fails:
 *}
-lemma
+lemma failed_attempt:
 "even n = (n mod 2 = 0)"
 "True"
 apply (induct n rule: even_odd.induct)
 txt {*
 \noindent Now the third subgoal is a dead end, since we have no
-useful induction hypothesis:
+useful induction hypothesis available:
 @{subgoals[display,indent=0]}
 *}
 oops
-section {* More general patterns *}
+section {* General pattern matching *}
-subsection {* Avoiding pattern splitting *}
+subsection {* Avoiding automatic pattern splitting *}
 text {*
 Up to now, we used pattern matching only on datatypes, and the
 patterns were always disjoint and complete, and if they weren't,
 they were made disjoint automatically like in the definition of
 @{const "sep"} in \S\ref{patmatch}.
-This splitting can significantly increase the number of equations
+This automatic splitting can significantly increase the number of
-involved, and is not always necessary. The following simple example
+equations involved, and this is not always desirable. The following
-shows the problem:
+example shows the problem:
 Suppose we are modelling incomplete knowledge about the world by a
 three-valued datatype, which has values @{term "T"}, @{term "F"}
 and @{term "X"} for true, false and uncertain propositions, respectively.
 *}
 datatype P3 = T | F | X
-text {* Then the conjunction of such values can be defined as follows: *}
+text {* \noindent Then the conjunction of such values can be defined as follows: *}
 fun And :: "P3 \<Rightarrow> P3 \<Rightarrow> P3"
 where
 "And T p = p"
 | "And p T = p"
 | "And X X = X"
 text {*
 This definition is useful, because the equations can directly be used
-as rules to simplify expressions. But the patterns overlap, e.g.~the
+as simplifcation rules rules. But the patterns overlap: For example,
-expression @{term "And T T"} is matched by the first two
+the expression @{term "And T T"} is matched by both the first and
-equations. By default, Isabelle makes the patterns disjoint by
+the second equation. By default, Isabelle makes the patterns disjoint by
 splitting them up, producing instances:
 *}
 thm And.simps
 \vspace*{1em}
 \noindent There are several problems with this:
 \begin{enumerate}
-\item When datatypes have many constructors, there can be an
+\item If the datatype has many constructors, there can be an
 explosion of equations. For @{const "And"}, we get seven instead of
 five equations, which can be tolerated, but this is just a small
 example.
-\item Since splitting makes the equations "less general", they
+\item Since splitting makes the equations \qt{less general}, they
 do not always match in rewriting. While the term @{term "And x F"}
-can be simplified to @{term "F"} by the original specification, a
+can be simplified to @{term "F"} with the original equations, a
 (manual) case split on @{term "x"} is now necessary.
 \item The splitting also concerns the induction rule @{text
 "And.induct"}. Instead of five premises it now has seven, which
 means that our induction proofs will have more cases.
 \item In general, it increases clarity if we get the same definition
 back which we put in.
 \end{enumerate}
-On the other hand, a definition needs to be consistent and defining
+If we do not want the automatic splitting, we can switch it off by
-both @{term "f x = True"} and @{term "f x = False"} is a bad
+leaving out the \cmd{sequential} option. However, we will have to
-idea. So if we don't want Isabelle to mangle our definitions, we
+prove that our pattern matching is consistent\footnote{This prevents
-will have to prove that this is not necessary. By using the full
+us from defining something like @{term "f x = True"} and @{term "f x
-definition form without the \cmd{sequential} option, we get this
+= False"} simultaneously.}:
-behaviour:
 *}
 function And2 :: "P3 \<Rightarrow> P3 \<Rightarrow> P3"
 where
 "And2 T p = p"
 | "And2 p F = F"
 | "And2 F p = F"
 | "And2 X X = X"
 txt {*
-Now it is also time to look at the subgoals generated by a
+\noindent Now let's look at the proof obligations generated by a
 function definition. In this case, they are:
-@{subgoals[display,indent=0]}
+@{subgoals[display,indent=0]}\vspace{-1.2em}\hspace{3cm}\vdots\vspace{1.2em}
 The first subgoal expresses the completeness of the patterns. It has
 the form of an elimination rule and states that every @{term x} of
-the function's input type must match one of the patterns. It could
+the function's input type must match at least one of the patterns\footnote{Completeness could
 be equivalently stated as a disjunction of existential statements:
 @{term "(\<exists>p. x = (T, p)) \<or> (\<exists>p. x = (p, T)) \<or> (\<exists>p. x = (p, F)) \<or>
-(\<exists>p. x = (F, p)) \<or> (x = (X, X))"} If the patterns just involve
+(\<exists>p. x = (F, p)) \<or> (x = (X, X))"}.}. If the patterns just involve
-datatypes, we can solve it with the @{text "pat_completeness"} method:
+datatypes, we can solve it with the @{text "pat_completeness"}
+method:
 *}
 apply pat_completeness
 txt {*
 The remaining subgoals express \emph{pattern compatibility}. We do
-allow that a value is matched by more than one patterns, but in this
+allow that an input value matches multiple patterns, but in this
 case, the result (i.e.~the right hand sides of the equations) must
 also be equal. For each pair of two patterns, there is one such
 subgoal. Usually this needs injectivity of the constructors, which
 is used automatically by @{text "auto"}.
 *}
 by auto
 subsection {* Non-constructor patterns *}
-text {* FIXME *}
+text {*
+Most of Isabelle's basic types take the form of inductive data types
+with constructors. However, this is not true for all of them. The
+integers, for instance, are defined using the usual algebraic
+quotient construction, thus they are not an \qt{official} datatype.
+Of course, we might want to do pattern matching there, too. So
+*}
+function Abs :: "int \<Rightarrow> nat"
+where
+"Abs (int n) = n"
+| "Abs (- int (Suc n)) = n"
+by (erule int_cases) auto
+termination by (relation "{}") simp
+text {*
+This kind of matching is again justified by the proof of pattern
+completeness and compatibility. Here, the existing lemma @{text
+int_cases} is used:
+\begin{center}@{thm int_cases}\hfill(@{text "int_cases"})\end{center}
+*}
+text {*
+One well-known instance of non-constructor patterns are the
+so-called \emph{$n+k$-patterns}, which are a little controversial in
+the functional programming world. Here is the initial fibonacci
+example with $n+k$-patterns:
+*}
+function fib2 :: "nat \<Rightarrow> nat"
+where
+"fib2 0 = 1"
+| "fib2 1 = 1"
+| "fib2 (n + 2) = fib2 n + fib2 (Suc n)"
+(*<*)ML "goals_limit := 1"(*>*)
+txt {*
+The proof obligation for pattern completeness states that every natural number is
+either @{term "0::nat"}, @{term "1::nat"} or @{term "n +
+(2::nat)"}:
+@{subgoals[display,indent=0]}
+This is an arithmetic triviality, but unfortunately the
+@{text arith} method cannot handle this specific form of an
+elimination rule. We have to do a case split on @{text P} first,
+which can be conveniently done using the @{text
+classical} rule. Pattern compatibility and termination are automatic as usual.
+*}
+(*<*)ML "goals_limit := 10"(*>*)
+apply (rule classical, simp, arith)
+apply auto
+done
+termination by lexicographic_order
+text {*
+We can stretch the notion of pattern matching even more. The
+following function is not a sensible functional program, but a
+perfectly valid mathematical definition:
+*}
+function ev :: "nat \<Rightarrow> bool"
+where
+"ev (2 * n) = True"
+| "ev (2 * n + 1) = False"
+by (rule classical, simp) arith+
+termination by (relation "{}") simp
+text {*
+This general notion of pattern matching gives you the full freedom
+of mathematical specifications. However, as always, freedom should
+be used with care:
+If we leave the area of constructor
+patterns, we have effectively departed from the world of functional
+programming. This means that it is no longer possible to use the
+code generator, and expect it to generate ML code for our
+definitions. Also, such a specification might not work very well together with
+simplification. Your mileage may vary.
+*}
+subsection {* Conditional equations *}
+text {*
+The function package also supports conditional equations, which are
+similar to guards in a language like Haskell. Here is Euclid's
+algorithm written with conditional patterns\footnote{Note that the
+patterns are also overlapping in the base case}:
+*}
+function gcd :: "nat \<Rightarrow> nat \<Rightarrow> nat"
+where
+"gcd x 0 = x"
+| "gcd 0 y = y"
+| "x < y \<Longrightarrow> gcd (Suc x) (Suc y) = gcd (Suc x) (y - x)"
+| "\<not> x < y \<Longrightarrow> gcd (Suc x) (Suc y) = gcd (x - y) (Suc y)"
+by (rule classical, auto, arith)
+termination by lexicographic_order
+text {*
+By now, you can probably guess what the proof obligations for the
+pattern completeness and compatibility look like.
+Again, functions with conditional patterns are not supported by the
+code generator.
+*}
+subsection {* Pattern matching on strings *}
+text {*
+As strings (as lists of characters) are normal data types, pattern
+matching on them is possible, but somewhat problematic. Consider the
+following definition:
+\end{isamarkuptext}
+\noindent\cmd{fun} @{text "check :: \"string \<Rightarrow> bool\""}\\%
+\cmd{where}\\%
+\hspace*{2ex}@{text "\"check (''good'') = True\""}\\%
+@{text "| \"check s = False\""}
+\begin{isamarkuptext}
+An invocation of the above \cmd{fun} command does not
+terminate. What is the problem? Strings are lists of characters, and
+characters are a data type with a lot of constructors. Splitting the
+catch-all pattern thus leads to an explosion of cases, which cannot
+be handled by Isabelle.
+There are two things we can do here. Either we write an explicit
+@{text "if"} on the right hand side, or we can use conditional patterns:
+*}
+function check :: "string \<Rightarrow> bool"
+where
+"check (''good'') = True"
+| "s \<noteq> ''good'' \<Longrightarrow> check s = False"
+by auto
 section {* Partiality *}
 text {*
 In HOL, all functions are total. A function @{term "f"} applied to
-@{term "x"} always has a value @{term "f x"}, and there is no notion
+@{term "x"} always has the value @{term "f x"}, and there is no notion
 of undefinedness.
-This property of HOL is the reason why we have to do termination
+This is why we have to do termination
-proofs when defining functions: The termination proof justifies the
+proofs when defining functions: The proof justifies that the
-definition of the function by wellfounded recursion.
+function can be defined by wellfounded recursion.
-However, the \cmd{function} package still supports partiality. Let's
+However, the \cmd{function} package does support partiality to a
-look at the following function which searches for a zero in the
+certain extent. Let's look at the following function which looks
-function f.
+for a zero of a given function f.
 *}
 function (*<*)(domintros, tailrec)(*>*)findzero :: "(nat \<Rightarrow> nat) \<Rightarrow> nat \<Rightarrow> nat"
 where
 "findzero f n = (if f n = 0 then n else findzero f (Suc n))"
 subsection {* Domain predicates *}
 text {*
 The trick is that Isabelle has not only defined the function @{const findzero}, but also
 a predicate @{term "findzero_dom"} that characterizes the values where the function
-terminates: the \emph{domain} of the function. In Isabelle/HOL, a
+terminates: the \emph{domain} of the function. If we treat a
-partial function is just a total function with an additional domain
+partial function just as a total function with an additional domain
-predicate. Like with total functions, we get simplification and
+predicate, we can derive simplification and
-induction rules, but they are guarded by the domain conditions and
+induction rules as we do for total functions. They are guarded
-called @{text psimps} and @{text pinduct}:
+by domain conditions and are called @{text psimps} and @{text
+pinduct}:
 *}
 thm findzero.psimps
 text {*
 text {*
 @{thm[display] findzero.pinduct}
 *}
 text {*
-As already mentioned, HOL does not support true partiality. All we
+Remember that all we
-are doing here is using some tricks to make a total function appear
+are doing here is use some tricks to make a total function appear
 as if it was partial. We can still write the term @{term "findzero
 (\<lambda>x. 1) 0"} and like any other term of type @{typ nat} it is equal
 to some natural number, although we might not be able to find out
-which one (we will discuss this further in \S\ref{default}). The
+which one. The function is \emph{underdefined}.
-function is \emph{underdefined}.
+But it is enough defined to prove something interesting about it. We
-But it is enough defined to prove something about it. We can prove
+can prove that if @{term "findzero f n"}
-that if @{term "findzero f n"}
 it terminates, it indeed returns a zero of @{term f}:
 *}
 lemma findzero_zero: "findzero_dom (f, n) \<Longrightarrow> f (findzero f n) = 0"
 txt {* This gives the following subgoals:
 @{subgoals[display,indent=0]}
-The premise in our lemma was used to satisfy the first premise in
+The hypothesis in our lemma was used to satisfy the first premise in
-the induction rule. However, now we can also use @{term
+the induction rule. However, we also get @{term
-"findzero_dom (f, n)"} as an assumption in the induction step. This
+"findzero_dom (f, n)"} as a local assumption in the induction step. This
 allows to unfold @{term "findzero f n"} using the @{text psimps}
-rule, and the rest is trivial. Since @{text psimps} rules carry the
+rule, and the rest is trivial. Since the @{text psimps} rules carry the
 @{text "[simp]"} attribute by default, we just need a single step:
 *}
 apply simp
 done
 is applied when calls to @{term findzero} are unfolded.
 *}
 text_raw {*
 \begin{figure}
-\begin{center}
+\hrule\vspace{6pt}
 \begin{minipage}{0.8\textwidth}
 \isabellestyle{it}
 \isastyle\isamarkuptrue
 *}
 lemma "\<lbrakk>findzero_dom (f, n); x \<in> {n ..< findzero f n}\<rbrakk> \<Longrightarrow> f x \<noteq> 0"
 proof (induct rule: findzero.pinduct)
 fix f n assume dom: "findzero_dom (f, n)"
-and IH: "\<lbrakk>f n \<noteq> 0; x \<in> {Suc n..<findzero f (Suc n)}\<rbrakk>
+and IH: "\<lbrakk>f n \<noteq> 0; x \<in> {Suc n ..< findzero f (Suc n)}\<rbrakk> \<Longrightarrow> f x \<noteq> 0"
-\<Longrightarrow> f x \<noteq> 0"
+and x_range: "x \<in> {n ..< findzero f n}"
-and x_range: "x \<in> {n..<findzero f n}"
 have "f n \<noteq> 0"
 proof
 assume "f n = 0"
 with dom have "findzero f n = n" by simp
 with x_range show False by auto
 thus "f x \<noteq> 0"
 proof
 assume "x = n"
 with `f n \<noteq> 0` show ?thesis by simp
 next
-assume "x \<in> {Suc n..<findzero f n}"
+assume "x \<in> {Suc n ..< findzero f n}"
 with dom and `f n \<noteq> 0` have "x \<in> {Suc n ..< findzero f (Suc n)}"
 by simp
 with IH and `f n \<noteq> 0`
 show ?thesis by simp
 qed
 qed
 text_raw {*
 \isamarkupfalse\isabellestyle{tt}
-\end{minipage}\end{center}
+\end{minipage}\vspace{6pt}\hrule
 \caption{A proof about a partial function}\label{findzero_isar}
 \end{figure}
 *}
 subsection {* Partial termination proofs *}
 lemmas with @{term False} as a premise.
 Essentially, we need some introduction rules for @{text
 findzero_dom}. The function package can prove such domain
 introduction rules automatically. But since they are not used very
-often (they are almost never needed if the function is total), they
+often (they are almost never needed if the function is total), this
-are disabled by default for efficiency reasons. So we have to go
+functionality is disabled by default for efficiency reasons. So we have to go
 back and ask for them explicitly by passing the @{text
 "(domintros)"} option to the function package:
+\vspace{1ex}
 \noindent\cmd{function} @{text "(domintros) findzero :: \"(nat \<Rightarrow> nat) \<Rightarrow> nat \<Rightarrow> nat\""}\\%
 \cmd{where}\isanewline%
 \ \ \ldots\\
-\cmd{by} @{text "pat_completeness auto"}\\%
+\noindent Now the package has proved an introduction rule for @{text findzero_dom}:
-Now the package has proved an introduction rule for @{text findzero_dom}:
 *}
 thm findzero.domintros
 text {*
 Since our function increases its argument at recursive calls, we
 need an induction principle which works \qt{backwards}. We will use
 @{text inc_induct}, which allows to do induction from a fixed number
 \qt{downwards}:
-@{thm[display] inc_induct}
+\begin{center}@{thm inc_induct}\hfill(@{text "inc_induct"})\end{center}
-Fig.~\ref{findzero_term} gives a detailed Isar proof of the fact
+Figure \ref{findzero_term} gives a detailed Isar proof of the fact
 that @{text findzero} terminates if there is a zero which is greater
 or equal to @{term n}. First we derive two useful rules which will
 solve the base case and the step case of the induction. The
 induction is then straightforward, except for the unusal induction
 principle.
 *}
 text_raw {*
 \begin{figure}
-\begin{center}
+\hrule\vspace{6pt}
 \begin{minipage}{0.8\textwidth}
 \isabellestyle{it}
 \isastyle\isamarkuptrue
 *}
 lemma findzero_termination:
-assumes "x >= n"
+assumes "x \<ge> n" and "f x = 0"
-assumes "f x = 0"
 shows "findzero_dom (f, n)"
 proof -
 have base: "findzero_dom (f, x)"
 by (rule findzero.domintros) (simp add:`f x = 0`)
 have step: "\<And>i. findzero_dom (f, Suc i)
 \<Longrightarrow> findzero_dom (f, i)"
 by (rule findzero.domintros) simp
-from `x \<ge> n`
+from `x \<ge> n` show ?thesis
-show ?thesis
 proof (induct rule:inc_induct)
-show "findzero_dom (f, x)"
+show "findzero_dom (f, x)" by (rule base)
-by (rule base)
 next
 fix i assume "findzero_dom (f, Suc i)"
-thus "findzero_dom (f, i)"
+thus "findzero_dom (f, i)" by (rule step)
-by (rule step)
 qed
 qed
 text_raw {*
 \isamarkupfalse\isabellestyle{tt}
-\end{minipage}\end{center}
+\end{minipage}\vspace{6pt}\hrule
 \caption{Termination proof for @{text findzero}}\label{findzero_term}
 \end{figure}
 *}
 text {*
 shows "findzero_dom (f, n)"
 using zero
 by (induct rule:inc_induct) (auto intro: findzero.domintros)
 text {*
-It is simple to combine the partial correctness result with the
+\noindent It is simple to combine the partial correctness result with the
 termination lemma:
 *}
 lemma findzero_total_correctness:
 "f x = 0 \<Longrightarrow> f (findzero f 0) = 0"
 predicate actually is. Actually, @{text findzero_dom} is just an
 abbreviation:
 @{abbrev[display] findzero_dom}
-The domain predicate is the accessible part of a relation @{const
+The domain predicate is the \emph{accessible part} of a relation @{const
 findzero_rel}, which was also created internally by the function
 package. @{const findzero_rel} is just a normal
-inductively defined predicate, so we can inspect its definition by
+inductive predicate, so we can inspect its definition by
 looking at the introduction rules @{text findzero_rel.intros}.
 In our case there is just a single rule:
 @{thm[display] findzero_rel.intros}
-The relation @{const findzero_rel}, expressed as a binary predicate,
+The predicate @{const findzero_rel}
 describes the \emph{recursion relation} of the function
 definition. The recursion relation is a binary relation on
 the arguments of the function that relates each argument to its
 recursive calls. In general, there is one introduction rule for each
 recursive call.
-The predicate @{term "acc findzero_rel"} is the \emph{accessible part} of
+The predicate @{term "acc findzero_rel"} is the accessible part of
 that relation. An argument belongs to the accessible part, if it can
-be reached in a finite number of steps.
+be reached in a finite number of steps (cf.~its definition in @{text
+"Accessible_Part.thy"}).
 Since the domain predicate is just an abbreviation, you can use
 lemmas for @{const acc} and @{const findzero_rel} directly. Some
 lemmas which are occasionally useful are @{text accI}, @{text
 acc_downward}, and of course the introduction and elimination rules
 text {*
 The domain predicate is our trick that allows us to model partiality
 in a world of total functions. The downside of this is that we have
 to carry it around all the time. The termination proof above allowed
 us to replace the abstract @{term "findzero_dom (f, n)"} by the more
-concrete @{term "(x \<ge> n \<and> f x = 0)"}, but the condition is still
+concrete @{term "(x \<ge> n \<and> f x = (0::nat))"}, but the condition is still
-there and it won't go away soon.
+there and can only be discharged for special cases.
+In particular, the domain predicate guards the unfolding of our
-In particular, the domain predicate guard the unfolding of our
 function, since it is there as a condition in the @{text psimp}
 rules.
-On the other hand, we must be happy about the domain predicate,
-since it guarantees that all this is at all possible without losing
-consistency.
 Now there is an important special case: We can actually get rid
 of the condition in the simplification rules, \emph{if the function
 is tail-recursive}. The reason is that for all tail-recursive
 equations there is a total function satisfying them, even if they
 are non-terminating.
+%  A function is tail recursive, if each call to the function is either
+%  equal
+%
+%  So the outer form of the
+%
+%if it can be written in the following
+%  form:
+%  {term[display] "f x = (if COND x then BASE x else f (LOOP x))"}
 The function package internally does the right construction and can
 derive the unconditional simp rules, if we ask it to do so. Luckily,
 our @{const "findzero"} function is tail-recursive, so we can just go
 back and add another option to the \cmd{function} command:
+\vspace{1ex}
 \noindent\cmd{function} @{text "(domintros, tailrec) findzero :: \"(nat \<Rightarrow> nat) \<Rightarrow> nat \<Rightarrow> nat\""}\\%
 \cmd{where}\isanewline%
 \ \ \ldots\\%
-Now, we actually get the unconditional simplification rules, even
+\noindent Now, we actually get unconditional simplification rules, even
 though the function is partial:
 *}
 thm findzero.simps
 text {*
 @{thm[display] findzero.simps}
-Of course these would make the simplifier loop, so we better remove
+\noindent Of course these would make the simplifier loop, so we better remove
 them from the simpset:
 *}
 declare findzero.simps[simp del]
+text {*
-text {* \fixme{Code generation ???} *}
+Getting rid of the domain conditions in the simplification rules is
+not only useful because it simplifies proofs. It is also required in
+order to use Isabelle's code generator to generate ML code
+from a function definition.
+Since the code generator only works with equations, it cannot be
+used with @{text "psimp"} rules. Thus, in order to generate code for
+partial functions, they must be defined as a tail recursion.
+Luckily, many functions have a relatively natural tail recursive
+definition.
+*}
 section {* Nested recursion *}
 text {*
 Recursive calls which are nested in one another frequently cause
 text {*
 As a general strategy, one should prove the statements needed for
 termination as a partial property first. Then they can be used to do
 the termination proof. This also works for less trivial
-examples. Figure \ref{f91} defines the well-known 91-function by
+examples. Figure \ref{f91} defines the 91-function, a well-known
-McCarthy \cite{?} and proves its termination.
+challenge problem due to John McCarthy, and proves its termination.
 *}
 text_raw {*
 \begin{figure}
-\begin{center}
+\hrule\vspace{6pt}
 \begin{minipage}{0.8\textwidth}
 \isabellestyle{it}
 \isastyle\isamarkuptrue
 *}
-function f91 :: "nat => nat"
+function f91 :: "nat \<Rightarrow> nat"
 where
 "f91 n = (if 100 < n then n - 10 else f91 (f91 (n + 11)))"
 by pat_completeness auto
 lemma f91_estimate:
 with `\<not> 100 < n` show "(f91 (n + 11), n) \<in> ?R" by simp
 qed
 text_raw {*
 \isamarkupfalse\isabellestyle{tt}
-\end{minipage}\end{center}
+\end{minipage}
+\vspace{6pt}\hrule
 \caption{McCarthy's 91-function}\label{f91}
 \end{figure}
 *}
 @{thm[display] mapeven.simps}
 \end{exercise}
 \begin{exercise}
-What happens if the congruence rule for @{const If} is
+Try what happens if the congruence rule for @{const If} is
 disabled by declaring @{text "if_cong[fundef_cong del]"}?
 \end{exercise}
 Note that in some cases there is no \qt{best} congruence rule.
-\fixme
+\fixme{}
 *}
-section {* Appendix: Predefined Congruence Rules *}
-(*<*)
-syntax (Rule output)
-"==>" :: "prop \<Rightarrow> prop \<Rightarrow> prop"
-("\<^raw:\mbox{}\inferrule{\mbox{>_\<^raw:}}>\<^raw:{\mbox{>_\<^raw:}}>")
-"_bigimpl" :: "asms \<Rightarrow> prop \<Rightarrow> prop"
-("\<^raw:\mbox{}\inferrule{>_\<^raw:}>\<^raw:{\mbox{>_\<^raw:}}>")
-"_asms" :: "prop \<Rightarrow> asms \<Rightarrow> asms"
-("\<^raw:\mbox{>_\<^raw:}\\>/ _")
-"_asm" :: "prop \<Rightarrow> asms" ("\<^raw:\mbox{>_\<^raw:}>")
-definition
-FixImp :: "prop \<Rightarrow> prop \<Rightarrow> prop"
-where
-"FixImp (PROP A) (PROP B) \<equiv> (PROP A \<Longrightarrow> PROP B)"
-notation (output)
-FixImp (infixr "\<Longrightarrow>" 1)
-setup {*
-let
-val fix_imps = map_aterms (fn Const ("==>", T) => Const ("Functions.FixImp", T) | t => t)
-fun transform t = Logic.list_implies (map fix_imps (Logic.strip_imp_prems t), Logic.strip_imp_concl t)
-in
-TermStyle.add_style "topl" (K transform)
 end
-*}
-(*>*)
-subsection {* Basic Control Structures *}
-text {*
-@{thm_style[mode=Rule] topl if_cong}
-@{thm_style [mode=Rule] topl let_cong}
-*}
-subsection {* Data Types *}
-text {*
-For each \cmd{datatype} definition, a congruence rule for the case
-combinator is registeres automatically. Here are the rules for
-@{text "nat"} and @{text "list"}:
-\begin{center}@{thm_style[mode=Rule] topl nat.case_cong}\end{center}
-\begin{center}@{thm_style[mode=Rule] topl list.case_cong}\end{center}
-*}
-subsection {* List combinators *}
-text {*
-@{thm_style[mode=Rule] topl map_cong}
-@{thm_style[mode=Rule] topl filter_cong}
-@{thm_style[mode=Rule] topl foldr_cong}
-@{thm_style[mode=Rule] topl foldl_cong}
-Similar: takewhile, dropwhile
-*}
-subsection {* Sets *}
-text {*
-@{thm_style[mode=Rule] topl ball_cong}
-@{thm_style[mode=Rule] topl bex_cong}
-@{thm_style[mode=Rule] topl UN_cong}
-@{thm_style[mode=Rule] topl INT_cong}
-@{thm_style[mode=Rule] topl image_cong}
-*}
-end

changeset 23188	595a0e24bd8e
parent 23003	4b0bf04a4d68
child 23805	953eb3c5f793