src/Doc/Prog_Prove/Types_and_funs.thy
author wenzelm
Sat Nov 01 14:20:38 2014 +0100 (2014-11-01)
changeset 58860 fee7cfa69c50
parent 58620 7435b6a3f72e
child 64862 2baa926a958d
permissions -rw-r--r--
eliminated spurious semicolons;
nipkow@47269
     1
(*<*)
nipkow@47269
     2
theory Types_and_funs
nipkow@47269
     3
imports Main
nipkow@47269
     4
begin
nipkow@47269
     5
(*>*)
nipkow@47269
     6
text{*
nipkow@47269
     7
\vspace{-5ex}
nipkow@52361
     8
\section{Type and Function Definitions}
nipkow@47269
     9
nipkow@47269
    10
Type synonyms are abbreviations for existing types, for example
nipkow@58521
    11
\index{string@@{text string}}*}
nipkow@47269
    12
nipkow@47269
    13
type_synonym string = "char list"
nipkow@47269
    14
nipkow@58521
    15
text{*
nipkow@47269
    16
Type synonyms are expanded after parsing and are not present in internal representation and output. They are mere conveniences for the reader.
nipkow@47269
    17
nipkow@47269
    18
\subsection{Datatypes}
kleing@51465
    19
\label{sec:datatypes}
nipkow@47269
    20
nipkow@47269
    21
The general form of a datatype definition looks like this:
nipkow@47269
    22
\begin{quote}
nipkow@47269
    23
\begin{tabular}{@ {}rclcll}
nipkow@55348
    24
\indexed{\isacom{datatype}}{datatype} @{text "('a\<^sub>1,\<dots>,'a\<^sub>n)t"}
nipkow@47269
    25
     & = & $C_1 \ @{text"\""}\tau_{1,1}@{text"\""} \dots @{text"\""}\tau_{1,n_1}@{text"\""}$ \\
nipkow@47269
    26
     & $|$ & \dots \\
nipkow@47269
    27
     & $|$ & $C_k \ @{text"\""}\tau_{k,1}@{text"\""} \dots @{text"\""}\tau_{k,n_k}@{text"\""}$
nipkow@47269
    28
\end{tabular}
nipkow@47269
    29
\end{quote}
nipkow@47269
    30
It introduces the constructors \
wenzelm@53015
    31
$C_i :: \tau_{i,1}\Rightarrow \cdots \Rightarrow \tau_{i,n_i} \Rightarrow$~@{text "('a\<^sub>1,\<dots>,'a\<^sub>n)t"} \ and expresses that any value of this type is built from these constructors in a unique manner. Uniqueness is implied by the following
nipkow@47269
    32
properties of the constructors:
nipkow@47269
    33
\begin{itemize}
nipkow@47269
    34
\item \emph{Distinctness:} $C_i\ \ldots \neq C_j\ \dots$ \quad if $i \neq j$
nipkow@47269
    35
\item \emph{Injectivity:}
nipkow@47269
    36
\begin{tabular}[t]{l}
nipkow@47269
    37
 $(C_i \ x_1 \dots x_{n_i} = C_i \ y_1 \dots y_{n_i}) =$\\
nipkow@47269
    38
 $(x_1 = y_1 \land \dots \land x_{n_i} = y_{n_i})$
nipkow@47269
    39
\end{tabular}
nipkow@47269
    40
\end{itemize}
nipkow@47269
    41
The fact that any value of the datatype is built from the constructors implies
nipkow@55348
    42
the \concept{structural induction}\index{induction} rule: to show
wenzelm@53015
    43
$P~x$ for all $x$ of type @{text "('a\<^sub>1,\<dots>,'a\<^sub>n)t"},
nipkow@47269
    44
one needs to show $P(C_i\ x_1 \dots x_{n_i})$ (for each $i$) assuming
wenzelm@53015
    45
$P(x_j)$ for all $j$ where $\tau_{i,j} =$~@{text "('a\<^sub>1,\<dots>,'a\<^sub>n)t"}.
nipkow@47269
    46
Distinctness and injectivity are applied automatically by @{text auto}
nipkow@47269
    47
and other proof methods. Induction must be applied explicitly.
nipkow@47269
    48
nipkow@58605
    49
Like in functional programming languages, datatype values can be taken apart
nipkow@58605
    50
with case expressions\index{case expression}\index{case expression@@{text "case ... of"}}, for example
nipkow@47269
    51
\begin{quote}
nipkow@47269
    52
\noquotes{@{term[source] "(case xs of [] \<Rightarrow> 0 | x # _ \<Rightarrow> Suc x)"}}
nipkow@47269
    53
\end{quote}
nipkow@58605
    54
Case expressions must be enclosed in parentheses.
nipkow@47269
    55
nipkow@58605
    56
As an example of a datatype beyond @{typ nat} and @{text list}, consider binary trees:
nipkow@47269
    57
*}
nipkow@47269
    58
nipkow@47306
    59
datatype 'a tree = Tip | Node  "'a tree"  'a  "'a tree"
nipkow@47269
    60
nipkow@47269
    61
text{* with a mirror function: *}
nipkow@47269
    62
nipkow@47269
    63
fun mirror :: "'a tree \<Rightarrow> 'a tree" where
nipkow@47269
    64
"mirror Tip = Tip" |
nipkow@47269
    65
"mirror (Node l a r) = Node (mirror r) a (mirror l)"
nipkow@47269
    66
nipkow@47269
    67
text{* The following lemma illustrates induction: *}
nipkow@47269
    68
nipkow@47269
    69
lemma "mirror(mirror t) = t"
nipkow@47269
    70
apply(induction t)
nipkow@47269
    71
nipkow@47269
    72
txt{* yields
nipkow@47269
    73
@{subgoals[display]}
nipkow@47269
    74
The induction step contains two induction hypotheses, one for each subtree.
nipkow@47269
    75
An application of @{text auto} finishes the proof.
nipkow@47269
    76
nipkow@47269
    77
A very simple but also very useful datatype is the predefined
nipkow@55348
    78
@{datatype[display] option}\index{option@@{text option}}\index{None@@{const None}}\index{Some@@{const Some}}
nipkow@47269
    79
Its sole purpose is to add a new element @{const None} to an existing
nipkow@47269
    80
type @{typ 'a}. To make sure that @{const None} is distinct from all the
nipkow@47269
    81
elements of @{typ 'a}, you wrap them up in @{const Some} and call
nipkow@47269
    82
the new type @{typ"'a option"}. A typical application is a lookup function
nipkow@47269
    83
on a list of key-value pairs, often called an association list:
nipkow@47269
    84
*}
nipkow@47269
    85
(*<*)
nipkow@47269
    86
apply auto
nipkow@47269
    87
done
nipkow@47269
    88
(*>*)
nipkow@47269
    89
fun lookup :: "('a * 'b) list \<Rightarrow> 'a \<Rightarrow> 'b option" where
nipkow@47306
    90
"lookup [] x = None" |
nipkow@47306
    91
"lookup ((a,b) # ps) x = (if a = x then Some b else lookup ps x)"
nipkow@47269
    92
nipkow@47269
    93
text{*
wenzelm@53015
    94
Note that @{text"\<tau>\<^sub>1 * \<tau>\<^sub>2"} is the type of pairs, also written @{text"\<tau>\<^sub>1 \<times> \<tau>\<^sub>2"}.
nipkow@51393
    95
Pairs can be taken apart either by pattern matching (as above) or with the
nipkow@57804
    96
projection functions @{const fst} and @{const snd}: @{thm fst_conv[of x y]} and @{thm snd_conv[of x y]}.
nipkow@57804
    97
Tuples are simulated by pairs nested to the right: @{term"(a,b,c)"}
nipkow@57804
    98
is short for @{text"(a, (b, c))"} and @{text "\<tau>\<^sub>1 \<times> \<tau>\<^sub>2 \<times> \<tau>\<^sub>3"} is short for
wenzelm@53015
    99
@{text "\<tau>\<^sub>1 \<times> (\<tau>\<^sub>2 \<times> \<tau>\<^sub>3)"}.
nipkow@47269
   100
nipkow@47269
   101
\subsection{Definitions}
nipkow@47269
   102
nipkow@56989
   103
Non-recursive functions can be defined as in the following example:
nipkow@55348
   104
\index{definition@\isacom{definition}}*}
nipkow@47269
   105
nipkow@47269
   106
definition sq :: "nat \<Rightarrow> nat" where
nipkow@47269
   107
"sq n = n * n"
nipkow@47269
   108
nipkow@47269
   109
text{* Such definitions do not allow pattern matching but only
wenzelm@53015
   110
@{text"f x\<^sub>1 \<dots> x\<^sub>n = t"}, where @{text f} does not occur in @{text t}.
nipkow@47269
   111
nipkow@47269
   112
\subsection{Abbreviations}
nipkow@47269
   113
nipkow@47269
   114
Abbreviations are similar to definitions:
nipkow@55348
   115
\index{abbreviation@\isacom{abbreviation}}*}
nipkow@47269
   116
nipkow@47269
   117
abbreviation sq' :: "nat \<Rightarrow> nat" where
nipkow@52045
   118
"sq' n \<equiv> n * n"
nipkow@47269
   119
nipkow@47269
   120
text{* The key difference is that @{const sq'} is only syntactic sugar:
nipkow@58521
   121
after parsing, @{term"sq' t"} is replaced by \mbox{@{term"t*t"}};
nipkow@52045
   122
before printing, every occurrence of @{term"u*u"} is replaced by
nipkow@52045
   123
\mbox{@{term"sq' u"}}.  Internally, @{const sq'} does not exist.
nipkow@52045
   124
This is the
nipkow@47269
   125
advantage of abbreviations over definitions: definitions need to be expanded
nipkow@52045
   126
explicitly (\autoref{sec:rewr-defs}) whereas abbreviations are already
nipkow@47269
   127
expanded upon parsing. However, abbreviations should be introduced sparingly:
nipkow@47269
   128
if abused, they can lead to a confusing discrepancy between the internal and
nipkow@47269
   129
external view of a term.
nipkow@47269
   130
nipkow@52045
   131
The ASCII representation of @{text"\<equiv>"} is \texttt{==} or \xsymbol{equiv}.
nipkow@52045
   132
nipkow@52361
   133
\subsection{Recursive Functions}
nipkow@47269
   134
\label{sec:recursive-funs}
nipkow@47269
   135
nipkow@55348
   136
Recursive functions are defined with \indexed{\isacom{fun}}{fun} by pattern matching
nipkow@56989
   137
over datatype constructors. The order of equations matters, as in
nipkow@47269
   138
functional programming languages. However, all HOL functions must be
nipkow@58602
   139
total. This simplifies the logic --- terms are always defined --- but means
nipkow@47269
   140
that recursive functions must terminate. Otherwise one could define a
nipkow@47269
   141
function @{prop"f n = f n + (1::nat)"} and conclude \mbox{@{prop"(0::nat) = 1"}} by
nipkow@47269
   142
subtracting @{term"f n"} on both sides.
nipkow@47269
   143
nipkow@47711
   144
Isabelle's automatic termination checker requires that the arguments of
nipkow@47269
   145
recursive calls on the right-hand side must be strictly smaller than the
nipkow@47269
   146
arguments on the left-hand side. In the simplest case, this means that one
nipkow@47269
   147
fixed argument position decreases in size with each recursive call. The size
nipkow@58504
   148
is measured as the number of constructors (excluding 0-ary ones, e.g., @{text
nipkow@47711
   149
Nil}). Lexicographic combinations are also recognized. In more complicated
nipkow@47269
   150
situations, the user may have to prove termination by hand. For details
wenzelm@58620
   151
see~@{cite Krauss}.
nipkow@47269
   152
nipkow@47269
   153
Functions defined with \isacom{fun} come with their own induction schema
nipkow@47269
   154
that mirrors the recursion schema and is derived from the termination
nipkow@47269
   155
order. For example,
nipkow@47269
   156
*}
nipkow@47269
   157
nipkow@47269
   158
fun div2 :: "nat \<Rightarrow> nat" where
nipkow@47269
   159
"div2 0 = 0" |
nipkow@54136
   160
"div2 (Suc 0) = 0" |
nipkow@47269
   161
"div2 (Suc(Suc n)) = Suc(div2 n)"
nipkow@47269
   162
nipkow@47269
   163
text{* does not just define @{const div2} but also proves a
nipkow@47711
   164
customized induction rule:
nipkow@47269
   165
\[
nipkow@47269
   166
\inferrule{
nipkow@47269
   167
\mbox{@{thm (prem 1) div2.induct}}\\
nipkow@47269
   168
\mbox{@{thm (prem 2) div2.induct}}\\
nipkow@47269
   169
\mbox{@{thm (prem 3) div2.induct}}}
nipkow@47269
   170
{\mbox{@{thm (concl) div2.induct[of _ "m"]}}}
nipkow@47269
   171
\]
nipkow@47711
   172
This customized induction rule can simplify inductive proofs. For example,
nipkow@47269
   173
*}
nipkow@47269
   174
nipkow@54571
   175
lemma "div2(n) = n div 2"
nipkow@47269
   176
apply(induction n rule: div2.induct)
nipkow@47269
   177
nipkow@54571
   178
txt{* (where the infix @{text div} is the predefined division operation)
nipkow@56989
   179
yields the subgoals
nipkow@47269
   180
@{subgoals[display,margin=65]}
nipkow@47269
   181
An application of @{text auto} finishes the proof.
nipkow@47269
   182
Had we used ordinary structural induction on @{text n},
nipkow@47269
   183
the proof would have needed an additional
nipkow@47711
   184
case analysis in the induction step.
nipkow@47269
   185
nipkow@54571
   186
This example leads to the following induction heuristic:
nipkow@54571
   187
\begin{quote}
nipkow@54571
   188
\emph{Let @{text f} be a recursive function.
nipkow@54571
   189
If the definition of @{text f} is more complicated
nipkow@54571
   190
than having one equation for each constructor of some datatype,
nipkow@55348
   191
then properties of @{text f} are best proved via @{text "f.induct"}.\index{inductionrule@@{text".induct"}}}
nipkow@54571
   192
\end{quote}
nipkow@54571
   193
nipkow@47269
   194
The general case is often called \concept{computation induction},
nipkow@47269
   195
because the induction follows the (terminating!) computation.
nipkow@47269
   196
For every defining equation
nipkow@47269
   197
\begin{quote}
wenzelm@53015
   198
@{text "f(e) = \<dots> f(r\<^sub>1) \<dots> f(r\<^sub>k) \<dots>"}
nipkow@47269
   199
\end{quote}
wenzelm@53015
   200
where @{text"f(r\<^sub>i)"}, @{text"i=1\<dots>k"}, are all the recursive calls,
nipkow@47269
   201
the induction rule @{text"f.induct"} contains one premise of the form
nipkow@47269
   202
\begin{quote}
wenzelm@53015
   203
@{text"P(r\<^sub>1) \<Longrightarrow> \<dots> \<Longrightarrow> P(r\<^sub>k) \<Longrightarrow> P(e)"}
nipkow@47269
   204
\end{quote}
wenzelm@53015
   205
If @{text "f :: \<tau>\<^sub>1 \<Rightarrow> \<dots> \<Rightarrow> \<tau>\<^sub>n \<Rightarrow> \<tau>"} then @{text"f.induct"} is applied like this:
nipkow@47269
   206
\begin{quote}
nipkow@55348
   207
\isacom{apply}@{text"(induction x\<^sub>1 \<dots> x\<^sub>n rule: f.induct)"}\index{inductionrule@@{text"induction ... rule:"}}
nipkow@47269
   208
\end{quote}
wenzelm@53015
   209
where typically there is a call @{text"f x\<^sub>1 \<dots> x\<^sub>n"} in the goal.
nipkow@47269
   210
But note that the induction rule does not mention @{text f} at all,
nipkow@47269
   211
except in its name, and is applicable independently of @{text f}.
nipkow@47269
   212
nipkow@54136
   213
nipkow@54436
   214
\subsection*{Exercises}
nipkow@54136
   215
nipkow@54136
   216
\begin{exercise}
nipkow@54194
   217
Starting from the type @{text "'a tree"} defined in the text, define
nipkow@54194
   218
a function @{text "contents ::"} @{typ "'a tree \<Rightarrow> 'a list"}
nipkow@54194
   219
that collects all values in a tree in a list, in any order,
nipkow@54194
   220
without removing duplicates.
nipkow@54194
   221
Then define a function @{text "treesum ::"} @{typ "nat tree \<Rightarrow> nat"}
nipkow@54194
   222
that sums up all values in a tree of natural numbers
nipkow@54194
   223
and prove @{prop "treesum t = listsum(contents t)"}.
nipkow@54194
   224
\end{exercise}
nipkow@54194
   225
nipkow@54194
   226
\begin{exercise}
nipkow@54194
   227
Define a new type @{text "'a tree2"} of binary trees where values are also
nipkow@54194
   228
stored in the leaves of the tree.  Also reformulate the
nipkow@54194
   229
@{const mirror} function accordingly. Define two functions
nipkow@54194
   230
@{text "pre_order"} and @{text "post_order"} of type @{text "'a tree2 \<Rightarrow> 'a list"}
nipkow@54194
   231
that traverse a tree and collect all stored values in the respective order in
nipkow@54194
   232
a list. Prove @{prop "pre_order (mirror t) = rev (post_order t)"}.
nipkow@54194
   233
\end{exercise}
nipkow@54194
   234
nipkow@54194
   235
\begin{exercise}
nipkow@54571
   236
Define a function @{text "intersperse ::"} @{typ "'a \<Rightarrow> 'a list \<Rightarrow> 'a list"}
nipkow@54571
   237
such that @{text "intersperse a [x\<^sub>1, ..., x\<^sub>n] = [x\<^sub>1, a, x\<^sub>2, a, ..., a, x\<^sub>n]"}.
nipkow@54571
   238
Now prove that @{prop "map f (intersperse a xs) = intersperse (f a) (map f xs)"}.
nipkow@54136
   239
\end{exercise}
nipkow@54136
   240
nipkow@54136
   241
nipkow@55348
   242
\section{Induction Heuristics}\index{induction heuristics}
nipkow@47269
   243
nipkow@47269
   244
We have already noted that theorems about recursive functions are proved by
nipkow@47269
   245
induction. In case the function has more than one argument, we have followed
nipkow@47269
   246
the following heuristic in the proofs about the append function:
nipkow@47269
   247
\begin{quote}
nipkow@47269
   248
\emph{Perform induction on argument number $i$\\
nipkow@47269
   249
 if the function is defined by recursion on argument number $i$.}
nipkow@47269
   250
\end{quote}
nipkow@47269
   251
The key heuristic, and the main point of this section, is to
nipkow@58502
   252
\emph{generalize the goal before induction}.
nipkow@47269
   253
The reason is simple: if the goal is
nipkow@47269
   254
too specific, the induction hypothesis is too weak to allow the induction
nipkow@47269
   255
step to go through. Let us illustrate the idea with an example.
nipkow@47269
   256
nipkow@47269
   257
Function @{const rev} has quadratic worst-case running time
nipkow@47269
   258
because it calls append for each element of the list and
nipkow@47269
   259
append is linear in its first argument.  A linear time version of
nipkow@47269
   260
@{const rev} requires an extra argument where the result is accumulated
nipkow@47269
   261
gradually, using only~@{text"#"}:
nipkow@47269
   262
*}
nipkow@47269
   263
(*<*)
nipkow@47269
   264
apply auto
nipkow@47269
   265
done
nipkow@47269
   266
(*>*)
nipkow@47269
   267
fun itrev :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a list" where
nipkow@47711
   268
"itrev []        ys = ys" |
nipkow@47269
   269
"itrev (x#xs) ys = itrev xs (x#ys)"
nipkow@47269
   270
nipkow@47269
   271
text{* The behaviour of @{const itrev} is simple: it reverses
nipkow@47269
   272
its first argument by stacking its elements onto the second argument,
nipkow@47711
   273
and it returns that second argument when the first one becomes
nipkow@47269
   274
empty. Note that @{const itrev} is tail-recursive: it can be
nipkow@56989
   275
compiled into a loop; no stack is necessary for executing it.
nipkow@47269
   276
nipkow@47269
   277
Naturally, we would like to show that @{const itrev} does indeed reverse
nipkow@47269
   278
its first argument provided the second one is empty:
nipkow@47269
   279
*}
nipkow@47269
   280
nipkow@47269
   281
lemma "itrev xs [] = rev xs"
nipkow@47269
   282
nipkow@47269
   283
txt{* There is no choice as to the induction variable:
nipkow@47269
   284
*}
nipkow@47269
   285
nipkow@47269
   286
apply(induction xs)
nipkow@47269
   287
apply(auto)
nipkow@47269
   288
nipkow@47269
   289
txt{*
nipkow@47269
   290
Unfortunately, this attempt does not prove
nipkow@47269
   291
the induction step:
nipkow@47269
   292
@{subgoals[display,margin=70]}
nipkow@47269
   293
The induction hypothesis is too weak.  The fixed
nipkow@47269
   294
argument,~@{term"[]"}, prevents it from rewriting the conclusion.  
nipkow@47269
   295
This example suggests a heuristic:
nipkow@47269
   296
\begin{quote}
nipkow@58502
   297
\emph{Generalize goals for induction by replacing constants by variables.}
nipkow@47269
   298
\end{quote}
nipkow@58519
   299
Of course one cannot do this naively: @{prop"itrev xs ys = rev xs"} is
nipkow@58502
   300
just not true.  The correct generalization is
wenzelm@58860
   301
*}
wenzelm@58860
   302
(*<*)oops(*>*)
nipkow@47269
   303
lemma "itrev xs ys = rev xs @ ys"
nipkow@47269
   304
(*<*)apply(induction xs, auto)(*>*)
nipkow@47269
   305
txt{*
nipkow@47269
   306
If @{text ys} is replaced by @{term"[]"}, the right-hand side simplifies to
nipkow@47269
   307
@{term"rev xs"}, as required.
nipkow@58502
   308
In this instance it was easy to guess the right generalization.
nipkow@47269
   309
Other situations can require a good deal of creativity.  
nipkow@47269
   310
nipkow@47269
   311
Although we now have two variables, only @{text xs} is suitable for
nipkow@47269
   312
induction, and we repeat our proof attempt. Unfortunately, we are still
nipkow@47269
   313
not there:
nipkow@47269
   314
@{subgoals[display,margin=65]}
nipkow@47269
   315
The induction hypothesis is still too weak, but this time it takes no
nipkow@58502
   316
intuition to generalize: the problem is that the @{text ys}
nipkow@47269
   317
in the induction hypothesis is fixed,
nipkow@47269
   318
but the induction hypothesis needs to be applied with
nipkow@47269
   319
@{term"a # ys"} instead of @{text ys}. Hence we prove the theorem
nipkow@47269
   320
for all @{text ys} instead of a fixed one. We can instruct induction
nipkow@58502
   321
to perform this generalization for us by adding @{text "arbitrary: ys"}\index{arbitrary@@{text"arbitrary:"}}.
nipkow@47269
   322
*}
nipkow@47269
   323
(*<*)oops
nipkow@47269
   324
lemma "itrev xs ys = rev xs @ ys"
nipkow@47269
   325
(*>*)
nipkow@47269
   326
apply(induction xs arbitrary: ys)
nipkow@47269
   327
nipkow@47269
   328
txt{* The induction hypothesis in the induction step is now universally quantified over @{text ys}:
nipkow@47269
   329
@{subgoals[display,margin=65]}
nipkow@47269
   330
Thus the proof succeeds:
nipkow@47269
   331
*}
nipkow@47269
   332
nipkow@47269
   333
apply auto
nipkow@47269
   334
done
nipkow@47269
   335
nipkow@47269
   336
text{*
nipkow@58502
   337
This leads to another heuristic for generalization:
nipkow@47269
   338
\begin{quote}
nipkow@58502
   339
\emph{Generalize induction by generalizing all free
nipkow@47269
   340
variables\\ {\em(except the induction variable itself)}.}
nipkow@47269
   341
\end{quote}
nipkow@58502
   342
Generalization is best performed with @{text"arbitrary: y\<^sub>1 \<dots> y\<^sub>k"}. 
nipkow@47269
   343
This heuristic prevents trivial failures like the one above.
nipkow@47269
   344
However, it should not be applied blindly.
nipkow@47269
   345
It is not always required, and the additional quantifiers can complicate
nipkow@47269
   346
matters in some cases. The variables that need to be quantified are typically
nipkow@47269
   347
those that change in recursive calls.
nipkow@47269
   348
nipkow@54136
   349
nipkow@54436
   350
\subsection*{Exercises}
nipkow@54136
   351
nipkow@54136
   352
\begin{exercise}
nipkow@54136
   353
Write a tail-recursive variant of the @{text add} function on @{typ nat}:
nipkow@54136
   354
@{term "itadd :: nat \<Rightarrow> nat \<Rightarrow> nat"}.
nipkow@54136
   355
Tail-recursive means that in the recursive case, @{text itadd} needs to call
nipkow@54136
   356
itself directly: \mbox{@{term"itadd (Suc m) n"}} @{text"= itadd \<dots>"}.
nipkow@54136
   357
Prove @{prop "itadd m n = add m n"}.
nipkow@54136
   358
\end{exercise}
nipkow@54136
   359
nipkow@54136
   360
nipkow@47269
   361
\section{Simplification}
nipkow@47269
   362
nipkow@55317
   363
So far we have talked a lot about simplifying terms without explaining the concept.
nipkow@55317
   364
\conceptidx{Simplification}{simplification} means
nipkow@47269
   365
\begin{itemize}
nipkow@47269
   366
\item using equations $l = r$ from left to right (only),
nipkow@47269
   367
\item as long as possible.
nipkow@47269
   368
\end{itemize}
nipkow@47711
   369
To emphasize the directionality, equations that have been given the
nipkow@55317
   370
@{text"simp"} attribute are called \conceptidx{simplification rules}{simplification rule}.
nipkow@55317
   371
Logically, they are still symmetric, but proofs by
nipkow@47269
   372
simplification use them only in the left-to-right direction.  The proof tool
nipkow@47269
   373
that performs simplifications is called the \concept{simplifier}. It is the
nipkow@47269
   374
basis of @{text auto} and other related proof methods.
nipkow@47269
   375
nipkow@47269
   376
The idea of simplification is best explained by an example. Given the
nipkow@47269
   377
simplification rules
nipkow@47269
   378
\[
nipkow@47269
   379
\begin{array}{rcl@ {\quad}l}
nipkow@47269
   380
@{term"0 + n::nat"} &@{text"="}& @{text n} & (1) \\
nipkow@47269
   381
@{term"(Suc m) + n"} &@{text"="}& @{term"Suc (m + n)"} & (2) \\
nipkow@47269
   382
@{text"(Suc m \<le> Suc n)"} &@{text"="}& @{text"(m \<le> n)"} & (3)\\
nipkow@47269
   383
@{text"(0 \<le> m)"} &@{text"="}& @{const True} & (4)
nipkow@47269
   384
\end{array}
nipkow@47269
   385
\]
nipkow@47269
   386
the formula @{prop"0 + Suc 0 \<le> Suc 0 + x"} is simplified to @{const True}
nipkow@47269
   387
as follows:
nipkow@47269
   388
\[
nipkow@47269
   389
\begin{array}{r@ {}c@ {}l@ {\quad}l}
nipkow@47269
   390
@{text"(0 + Suc 0"} & \leq & @{text"Suc 0 + x)"}  & \stackrel{(1)}{=} \\
nipkow@47269
   391
@{text"(Suc 0"}     & \leq & @{text"Suc 0 + x)"}  & \stackrel{(2)}{=} \\
nipkow@57817
   392
@{text"(Suc 0"}     & \leq & @{text"Suc (0 + x))"} & \stackrel{(3)}{=} \\
nipkow@47269
   393
@{text"(0"}         & \leq & @{text"0 + x)"}      & \stackrel{(4)}{=} \\[1ex]
nipkow@47269
   394
 & @{const True}
nipkow@47269
   395
\end{array}
nipkow@47269
   396
\]
nipkow@47269
   397
Simplification is often also called \concept{rewriting}
nipkow@55317
   398
and simplification rules \conceptidx{rewrite rules}{rewrite rule}.
nipkow@47269
   399
nipkow@52361
   400
\subsection{Simplification Rules}
nipkow@47269
   401
nipkow@47269
   402
The attribute @{text"simp"} declares theorems to be simplification rules,
nipkow@47269
   403
which the simplifier will use automatically. In addition,
nipkow@47269
   404
\isacom{datatype} and \isacom{fun} commands implicitly declare some
nipkow@47269
   405
simplification rules: \isacom{datatype} the distinctness and injectivity
nipkow@47269
   406
rules, \isacom{fun} the defining equations. Definitions are not declared
nipkow@47269
   407
as simplification rules automatically! Nearly any theorem can become a
nipkow@47269
   408
simplification rule. The simplifier will try to transform it into an
nipkow@47269
   409
equation. For example, the theorem @{prop"\<not> P"} is turned into @{prop"P = False"}.
nipkow@47269
   410
nipkow@47269
   411
Only equations that really simplify, like @{prop"rev (rev xs) = xs"} and
nipkow@47269
   412
@{prop"xs @ [] = xs"}, should be declared as simplification
nipkow@47269
   413
rules. Equations that may be counterproductive as simplification rules
nipkow@58521
   414
should only be used in specific proof steps (see \autoref{sec:simp} below).
nipkow@47269
   415
Distributivity laws, for example, alter the structure of terms
nipkow@47269
   416
and can produce an exponential blow-up.
nipkow@47269
   417
nipkow@52361
   418
\subsection{Conditional Simplification Rules}
nipkow@47269
   419
nipkow@47269
   420
Simplification rules can be conditional.  Before applying such a rule, the
nipkow@47269
   421
simplifier will first try to prove the preconditions, again by
nipkow@47269
   422
simplification. For example, given the simplification rules
nipkow@47269
   423
\begin{quote}
nipkow@47269
   424
@{prop"p(0::nat) = True"}\\
nipkow@47269
   425
@{prop"p(x) \<Longrightarrow> f(x) = g(x)"},
nipkow@47269
   426
\end{quote}
nipkow@47269
   427
the term @{term"f(0::nat)"} simplifies to @{term"g(0::nat)"}
nipkow@47269
   428
but @{term"f(1::nat)"} does not simplify because @{prop"p(1::nat)"}
nipkow@47269
   429
is not provable.
nipkow@47269
   430
nipkow@47269
   431
\subsection{Termination}
nipkow@47269
   432
nipkow@47269
   433
Simplification can run forever, for example if both @{prop"f x = g x"} and
nipkow@47269
   434
@{prop"g(x) = f(x)"} are simplification rules. It is the user's
nipkow@47269
   435
responsibility not to include simplification rules that can lead to
nipkow@47269
   436
nontermination, either on their own or in combination with other
nipkow@47269
   437
simplification rules. The right-hand side of a simplification rule should
nipkow@58602
   438
always be ``simpler'' than the left-hand side --- in some sense. But since
nipkow@47269
   439
termination is undecidable, such a check cannot be automated completely
nipkow@47269
   440
and Isabelle makes little attempt to detect nontermination.
nipkow@47269
   441
nipkow@47269
   442
When conditional simplification rules are applied, their preconditions are
nipkow@47269
   443
proved first. Hence all preconditions need to be
nipkow@47269
   444
simpler than the left-hand side of the conclusion. For example
nipkow@47269
   445
\begin{quote}
nipkow@47269
   446
@{prop "n < m \<Longrightarrow> (n < Suc m) = True"}
nipkow@47269
   447
\end{quote}
nipkow@47269
   448
is suitable as a simplification rule: both @{prop"n<m"} and @{const True}
nipkow@47269
   449
are simpler than \mbox{@{prop"n < Suc m"}}. But
nipkow@47269
   450
\begin{quote}
nipkow@47269
   451
@{prop "Suc n < m \<Longrightarrow> (n < m) = True"}
nipkow@47269
   452
\end{quote}
nipkow@47269
   453
leads to nontermination: when trying to rewrite @{prop"n<m"} to @{const True}
nipkow@47269
   454
one first has to prove \mbox{@{prop"Suc n < m"}}, which can be rewritten to @{const True} provided @{prop"Suc(Suc n) < m"}, \emph{ad infinitum}.
nipkow@47269
   455
nipkow@55348
   456
\subsection{The \indexed{@{text simp}}{simp} Proof Method}
nipkow@47269
   457
\label{sec:simp}
nipkow@47269
   458
nipkow@47269
   459
So far we have only used the proof method @{text auto}.  Method @{text simp}
nipkow@47269
   460
is the key component of @{text auto}, but @{text auto} can do much more. In
nipkow@47269
   461
some cases, @{text auto} is overeager and modifies the proof state too much.
nipkow@47269
   462
In such cases the more predictable @{text simp} method should be used.
nipkow@47269
   463
Given a goal
nipkow@47269
   464
\begin{quote}
wenzelm@53015
   465
@{text"1. \<lbrakk> P\<^sub>1; \<dots>; P\<^sub>m \<rbrakk> \<Longrightarrow> C"}
nipkow@47269
   466
\end{quote}
nipkow@47269
   467
the command
nipkow@47269
   468
\begin{quote}
wenzelm@53015
   469
\isacom{apply}@{text"(simp add: th\<^sub>1 \<dots> th\<^sub>n)"}
nipkow@47269
   470
\end{quote}
wenzelm@53015
   471
simplifies the assumptions @{text "P\<^sub>i"} and the conclusion @{text C} using
nipkow@47269
   472
\begin{itemize}
nipkow@47269
   473
\item all simplification rules, including the ones coming from \isacom{datatype} and \isacom{fun},
wenzelm@53015
   474
\item the additional lemmas @{text"th\<^sub>1 \<dots> th\<^sub>n"}, and
nipkow@47269
   475
\item the assumptions.
nipkow@47269
   476
\end{itemize}
nipkow@47269
   477
In addition to or instead of @{text add} there is also @{text del} for removing
nipkow@47269
   478
simplification rules temporarily. Both are optional. Method @{text auto}
nipkow@47269
   479
can be modified similarly:
nipkow@47269
   480
\begin{quote}
nipkow@47269
   481
\isacom{apply}@{text"(auto simp add: \<dots> simp del: \<dots>)"}
nipkow@47269
   482
\end{quote}
nipkow@47269
   483
Here the modifiers are @{text"simp add"} and @{text"simp del"}
nipkow@47269
   484
instead of just @{text add} and @{text del} because @{text auto}
nipkow@47269
   485
does not just perform simplification.
nipkow@47269
   486
nipkow@47269
   487
Note that @{text simp} acts only on subgoal 1, @{text auto} acts on all
nipkow@47269
   488
subgoals. There is also @{text simp_all}, which applies @{text simp} to
nipkow@47269
   489
all subgoals.
nipkow@47269
   490
nipkow@56989
   491
\subsection{Rewriting with Definitions}
nipkow@47269
   492
\label{sec:rewr-defs}
nipkow@47269
   493
nipkow@47269
   494
Definitions introduced by the command \isacom{definition}
nipkow@47269
   495
can also be used as simplification rules,
nipkow@47269
   496
but by default they are not: the simplifier does not expand them
nipkow@47269
   497
automatically. Definitions are intended for introducing abstract concepts and
nipkow@47269
   498
not merely as abbreviations. Of course, we need to expand the definition
nipkow@47269
   499
initially, but once we have proved enough abstract properties of the new
nipkow@47269
   500
constant, we can forget its original definition. This style makes proofs more
nipkow@47269
   501
robust: if the definition has to be changed, only the proofs of the abstract
nipkow@47269
   502
properties will be affected.
nipkow@47269
   503
nipkow@47269
   504
The definition of a function @{text f} is a theorem named @{text f_def}
nipkow@58605
   505
and can be added to a call of @{text simp} like any other theorem:
nipkow@47269
   506
\begin{quote}
nipkow@47269
   507
\isacom{apply}@{text"(simp add: f_def)"}
nipkow@47269
   508
\end{quote}
nipkow@47269
   509
In particular, let-expressions can be unfolded by
nipkow@47269
   510
making @{thm[source] Let_def} a simplification rule.
nipkow@47269
   511
nipkow@52361
   512
\subsection{Case Splitting With @{text simp}}
nipkow@47269
   513
nipkow@47269
   514
Goals containing if-expressions are automatically split into two cases by
nipkow@47269
   515
@{text simp} using the rule
nipkow@47269
   516
\begin{quote}
nipkow@47269
   517
@{prop"P(if A then s else t) = ((A \<longrightarrow> P(s)) \<and> (~A \<longrightarrow> P(t)))"}
nipkow@47269
   518
\end{quote}
nipkow@47269
   519
For example, @{text simp} can prove
nipkow@47269
   520
\begin{quote}
nipkow@47269
   521
@{prop"(A \<and> B) = (if A then B else False)"}
nipkow@47269
   522
\end{quote}
nipkow@47269
   523
because both @{prop"A \<longrightarrow> (A & B) = B"} and @{prop"~A \<longrightarrow> (A & B) = False"}
nipkow@47269
   524
simplify to @{const True}.
nipkow@47269
   525
nipkow@47269
   526
We can split case-expressions similarly. For @{text nat} the rule looks like this:
nipkow@47269
   527
@{prop[display,margin=65,indent=4]"P(case e of 0 \<Rightarrow> a | Suc n \<Rightarrow> b n) = ((e = 0 \<longrightarrow> P a) & (ALL n. e = Suc n \<longrightarrow> P(b n)))"}
nipkow@47269
   528
Case expressions are not split automatically by @{text simp}, but @{text simp}
nipkow@47269
   529
can be instructed to do so:
nipkow@47269
   530
\begin{quote}
nipkow@47269
   531
\isacom{apply}@{text"(simp split: nat.split)"}
nipkow@47269
   532
\end{quote}
nipkow@47269
   533
splits all case-expressions over natural numbers. For an arbitrary
nipkow@55348
   534
datatype @{text t} it is @{text "t.split"}\index{split@@{text".split"}} instead of @{thm[source] nat.split}.
nipkow@47269
   535
Method @{text auto} can be modified in exactly the same way.
nipkow@55348
   536
The modifier \indexed{@{text "split:"}}{split} can be followed by multiple names.
nipkow@54605
   537
Splitting if or case-expressions in the assumptions requires 
nipkow@54605
   538
@{text "split: if_splits"} or @{text "split: t.splits"}.
nipkow@52593
   539
nipkow@52593
   540
nipkow@54436
   541
\subsection*{Exercises}
nipkow@52593
   542
nipkow@54195
   543
\exercise\label{exe:tree0}
nipkow@54195
   544
Define a datatype @{text tree0} of binary tree skeletons which do not store
nipkow@54195
   545
any information, neither in the inner nodes nor in the leaves.
nipkow@54467
   546
Define a function @{text "nodes :: tree0 \<Rightarrow> nat"} that counts the number of
nipkow@54195
   547
all nodes (inner nodes and leaves) in such a tree.
nipkow@54195
   548
Consider the following recursive function:
nipkow@54195
   549
*}
nipkow@54195
   550
(*<*)
nipkow@54195
   551
datatype tree0 = Tip | Node tree0 tree0
nipkow@54195
   552
(*>*)
nipkow@54195
   553
fun explode :: "nat \<Rightarrow> tree0 \<Rightarrow> tree0" where
nipkow@54195
   554
"explode 0 t = t" |
nipkow@54195
   555
"explode (Suc n) t = explode n (Node t t)"
nipkow@54195
   556
nipkow@54195
   557
text {*
nipkow@54195
   558
Find an equation expressing the size of a tree after exploding it
nipkow@54195
   559
(\noquotes{@{term [source] "nodes (explode n t)"}}) as a function
nipkow@54195
   560
of @{term "nodes t"} and @{text n}. Prove your equation.
nipkow@56989
   561
You may use the usual arithmetic operators, including the exponentiation
nipkow@54195
   562
operator ``@{text"^"}''. For example, \noquotes{@{prop [source] "2 ^ 2 = 4"}}.
nipkow@54195
   563
nipkow@54195
   564
Hint: simplifying with the list of theorems @{thm[source] algebra_simps}
nipkow@54195
   565
takes care of common algebraic properties of the arithmetic operators.
nipkow@54195
   566
\endexercise
nipkow@54195
   567
nipkow@52593
   568
\exercise
nipkow@52593
   569
Define arithmetic expressions in one variable over integers (type @{typ int})
nipkow@52593
   570
as a data type:
nipkow@52593
   571
*}
nipkow@52593
   572
nipkow@52593
   573
datatype exp = Var | Const int | Add exp exp | Mult exp exp
nipkow@52593
   574
nipkow@52593
   575
text{*
nipkow@52593
   576
Define a function \noquotes{@{term [source]"eval :: exp \<Rightarrow> int \<Rightarrow> int"}}
nipkow@52593
   577
such that @{term"eval e x"} evaluates @{text e} at the value
nipkow@52593
   578
@{text x}.
nipkow@52593
   579
nipkow@52593
   580
A polynomial can be represented as a list of coefficients, starting with
nipkow@52593
   581
the constant. For example, @{term "[4, 2, -1, 3::int]"} represents the
nipkow@52593
   582
polynomial $4 + 2x - x^2 + 3x^3$.
nipkow@52593
   583
Define a function \noquotes{@{term [source] "evalp :: int list \<Rightarrow> int \<Rightarrow> int"}}
nipkow@52593
   584
that evaluates a polynomial at the given value.
nipkow@52593
   585
Define a function \noquotes{@{term[source] "coeffs :: exp \<Rightarrow> int list"}}
nipkow@52593
   586
that transforms an expression into a polynomial. This may require auxiliary
nipkow@52593
   587
functions. Prove that @{text coeffs} preserves the value of the expression:
nipkow@52593
   588
\mbox{@{prop"evalp (coeffs e) x = eval e x"}.}
nipkow@54467
   589
Hint: consider the hint in Exercise~\ref{exe:tree0}.
nipkow@52593
   590
\endexercise
nipkow@47269
   591
*}
nipkow@47269
   592
(*<*)
nipkow@47269
   593
end
nipkow@47269
   594
(*>*)