src/Doc/Tutorial/document/fp.tex
changeset 48985 5386df44a037
parent 48966 6e15de7dd871
child 57083 5c26000e1042
equal deleted inserted replaced
48984:f51d4a302962 48985:5386df44a037
       
     1 \chapter{Functional Programming in HOL}
       
     2 
       
     3 This chapter describes how to write
       
     4 functional programs in HOL and how to verify them.  However, 
       
     5 most of the constructs and
       
     6 proof procedures introduced are general and recur in any specification
       
     7 or verification task.  We really should speak of functional
       
     8 \emph{modelling} rather than functional \emph{programming}: 
       
     9 our primary aim is not
       
    10 to write programs but to design abstract models of systems.  HOL is
       
    11 a specification language that goes well beyond what can be expressed as a
       
    12 program. However, for the time being we concentrate on the computable.
       
    13 
       
    14 If you are a purist functional programmer, please note that all functions
       
    15 in HOL must be total:
       
    16 they must terminate for all inputs.  Lazy data structures are not
       
    17 directly available.
       
    18 
       
    19 \section{An Introductory Theory}
       
    20 \label{sec:intro-theory}
       
    21 
       
    22 Functional programming needs datatypes and functions. Both of them can be
       
    23 defined in a theory with a syntax reminiscent of languages like ML or
       
    24 Haskell. As an example consider the theory in figure~\ref{fig:ToyList}.
       
    25 We will now examine it line by line.
       
    26 
       
    27 \begin{figure}[htbp]
       
    28 \begin{ttbox}\makeatother
       
    29 \input{ToyList1}\end{ttbox}
       
    30 \caption{A Theory of Lists}
       
    31 \label{fig:ToyList}
       
    32 \end{figure}
       
    33 
       
    34 \index{*ToyList example|(}
       
    35 {\makeatother\medskip\input{ToyList.tex}}
       
    36 
       
    37 The complete proof script is shown in Fig.\ts\ref{fig:ToyList-proofs}. The
       
    38 concatenation of Figs.\ts\ref{fig:ToyList} and~\ref{fig:ToyList-proofs}
       
    39 constitutes the complete theory \texttt{ToyList} and should reside in file
       
    40 \texttt{ToyList.thy}.
       
    41 % It is good practice to present all declarations and
       
    42 %definitions at the beginning of a theory to facilitate browsing.%
       
    43 \index{*ToyList example|)}
       
    44 
       
    45 \begin{figure}[htbp]
       
    46 \begin{ttbox}\makeatother
       
    47 \input{ToyList2}\end{ttbox}
       
    48 \caption{Proofs about Lists}
       
    49 \label{fig:ToyList-proofs}
       
    50 \end{figure}
       
    51 
       
    52 \subsubsection*{Review}
       
    53 
       
    54 This is the end of our toy proof. It should have familiarized you with
       
    55 \begin{itemize}
       
    56 \item the standard theorem proving procedure:
       
    57 state a goal (lemma or theorem); proceed with proof until a separate lemma is
       
    58 required; prove that lemma; come back to the original goal.
       
    59 \item a specific procedure that works well for functional programs:
       
    60 induction followed by all-out simplification via \isa{auto}.
       
    61 \item a basic repertoire of proof commands.
       
    62 \end{itemize}
       
    63 
       
    64 \begin{warn}
       
    65 It is tempting to think that all lemmas should have the \isa{simp} attribute
       
    66 just because this was the case in the example above. However, in that example
       
    67 all lemmas were equations, and the right-hand side was simpler than the
       
    68 left-hand side --- an ideal situation for simplification purposes. Unless
       
    69 this is clearly the case, novices should refrain from awarding a lemma the
       
    70 \isa{simp} attribute, which has a global effect. Instead, lemmas can be
       
    71 applied locally where they are needed, which is discussed in the following
       
    72 chapter.
       
    73 \end{warn}
       
    74 
       
    75 \section{Some Helpful Commands}
       
    76 \label{sec:commands-and-hints}
       
    77 
       
    78 This section discusses a few basic commands for manipulating the proof state
       
    79 and can be skipped by casual readers.
       
    80 
       
    81 There are two kinds of commands used during a proof: the actual proof
       
    82 commands and auxiliary commands for examining the proof state and controlling
       
    83 the display. Simple proof commands are of the form
       
    84 \commdx{apply}(\textit{method}), where \textit{method} is typically 
       
    85 \isa{induct_tac} or \isa{auto}.  All such theorem proving operations
       
    86 are referred to as \bfindex{methods}, and further ones are
       
    87 introduced throughout the tutorial.  Unless stated otherwise, you may
       
    88 assume that a method attacks merely the first subgoal. An exception is
       
    89 \isa{auto}, which tries to solve all subgoals.
       
    90 
       
    91 The most useful auxiliary commands are as follows:
       
    92 \begin{description}
       
    93 \item[Modifying the order of subgoals:]
       
    94 \commdx{defer} moves the first subgoal to the end and
       
    95 \commdx{prefer}~$n$ moves subgoal $n$ to the front.
       
    96 \item[Printing theorems:]
       
    97   \commdx{thm}~\textit{name}$@1$~\dots~\textit{name}$@n$
       
    98   prints the named theorems.
       
    99 \item[Reading terms and types:] \commdx{term}
       
   100   \textit{string} reads, type-checks and prints the given string as a term in
       
   101   the current context; the inferred type is output as well.
       
   102   \commdx{typ} \textit{string} reads and prints the given
       
   103   string as a type in the current context.
       
   104 \end{description}
       
   105 Further commands are found in the Isabelle/Isar Reference
       
   106 Manual~\cite{isabelle-isar-ref}.
       
   107 
       
   108 \begin{pgnote}
       
   109 Clicking on the \pgmenu{State} button redisplays the current proof state.
       
   110 This is helpful in case commands like \isacommand{thm} have overwritten it.
       
   111 \end{pgnote}
       
   112 
       
   113 We now examine Isabelle's functional programming constructs systematically,
       
   114 starting with inductive datatypes.
       
   115 
       
   116 
       
   117 \section{Datatypes}
       
   118 \label{sec:datatype}
       
   119 
       
   120 \index{datatypes|(}%
       
   121 Inductive datatypes are part of almost every non-trivial application of HOL.
       
   122 First we take another look at an important example, the datatype of
       
   123 lists, before we turn to datatypes in general. The section closes with a
       
   124 case study.
       
   125 
       
   126 
       
   127 \subsection{Lists}
       
   128 
       
   129 \index{*list (type)}%
       
   130 Lists are one of the essential datatypes in computing.  We expect that you
       
   131 are already familiar with their basic operations.
       
   132 Theory \isa{ToyList} is only a small fragment of HOL's predefined theory
       
   133 \thydx{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}.
       
   134 The latter contains many further operations. For example, the functions
       
   135 \cdx{hd} (``head'') and \cdx{tl} (``tail'') return the first
       
   136 element and the remainder of a list. (However, pattern matching is usually
       
   137 preferable to \isa{hd} and \isa{tl}.)  
       
   138 Also available are higher-order functions like \isa{map} and \isa{filter}.
       
   139 Theory \isa{List} also contains
       
   140 more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates
       
   141 $x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}.  In the rest of the tutorial we
       
   142 always use HOL's predefined lists by building on theory \isa{Main}.
       
   143 \begin{warn}
       
   144 Looking ahead to sets and quanifiers in Part II:
       
   145 The best way to express that some element \isa{x} is in a list \isa{xs} is
       
   146 \isa{x $\in$ set xs}, where \isa{set} is a function that turns a list into the
       
   147 set of its elements.
       
   148 By the same device you can also write bounded quantifiers like
       
   149 \isa{$\forall$x $\in$ set xs} or embed lists in other set expressions.
       
   150 \end{warn}
       
   151 
       
   152 
       
   153 \subsection{The General Format}
       
   154 \label{sec:general-datatype}
       
   155 
       
   156 The general HOL \isacommand{datatype} definition is of the form
       
   157 \[
       
   158 \isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~
       
   159 C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~
       
   160 C@m~\tau@{m1}~\dots~\tau@{mk@m}
       
   161 \]
       
   162 where $\alpha@i$ are distinct type variables (the parameters), $C@i$ are distinct
       
   163 constructor names and $\tau@{ij}$ are types; it is customary to capitalize
       
   164 the first letter in constructor names. There are a number of
       
   165 restrictions (such as that the type should not be empty) detailed
       
   166 elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them.
       
   167 
       
   168 Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and
       
   169 \isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically
       
   170 during proofs by simplification.  The same is true for the equations in
       
   171 primitive recursive function definitions.
       
   172 
       
   173 Every\footnote{Except for advanced datatypes where the recursion involves
       
   174 ``\isasymRightarrow'' as in {\S}\ref{sec:nested-fun-datatype}.} datatype $t$
       
   175 comes equipped with a \isa{size} function from $t$ into the natural numbers
       
   176 (see~{\S}\ref{sec:nat} below). For lists, \isa{size} is just the length, i.e.\
       
   177 \isa{size [] = 0} and \isa{size(x \# xs) = size xs + 1}.  In general,
       
   178 \cdx{size} returns
       
   179 \begin{itemize}
       
   180 \item zero for all constructors that do not have an argument of type $t$,
       
   181 \item one plus the sum of the sizes of all arguments of type~$t$,
       
   182 for all other constructors.
       
   183 \end{itemize}
       
   184 Note that because
       
   185 \isa{size} is defined on every datatype, it is overloaded; on lists
       
   186 \isa{size} is also called \sdx{length}, which is not overloaded.
       
   187 Isabelle will always show \isa{size} on lists as \isa{length}.
       
   188 
       
   189 
       
   190 \subsection{Primitive Recursion}
       
   191 
       
   192 \index{recursion!primitive}%
       
   193 Functions on datatypes are usually defined by recursion. In fact, most of the
       
   194 time they are defined by what is called \textbf{primitive recursion} over some
       
   195 datatype $t$. This means that the recursion equations must be of the form
       
   196 \[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]
       
   197 such that $C$ is a constructor of $t$ and all recursive calls of
       
   198 $f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus
       
   199 Isabelle immediately sees that $f$ terminates because one (fixed!) argument
       
   200 becomes smaller with every recursive call. There must be at most one equation
       
   201 for each constructor.  Their order is immaterial.
       
   202 A more general method for defining total recursive functions is introduced in
       
   203 {\S}\ref{sec:fun}.
       
   204 
       
   205 \begin{exercise}\label{ex:Tree}
       
   206 \input{Tree.tex}%
       
   207 \end{exercise}
       
   208 
       
   209 \input{case_exprs.tex}
       
   210 
       
   211 \input{Ifexpr.tex}
       
   212 \index{datatypes|)}
       
   213 
       
   214 
       
   215 \section{Some Basic Types}
       
   216 
       
   217 This section introduces the types of natural numbers and ordered pairs.  Also
       
   218 described is type \isa{option}, which is useful for modelling exceptional
       
   219 cases. 
       
   220 
       
   221 \subsection{Natural Numbers}
       
   222 \label{sec:nat}\index{natural numbers}%
       
   223 \index{linear arithmetic|(}
       
   224 
       
   225 \input{fakenat.tex}\medskip
       
   226 \input{natsum.tex}
       
   227 
       
   228 \index{linear arithmetic|)}
       
   229 
       
   230 
       
   231 \subsection{Pairs}
       
   232 \input{pairs2.tex}
       
   233 
       
   234 \subsection{Datatype {\tt\slshape option}}
       
   235 \label{sec:option}
       
   236 \input{Option2.tex}
       
   237 
       
   238 \section{Definitions}
       
   239 \label{sec:Definitions}
       
   240 
       
   241 A definition is simply an abbreviation, i.e.\ a new name for an existing
       
   242 construction. In particular, definitions cannot be recursive. Isabelle offers
       
   243 definitions on the level of types and terms. Those on the type level are
       
   244 called \textbf{type synonyms}; those on the term level are simply called 
       
   245 definitions.
       
   246 
       
   247 
       
   248 \subsection{Type Synonyms}
       
   249 
       
   250 \index{type synonyms}%
       
   251 Type synonyms are similar to those found in ML\@. They are created by a 
       
   252 \commdx{type\protect\_synonym} command:
       
   253 
       
   254 \medskip
       
   255 \input{types.tex}
       
   256 
       
   257 \input{prime_def.tex}
       
   258 
       
   259 
       
   260 \section{The Definitional Approach}
       
   261 \label{sec:definitional}
       
   262 
       
   263 \index{Definitional Approach}%
       
   264 As we pointed out at the beginning of the chapter, asserting arbitrary
       
   265 axioms such as $f(n) = f(n) + 1$ can easily lead to contradictions. In order
       
   266 to avoid this danger, we advocate the definitional rather than
       
   267 the axiomatic approach: introduce new concepts by definitions. However,  Isabelle/HOL seems to
       
   268 support many richer definitional constructs, such as
       
   269 \isacommand{primrec}. The point is that Isabelle reduces such constructs to first principles. For example, each
       
   270 \isacommand{primrec} function definition is turned into a proper
       
   271 (nonrecursive!) definition from which the user-supplied recursion equations are
       
   272 automatically proved.  This process is
       
   273 hidden from the user, who does not have to understand the details.  Other commands described
       
   274 later, like \isacommand{fun} and \isacommand{inductive}, work similarly.  
       
   275 This strict adherence to the definitional approach reduces the risk of 
       
   276 soundness errors.
       
   277 
       
   278 \chapter{More Functional Programming}
       
   279 
       
   280 The purpose of this chapter is to deepen your understanding of the
       
   281 concepts encountered so far and to introduce advanced forms of datatypes and
       
   282 recursive functions. The first two sections give a structured presentation of
       
   283 theorem proving by simplification ({\S}\ref{sec:Simplification}) and discuss
       
   284 important heuristics for induction ({\S}\ref{sec:InductionHeuristics}).  You can
       
   285 skip them if you are not planning to perform proofs yourself.
       
   286 We then present a case
       
   287 study: a compiler for expressions ({\S}\ref{sec:ExprCompiler}). Advanced
       
   288 datatypes, including those involving function spaces, are covered in
       
   289 {\S}\ref{sec:advanced-datatypes}; it closes with another case study, search
       
   290 trees (``tries'').  Finally we introduce \isacommand{fun}, a general
       
   291 form of recursive function definition that goes well beyond 
       
   292 \isacommand{primrec} ({\S}\ref{sec:fun}).
       
   293 
       
   294 
       
   295 \section{Simplification}
       
   296 \label{sec:Simplification}
       
   297 \index{simplification|(}
       
   298 
       
   299 So far we have proved our theorems by \isa{auto}, which simplifies
       
   300 all subgoals. In fact, \isa{auto} can do much more than that. 
       
   301 To go beyond toy examples, you
       
   302 need to understand the ingredients of \isa{auto}.  This section covers the
       
   303 method that \isa{auto} always applies first, simplification.
       
   304 
       
   305 Simplification is one of the central theorem proving tools in Isabelle and
       
   306 many other systems. The tool itself is called the \textbf{simplifier}. 
       
   307 This section introduces the many features of the simplifier
       
   308 and is required reading if you intend to perform proofs.  Later on,
       
   309 {\S}\ref{sec:simplification-II} explains some more advanced features and a
       
   310 little bit of how the simplifier works. The serious student should read that
       
   311 section as well, in particular to understand why the simplifier did
       
   312 something unexpected.
       
   313 
       
   314 \subsection{What is Simplification?}
       
   315 
       
   316 In its most basic form, simplification means repeated application of
       
   317 equations from left to right. For example, taking the rules for \isa{\at}
       
   318 and applying them to the term \isa{[0,1] \at\ []} results in a sequence of
       
   319 simplification steps:
       
   320 \begin{ttbox}\makeatother
       
   321 (0#1#[]) @ []  \(\leadsto\)  0#((1#[]) @ [])  \(\leadsto\)  0#(1#([] @ []))  \(\leadsto\)  0#1#[]
       
   322 \end{ttbox}
       
   323 This is also known as \bfindex{term rewriting}\indexbold{rewriting} and the
       
   324 equations are referred to as \bfindex{rewrite rules}.
       
   325 ``Rewriting'' is more honest than ``simplification'' because the terms do not
       
   326 necessarily become simpler in the process.
       
   327 
       
   328 The simplifier proves arithmetic goals as described in
       
   329 {\S}\ref{sec:nat} above.  Arithmetic expressions are simplified using built-in
       
   330 procedures that go beyond mere rewrite rules.  New simplification procedures
       
   331 can be coded and installed, but they are definitely not a matter for this
       
   332 tutorial. 
       
   333 
       
   334 \input{simp.tex}
       
   335 
       
   336 \index{simplification|)}
       
   337 
       
   338 \input{Itrev.tex}
       
   339 \begin{exercise}
       
   340 \input{Plus.tex}%
       
   341 \end{exercise}
       
   342 \begin{exercise}
       
   343 \input{Tree2.tex}%
       
   344 \end{exercise}
       
   345 
       
   346 \input{CodeGen.tex}
       
   347 
       
   348 
       
   349 \section{Advanced Datatypes}
       
   350 \label{sec:advanced-datatypes}
       
   351 \index{datatype@\isacommand {datatype} (command)|(}
       
   352 \index{primrec@\isacommand {primrec} (command)|(}
       
   353 %|)
       
   354 
       
   355 This section presents advanced forms of datatypes: mutual and nested
       
   356 recursion.  A series of examples will culminate in a treatment of the trie
       
   357 data structure.
       
   358 
       
   359 
       
   360 \subsection{Mutual Recursion}
       
   361 \label{sec:datatype-mut-rec}
       
   362 
       
   363 \input{ABexpr.tex}
       
   364 
       
   365 \subsection{Nested Recursion}
       
   366 \label{sec:nested-datatype}
       
   367 
       
   368 {\makeatother\input{Nested.tex}}
       
   369 
       
   370 
       
   371 \subsection{The Limits of Nested Recursion}
       
   372 \label{sec:nested-fun-datatype}
       
   373 
       
   374 How far can we push nested recursion? By the unfolding argument above, we can
       
   375 reduce nested to mutual recursion provided the nested recursion only involves
       
   376 previously defined datatypes. This does not include functions:
       
   377 \begin{isabelle}
       
   378 \isacommand{datatype} t = C "t \isasymRightarrow\ bool"
       
   379 \end{isabelle}
       
   380 This declaration is a real can of worms.
       
   381 In HOL it must be ruled out because it requires a type
       
   382 \isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the
       
   383 same cardinality --- an impossibility. For the same reason it is not possible
       
   384 to allow recursion involving the type \isa{t set}, which is isomorphic to
       
   385 \isa{t \isasymFun\ bool}.
       
   386 
       
   387 Fortunately, a limited form of recursion
       
   388 involving function spaces is permitted: the recursive type may occur on the
       
   389 right of a function arrow, but never on the left. Hence the above can of worms
       
   390 is ruled out but the following example of a potentially 
       
   391 \index{infinitely branching trees}%
       
   392 infinitely branching tree is accepted:
       
   393 \smallskip
       
   394 
       
   395 \input{Fundata.tex}
       
   396 
       
   397 If you need nested recursion on the left of a function arrow, there are
       
   398 alternatives to pure HOL\@.  In the Logic for Computable Functions 
       
   399 (\rmindex{LCF}), types like
       
   400 \begin{isabelle}
       
   401 \isacommand{datatype} lam = C "lam \isasymrightarrow\ lam"
       
   402 \end{isabelle}
       
   403 do indeed make sense~\cite{paulson87}.  Note the different arrow,
       
   404 \isa{\isasymrightarrow} instead of \isa{\isasymRightarrow},
       
   405 expressing the type of \emph{continuous} functions. 
       
   406 There is even a version of LCF on top of HOL,
       
   407 called \rmindex{HOLCF}~\cite{MuellerNvOS99}.
       
   408 \index{datatype@\isacommand {datatype} (command)|)}
       
   409 \index{primrec@\isacommand {primrec} (command)|)}
       
   410 
       
   411 
       
   412 \subsection{Case Study: Tries}
       
   413 \label{sec:Trie}
       
   414 
       
   415 \index{tries|(}%
       
   416 Tries are a classic search tree data structure~\cite{Knuth3-75} for fast
       
   417 indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a
       
   418 trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and
       
   419 ``cat''.  When searching a string in a trie, the letters of the string are
       
   420 examined sequentially. Each letter determines which subtrie to search next.
       
   421 In this case study we model tries as a datatype, define a lookup and an
       
   422 update function, and prove that they behave as expected.
       
   423 
       
   424 \begin{figure}[htbp]
       
   425 \begin{center}
       
   426 \unitlength1mm
       
   427 \begin{picture}(60,30)
       
   428 \put( 5, 0){\makebox(0,0)[b]{l}}
       
   429 \put(25, 0){\makebox(0,0)[b]{e}}
       
   430 \put(35, 0){\makebox(0,0)[b]{n}}
       
   431 \put(45, 0){\makebox(0,0)[b]{r}}
       
   432 \put(55, 0){\makebox(0,0)[b]{t}}
       
   433 %
       
   434 \put( 5, 9){\line(0,-1){5}}
       
   435 \put(25, 9){\line(0,-1){5}}
       
   436 \put(44, 9){\line(-3,-2){9}}
       
   437 \put(45, 9){\line(0,-1){5}}
       
   438 \put(46, 9){\line(3,-2){9}}
       
   439 %
       
   440 \put( 5,10){\makebox(0,0)[b]{l}}
       
   441 \put(15,10){\makebox(0,0)[b]{n}}
       
   442 \put(25,10){\makebox(0,0)[b]{p}}
       
   443 \put(45,10){\makebox(0,0)[b]{a}}
       
   444 %
       
   445 \put(14,19){\line(-3,-2){9}}
       
   446 \put(15,19){\line(0,-1){5}}
       
   447 \put(16,19){\line(3,-2){9}}
       
   448 \put(45,19){\line(0,-1){5}}
       
   449 %
       
   450 \put(15,20){\makebox(0,0)[b]{a}}
       
   451 \put(45,20){\makebox(0,0)[b]{c}}
       
   452 %
       
   453 \put(30,30){\line(-3,-2){13}}
       
   454 \put(30,30){\line(3,-2){13}}
       
   455 \end{picture}
       
   456 \end{center}
       
   457 \caption{A Sample Trie}
       
   458 \label{fig:trie}
       
   459 \end{figure}
       
   460 
       
   461 Proper tries associate some value with each string. Since the
       
   462 information is stored only in the final node associated with the string, many
       
   463 nodes do not carry any value. This distinction is modeled with the help
       
   464 of the predefined datatype \isa{option} (see {\S}\ref{sec:option}).
       
   465 \input{Trie.tex}
       
   466 \index{tries|)}
       
   467 
       
   468 \section{Total Recursive Functions: \isacommand{fun}}
       
   469 \label{sec:fun}
       
   470 \index{fun@\isacommand {fun} (command)|(}\index{functions!total|(}
       
   471 
       
   472 Although many total functions have a natural primitive recursive definition,
       
   473 this is not always the case. Arbitrary total recursive functions can be
       
   474 defined by means of \isacommand{fun}: you can use full pattern matching,
       
   475 recursion need not involve datatypes, and termination is proved by showing
       
   476 that the arguments of all recursive calls are smaller in a suitable sense.
       
   477 In this section we restrict ourselves to functions where Isabelle can prove
       
   478 termination automatically. More advanced function definitions, including user
       
   479 supplied termination proofs, nested recursion and partiality, are discussed
       
   480 in a separate tutorial~\cite{isabelle-function}.
       
   481 
       
   482 \input{fun0.tex}
       
   483 
       
   484 \index{fun@\isacommand {fun} (command)|)}\index{functions!total|)}