src/Doc/Tutorial/document/fp.tex
changeset 48985 5386df44a037
parent 48966 6e15de7dd871
child 57083 5c26000e1042
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/src/Doc/Tutorial/document/fp.tex	Tue Aug 28 18:57:32 2012 +0200
@@ -0,0 +1,484 @@
+\chapter{Functional Programming in HOL}
+
+This chapter describes how to write
+functional programs in HOL and how to verify them.  However, 
+most of the constructs and
+proof procedures introduced are general and recur in any specification
+or verification task.  We really should speak of functional
+\emph{modelling} rather than functional \emph{programming}: 
+our primary aim is not
+to write programs but to design abstract models of systems.  HOL is
+a specification language that goes well beyond what can be expressed as a
+program. However, for the time being we concentrate on the computable.
+
+If you are a purist functional programmer, please note that all functions
+in HOL must be total:
+they must terminate for all inputs.  Lazy data structures are not
+directly available.
+
+\section{An Introductory Theory}
+\label{sec:intro-theory}
+
+Functional programming needs datatypes and functions. Both of them can be
+defined in a theory with a syntax reminiscent of languages like ML or
+Haskell. As an example consider the theory in figure~\ref{fig:ToyList}.
+We will now examine it line by line.
+
+\begin{figure}[htbp]
+\begin{ttbox}\makeatother
+\input{ToyList1}\end{ttbox}
+\caption{A Theory of Lists}
+\label{fig:ToyList}
+\end{figure}
+
+\index{*ToyList example|(}
+{\makeatother\medskip\input{ToyList.tex}}
+
+The complete proof script is shown in Fig.\ts\ref{fig:ToyList-proofs}. The
+concatenation of Figs.\ts\ref{fig:ToyList} and~\ref{fig:ToyList-proofs}
+constitutes the complete theory \texttt{ToyList} and should reside in file
+\texttt{ToyList.thy}.
+% It is good practice to present all declarations and
+%definitions at the beginning of a theory to facilitate browsing.%
+\index{*ToyList example|)}
+
+\begin{figure}[htbp]
+\begin{ttbox}\makeatother
+\input{ToyList2}\end{ttbox}
+\caption{Proofs about Lists}
+\label{fig:ToyList-proofs}
+\end{figure}
+
+\subsubsection*{Review}
+
+This is the end of our toy proof. It should have familiarized you with
+\begin{itemize}
+\item the standard theorem proving procedure:
+state a goal (lemma or theorem); proceed with proof until a separate lemma is
+required; prove that lemma; come back to the original goal.
+\item a specific procedure that works well for functional programs:
+induction followed by all-out simplification via \isa{auto}.
+\item a basic repertoire of proof commands.
+\end{itemize}
+
+\begin{warn}
+It is tempting to think that all lemmas should have the \isa{simp} attribute
+just because this was the case in the example above. However, in that example
+all lemmas were equations, and the right-hand side was simpler than the
+left-hand side --- an ideal situation for simplification purposes. Unless
+this is clearly the case, novices should refrain from awarding a lemma the
+\isa{simp} attribute, which has a global effect. Instead, lemmas can be
+applied locally where they are needed, which is discussed in the following
+chapter.
+\end{warn}
+
+\section{Some Helpful Commands}
+\label{sec:commands-and-hints}
+
+This section discusses a few basic commands for manipulating the proof state
+and can be skipped by casual readers.
+
+There are two kinds of commands used during a proof: the actual proof
+commands and auxiliary commands for examining the proof state and controlling
+the display. Simple proof commands are of the form
+\commdx{apply}(\textit{method}), where \textit{method} is typically 
+\isa{induct_tac} or \isa{auto}.  All such theorem proving operations
+are referred to as \bfindex{methods}, and further ones are
+introduced throughout the tutorial.  Unless stated otherwise, you may
+assume that a method attacks merely the first subgoal. An exception is
+\isa{auto}, which tries to solve all subgoals.
+
+The most useful auxiliary commands are as follows:
+\begin{description}
+\item[Modifying the order of subgoals:]
+\commdx{defer} moves the first subgoal to the end and
+\commdx{prefer}~$n$ moves subgoal $n$ to the front.
+\item[Printing theorems:]
+  \commdx{thm}~\textit{name}$@1$~\dots~\textit{name}$@n$
+  prints the named theorems.
+\item[Reading terms and types:] \commdx{term}
+  \textit{string} reads, type-checks and prints the given string as a term in
+  the current context; the inferred type is output as well.
+  \commdx{typ} \textit{string} reads and prints the given
+  string as a type in the current context.
+\end{description}
+Further commands are found in the Isabelle/Isar Reference
+Manual~\cite{isabelle-isar-ref}.
+
+\begin{pgnote}
+Clicking on the \pgmenu{State} button redisplays the current proof state.
+This is helpful in case commands like \isacommand{thm} have overwritten it.
+\end{pgnote}
+
+We now examine Isabelle's functional programming constructs systematically,
+starting with inductive datatypes.
+
+
+\section{Datatypes}
+\label{sec:datatype}
+
+\index{datatypes|(}%
+Inductive datatypes are part of almost every non-trivial application of HOL.
+First we take another look at an important example, the datatype of
+lists, before we turn to datatypes in general. The section closes with a
+case study.
+
+
+\subsection{Lists}
+
+\index{*list (type)}%
+Lists are one of the essential datatypes in computing.  We expect that you
+are already familiar with their basic operations.
+Theory \isa{ToyList} is only a small fragment of HOL's predefined theory
+\thydx{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}.
+The latter contains many further operations. For example, the functions
+\cdx{hd} (``head'') and \cdx{tl} (``tail'') return the first
+element and the remainder of a list. (However, pattern matching is usually
+preferable to \isa{hd} and \isa{tl}.)  
+Also available are higher-order functions like \isa{map} and \isa{filter}.
+Theory \isa{List} also contains
+more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates
+$x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}.  In the rest of the tutorial we
+always use HOL's predefined lists by building on theory \isa{Main}.
+\begin{warn}
+Looking ahead to sets and quanifiers in Part II:
+The best way to express that some element \isa{x} is in a list \isa{xs} is
+\isa{x $\in$ set xs}, where \isa{set} is a function that turns a list into the
+set of its elements.
+By the same device you can also write bounded quantifiers like
+\isa{$\forall$x $\in$ set xs} or embed lists in other set expressions.
+\end{warn}
+
+
+\subsection{The General Format}
+\label{sec:general-datatype}
+
+The general HOL \isacommand{datatype} definition is of the form
+\[
+\isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~
+C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~
+C@m~\tau@{m1}~\dots~\tau@{mk@m}
+\]
+where $\alpha@i$ are distinct type variables (the parameters), $C@i$ are distinct
+constructor names and $\tau@{ij}$ are types; it is customary to capitalize
+the first letter in constructor names. There are a number of
+restrictions (such as that the type should not be empty) detailed
+elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them.
+
+Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and
+\isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically
+during proofs by simplification.  The same is true for the equations in
+primitive recursive function definitions.
+
+Every\footnote{Except for advanced datatypes where the recursion involves
+``\isasymRightarrow'' as in {\S}\ref{sec:nested-fun-datatype}.} datatype $t$
+comes equipped with a \isa{size} function from $t$ into the natural numbers
+(see~{\S}\ref{sec:nat} below). For lists, \isa{size} is just the length, i.e.\
+\isa{size [] = 0} and \isa{size(x \# xs) = size xs + 1}.  In general,
+\cdx{size} returns
+\begin{itemize}
+\item zero for all constructors that do not have an argument of type $t$,
+\item one plus the sum of the sizes of all arguments of type~$t$,
+for all other constructors.
+\end{itemize}
+Note that because
+\isa{size} is defined on every datatype, it is overloaded; on lists
+\isa{size} is also called \sdx{length}, which is not overloaded.
+Isabelle will always show \isa{size} on lists as \isa{length}.
+
+
+\subsection{Primitive Recursion}
+
+\index{recursion!primitive}%
+Functions on datatypes are usually defined by recursion. In fact, most of the
+time they are defined by what is called \textbf{primitive recursion} over some
+datatype $t$. This means that the recursion equations must be of the form
+\[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]
+such that $C$ is a constructor of $t$ and all recursive calls of
+$f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus
+Isabelle immediately sees that $f$ terminates because one (fixed!) argument
+becomes smaller with every recursive call. There must be at most one equation
+for each constructor.  Their order is immaterial.
+A more general method for defining total recursive functions is introduced in
+{\S}\ref{sec:fun}.
+
+\begin{exercise}\label{ex:Tree}
+\input{Tree.tex}%
+\end{exercise}
+
+\input{case_exprs.tex}
+
+\input{Ifexpr.tex}
+\index{datatypes|)}
+
+
+\section{Some Basic Types}
+
+This section introduces the types of natural numbers and ordered pairs.  Also
+described is type \isa{option}, which is useful for modelling exceptional
+cases. 
+
+\subsection{Natural Numbers}
+\label{sec:nat}\index{natural numbers}%
+\index{linear arithmetic|(}
+
+\input{fakenat.tex}\medskip
+\input{natsum.tex}
+
+\index{linear arithmetic|)}
+
+
+\subsection{Pairs}
+\input{pairs2.tex}
+
+\subsection{Datatype {\tt\slshape option}}
+\label{sec:option}
+\input{Option2.tex}
+
+\section{Definitions}
+\label{sec:Definitions}
+
+A definition is simply an abbreviation, i.e.\ a new name for an existing
+construction. In particular, definitions cannot be recursive. Isabelle offers
+definitions on the level of types and terms. Those on the type level are
+called \textbf{type synonyms}; those on the term level are simply called 
+definitions.
+
+
+\subsection{Type Synonyms}
+
+\index{type synonyms}%
+Type synonyms are similar to those found in ML\@. They are created by a 
+\commdx{type\protect\_synonym} command:
+
+\medskip
+\input{types.tex}
+
+\input{prime_def.tex}
+
+
+\section{The Definitional Approach}
+\label{sec:definitional}
+
+\index{Definitional Approach}%
+As we pointed out at the beginning of the chapter, asserting arbitrary
+axioms such as $f(n) = f(n) + 1$ can easily lead to contradictions. In order
+to avoid this danger, we advocate the definitional rather than
+the axiomatic approach: introduce new concepts by definitions. However,  Isabelle/HOL seems to
+support many richer definitional constructs, such as
+\isacommand{primrec}. The point is that Isabelle reduces such constructs to first principles. For example, each
+\isacommand{primrec} function definition is turned into a proper
+(nonrecursive!) definition from which the user-supplied recursion equations are
+automatically proved.  This process is
+hidden from the user, who does not have to understand the details.  Other commands described
+later, like \isacommand{fun} and \isacommand{inductive}, work similarly.  
+This strict adherence to the definitional approach reduces the risk of 
+soundness errors.
+
+\chapter{More Functional Programming}
+
+The purpose of this chapter is to deepen your understanding of the
+concepts encountered so far and to introduce advanced forms of datatypes and
+recursive functions. The first two sections give a structured presentation of
+theorem proving by simplification ({\S}\ref{sec:Simplification}) and discuss
+important heuristics for induction ({\S}\ref{sec:InductionHeuristics}).  You can
+skip them if you are not planning to perform proofs yourself.
+We then present a case
+study: a compiler for expressions ({\S}\ref{sec:ExprCompiler}). Advanced
+datatypes, including those involving function spaces, are covered in
+{\S}\ref{sec:advanced-datatypes}; it closes with another case study, search
+trees (``tries'').  Finally we introduce \isacommand{fun}, a general
+form of recursive function definition that goes well beyond 
+\isacommand{primrec} ({\S}\ref{sec:fun}).
+
+
+\section{Simplification}
+\label{sec:Simplification}
+\index{simplification|(}
+
+So far we have proved our theorems by \isa{auto}, which simplifies
+all subgoals. In fact, \isa{auto} can do much more than that. 
+To go beyond toy examples, you
+need to understand the ingredients of \isa{auto}.  This section covers the
+method that \isa{auto} always applies first, simplification.
+
+Simplification is one of the central theorem proving tools in Isabelle and
+many other systems. The tool itself is called the \textbf{simplifier}. 
+This section introduces the many features of the simplifier
+and is required reading if you intend to perform proofs.  Later on,
+{\S}\ref{sec:simplification-II} explains some more advanced features and a
+little bit of how the simplifier works. The serious student should read that
+section as well, in particular to understand why the simplifier did
+something unexpected.
+
+\subsection{What is Simplification?}
+
+In its most basic form, simplification means repeated application of
+equations from left to right. For example, taking the rules for \isa{\at}
+and applying them to the term \isa{[0,1] \at\ []} results in a sequence of
+simplification steps:
+\begin{ttbox}\makeatother
+(0#1#[]) @ []  \(\leadsto\)  0#((1#[]) @ [])  \(\leadsto\)  0#(1#([] @ []))  \(\leadsto\)  0#1#[]
+\end{ttbox}
+This is also known as \bfindex{term rewriting}\indexbold{rewriting} and the
+equations are referred to as \bfindex{rewrite rules}.
+``Rewriting'' is more honest than ``simplification'' because the terms do not
+necessarily become simpler in the process.
+
+The simplifier proves arithmetic goals as described in
+{\S}\ref{sec:nat} above.  Arithmetic expressions are simplified using built-in
+procedures that go beyond mere rewrite rules.  New simplification procedures
+can be coded and installed, but they are definitely not a matter for this
+tutorial. 
+
+\input{simp.tex}
+
+\index{simplification|)}
+
+\input{Itrev.tex}
+\begin{exercise}
+\input{Plus.tex}%
+\end{exercise}
+\begin{exercise}
+\input{Tree2.tex}%
+\end{exercise}
+
+\input{CodeGen.tex}
+
+
+\section{Advanced Datatypes}
+\label{sec:advanced-datatypes}
+\index{datatype@\isacommand {datatype} (command)|(}
+\index{primrec@\isacommand {primrec} (command)|(}
+%|)
+
+This section presents advanced forms of datatypes: mutual and nested
+recursion.  A series of examples will culminate in a treatment of the trie
+data structure.
+
+
+\subsection{Mutual Recursion}
+\label{sec:datatype-mut-rec}
+
+\input{ABexpr.tex}
+
+\subsection{Nested Recursion}
+\label{sec:nested-datatype}
+
+{\makeatother\input{Nested.tex}}
+
+
+\subsection{The Limits of Nested Recursion}
+\label{sec:nested-fun-datatype}
+
+How far can we push nested recursion? By the unfolding argument above, we can
+reduce nested to mutual recursion provided the nested recursion only involves
+previously defined datatypes. This does not include functions:
+\begin{isabelle}
+\isacommand{datatype} t = C "t \isasymRightarrow\ bool"
+\end{isabelle}
+This declaration is a real can of worms.
+In HOL it must be ruled out because it requires a type
+\isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the
+same cardinality --- an impossibility. For the same reason it is not possible
+to allow recursion involving the type \isa{t set}, which is isomorphic to
+\isa{t \isasymFun\ bool}.
+
+Fortunately, a limited form of recursion
+involving function spaces is permitted: the recursive type may occur on the
+right of a function arrow, but never on the left. Hence the above can of worms
+is ruled out but the following example of a potentially 
+\index{infinitely branching trees}%
+infinitely branching tree is accepted:
+\smallskip
+
+\input{Fundata.tex}
+
+If you need nested recursion on the left of a function arrow, there are
+alternatives to pure HOL\@.  In the Logic for Computable Functions 
+(\rmindex{LCF}), types like
+\begin{isabelle}
+\isacommand{datatype} lam = C "lam \isasymrightarrow\ lam"
+\end{isabelle}
+do indeed make sense~\cite{paulson87}.  Note the different arrow,
+\isa{\isasymrightarrow} instead of \isa{\isasymRightarrow},
+expressing the type of \emph{continuous} functions. 
+There is even a version of LCF on top of HOL,
+called \rmindex{HOLCF}~\cite{MuellerNvOS99}.
+\index{datatype@\isacommand {datatype} (command)|)}
+\index{primrec@\isacommand {primrec} (command)|)}
+
+
+\subsection{Case Study: Tries}
+\label{sec:Trie}
+
+\index{tries|(}%
+Tries are a classic search tree data structure~\cite{Knuth3-75} for fast
+indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a
+trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and
+``cat''.  When searching a string in a trie, the letters of the string are
+examined sequentially. Each letter determines which subtrie to search next.
+In this case study we model tries as a datatype, define a lookup and an
+update function, and prove that they behave as expected.
+
+\begin{figure}[htbp]
+\begin{center}
+\unitlength1mm
+\begin{picture}(60,30)
+\put( 5, 0){\makebox(0,0)[b]{l}}
+\put(25, 0){\makebox(0,0)[b]{e}}
+\put(35, 0){\makebox(0,0)[b]{n}}
+\put(45, 0){\makebox(0,0)[b]{r}}
+\put(55, 0){\makebox(0,0)[b]{t}}
+%
+\put( 5, 9){\line(0,-1){5}}
+\put(25, 9){\line(0,-1){5}}
+\put(44, 9){\line(-3,-2){9}}
+\put(45, 9){\line(0,-1){5}}
+\put(46, 9){\line(3,-2){9}}
+%
+\put( 5,10){\makebox(0,0)[b]{l}}
+\put(15,10){\makebox(0,0)[b]{n}}
+\put(25,10){\makebox(0,0)[b]{p}}
+\put(45,10){\makebox(0,0)[b]{a}}
+%
+\put(14,19){\line(-3,-2){9}}
+\put(15,19){\line(0,-1){5}}
+\put(16,19){\line(3,-2){9}}
+\put(45,19){\line(0,-1){5}}
+%
+\put(15,20){\makebox(0,0)[b]{a}}
+\put(45,20){\makebox(0,0)[b]{c}}
+%
+\put(30,30){\line(-3,-2){13}}
+\put(30,30){\line(3,-2){13}}
+\end{picture}
+\end{center}
+\caption{A Sample Trie}
+\label{fig:trie}
+\end{figure}
+
+Proper tries associate some value with each string. Since the
+information is stored only in the final node associated with the string, many
+nodes do not carry any value. This distinction is modeled with the help
+of the predefined datatype \isa{option} (see {\S}\ref{sec:option}).
+\input{Trie.tex}
+\index{tries|)}
+
+\section{Total Recursive Functions: \isacommand{fun}}
+\label{sec:fun}
+\index{fun@\isacommand {fun} (command)|(}\index{functions!total|(}
+
+Although many total functions have a natural primitive recursive definition,
+this is not always the case. Arbitrary total recursive functions can be
+defined by means of \isacommand{fun}: you can use full pattern matching,
+recursion need not involve datatypes, and termination is proved by showing
+that the arguments of all recursive calls are smaller in a suitable sense.
+In this section we restrict ourselves to functions where Isabelle can prove
+termination automatically. More advanced function definitions, including user
+supplied termination proofs, nested recursion and partiality, are discussed
+in a separate tutorial~\cite{isabelle-function}.
+
+\input{fun0.tex}
+
+\index{fun@\isacommand {fun} (command)|)}\index{functions!total|)}