103

1 
\documentstyle[11pt,a4,proof,lcp,alltt,amssymbols,draft]{article}


2 
\newif\ifCADE


3 
\CADEfalse


4 


5 
\title{A Fixedpoint Approach to Implementing (Co)Inductive Definitions\\


6 
DRAFT\thanks{Research funded by the SERC (grants GR/G53279,


7 
GR/H40570) and by the ESPRIT Basic Research Action 6453 `Types'.}}


8 


9 
\author{{\em Lawrence C. Paulson}\\


10 
Computer Laboratory, University of Cambridge}


11 
\date{\today}


12 
\setcounter{secnumdepth}{2} \setcounter{tocdepth}{2}


13 


14 
\def\picture #1 by #2 (#3){% pictures: width by height (name)


15 
\message{Picture #3}


16 
\vbox to #2{\hrule width #1 height 0pt depth 0pt


17 
\vfill \special{picture #3}}}


18 


19 


20 
\newcommand\sbs{\subseteq}


21 
\newcommand\List[1]{\lbrakk#1\rbrakk}


22 
\let\To=\Rightarrow


23 
\newcommand\Var[1]{{?\!#1}}


24 


25 


26 
%%%\newcommand\Pow{{\tt Pow}}


27 
\let\pow=\wp


28 
\newcommand\RepFun{{\tt RepFun}}


29 
\newcommand\pair[1]{\langle#1\rangle}


30 
\newcommand\cons{{\tt cons}}


31 
\def\succ{{\tt succ}}


32 
\newcommand\split{{\tt split}}


33 
\newcommand\fst{{\tt fst}}


34 
\newcommand\snd{{\tt snd}}


35 
\newcommand\converse{{\tt converse}}


36 
\newcommand\domain{{\tt domain}}


37 
\newcommand\range{{\tt range}}


38 
\newcommand\field{{\tt field}}


39 
\newcommand\bndmono{\hbox{\tt bnd\_mono}}


40 
\newcommand\lfp{{\tt lfp}}


41 
\newcommand\gfp{{\tt gfp}}


42 
\newcommand\id{{\tt id}}


43 
\newcommand\trans{{\tt trans}}


44 
\newcommand\wf{{\tt wf}}


45 
\newcommand\wfrec{\hbox{\tt wfrec}}


46 
\newcommand\nat{{\tt nat}}


47 
\newcommand\natcase{\hbox{\tt nat\_case}}


48 
\newcommand\transrec{{\tt transrec}}


49 
\newcommand\rank{{\tt rank}}


50 
\newcommand\univ{{\tt univ}}


51 
\newcommand\Vrec{{\tt Vrec}}


52 
\newcommand\Inl{{\tt Inl}}


53 
\newcommand\Inr{{\tt Inr}}


54 
\newcommand\case{{\tt case}}


55 
\newcommand\lst{{\tt list}}


56 
\newcommand\Nil{{\tt Nil}}


57 
\newcommand\Cons{{\tt Cons}}


58 
\newcommand\lstcase{\hbox{\tt list\_case}}


59 
\newcommand\lstrec{\hbox{\tt list\_rec}}


60 
\newcommand\length{{\tt length}}


61 
\newcommand\listn{{\tt listn}}


62 
\newcommand\acc{{\tt acc}}


63 
\newcommand\primrec{{\tt primrec}}


64 
\newcommand\SC{{\tt SC}}


65 
\newcommand\CONST{{\tt CONST}}


66 
\newcommand\PROJ{{\tt PROJ}}


67 
\newcommand\COMP{{\tt COMP}}


68 
\newcommand\PREC{{\tt PREC}}


69 


70 
\newcommand\quniv{{\tt quniv}}


71 
\newcommand\llist{{\tt llist}}


72 
\newcommand\LNil{{\tt LNil}}


73 
\newcommand\LCons{{\tt LCons}}


74 
\newcommand\lconst{{\tt lconst}}


75 
\newcommand\lleq{{\tt lleq}}


76 
\newcommand\map{{\tt map}}


77 
\newcommand\term{{\tt term}}


78 
\newcommand\Apply{{\tt Apply}}


79 
\newcommand\termcase{{\tt term\_case}}


80 
\newcommand\rev{{\tt rev}}


81 
\newcommand\reflect{{\tt reflect}}


82 
\newcommand\tree{{\tt tree}}


83 
\newcommand\forest{{\tt forest}}


84 
\newcommand\Part{{\tt Part}}


85 
\newcommand\TF{{\tt tree\_forest}}


86 
\newcommand\Tcons{{\tt Tcons}}


87 
\newcommand\Fcons{{\tt Fcons}}


88 
\newcommand\Fnil{{\tt Fnil}}


89 
\newcommand\TFcase{\hbox{\tt TF\_case}}


90 
\newcommand\Fin{{\tt Fin}}


91 
\newcommand\QInl{{\tt QInl}}


92 
\newcommand\QInr{{\tt QInr}}


93 
\newcommand\qsplit{{\tt qsplit}}


94 
\newcommand\qcase{{\tt qcase}}


95 
\newcommand\Con{{\tt Con}}


96 
\newcommand\data{{\tt data}}


97 


98 
\sloppy


99 
\binperiod %%%treat . like a binary operator


100 


101 
\begin{document}


102 
\pagestyle{empty}


103 
\begin{titlepage}


104 
\maketitle


105 
\begin{abstract}


106 
Several theorem provers provide commands for formalizing recursive


107 
datatypes and/or inductively defined sets. This paper presents a new


108 
approach, based on fixedpoint definitions. It is unusually general:


109 
it admits all monotone inductive definitions. It is conceptually simple,


110 
which has allowed the easy implementation of mutual recursion and other


111 
conveniences. It also handles coinductive definitions: simply replace


112 
the least fixedpoint by a greatest fixedpoint. This represents the first


113 
automated support for coinductive definitions.


114 


115 
Examples include lists of $n$ elements, the accessible part of a relation


116 
and the set of primitive recursive functions. One example of a


117 
coinductive definition is bisimulations for lazy lists. \ifCADE\else


118 
Recursive datatypes are examined in detail, as well as one example of a


119 
``codatatype'': lazy lists. The appendices are simple user's manuals


120 
for this Isabelle/ZF package.\fi


121 


122 
The method has been implemented in Isabelle's ZF set theory. It should


123 
be applicable to any logic in which the KnasterTarski Theorem can be


124 
proved. The paper briefly describes a method of formalizing


125 
nonwellfounded data structures in standard ZF set theory.


126 
\end{abstract}


127 
%


128 
\begin{center} Copyright \copyright{} \number\year{} by Lawrence C. Paulson


129 
\end{center}


130 
\thispagestyle{empty}


131 
\end{titlepage}


132 


133 
\tableofcontents


134 
\cleardoublepage


135 
\pagenumbering{arabic}\pagestyle{headings}\DRAFT


136 


137 
\section{Introduction}


138 
Several theorem provers provide commands for formalizing recursive data


139 
structures, like lists and trees. Examples include Boyer and Moore's shell


140 
principle~\cite{bm79} and Melham's recursive type package for the HOL


141 
system~\cite{melham89}. Such data structures are called {\bf datatypes}


142 
below, by analogy with {\tt datatype} definitions in Standard~ML\@.


143 


144 
A datatype is but one example of a {\bf inductive definition}. This


145 
specifies the least set closed under given rules~\cite{aczel77}. The


146 
collection of theorems in a logic is inductively defined. A structural


147 
operational semantics~\cite{hennessy90} is an inductive definition of a


148 
reduction or evaluation relation on programs. A few theorem provers


149 
provide commands for formalizing inductive definitions; these include


150 
Coq~\cite{paulin92} and again the HOL system~\cite{camilleri92}.


151 


152 
The dual notion is that of a {\bf coinductive definition}. This specifies


153 
the greatest set closed under given rules. Important examples include


154 
using bisimulation relations to formalize equivalence of


155 
processes~\cite{milner89} or lazy functional programs~\cite{abramsky90}.


156 
Other examples include lazy lists and other infinite data structures; these


157 
are called {\bf codatatypes} below.


158 


159 
Most existing implementations of datatype and inductive definitions accept


160 
an artifically narrow class of inputs, and are not easily extended. The


161 
shell principle and Coq's inductive definition rules are built into the


162 
underlying logic. Melham's packages derive datatypes and inductive


163 
definitions from specialized constructions in higherorder logic.


164 


165 
This paper describes a package based on a fixedpoint approach. Least


166 
fixedpoints yield inductive definitions; greatest fixedpoints yield


167 
coinductive definitions. The package is uniquely powerful:


168 
\begin{itemize}


169 
\item It accepts the largest natural class of inductive definitions, namely


170 
all monotone inductive definitions.


171 
\item It accepts a wide class of datatype definitions.


172 
\item It handles coinductive and codatatype definitions. Most of


173 
the discussion below applies equally to inductive and coinductive


174 
definitions, and most of the code is shared. To my knowledge, this is


175 
the only package supporting coinductive definitions.


176 
\item Definitions may be mutually recursive.


177 
\end{itemize}


178 
The package is implemented in Isabelle~\cite{isabelleintro}, using ZF set


179 
theory \cite{paulsonsetI,paulsonsetII}. However, the fixedpoint


180 
approach is independent of Isabelle. The recursion equations are specified


181 
as introduction rules for the mutually recursive sets. The package


182 
transforms these rules into a mapping over sets, and attempts to prove that


183 
the mapping is monotonic and welltyped. If successful, the package


184 
makes fixedpoint definitions and proves the introduction, elimination and


185 
(co)induction rules. The package consists of several Standard ML


186 
functors~\cite{paulson91}; it accepts its argument and returns its result


187 
as ML structures.


188 


189 
Most datatype packages equip the new datatype with some means of expressing


190 
recursive functions. This is the main thing lacking from my package. Its


191 
fixedpoint operators define recursive sets, not recursive functions. But


192 
the Isabelle/ZF theory provides wellfounded recursion and other logical


193 
tools to simplify this task~\cite{paulsonsetII}.


194 


195 
\S2 briefly introduces the least and greatest fixedpoint operators. \S3


196 
discusses the form of introduction rules, mutual recursion and other points


197 
common to inductive and coinductive definitions. \S4 discusses induction


198 
and coinduction rules separately. \S5 presents several examples,


199 
including a coinductive definition. \S6 describes datatype definitions,


200 
while \S7 draws brief conclusions. \ifCADE\else The appendices are simple


201 
user's manuals for this Isabelle/ZF package.\fi


202 


203 
Most of the definitions and theorems shown below have been generated by the


204 
package. I have renamed some variables to improve readability.


205 


206 
\section{Fixedpoint operators}


207 
In set theory, the least and greatest fixedpoint operators are defined as


208 
follows:


209 
\begin{eqnarray*}


210 
\lfp(D,h) & \equiv & \inter\{X\sbs D. h(X)\sbs X\} \\


211 
\gfp(D,h) & \equiv & \union\{X\sbs D. X\sbs h(X)\}


212 
\end{eqnarray*}


213 
Say that $h$ is {\bf bounded by}~$D$ if $h(D)\sbs D$, and {\bf monotone} if


214 
$h(A)\sbs h(B)$ for all $A$ and $B$ such that $A\sbs B\sbs D$. If $h$ is


215 
bounded by~$D$ and monotone then both operators yield fixedpoints:


216 
\begin{eqnarray*}


217 
\lfp(D,h) & = & h(\lfp(D,h)) \\


218 
\gfp(D,h) & = & h(\gfp(D,h))


219 
\end{eqnarray*}


220 
These equations are instances of the KnasterTarski theorem, which states


221 
that every monotonic function over a complete lattice has a


222 
fixedpoint~\cite{davey&priestley}. It is obvious from their definitions


223 
that $\lfp$ must be the least fixedpoint, and $\gfp$ the greatest.


224 


225 
This fixedpoint theory is simple. The KnasterTarski theorem is easy to


226 
prove. Showing monotonicity of~$h$ is trivial, in typical cases. We must


227 
also exhibit a bounding set~$D$ for~$h$. Sometimes this is trivial, as


228 
when a set of ``theorems'' is (co)inductively defined over some previously


229 
existing set of ``formulae.'' But defining the bounding set for


230 
(co)datatype definitions requires some effort; see~\S\ref{datasec} below.


231 


232 


233 
\section{Elements of an inductive or coinductive definition}\label{basicsec}


234 
Consider a (co)inductive definition of the sets $R_1$, \ldots,~$R_n$, in


235 
mutual recursion. They will be constructed from previously existing sets


236 
$D_1$, \ldots,~$D_n$, respectively, which are called their {\bf domains}.


237 
The construction yields not $R_i\sbs D_i$ but $R_i\sbs D_1+\cdots+D_n$, where


238 
$R_i$ is the image of~$D_i$ under an injection~\cite[\S4.5]{paulsonsetII}.


239 


240 
The definition may involve arbitrary parameters $\vec{p}=p_1$,


241 
\ldots,~$p_k$. Each recursive set then has the form $R_i(\vec{p})$. The


242 
parameters must be identical every time they occur within a definition. This


243 
would appear to be a serious restriction compared with other systems such as


244 
Coq~\cite{paulin92}. For instance, we cannot define the lists of


245 
$n$ elements as the set $\listn(A,n)$ using rules where the parameter~$n$


246 
varies. \S\ref{listnsec} describes how to express this definition using the


247 
package.


248 


249 
To avoid clutter below, the recursive sets are shown as simply $R_i$


250 
instead of $R_i(\vec{p})$.


251 


252 
\subsection{The form of the introduction rules}\label{introsec}


253 
The body of the definition consists of the desired introduction rules,


254 
specified as strings. The conclusion of each rule must have the form $t\in


255 
R_i$, where $t$ is any term. Premises typically have the same form, but


256 
they can have the more general form $t\in M(R_i)$ or express arbitrary


257 
sideconditions.


258 


259 
The premise $t\in M(R_i)$ is permitted if $M$ is a monotonic operator on


260 
sets, satisfying the rule


261 
\[ \infer{M(A)\sbs M(B)}{A\sbs B} \]


262 
The inductive definition package must be supplied monotonicity rules for


263 
all such premises.


264 


265 
Because any monotonic $M$ may appear in premises, the criteria for an


266 
acceptable definition is semantic rather than syntactic. A suitable choice


267 
of~$M$ and~$t$ can express a lot. The powerset operator $\pow$ is


268 
monotone, and the premise $t\in\pow(R)$ expresses $t\sbs R$; see


269 
\S\ref{accsec} for an example. The `list of' operator is monotone, and


270 
the premise $t\in\lst(R)$ avoids having to encode the effect of~$\lst(R)$


271 
using mutual recursion; see \S\ref{primrecsec} and also my earlier


272 
paper~\cite[\S4.4]{paulsonsetII}.


273 


274 
Introduction rules may also contain {\bf sideconditions}. These are


275 
premises consisting of arbitrary formulae not mentioning the recursive


276 
sets. Sideconditions typically involve typechecking. One example is the


277 
premise $a\in A$ in the following rule from the definition of lists:


278 
\[ \infer{\Cons(a,l)\in\lst(A)}{a\in A & l\in\lst(A)} \]


279 


280 
\subsection{The fixedpoint definitions}


281 
The package translates the list of desired introduction rules into a fixedpoint


282 
definition. Consider, as a running example, the finite set operator


283 
$\Fin(A)$: the set of all finite subsets of~$A$. It can be


284 
defined as the least set closed under the rules


285 
\[ \emptyset\in\Fin(A) \qquad


286 
\infer{\{a\}\un b\in\Fin(A)}{a\in A & b\in\Fin(A)}


287 
\]


288 


289 
The domain for a (co)inductive definition must be some existing set closed


290 
under the rules. A suitable domain for $\Fin(A)$ is $\pow(A)$, the set of all


291 
subsets of~$A$. The package generates the definition


292 
\begin{eqnarray*}


293 
\Fin(A) & \equiv & \lfp(\pow(A), \;


294 
\begin{array}[t]{r@{\,}l}


295 
\lambda X. \{z\in\pow(A). & z=\emptyset \disj{} \\


296 
&(\exists a\,b. z=\{a\}\un b\conj a\in A\conj b\in X)\})


297 
\end{array}


298 
\end{eqnarray*}


299 
The contribution of each rule to the definition of $\Fin(A)$ should be


300 
obvious. A coinductive definition is similar but uses $\gfp$ instead


301 
of~$\lfp$.


302 


303 
The package must prove that the fixedpoint operator is applied to a


304 
monotonic function. If the introduction rules have the form described


305 
above, and if the package is supplied a monotonicity theorem for every


306 
$t\in M(R_i)$ premise, then this proof is trivial.\footnote{Due to the


307 
presence of logical connectives in the fixedpoint's body, the


308 
monotonicity proof requires some unusual rules. These state that the


309 
connectives $\conj$, $\disj$ and $\exists$ are monotonic with respect to


310 
the partial ordering on unary predicates given by $P\sqsubseteq Q$ if and


311 
only if $\forall x.P(x)\imp Q(x)$.}


312 


313 
The result structure returns the definitions of the recursive sets as a theorem


314 
list called {\tt defs}. It also returns, as the theorem {\tt unfold}, a


315 
fixedpoint equation such as


316 
\begin{eqnarray*}


317 
\Fin(A) & = &


318 
\begin{array}[t]{r@{\,}l}


319 
\{z\in\pow(A). & z=\emptyset \disj{} \\


320 
&(\exists a\,b. z=\{a\}\un b\conj a\in A\conj b\in \Fin(A))\}


321 
\end{array}


322 
\end{eqnarray*}


323 
It also returns, as the theorem {\tt dom\_subset}, an inclusion such as


324 
$\Fin(A)\sbs\pow(A)$.


325 


326 


327 
\subsection{Mutual recursion} \label{mutualsec}


328 
In a mutually recursive definition, the domain for the fixedoint construction


329 
is the disjoint sum of the domain~$D_i$ of each~$R_i$, for $i=1$,


330 
\ldots,~$n$. The package uses the injections of the


331 
binary disjoint sum, typically $\Inl$ and~$\Inr$, to express injections


332 
$h_{1,n}$, \ldots, $h_{n,n}$ for the $n$ary disjoint sum $D_1+\cdots+D_n$.


333 


334 
As discussed elsewhere \cite[\S4.5]{paulsonsetII}, Isabelle/ZF defines the


335 
operator $\Part$ to support mutual recursion. The set $\Part(A,h)$


336 
contains those elements of~$A$ having the form~$h(z)$:


337 
\begin{eqnarray*}


338 
\Part(A,h) & \equiv & \{x\in A. \exists z. x=h(z)\}.


339 
\end{eqnarray*}


340 
For mutually recursive sets $R_1$, \ldots,~$R_n$ with


341 
$n>1$, the package makes $n+1$ definitions. The first defines a set $R$ using


342 
a fixedpoint operator. The remaining $n$ definitions have the form


343 
\begin{eqnarray*}


344 
R_i & \equiv & \Part(R,h_{i,n}), \qquad i=1,\ldots, n.


345 
\end{eqnarray*}


346 
It follows that $R=R_1\un\cdots\un R_n$, where the $R_i$ are pairwise disjoint.


347 


348 


349 
\subsection{Proving the introduction rules}


350 
The uesr supplies the package with the desired form of the introduction


351 
rules. Once it has derived the theorem {\tt unfold}, it attempts


352 
to prove the introduction rules. From the user's point of view, this is the


353 
trickiest stage; the proofs often fail. The task is to show that the domain


354 
$D_1+\cdots+D_n$ of the combined set $R_1\un\cdots\un R_n$ is


355 
closed under all the introduction rules. This essentially involves replacing


356 
each~$R_i$ by $D_1+\cdots+D_n$ in each of the introduction rules and


357 
attempting to prove the result.


358 


359 
Consider the $\Fin(A)$ example. After substituting $\pow(A)$ for $\Fin(A)$


360 
in the rules, the package must prove


361 
\[ \emptyset\in\pow(A) \qquad


362 
\infer{\{a\}\un b\in\pow(A)}{a\in A & b\in\pow(A)}


363 
\]


364 
Such proofs can be regarded as typechecking the definition. The user


365 
supplies the package with typechecking rules to apply. Usually these are


366 
general purpose rules from the ZF theory. They could however be rules


367 
specifically proved for a particular inductive definition; sometimes this is


368 
the easiest way to get the definition through!


369 


370 
The package returns the introduction rules as the theorem list {\tt intrs}.


371 


372 
\subsection{The elimination rule}


373 
The elimination rule, called {\tt elim}, is derived in a straightforward


374 
manner. Applying the rule performs a case analysis, with one case for each


375 
introduction rule. Continuing our example, the elimination rule for $\Fin(A)$


376 
is


377 
\[ \infer{Q}{x\in\Fin(A) & \infer*{Q}{[x=\emptyset]}


378 
& \infer*{Q}{[x=\{a\}\un b & a\in A &b\in\Fin(A)]_{a,b}} }


379 
\]


380 
The package also returns a function {\tt mk\_cases}, for generating simplified


381 
instances of the elimination rule. However, {\tt mk\_cases} only works for


382 
datatypes and for inductive definitions involving datatypes, such as an


383 
inductively defined relation between lists. It instantiates {\tt elim}


384 
with a usersupplied term, then simplifies the cases using the freeness of


385 
the underlying datatype.


386 


387 


388 
\section{Induction and coinduction rules}


389 
Here we must consider inductive and coinductive definitions separately.


390 
For an inductive definition, the package returns an induction rule derived


391 
directly from the properties of least fixedpoints, as well as a modified


392 
rule for mutual recursion and inductively defined relations. For a


393 
coinductive definition, the package returns a basic coinduction rule.


394 


395 
\subsection{The basic induction rule}\label{basicindsec}


396 
The basic rule, called simply {\tt induct}, is appropriate in most situations.


397 
For inductive definitions, it is strong rule induction~\cite{camilleri92}; for


398 
datatype definitions (see below), it is just structural induction.


399 


400 
The induction rule for an inductively defined set~$R$ has the following form.


401 
The major premise is $x\in R$. There is a minor premise for each


402 
introduction rule:


403 
\begin{itemize}


404 
\item If the introduction rule concludes $t\in R_i$, then the minor premise


405 
is~$P(t)$.


406 


407 
\item The minor premise's eigenvariables are precisely the introduction


408 
rule's free variables that are not parameters of~$R$  for instance, $A$


409 
is not an eigenvariable in the $\Fin(A)$ rule below.


410 


411 
\item If the introduction rule has a premise $t\in R_i$, then the minor


412 
premise discharges the assumption $t\in R_i$ and the induction


413 
hypothesis~$P(t)$. If the introduction rule has a premise $t\in M(R_i)$


414 
then the minor premise discharges the single assumption


415 
\[ t\in M(\{z\in R_i. P(z)\}). \]


416 
Because $M$ is monotonic, this assumption implies $t\in M(R_i)$. The


417 
occurrence of $P$ gives the effect of an induction hypothesis, which may be


418 
exploited by appealing to properties of~$M$.


419 
\end{itemize}


420 
The rule for $\Fin(A)$ is


421 
\[ \infer{P(x)}{x\in\Fin(A) & P(\emptyset)


422 
& \infer*{P(\{a\}\un b)}{[a\in A & b\in\Fin(A) & P(b)]_{a,b}} }


423 
\]


424 
Stronger induction rules often suggest themselves. In the case of


425 
$\Fin(A)$, the Isabelle/ZF theory proceeds to derive a rule whose third


426 
premise discharges the extra assumption $a\not\in b$. Most special induction


427 
rules must be proved manually, but the package proves a rule for mutual


428 
induction and inductive relations.


429 


430 
\subsection{Mutual induction}


431 
The mutual induction rule is called {\tt


432 
mutual\_induct}. It differs from the basic rule in several respects:


433 
\begin{itemize}


434 
\item Instead of a single predicate~$P$, it uses $n$ predicates $P_1$,


435 
\ldots,~$P_n$: one for each recursive set.


436 


437 
\item There is no major premise such as $x\in R_i$. Instead, the conclusion


438 
refers to all the recursive sets:


439 
\[ (\forall z.z\in R_1\imp P_1(z))\conj\cdots\conj


440 
(\forall z.z\in R_n\imp P_n(z))


441 
\]


442 
Proving the premises simultaneously establishes $P_i(z)$ for $z\in


443 
R_i$ and $i=1$, \ldots,~$n$.


444 


445 
\item If the domain of some $R_i$ is the Cartesian product


446 
$A_1\times\cdots\times A_m$, then the corresponding predicate $P_i$ takes $m$


447 
arguments and the corresponding conjunct of the conclusion is


448 
\[ (\forall z_1\ldots z_m.\pair{z_1,\ldots,z_m}\in R_i\imp P_i(z_1,\ldots,z_m))


449 
\]


450 
\end{itemize}


451 
The last point above simplifies reasoning about inductively defined


452 
relations. It eliminates the need to express properties of $z_1$,


453 
\ldots,~$z_m$ as properties of the tuple $\pair{z_1,\ldots,z_m}$.


454 


455 
\subsection{Coinduction}\label{coindsec}


456 
A coinductive definition yields a primitive coinduction rule, with no


457 
refinements such as those for the induction rules. (Experience may suggest


458 
refinements later.) Consider the codatatype of lazy lists as an example. For


459 
suitable definitions of $\LNil$ and $\LCons$, lazy lists may be defined as the


460 
greatest fixedpoint satisfying the rules


461 
\[ \LNil\in\llist(A) \qquad


462 
\infer[()]{\LCons(a,l)\in\llist(A)}{a\in A & l\in\llist(A)}


463 
\]


464 
The $()$ tag stresses that this is a coinductive definition. A suitable


465 
domain for $\llist(A)$ is $\quniv(A)$, a set closed under variant forms of


466 
sum and product for representing infinite data structures


467 
(\S\ref{datasec}). Coinductive definitions use these variant sums and


468 
products.


469 


470 
The package derives an {\tt unfold} theorem similar to that for $\Fin(A)$.


471 
Then it proves the theorem {\tt co\_induct}, which expresses that $\llist(A)$


472 
is the greatest solution to this equation contained in $\quniv(A)$:


473 
\[ \infer{a\in\llist(A)}{a\in X & X\sbs \quniv(A) &


474 
\infer*{z=\LNil\disj \bigl(\exists a\,l.\,


475 
\begin{array}[t]{@{}l}


476 
z=\LCons(a,l) \conj a\in A \conj{}\\


477 
l\in X\un\llist(A) \bigr)


478 
\end{array} }{[z\in X]_z}}


479 
\]


480 
Having $X\un\llist(A)$ instead of simply $X$ in the third premise above


481 
represents a slight strengthening of the greatest fixedpoint property. I


482 
discuss several forms of coinduction rules elsewhere~\cite{paulsoncoind}.


483 


484 


485 
\section{Examples of inductive and coinductive definitions}\label{indegsec}


486 
This section presents several examples: the finite set operator,


487 
lists of $n$ elements, bisimulations on lazy lists, the wellfounded part


488 
of a relation, and the primitive recursive functions.


489 


490 
\subsection{The finite set operator}


491 
The definition of finite sets has been discussed extensively above. Here


492 
is the corresponding ML invocation (note that $\cons(a,b)$ abbreviates


493 
$\{a\}\un b$ in Isabelle/ZF):


494 
\begin{ttbox}


495 
structure Fin = Inductive_Fun


496 
(val thy = Arith.thy addconsts [(["Fin"],"i=>i")];


497 
val rec_doms = [("Fin","Pow(A)")];


498 
val sintrs =


499 
["0 : Fin(A)",


500 
"[ a: A; b: Fin(A) ] ==> cons(a,b) : Fin(A)"];


501 
val monos = [];


502 
val con_defs = [];


503 
val type_intrs = [empty_subsetI, cons_subsetI, PowI]


504 
val type_elims = [make_elim PowD]);


505 
\end{ttbox}


506 
The parent theory is obtained from {\tt Arith.thy} by adding the unary


507 
function symbol~$\Fin$. Its domain is specified as $\pow(A)$, where $A$ is


508 
the parameter appearing in the introduction rules. For typechecking, the


509 
package supplies the introduction rules:


510 
\[ \emptyset\sbs A \qquad


511 
\infer{\{a\}\un B\sbs C}{a\in C & B\sbs C}


512 
\]


513 
A further introduction rule and an elimination rule express the two


514 
directions of the equivalence $A\in\pow(B)\bimp A\sbs B$. Typechecking


515 
involves mostly introduction rules. When the package returns, we can refer


516 
to the $\Fin(A)$ introduction rules as {\tt Fin.intrs}, the induction rule


517 
as {\tt Fin.induct}, and so forth.


518 


519 
\subsection{Lists of $n$ elements}\label{listnsec}


520 
This has become a standard example in the


521 
literature. Following PaulinMohring~\cite{paulin92}, we could attempt to


522 
define a new datatype $\listn(A,n)$, for lists of length~$n$, as an $n$indexed


523 
family of sets. But her introduction rules


524 
\[ {\tt Niln}\in\listn(A,0) \qquad


525 
\infer{{\tt Consn}(n,a,l)\in\listn(A,\succ(n))}


526 
{n\in\nat & a\in A & l\in\listn(A,n)}


527 
\]


528 
are not acceptable to the inductive definition package:


529 
$\listn$ occurs with three different parameter lists in the definition.


530 


531 
\begin{figure}


532 
\begin{small}


533 
\begin{verbatim}


534 
structure ListN = Inductive_Fun


535 
(val thy = ListFn.thy addconsts [(["listn"],"i=>i")];


536 
val rec_doms = [("listn", "nat*list(A)")];


537 
val sintrs =


538 
["<0,Nil> : listn(A)",


539 
"[ a: A; <n,l> : listn(A) ] ==> <succ(n), Cons(a,l)> : listn(A)"];


540 
val monos = [];


541 
val con_defs = [];


542 
val type_intrs = nat_typechecks@List.intrs@[SigmaI]


543 
val type_elims = [SigmaE2]);


544 
\end{verbatim}


545 
\end{small}


546 
\hrule


547 
\caption{Defining lists of $n$ elements} \label{listnfig}


548 
\end{figure}


549 


550 
There is an obvious way of handling this particular example, which may suggest


551 
a general approach to varying parameters. Here, we can take advantage of an


552 
existing datatype definition of $\lst(A)$, with constructors $\Nil$


553 
and~$\Cons$. Then incorporate the number~$n$ into the inductive set itself,


554 
defining $\listn(A)$ as a relation. It consists of pairs $\pair{n,l}$ such


555 
that $n\in\nat$ and~$l\in\lst(A)$ and $l$ has length~$n$. In fact,


556 
$\listn(A)$ turns out to be the converse of the length function on~$\lst(A)$.


557 
The Isabelle/ZF introduction rules are


558 
\[ \pair{0,\Nil}\in\listn(A) \qquad


559 
\infer{\pair{\succ(n),\Cons(a,l)}\in\listn(A)}


560 
{a\in A & \pair{n,l}\in\listn(A)}


561 
\]


562 
Figure~\ref{listnfig} presents the ML invocation. A theory of lists,


563 
extended with a declaration of $\listn$, is the parent theory. The domain


564 
is specified as $\nat\times\lst(A)$. The typechecking rules include those


565 
for 0, $\succ$, $\Nil$ and $\Cons$. Because $\listn(A)$ is a set of pairs,


566 
typechecking also requires introduction and elimination rules to express


567 
both directions of the equivalence $\pair{a,b}\in A\times B \bimp a\in A


568 
\conj b\in B$.


569 


570 
The package returns introduction, elimination and induction rules for


571 
$\listn$. The basic induction rule, {\tt ListN.induct}, is


572 
\[ \infer{P(x)}{x\in\listn(A) & P(\pair{0,\Nil}) &


573 
\infer*{P(\pair{\succ(n),\Cons(a,l)})}


574 
{[a\in A & \pair{n,l}\in\listn(A) & P(\pair{n,l})]_{a,l,n}}}


575 
\]


576 
This rule requires the induction formula to be a


577 
unary property of pairs,~$P(\pair{n,l})$. The alternative rule, {\tt


578 
ListN.mutual\_induct}, uses a binary property instead:


579 
\[ \infer{\forall n\,l. \pair{n,l}\in\listn(A) \imp P(\pair{n,l})}


580 
{P(0,\Nil) &


581 
\infer*{P(\succ(n),\Cons(a,l))}


582 
{[a\in A & \pair{n,l}\in\listn(A) & P(n,l)]_{a,l,n}}}


583 
\]


584 
It is now a simple matter to prove theorems about $\listn(A)$, such as


585 
\[ \forall l\in\lst(A). \pair{\length(l),\, l}\in\listn(A) \]


586 
\[ \listn(A)``\{n\} = \{l\in\lst(A). \length(l)=n\} \]


587 
This latter result  here $r``A$ denotes the image of $A$ under $r$


588 
 asserts that the inductive definition agrees with the obvious notion of


589 
$n$element list.


590 


591 
Unlike in Coq, the definition does not declare a new datatype. A `list of


592 
$n$ elements' really is a list, and is subject to list operators such


593 
as append. For example, a trivial induction yields


594 
\[ \infer{\pair{m\mathbin{+} m,\, l@l'}\in\listn(A)}


595 
{\pair{m,l}\in\listn(A) & \pair{m',l'}\in\listn(A)}


596 
\]


597 
where $+$ here denotes addition on the natural numbers and @ denotes append.


598 


599 
\ifCADE\typeout{****Omitting mk_cases from CADE version!}


600 
\else


601 
\subsection{A demonstration of {\tt mk\_cases}}


602 
The elimination rule, {\tt ListN.elim}, is cumbersome:


603 
\[ \infer{Q}{x\in\listn(A) &


604 
\infer*{Q}{[x = \pair{0,\Nil}]} &


605 
\infer*{Q}


606 
{\left[\begin{array}{l}


607 
x = \pair{\succ(n),\Cons(a,l)} \\


608 
a\in A \\


609 
\pair{n,l}\in\listn(A)


610 
\end{array} \right]_{a,l,n}}}


611 
\]


612 
The function {\tt ListN.mk\_cases} generates simplified instances of this


613 
rule. It works by freeness reasoning on the list constructors.


614 
If $x$ is $\pair{i,\Nil}$ or $\pair{i,\Cons(a,l)}$ then {\tt ListN.mk\_cases}


615 
deduces the corresponding form of~$i$. For example,


616 
\begin{ttbox}


617 
ListN.mk_cases List.con_defs "<i,Cons(a,l)> : listn(A)"


618 
\end{ttbox}


619 
yields the rule


620 
\[ \infer{Q}{\pair{i, \Cons(a,l)}\in\listn(A) &


621 
\infer*{Q}


622 
{\left[\begin{array}{l}


623 
i = \succ(n) \\ a\in A \\ \pair{n,l}\in\listn(A)


624 
\end{array} \right]_{n}}}


625 
\]


626 
The package also has builtin rules for freeness reasoning about $0$


627 
and~$\succ$. So if $x$ is $\pair{0,l}$ or $\pair{\succ(i),l}$, then {\tt


628 
ListN.mk\_cases} can similarly deduce the corresponding form of~$l$.


629 


630 
The function {\tt mk\_cases} is also useful with datatype definitions


631 
themselves. The version from the definition of lists, namely {\tt


632 
List.mk\_cases}, can prove the rule


633 
\[ \infer{Q}{\Cons(a,l)\in\lst(A) &


634 
& \infer*{Q}{[a\in A &l\in\lst(A)]} }


635 
\]


636 
The most important uses of {\tt mk\_cases} concern inductive definitions of


637 
evaluation relations. Then {\tt mk\_cases} supports the kind of backward


638 
inference typical of hand proofs, for example to prove that the evaluation


639 
relation is functional.


640 
\fi %CADE


641 


642 
\subsection{A coinductive definition: bisimulations on lazy lists}


643 
This example anticipates the definition of the codatatype $\llist(A)$, which


644 
consists of lazy lists over~$A$. Its constructors are $\LNil$ and $\LCons$,


645 
satisfying the introduction rules shown in~\S\ref{coindsec}.


646 
Because $\llist(A)$ is defined as a greatest fixedpoint and uses the variant


647 
pairing and injection operators, it contains nonwellfounded elements such as


648 
solutions to $\LCons(a,l)=l$.


649 


650 
The next step in the development of lazy lists is to define a coinduction


651 
principle for proving equalities. This is done by showing that the equality


652 
relation on lazy lists is the greatest fixedpoint of some monotonic


653 
operation. The usual approach~\cite{pitts94} is to define some notion of


654 
bisimulation for lazy lists, define equivalence to be the greatest


655 
bisimulation, and finally to prove that two lazy lists are equivalent if and


656 
only if they are equal. The coinduction rule for equivalence then yields a


657 
coinduction principle for equalities.


658 


659 
A binary relation $R$ on lazy lists is a {\bf bisimulation} provided $R\sbs


660 
R^+$, where $R^+$ is the relation


661 
\[ \{\pair{\LNil;\LNil}\} \un


662 
\{\pair{\LCons(a,l);\LCons(a,l')} . a\in A \conj \pair{l;l'}\in R\}.


663 
\]


664 
Variant pairs are used, $\pair{l;l'}$ instead of $\pair{l,l'}$, because this


665 
is a coinductive definition.


666 


667 
A pair of lazy lists are {\bf equivalent} if they belong to some bisimulation.


668 
Equivalence can be coinductively defined as the greatest fixedpoint for the


669 
introduction rules


670 
\[ \pair{\LNil;\LNil} \in\lleq(A) \qquad


671 
\infer[()]{\pair{\LCons(a,l);\LCons(a,l')} \in\lleq(A)}


672 
{a\in A & \pair{l;l'}\in \lleq(A)}


673 
\]


674 
To make this coinductive definition, we invoke \verbCo_Inductive_Fun:


675 
\begin{ttbox}


676 
structure LList_Eq = Co_Inductive_Fun


677 
(val thy = LList.thy addconsts [(["lleq"],"i=>i")];


678 
val rec_doms = [("lleq", "llist(A) <*> llist(A)")];


679 
val sintrs =


680 
["<LNil; LNil> : lleq(A)",


681 
"[ a:A; <l;l'>: lleq(A) ] ==> <LCons(a,l); LCons(a,l')>: lleq(A)"];


682 
val monos = [];


683 
val con_defs = [];


684 
val type_intrs = LList.intrs@[QSigmaI];


685 
val type_elims = [QSigmaE2]);


686 
\end{ttbox}


687 
Again, {\tt addconsts} declares a constant for $\lleq$ in the parent theory.


688 
The domain of $\lleq(A)$ is $\llist(A)\otimes\llist(A)$, where $\otimes$


689 
denotes the variant Cartesian product. The typechecking rules include the


690 
introduction rules for lazy lists as well as rules expressinve both


691 
definitions of the equivalence


692 
\[ \pair{a;b}\in A\otimes B \bimp a\in A \conj b\in B. \]


693 


694 
The package returns the introduction rules and the elimination rule, as


695 
usual. But instead of induction rules, it returns a coinduction rule.


696 
The rule is too big to display in the usual notation; its conclusion is


697 
$a\in\lleq(A)$ and its premises are $a\in X$, $X\sbs \llist(A)\otimes\llist(A)$


698 
and


699 
\[ \infer*{z=\pair{\LNil;\LNil}\disj \bigl(\exists a\,l\,l'.\,


700 
\begin{array}[t]{@{}l}


701 
z=\pair{\LCons(a,l);\LCons(a,l')} \conj a\in A \conj{}\\


702 
\pair{l;l'}\in X\un\lleq(A) \bigr)


703 
\end{array} }{[z\in X]_z}


704 
\]


705 
Thus if $a\in X$, where $X$ is a bisimulation contained in the


706 
domain of $\lleq(A)$, then $a\in\lleq(A)$. It is easy to show that


707 
$\lleq(A)$ is reflexive: the equality relation is a bisimulation. And


708 
$\lleq(A)$ is symmetric: its converse is a bisimulation. But showing that


709 
$\lleq(A)$ coincides with the equality relation takes considerable work.


710 


711 
\subsection{The accessible part of a relation}\label{accsec}


712 
Let $\prec$ be a binary relation on~$D$; in short, $(\prec)\sbs D\times D$.


713 
The {\bf accessible} or {\bf wellfounded} part of~$\prec$, written


714 
$\acc(\prec)$, is essentially that subset of~$D$ for which $\prec$ admits


715 
no infinite decreasing chains~\cite{aczel77}. Formally, $\acc(\prec)$ is


716 
inductively defined to be the least set that contains $a$ if it contains


717 
all $\prec$predecessors of~$a$, for $a\in D$. Thus we need an


718 
introduction rule of the form


719 
%%%%\[ \infer{a\in\acc(\prec)}{\infer*{y\in\acc(\prec)}{[y\prec a]_y}} \]


720 
\[ \infer{a\in\acc(\prec)}{\forall y.y\prec a\imp y\in\acc(\prec)} \]


721 
PaulinMohring treats this example in Coq~\cite{paulin92}, but it causes


722 
difficulties for other systems. Its premise does not conform to


723 
the structure of introduction rules for HOL's inductive definition


724 
package~\cite{camilleri92}. It is also unacceptable to Isabelle package


725 
(\S\ref{introsec}), but fortunately can be transformed into one of the


726 
form $t\in M(R)$.


727 


728 
The powerset operator is monotonic, and $t\in\pow(R)$ is equivalent to


729 
$t\sbs R$. This in turn is equivalent to $\forall y\in t. y\in R$. To


730 
express $\forall y.y\prec a\imp y\in\acc(\prec)$ we need only find a


731 
term~$t$ such that $y\in t$ if and only if $y\prec a$. A suitable $t$ is


732 
the inverse image of~$\{a\}$ under~$\prec$.


733 


734 
The ML invocation below follows this approach. Here $r$ is~$\prec$ and


735 
$\field(r)$ refers to~$D$, the domain of $\acc(r)$. Finally $r^{}``\{a\}$


736 
denotes the inverse image of~$\{a\}$ under~$r$. The package is supplied


737 
the theorem {\tt Pow\_mono}, which asserts that $\pow$ is monotonic.


738 
\begin{ttbox}


739 
structure Acc = Inductive_Fun


740 
(val thy = WF.thy addconsts [(["acc"],"i=>i")];


741 
val rec_doms = [("acc", "field(r)")];


742 
val sintrs =


743 
["[ r``\{a\} : Pow(acc(r)); a : field(r) ] ==> a : acc(r)"];


744 
val monos = [Pow_mono];


745 
val con_defs = [];


746 
val type_intrs = [];


747 
val type_elims = []);


748 
\end{ttbox}


749 
The Isabelle theory proceeds to prove facts about $\acc(\prec)$. For


750 
instance, $\prec$ is wellfounded if and only if its field is contained in


751 
$\acc(\prec)$.


752 


753 
As mentioned in~\S\ref{basicindsec}, a premise of the form $t\in M(R)$


754 
gives rise to an unusual induction hypothesis. Let us examine the


755 
induction rule, {\tt Acc.induct}:


756 
\[ \infer{P(x)}{x\in\acc(r) &


757 
\infer*{P(a)}{[r^{}``\{a\}\in\pow(\{z\in\acc(r).P(z)\}) &


758 
a\in\field(r)]_a}}


759 
\]


760 
The strange induction hypothesis is equivalent to


761 
$\forall y. \pair{y,a}\in r\imp y\in\acc(r)\conj P(y)$.


762 
Therefore the rule expresses wellfounded induction on the accessible part


763 
of~$\prec$.


764 


765 
The use of inverse image is not essential. The Isabelle package can accept


766 
introduction rules with arbitrary premises of the form $\forall


767 
\vec{y}.P(\vec{y})\imp f(\vec{y})\in R$. The premise can be expressed


768 
equivalently as


769 
\[ \{z\in D. P(\vec{y}) \conj z=f(\vec{y})\} \]


770 
provided $f(\vec{y})\in D$ for all $\vec{y}$ such that~$P(\vec{y})$. The


771 
following section demonstrates another use of the premise $t\in M(R)$,


772 
where $M=\lst$.


773 


774 
\subsection{The primitive recursive functions}\label{primrecsec}


775 
The primitive recursive functions are traditionally defined inductively, as


776 
a subset of the functions over the natural numbers. One difficulty is that


777 
functions of all arities are taken together, but this is easily


778 
circumvented by regarding them as functions on lists. Another difficulty,


779 
the notion of composition, is less easily circumvented.


780 


781 
Here is a more precise definition. Letting $\vec{x}$ abbreviate


782 
$x_0,\ldots,x_{n1}$, we can write lists such as $[\vec{x}]$,


783 
$[y+1,\vec{x}]$, etc. A function is {\bf primitive recursive} if it


784 
belongs to the least set of functions in $\lst(\nat)\to\nat$ containing


785 
\begin{itemize}


786 
\item The {\bf successor} function $\SC$, such that $\SC[y,\vec{x}]=y+1$.


787 
\item All {\bf constant} functions $\CONST(k)$, such that


788 
$\CONST(k)[\vec{x}]=k$.


789 
\item All {\bf projection} functions $\PROJ(i)$, such that


790 
$\PROJ(i)[\vec{x}]=x_i$ if $0\leq i<n$.


791 
\item All {\bf compositions} $\COMP(g,[f_0,\ldots,f_{m1}])$,


792 
where $g$ and $f_0$, \ldots, $f_{m1}$ are primitive recursive,


793 
such that


794 
\begin{eqnarray*}


795 
\COMP(g,[f_0,\ldots,f_{m1}])[\vec{x}] & = &


796 
g[f_0[\vec{x}],\ldots,f_{m1}[\vec{x}]].


797 
\end{eqnarray*}


798 


799 
\item All {\bf recursions} $\PREC(f,g)$, where $f$ and $g$ are primitive


800 
recursive, such that


801 
\begin{eqnarray*}


802 
\PREC(f,g)[0,\vec{x}] & = & f[\vec{x}] \\


803 
\PREC(f,g)[y+1,\vec{x}] & = & g[\PREC(f,g)[y,\vec{x}],\, y,\, \vec{x}].


804 
\end{eqnarray*}


805 
\end{itemize}


806 
Composition is awkward because it combines not two functions, as is usual,


807 
but $m+1$ functions. In her proof that Ackermann's function is not


808 
primitive recursive, Nora Szasz was unable to formalize this definition


809 
directly~\cite{szasz93}. So she generalized primitive recursion to


810 
tuplevalued functions. This modified the inductive definition such that


811 
each operation on primitive recursive functions combined just two functions.


812 


813 
\begin{figure}


814 
\begin{ttbox}


815 
structure Primrec = Inductive_Fun


816 
(val thy = Primrec0.thy;


817 
val rec_doms = [("primrec", "list(nat)>nat")];


818 
val ext = None


819 
val sintrs =


820 
["SC : primrec",


821 
"k: nat ==> CONST(k) : primrec",


822 
"i: nat ==> PROJ(i) : primrec",


823 
"[ g: primrec; fs: list(primrec) ] ==> COMP(g,fs): primrec",


824 
"[ f: primrec; g: primrec ] ==> PREC(f,g): primrec"];


825 
val monos = [list_mono];


826 
val con_defs = [SC_def,CONST_def,PROJ_def,COMP_def,PREC_def];


827 
val type_intrs = pr0_typechecks


828 
val type_elims = []);


829 
\end{ttbox}


830 
\hrule


831 
\caption{Inductive definition of the primitive recursive functions}


832 
\label{primrecfig}


833 
\end{figure}


834 
\def\fs{{\it fs}}


835 
Szasz was using ALF, but Coq and HOL would also have problems accepting


836 
this definition. Isabelle's package accepts it easily since


837 
$[f_0,\ldots,f_{m1}]$ is a list of primitive recursive functions and


838 
$\lst$ is monotonic. There are five introduction rules, one for each of


839 
the five forms of primitive recursive function. Note the one for $\COMP$:


840 
\[ \infer{\COMP(g,\fs)\in\primrec}{g\in\primrec & \fs\in\lst(\primrec)} \]


841 
The induction rule for $\primrec$ has one case for each introduction rule.


842 
Due to the use of $\lst$ as a monotone operator, the composition case has


843 
an unusual induction hypothesis:


844 
\[ \infer*{P(\COMP(g,\fs))}


845 
{[g\in\primrec & \fs\in\lst(\{z\in\primrec.P(x)\})]_{\fs,g}} \]


846 
The hypothesis states that $\fs$ is a list of primitive recursive functions


847 
satisfying the induction formula. Proving the $\COMP$ case typically requires


848 
structural induction on lists, yielding two subcases: either $\fs=\Nil$ or


849 
else $\fs=\Cons(f,\fs')$, where $f\in\primrec$, $P(f)$, and $\fs'$ is


850 
another list of primitive recursive functions satisfying~$P$.


851 


852 
Figure~\ref{primrecfig} presents the ML invocation. Theory {\tt


853 
Primrec0.thy} defines the constants $\SC$, etc.; their definitions


854 
consist of routine list programming and are omitted. The Isabelle theory


855 
goes on to formalize Ackermann's function and prove that it is not


856 
primitive recursive, using the induction rule {\tt Primrec.induct}. The


857 
proof follows Szasz's excellent account.


858 


859 
ALF and Coq treat inductive definitions as datatypes, with a new


860 
constructor for each introduction rule. This forced Szasz to define a


861 
small programming language for the primitive recursive functions, and then


862 
define their semantics. But the Isabelle/ZF formulation defines the


863 
primitive recursive functions directly as a subset of the function set


864 
$\lst(\nat)\to\nat$. This saves a step and conforms better to mathematical


865 
tradition.


866 


867 


868 
\section{Datatypes and codatatypes}\label{datasec}


869 
A (co)datatype definition is a (co)inductive definition with automatically


870 
defined constructors and case analysis operator. The package proves that the


871 
case operator inverts the constructors, and can also prove freeness theorems


872 
involving any pair of constructors.


873 


874 


875 
\subsection{Constructors and their domain}


876 
Conceptually, our two forms of definition are distinct: a (co)inductive


877 
definition selects a subset of an existing set, but a (co)datatype


878 
definition creates a new set. But the package reduces the latter to the


879 
former. A set having strong closure properties must serve as the domain


880 
of the (co)inductive definition. Constructing this set requires some


881 
theoretical effort. Consider first datatypes and then codatatypes.


882 


883 
Isabelle/ZF defines the standard notion of Cartesian product $A\times B$,


884 
containing ordered pairs $\pair{a,b}$. Now the $m$tuple


885 
$\pair{x_1,\ldots\,x_m}$ is the empty set~$\emptyset$ if $m=0$, simply


886 
$x_1$ if $m=1$, and $\pair{x_1,\pair{x_2,\ldots\,x_m}}$ if $m\geq2$.


887 
Isabelle/ZF also defines the disjoint sum $A+B$, containing injections


888 
$\Inl(a)\equiv\pair{0,a}$ and $\Inr(b)\equiv\pair{1,b}$.


889 


890 
A datatype constructor $\Con(x_1,\ldots\,x_m)$ is defined to be


891 
$h(\pair{x_1,\ldots\,x_m})$, where $h$ is composed of $\Inl$ and~$\Inr$.


892 
In a mutually recursive definition, all constructors for the set~$R_i$ have


893 
the outer form~$h_{i,n}$, where $h_{i,n}$ is the injection described


894 
in~\S\ref{mutualsec}. Further nested injections ensure that the


895 
constructors for~$R_i$ are pairwise distinct.


896 


897 
Isabelle/ZF defines the set $\univ(A)$, which contains~$A$ and


898 
furthermore contains $\pair{a,b}$, $\Inl(a)$ and $\Inr(b)$ for $a$,


899 
$b\in\univ(A)$. In a typical datatype definition with set parameters


900 
$A_1$, \ldots, $A_k$, a suitable domain for all the recursive sets is


901 
$\univ(A_1\un\cdots\un A_k)$. This solves the problem for


902 
datatypes~\cite[\S4.2]{paulsonsetII}.


903 


904 
The standard pairs and injections can only yield wellfounded


905 
constructions. This eases the (manual!) definition of recursive functions


906 
over datatypes. But they are unsuitable for codatatypes, which typically


907 
contain nonwellfounded objects.


908 


909 
To support codatatypes, Isabelle/ZF defines a variant notion of ordered


910 
pair, written~$\pair{a;b}$. It also defines the corresponding variant


911 
notion of Cartesian product $A\otimes B$, variant injections $\QInl(a)$


912 
and~$\QInr(b)$, and variant disjoint sum $A\oplus B$. Finally it defines


913 
the set $\quniv(A)$, which contains~$A$ and furthermore contains


914 
$\pair{a;b}$, $\QInl(a)$ and $\QInr(b)$ for $a$, $b\in\quniv(A)$. In a


915 
typical codatatype definition with set parameters $A_1$, \ldots, $A_k$, a


916 
suitable domain is $\quniv(A_1\un\cdots\un A_k)$. This approach is an


917 
alternative to adopting an AntiFoundation


918 
Axiom~\cite{aczel88}.\footnote{No reference is available. Variant pairs


919 
are defined by $\pair{a;b}\equiv a+b \equiv (\{0\}\times a) \un (\{1\}\times


920 
b)$, where $\times$ is the Cartesian product for standard ordered pairs. Now


921 
$\pair{a;b}$ is injective and monotonic in its two arguments.


922 
Nonwellfounded constructions, such as infinite lists, are constructed


923 
as least fixedpoints; the bounding set typically has the form


924 
$\univ(a_1\un\cdots\un a_k)$, where $a_1$, \ldots, $a_k$ are specified


925 
elements of the construction.}


926 


927 


928 
\subsection{The case analysis operator}


929 
The (co)datatype package automatically defines a case analysis operator,


930 
called {\tt$R$\_case}. A mutually recursive definition still has only


931 
one operator, called {\tt$R_1$\_\ldots\_$R_n$\_case}. The case operator is


932 
analogous to those for products and sums.


933 


934 
Datatype definitions employ standard products and sums, whose operators are


935 
$\split$ and $\case$ and satisfy the equations


936 
\begin{eqnarray*}


937 
\split(f,\pair{x,y}) & = & f(x,y) \\


938 
\case(f,g,\Inl(x)) & = & f(x) \\


939 
\case(f,g,\Inr(y)) & = & g(y)


940 
\end{eqnarray*}


941 
Suppose the datatype has $k$ constructors $\Con_1$, \ldots,~$\Con_k$. Then


942 
its case operator takes $k+1$ arguments and satisfies an equation for each


943 
constructor:


944 
\begin{eqnarray*}


945 
R\hbox{\_case}(f_1,\ldots,f_k, {\tt Con}_i(\vec{x})) & = & f_i(\vec{x}),


946 
\qquad i = 1, \ldots, k


947 
\end{eqnarray*}


948 
Note that if $f$ and $g$ have metatype $i\To i$ then so do $\split(f)$ and


949 
$\case(f,g)$. This works because $\split$ and $\case$ operate on their


950 
last argument. They are easily combined to make complex case analysis


951 
operators. Here are two examples:


952 
\begin{itemize}


953 
\item $\split(\lambda x.\split(f(x)))$ performs case analysis for


954 
$A\times (B\times C)$, as is easily verified:


955 
\begin{eqnarray*}


956 
\split(\lambda x.\split(f(x)), \pair{a,b,c})


957 
& = & (\lambda x.\split(f(x))(a,\pair{b,c}) \\


958 
& = & \split(f(a), \pair{b,c}) \\


959 
& = & f(a,b,c)


960 
\end{eqnarray*}


961 


962 
\item $\case(f,\case(g,h))$ performs case analysis for $A+(B+C)$; let us


963 
verify one of the three equations:


964 
\begin{eqnarray*}


965 
\case(f,\case(g,h), \Inr(\Inl(b)))


966 
& = & \case(g,h,\Inl(b)) \\


967 
& = & g(b)


968 
\end{eqnarray*}


969 
\end{itemize}


970 
Codatatype definitions are treated in precisely the same way. They express


971 
case operators using those for the variant products and sums, namely


972 
$\qsplit$ and~$\qcase$.


973 


974 


975 
\ifCADE The package has processed all the datatypes discussed in my earlier


976 
paper~\cite{paulsonsetII} and the codatatype of lazy lists. Space


977 
limitations preclude discussing these examples here, but they are


978 
distributed with Isabelle.


979 
\typeout{****Omitting datatype examples from CADE version!} \else


980 


981 
To see how constructors and the case analysis operator are defined, let us


982 
examine some examples. These include lists and trees/forests, which I have


983 
discussed extensively in another paper~\cite{paulsonsetII}.


984 


985 
\begin{figure}


986 
\begin{ttbox}


987 
structure List = Datatype_Fun


988 
(val thy = Univ.thy;


989 
val rec_specs =


990 
[("list", "univ(A)",


991 
[(["Nil"], "i"),


992 
(["Cons"], "[i,i]=>i")])];


993 
val rec_styp = "i=>i";


994 
val ext = None


995 
val sintrs =


996 
["Nil : list(A)",


997 
"[ a: A; l: list(A) ] ==> Cons(a,l) : list(A)"];


998 
val monos = [];


999 
val type_intrs = datatype_intrs


1000 
val type_elims = datatype_elims);


1001 
\end{ttbox}


1002 
\hrule


1003 
\caption{Defining the datatype of lists} \label{listfig}


1004 


1005 
\medskip


1006 
\begin{ttbox}


1007 
structure LList = Co_Datatype_Fun


1008 
(val thy = QUniv.thy;


1009 
val rec_specs =


1010 
[("llist", "quniv(A)",


1011 
[(["LNil"], "i"),


1012 
(["LCons"], "[i,i]=>i")])];


1013 
val rec_styp = "i=>i";


1014 
val ext = None


1015 
val sintrs =


1016 
["LNil : llist(A)",


1017 
"[ a: A; l: llist(A) ] ==> LCons(a,l) : llist(A)"];


1018 
val monos = [];


1019 
val type_intrs = co_datatype_intrs


1020 
val type_elims = co_datatype_elims);


1021 
\end{ttbox}


1022 
\hrule


1023 
\caption{Defining the codatatype of lazy lists} \label{llistfig}


1024 
\end{figure}


1025 


1026 
\subsection{Example: lists and lazy lists}


1027 
Figures \ref{listfig} and~\ref{llistfig} present the ML definitions of


1028 
lists and lazy lists, respectively. They highlight the (many) similarities


1029 
and (few) differences between datatype and codatatype definitions.


1030 


1031 
Each form of list has two constructors, one for the empty list and one for


1032 
adding an element to a list. Each takes a parameter, defining the set of


1033 
lists over a given set~$A$. Each uses the appropriate domain from a


1034 
Isabelle/ZF theory:


1035 
\begin{itemize}


1036 
\item $\lst(A)$ specifies domain $\univ(A)$ and parent theory {\tt Univ.thy}.


1037 


1038 
\item $\llist(A)$ specifies domain $\quniv(A)$ and parent theory {\tt


1039 
QUniv.thy}.


1040 
\end{itemize}


1041 


1042 
Since $\lst(A)$ is a datatype, it enjoys a structural rule, {\tt List.induct}:


1043 
\[ \infer{P(x)}{x\in\lst(A) & P(\Nil)


1044 
& \infer*{P(\Cons(a,l))}{[a\in A & l\in\lst(A) & P(l)]_{a,l}} }


1045 
\]


1046 
Induction and freeness yield the law $l\not=\Cons(a,l)$. To strengthen this,


1047 
Isabelle/ZF defines the rank of a set and proves that the standard pairs and


1048 
injections have greater rank than their components. An immediate consequence,


1049 
which justifies structural recursion on lists \cite[\S4.3]{paulsonsetII},


1050 
is


1051 
\[ \rank(l) < \rank(\Cons(a,l)). \]


1052 


1053 
Since $\llist(A)$ is a codatatype, it has no induction rule. Instead it has


1054 
the coinduction rule shown in \S\ref{coindsec}. Since variant pairs and


1055 
injections are monotonic and need not have greater rank than their


1056 
components, fixedpoint operators can create cyclic constructions. For


1057 
example, the definition


1058 
\begin{eqnarray*}


1059 
\lconst(a) & \equiv & \lfp(\univ(a), \lambda l. \LCons(a,l))


1060 
\end{eqnarray*}


1061 
yields $\lconst(a) = \LCons(a,\lconst(a))$.


1062 


1063 
\medskip


1064 
It may be instructive to examine the definitions of the constructors and


1065 
case operator for $\lst(A)$. The definitions for $\llist(A)$ are similar.


1066 
The list constructors are defined as follows:


1067 
\begin{eqnarray*}


1068 
\Nil & = & \Inl(\emptyset) \\


1069 
\Cons(a,l) & = & \Inr(\pair{a,l})


1070 
\end{eqnarray*}


1071 
The operator $\lstcase$ performs case analysis on these two alternatives:


1072 
\begin{eqnarray*}


1073 
\lstcase(c,h) & \equiv & \case(\lambda u.c, \split(h))


1074 
\end{eqnarray*}


1075 
Let us verify the two equations:


1076 
\begin{eqnarray*}


1077 
\lstcase(c, h, \Nil) & = &


1078 
\case(\lambda u.c, \split(h), \Inl(\emptyset)) \\


1079 
& = & (\lambda u.c)(\emptyset) \\


1080 
& = & c.\\[1ex]


1081 
\lstcase(c, h, \Cons(x,y)) & = &


1082 
\case(\lambda u.c, \split(h), \Inr(\pair{x,y})) \\


1083 
& = & \split(h, \pair{x,y}) \\


1084 
& = & h(x,y).


1085 
\end{eqnarray*}


1086 


1087 
\begin{figure}


1088 
\begin{small}


1089 
\begin{verbatim}


1090 
structure TF = Datatype_Fun


1091 
(val thy = Univ.thy;


1092 
val rec_specs =


1093 
[("tree", "univ(A)",


1094 
[(["Tcons"], "[i,i]=>i")]),


1095 
("forest", "univ(A)",


1096 
[(["Fnil"], "i"),


1097 
(["Fcons"], "[i,i]=>i")])];


1098 
val rec_styp = "i=>i";


1099 
val ext = None


1100 
val sintrs =


1101 
["[ a:A; f: forest(A) ] ==> Tcons(a,f) : tree(A)",


1102 
"Fnil : forest(A)",


1103 
"[ t: tree(A); f: forest(A) ] ==> Fcons(t,f) : forest(A)"];


1104 
val monos = [];


1105 
val type_intrs = datatype_intrs


1106 
val type_elims = datatype_elims);


1107 
\end{verbatim}


1108 
\end{small}


1109 
\hrule


1110 
\caption{Defining the datatype of trees and forests} \label{tffig}


1111 
\end{figure}


1112 


1113 


1114 
\subsection{Example: mutual recursion}


1115 
In the mutually recursive trees/forests~\cite[\S4.5]{paulsonsetII}, trees


1116 
have the one constructor $\Tcons$, while forests have the two constructors


1117 
$\Fnil$ and~$\Fcons$. Figure~\ref{tffig} presents the ML


1118 
definition. It has much in common with that of $\lst(A)$, including its


1119 
use of $\univ(A)$ for the domain and {\tt Univ.thy} for the parent theory.


1120 
The three introduction rules define the mutual recursion. The


1121 
distinguishing feature of this example is its two induction rules.


1122 


1123 
The basic induction rule is called {\tt TF.induct}:


1124 
\[ \infer{P(x)}{x\in\TF(A) &


1125 
\infer*{P(\Tcons(a,f))}


1126 
{\left[\begin{array}{l} a\in A \\


1127 
f\in\forest(A) \\ P(f)


1128 
\end{array}


1129 
\right]_{a,f}}


1130 
& P(\Fnil)


1131 
& \infer*{P(\Fcons(a,l))}


1132 
{\left[\begin{array}{l} t\in\tree(A) \\ P(t) \\


1133 
f\in\forest(A) \\ P(f)


1134 
\end{array}


1135 
\right]_{t,f}} }


1136 
\]


1137 
This rule establishes a single predicate for $\TF(A)$, the union of the


1138 
recursive sets.


1139 


1140 
Although such reasoning is sometimes useful


1141 
\cite[\S4.5]{paulsonsetII}, a proper mutual induction rule should establish


1142 
separate predicates for $\tree(A)$ and $\forest(A)$. The package calls this


1143 
rule {\tt TF.mutual\_induct}. Observe the usage of $P$ and $Q$ in the


1144 
induction hypotheses:


1145 
\[ \infer{(\forall z. z\in\tree(A)\imp P(z)) \conj


1146 
(\forall z. z\in\forest(A)\imp Q(z))}


1147 
{\infer*{P(\Tcons(a,f))}


1148 
{\left[\begin{array}{l} a\in A \\


1149 
f\in\forest(A) \\ Q(f)


1150 
\end{array}


1151 
\right]_{a,f}}


1152 
& Q(\Fnil)


1153 
& \infer*{Q(\Fcons(a,l))}


1154 
{\left[\begin{array}{l} t\in\tree(A) \\ P(t) \\


1155 
f\in\forest(A) \\ Q(f)


1156 
\end{array}


1157 
\right]_{t,f}} }


1158 
\]


1159 
As mentioned above, the package does not define a structural recursion


1160 
operator. I have described elsewhere how this is done


1161 
\cite[\S4.5]{paulsonsetII}.


1162 


1163 
Both forest constructors have the form $\Inr(\cdots)$,


1164 
while the tree constructor has the form $\Inl(\cdots)$. This pattern would


1165 
hold regardless of how many tree or forest constructors there were.


1166 
\begin{eqnarray*}


1167 
\Tcons(a,l) & = & \Inl(\pair{a,l}) \\


1168 
\Fnil & = & \Inr(\Inl(\emptyset)) \\


1169 
\Fcons(a,l) & = & \Inr(\Inr(\pair{a,l}))


1170 
\end{eqnarray*}


1171 
There is only one case operator; it works on the union of the trees and


1172 
forests:


1173 
\begin{eqnarray*}


1174 
{\tt tree\_forest\_case}(f,c,g) & \equiv &


1175 
\case(\split(f),\, \case(\lambda u.c, \split(g)))


1176 
\end{eqnarray*}


1177 


1178 
\begin{figure}


1179 
\begin{small}


1180 
\begin{verbatim}


1181 
structure Data = Datatype_Fun


1182 
(val thy = Univ.thy;


1183 
val rec_specs =


1184 
[("data", "univ(A Un B)",


1185 
[(["Con0"], "i"),


1186 
(["Con1"], "i=>i"),


1187 
(["Con2"], "[i,i]=>i"),


1188 
(["Con3"], "[i,i,i]=>i")])];


1189 
val rec_styp = "[i,i]=>i";


1190 
val ext = None


1191 
val sintrs =


1192 
["Con0 : data(A,B)",


1193 
"[ a: A ] ==> Con1(a) : data(A,B)",


1194 
"[ a: A; b: B ] ==> Con2(a,b) : data(A,B)",


1195 
"[ a: A; b: B; d: data(A,B) ] ==> Con3(a,b,d) : data(A,B)"];


1196 
val monos = [];


1197 
val type_intrs = datatype_intrs


1198 
val type_elims = datatype_elims);


1199 
\end{verbatim}


1200 
\end{small}


1201 
\hrule


1202 
\caption{Defining the fourconstructor sample datatype} \label{datafig}


1203 
\end{figure}


1204 


1205 
\subsection{A fourconstructor datatype}


1206 
Finally let us consider a fairly general datatype. It has four


1207 
constructors $\Con_0$, $\Con_1$\ $\Con_2$ and $\Con_3$, with the


1208 
corresponding arities. Figure~\ref{datafig} presents the ML definition.


1209 
Because this datatype has two set parameters, $A$ and~$B$, it specifies


1210 
$\univ(A\un B)$ as its domain. The structural induction rule has four


1211 
minor premises, one per constructor:


1212 
\[ \infer{P(x)}{x\in\data(A,B) &


1213 
P(\Con_0) &


1214 
\infer*{P(\Con_1(a))}{[a\in A]_a} &


1215 
\infer*{P(\Con_2(a,b))}


1216 
{\left[\begin{array}{l} a\in A \\ b\in B \end{array}


1217 
\right]_{a,b}} &


1218 
\infer*{P(\Con_3(a,b,d))}


1219 
{\left[\begin{array}{l} a\in A \\ b\in B \\


1220 
d\in\data(A,B) \\ P(d)


1221 
\end{array}


1222 
\right]_{a,b,d}} }


1223 
\]


1224 


1225 
The constructor definitions are


1226 
\begin{eqnarray*}


1227 
\Con_0 & = & \Inl(\Inl(\emptyset)) \\


1228 
\Con_1(a) & = & \Inl(\Inr(a)) \\


1229 
\Con_2(a,b) & = & \Inr(\Inl(\pair{a,b})) \\


1230 
\Con_3(a,b,c) & = & \Inr(\Inr(\pair{a,b,c})).


1231 
\end{eqnarray*}


1232 
The case operator is


1233 
\begin{eqnarray*}


1234 
{\tt data\_case}(f_0,f_1,f_2,f_3) & \equiv &


1235 
\case(\begin{array}[t]{@{}l}


1236 
\case(\lambda u.f_0,\; f_1),\, \\


1237 
\case(\split(f_2),\; \split(\lambda v.\split(f_3(v)))) )


1238 
\end{array}


1239 
\end{eqnarray*}


1240 
This may look cryptic, but the case equations are trivial to verify.


1241 


1242 
In the constructor definitions, the injections are balanced. A more naive


1243 
approach is to define $\Con_3(a,b,c)$ as


1244 
$\Inr(\Inr(\Inr(\pair{a,b,c})))$; instead, each constructor has two


1245 
injections. The difference here is small. But the ZF examples include a


1246 
60element enumeration type, where each constructor has 5 or~6 injections.


1247 
The naive approach would require 1 to~59 injections; the definitions would be


1248 
quadratic in size. It is like the difference between the binary and unary


1249 
numeral systems.


1250 


1251 
The package returns the constructor and case operator definitions as the


1252 
theorem list \verbcon_defs. The head of this list defines the case


1253 
operator and the tail defines the constructors.


1254 


1255 
The package returns the case equations, such as


1256 
\begin{eqnarray*}


1257 
{\tt data\_case}(f_0,f_1,f_2,f_3,\Con_3(a,b,c)) & = & f_3(a,b,c),


1258 
\end{eqnarray*}


1259 
as the theorem list \verbcase_eqns. There is one equation per constructor.


1260 


1261 
\subsection{Proving freeness theorems}


1262 
There are two kinds of freeness theorems:


1263 
\begin{itemize}


1264 
\item {\bf injectiveness} theorems, such as


1265 
\[ \Con_2(a,b) = \Con_2(a',b') \bimp a=a' \conj b=b' \]


1266 


1267 
\item {\bf distinctness} theorems, such as


1268 
\[ \Con_1(a) \not= \Con_2(a',b') \]


1269 
\end{itemize}


1270 
Since the number of such theorems is quadratic in the number of constructors,


1271 
the package does not attempt to prove them all. Instead it returns tools for


1272 
proving desired theorems  either explicitly or `on the fly' during


1273 
simplification or classical reasoning.


1274 


1275 
The theorem list \verbfree_iffs enables the simplifier to perform freeness


1276 
reasoning. This works by incremental unfolding of constructors that appear in


1277 
equations. The theorem list contains logical equivalences such as


1278 
\begin{eqnarray*}


1279 
\Con_0=c & \bimp & c=\Inl(\Inl(\emptyset)) \\


1280 
\Con_1(a)=c & \bimp & c=\Inl(\Inr(a)) \\


1281 
& \vdots & \\


1282 
\Inl(a)=\Inl(b) & \bimp & a=b \\


1283 
\Inl(a)=\Inr(b) & \bimp & \bot \\


1284 
\pair{a,b} = \pair{a',b'} & \bimp & a=a' \conj b=b'


1285 
\end{eqnarray*}


1286 
For example, these rewrite $\Con_1(a)=\Con_1(b)$ to $a=b$ in four steps.


1287 


1288 
The theorem list \verbfree_SEs enables the classical


1289 
reasoner to perform similar replacements. It consists of elimination rules


1290 
to replace $\Con_0=c$ by $c=\Inl(\Inl(\emptyset))$, and so forth, in the


1291 
assumptions.


1292 


1293 
Such incremental unfolding combines freeness reasoning with other proof


1294 
steps. It has the unfortunate sideeffect of unfolding definitions of


1295 
constructors in contexts such as $\exists x.\Con_1(a)=x$, where they should


1296 
be left alone. Calling the Isabelle tactic {\tt fold\_tac con\_defs}


1297 
restores the defined constants.


1298 
\fi %CADE


1299 


1300 
\section{Conclusions and future work}


1301 
The fixedpoint approach makes it easy to implement a uniquely powerful


1302 
package for inductive and coinductive definitions. It is efficient: it


1303 
processes most definitions in seconds and even a 60constructor datatype


1304 
requires only two minutes. It is also simple: the package consists of


1305 
under 1100 lines (35K bytes) of Standard ML code. The first working


1306 
version took under a week to code.


1307 


1308 
The approach is not restricted to set theory. It should be suitable for


1309 
any logic that has some notion of set and the KnasterTarski Theorem.


1310 
Indeed, Melham's inductive definition package for the HOL


1311 
system~\cite{camilleri92} implicitly proves this theorem.


1312 


1313 
Datatype and codatatype definitions furthermore require a particular set


1314 
closed under a suitable notion of ordered pair. I intend to use the


1315 
Isabelle/ZF package as the basis for a higherorder logic one, using


1316 
Isabelle/HOL\@. The necessary theory is already


1317 
mechanizeds~\cite{paulsoncoind}. HOL represents sets by unary predicates;


1318 
defining the corresponding types may cause complication.


1319 


1320 


1321 
\bibliographystyle{plain}


1322 
\bibliography{atp,theory,funprog,isabelle}


1323 
%%%%%\doendnotes


1324 


1325 
\ifCADE\typeout{****Omitting appendices from CADE version!}


1326 
\else


1327 
\newpage


1328 
\appendix


1329 
\section{Inductive and coinductive definitions: users guide}


1330 
The ML functors \verbInductive_Fun and \verbCo_Inductive_Fun build


1331 
inductive and coinductive definitions, respectively. This section describes


1332 
how to invoke them.


1333 


1334 
\subsection{The result structure}


1335 
Many of the result structure's components have been discussed


1336 
in~\S\ref{basicsec}; others are selfexplanatory.


1337 
\begin{description}


1338 
\item[\tt thy] is the new theory containing the recursive sets.


1339 


1340 
\item[\tt defs] is the list of definitions of the recursive sets.


1341 


1342 
\item[\tt bnd\_mono] is a monotonicity theorem for the fixedpoint operator.


1343 


1344 
\item[\tt unfold] is a fixedpoint equation for the recursive set (the union of


1345 
the recursive sets, in the case of mutual recursion).


1346 


1347 
\item[\tt dom\_subset] is a theorem stating inclusion in the domain.


1348 


1349 
\item[\tt intrs] is the list of introduction rules, now proved as theorems, for


1350 
the recursive sets.


1351 


1352 
\item[\tt elim] is the elimination rule.


1353 


1354 
\item[\tt mk\_cases] is a function to create simplified instances of {\tt


1355 
elim}, using freeness reasoning on some underlying datatype.


1356 
\end{description}


1357 


1358 
For an inductive definition, the result structure contains two induction rules,


1359 
{\tt induct} and \verbmutual_induct. For a coinductive definition, it


1360 
contains the rule \verbco_induct.


1361 


1362 
\begin{figure}


1363 
\begin{ttbox}


1364 
sig


1365 
val thy : theory


1366 
val defs : thm list


1367 
val bnd_mono : thm


1368 
val unfold : thm


1369 
val dom_subset : thm


1370 
val intrs : thm list


1371 
val elim : thm


1372 
val mk_cases : thm list > string > thm


1373 
{\it(Inductive definitions only)}


1374 
val induct : thm


1375 
val mutual_induct: thm


1376 
{\it(Coinductive definitions only)}


1377 
val co_induct : thm


1378 
end


1379 
\end{ttbox}


1380 
\hrule


1381 
\caption{The result of a (co)inductive definition} \label{defresultfig}


1382 
\end{figure}


1383 


1384 
Figure~\ref{defresultfig} summarizes the two result signatures,


1385 
specifying the types of all these components.


1386 


1387 
\begin{figure}


1388 
\begin{ttbox}


1389 
sig


1390 
val thy : theory


1391 
val rec_doms : (string*string) list


1392 
val sintrs : string list


1393 
val monos : thm list


1394 
val con_defs : thm list


1395 
val type_intrs : thm list


1396 
val type_elims : thm list


1397 
end


1398 
\end{ttbox}


1399 
\hrule


1400 
\caption{The argument of a (co)inductive definition} \label{defargfig}


1401 
\end{figure}


1402 


1403 
\subsection{The argument structure}


1404 
Both \verbInductive_Fun and \verbCo_Inductive_Fun take the same argument


1405 
structure (Figure~\ref{defargfig}). Its components are as follows:


1406 
\begin{description}


1407 
\item[\tt thy] is the definition's parent theory, which {\it must\/}


1408 
declare constants for the recursive sets.


1409 


1410 
\item[\tt rec\_doms] is a list of pairs, associating the name of each recursive


1411 
set with its domain.


1412 


1413 
\item[\tt sintrs] specifies the desired introduction rules as strings.


1414 


1415 
\item[\tt monos] consists of monotonicity theorems for each operator applied


1416 
to a recursive set in the introduction rules.


1417 


1418 
\item[\tt con\_defs] contains definitions of constants appearing in the


1419 
introduction rules. The (co)datatype package supplies the constructors'


1420 
definitions here. Most direct calls of \verbInductive_Fun or


1421 
\verbCo_Inductive_Fun pass the empty list; one exception is the primitive


1422 
recursive functions example (\S\ref{primrecsec}).


1423 


1424 
\item[\tt type\_intrs] consists of introduction rules for typechecking the


1425 
definition, as discussed in~\S\ref{basicsec}. They are applied using


1426 
depthfirst search; you can trace the proof by setting


1427 
\verbtrace_DEPTH_FIRST := true.


1428 


1429 
\item[\tt type\_elims] consists of elimination rules for typechecking the


1430 
definition. They are presumed to be `safe' and are applied as much as


1431 
possible, prior to the {\tt type\_intrs} search.


1432 
\end{description}


1433 
The package has a few notable restrictions:


1434 
\begin{itemize}


1435 
\item The parent theory, {\tt thy}, must declare the recursive sets as


1436 
constants. You can extend a theory with new constants using {\tt


1437 
addconsts}, as illustrated in~\S\ref{indegsec}. If the inductive


1438 
definition also requires new concrete syntax, then it is simpler to


1439 
express the parent theory using a theory file. It is often convenient to


1440 
define an infix syntax for relations, say $a\prec b$ for $\pair{a,b}\in


1441 
R$.


1442 


1443 
\item The names of the recursive sets must be identifiers, not infix


1444 
operators.


1445 


1446 
\item Sideconditions must not be conjunctions. However, an introduction rule


1447 
may contain any number of sideconditions.


1448 
\end{itemize}


1449 


1450 


1451 
\section{Datatype and codatatype definitions: users guide}


1452 
The ML functors \verbDatatype_Fun and \verbCo_Datatype_Fun define datatypes


1453 
and codatatypes, invoking \verbDatatype_Fun and


1454 
\verbCo_Datatype_Fun to make the underlying (co)inductive definitions.


1455 


1456 


1457 
\subsection{The result structure}


1458 
The result structure extends that of (co)inductive definitions


1459 
(Figure~\ref{defresultfig}) with several additional items:


1460 
\begin{ttbox}


1461 
val con_thy : theory


1462 
val con_defs : thm list


1463 
val case_eqns : thm list


1464 
val free_iffs : thm list


1465 
val free_SEs : thm list


1466 
val mk_free : string > thm


1467 
\end{ttbox}


1468 
Most of these have been discussed in~\S\ref{datasec}. Here is a summary:


1469 
\begin{description}


1470 
\item[\tt con\_thy] is a new theory containing definitions of the


1471 
(co)datatype's constructors and case operator. It also declares the


1472 
recursive sets as constants, so that it may serve as the parent


1473 
theory for the (co)inductive definition.


1474 


1475 
\item[\tt con\_defs] is a list of definitions: the case operator followed by


1476 
the constructors. This theorem list can be supplied to \verbmk_cases, for


1477 
example.


1478 


1479 
\item[\tt case\_eqns] is a list of equations, stating that the case operator


1480 
inverts each constructor.


1481 


1482 
\item[\tt free\_iffs] is a list of logical equivalences to perform freeness


1483 
reasoning by rewriting. A typical application has the form


1484 
\begin{ttbox}


1485 
by (asm_simp_tac (ZF_ss addsimps free_iffs) 1);


1486 
\end{ttbox}


1487 


1488 
\item[\tt free\_SEs] is a list of `safe' elimination rules to perform freeness


1489 
reasoning. It can be supplied to \verberesolve_tac or to the classical


1490 
reasoner:


1491 
\begin{ttbox}


1492 
by (fast_tac (ZF_cs addSEs free_SEs) 1);


1493 
\end{ttbox}


1494 


1495 
\item[\tt mk\_free] is a function to prove freeness properties, specified as


1496 
strings. The theorems can be expressed in various forms, such as logical


1497 
equivalences or elimination rules.


1498 
\end{description}


1499 


1500 
The result structure also inherits everything from the underlying


1501 
(co)inductive definition, such as the introduction rules, elimination rule,


1502 
and induction/coinduction rule.


1503 


1504 


1505 
\begin{figure}


1506 
\begin{ttbox}


1507 
sig


1508 
val thy : theory


1509 
val rec_specs : (string * string * (string list*string)list) list


1510 
val rec_styp : string


1511 
val ext : Syntax.sext option


1512 
val sintrs : string list


1513 
val monos : thm list


1514 
val type_intrs: thm list


1515 
val type_elims: thm list


1516 
end


1517 
\end{ttbox}


1518 
\hrule


1519 
\caption{The argument of a (co)datatype definition} \label{dataargfig}


1520 
\end{figure}


1521 


1522 
\subsection{The argument structure}


1523 
Both (co)datatype functors take the same argument structure


1524 
(Figure~\ref{dataargfig}). It does not extend that for (co)inductive


1525 
definitions, but shares several components and passes them uninterpreted to


1526 
\verbDatatype_Fun or


1527 
\verbCo_Datatype_Fun. The new components are as follows:


1528 
\begin{description}


1529 
\item[\tt thy] is the (co)datatype's parent theory. It {\it must not\/}


1530 
declare constants for the recursive sets. Recall that (co)inductive


1531 
definitions have the opposite restriction.


1532 


1533 
\item[\tt rec\_specs] is a list of triples of the form ({\it recursive set\/},


1534 
{\it domain\/}, {\it constructors\/}) for each mutually recursive set. {\it


1535 
Constructors\/} is a list of the form (names, type). See the discussion and


1536 
examples in~\S\ref{datasec}.


1537 


1538 
\item[\tt rec\_styp] is the common metatype of the mutually recursive sets,


1539 
specified as a string. They must all have the same type because all must


1540 
take the same parameters.


1541 


1542 
\item[\tt ext] is an optional syntax extension, usually omitted by writing


1543 
{\tt None}. You can supply mixfix syntax for the constructors by supplying


1544 
\begin{ttbox}


1545 
Some (Syntax.simple_sext [{\it mixfix declarations\/}])


1546 
\end{ttbox}


1547 
\end{description}


1548 
The choice of domain is usually simple. Isabelle/ZF defines the set


1549 
$\univ(A)$, which contains~$A$ and is closed under the standard Cartesian


1550 
products and disjoint sums \cite[\S4.2]{paulsonsetII}. In a typical


1551 
datatype definition with set parameters $A_1$, \ldots, $A_k$, a suitable


1552 
domain for all the recursive sets is $\univ(A_1\un\cdots\un A_k)$. For a


1553 
codatatype definition, the set


1554 
$\quniv(A)$ contains~$A$ and is closed under the variant Cartesian products


1555 
and disjoint sums; the appropropriate domain is


1556 
$\quniv(A_1\un\cdots\un A_k)$.


1557 


1558 
The {\tt sintrs} specify the introduction rules, which govern the recursive


1559 
structure of the datatype. Introduction rules may involve monotone operators


1560 
and sideconditions to express things that go beyond the usual notion of


1561 
datatype. The theorem lists {\tt monos}, {\tt type\_intrs} and {\tt


1562 
type\_elims} should contain precisely what is needed for the underlying


1563 
(co)inductive definition. Isabelle/ZF defines theorem lists that can be


1564 
defined for the latter two components:


1565 
\begin{itemize}


1566 
\item {\tt datatype\_intrs} and {\tt datatype\_elims} are typechecking rules


1567 
for $\univ(A)$.


1568 
\item {\tt co\_datatype\_intrs} and {\tt co\_datatype\_elims} are typechecking


1569 
rules for $\quniv(A)$.


1570 
\end{itemize}


1571 
In typical definitions, these theorem lists need not be supplemented with


1572 
other theorems.


1573 


1574 
The constructor definitions' righthand sides can overlap. A


1575 
simple example is the datatype for the combinators, whose constructors are


1576 
\begin{eqnarray*}


1577 
{\tt K} & \equiv & \Inl(\emptyset) \\


1578 
{\tt S} & \equiv & \Inr(\Inl(\emptyset)) \\


1579 
p{\tt\#}q & \equiv & \Inr(\Inl(\pair{p,q}))


1580 
\end{eqnarray*}


1581 
Unlike in previous versions of Isabelle, \verbfold_tac now ensures that the


1582 
longest righthand sides are folded first.


1583 


1584 
\fi


1585 
\end{document}
