author  lcp 
Wed, 01 Dec 1993 13:00:04 +0100  
changeset 179  ceb948cefb93 
parent 130  c035b6b9eafc 
child 181  9c2db771f224 
permissions  rwrr 
130  1 
\documentstyle[12pt,a4,proof,lcp,alltt,amssymbols]{article} 
179  2 
\hyphenation{datatype} 
130  3 
%CADE version should use 11pt and the Springer style 
103  4 
\newif\ifCADE 
5 
\CADEfalse 

6 

130  7 
\title{A Fixedpoint Approach to Implementing (Co)Inductive 
8 
Definitions\thanks{Jim Grundy and Simon Thompson made detailed comments on 

9 
a draft. Research funded by the SERC (grants GR/G53279, 

103  10 
GR/H40570) and by the ESPRIT Basic Research Action 6453 `Types'.}} 
11 

12 
\author{{\em Lawrence C. Paulson}\\ 

13 
Computer Laboratory, University of Cambridge} 

14 
\date{\today} 

15 
\setcounter{secnumdepth}{2} \setcounter{tocdepth}{2} 

16 

17 
\def\picture #1 by #2 (#3){% pictures: width by height (name) 

18 
\message{Picture #3} 

19 
\vbox to #2{\hrule width #1 height 0pt depth 0pt 

20 
\vfill \special{picture #3}}} 

21 

22 

23 
\newcommand\sbs{\subseteq} 

24 
\newcommand\List[1]{\lbrakk#1\rbrakk} 

25 
\let\To=\Rightarrow 

26 
\newcommand\Var[1]{{?\!#1}} 

27 

28 

29 
%%%\newcommand\Pow{{\tt Pow}} 

30 
\let\pow=\wp 

31 
\newcommand\RepFun{{\tt RepFun}} 

32 
\newcommand\pair[1]{\langle#1\rangle} 

33 
\newcommand\cons{{\tt cons}} 

34 
\def\succ{{\tt succ}} 

35 
\newcommand\split{{\tt split}} 

36 
\newcommand\fst{{\tt fst}} 

37 
\newcommand\snd{{\tt snd}} 

38 
\newcommand\converse{{\tt converse}} 

39 
\newcommand\domain{{\tt domain}} 

40 
\newcommand\range{{\tt range}} 

41 
\newcommand\field{{\tt field}} 

42 
\newcommand\bndmono{\hbox{\tt bnd\_mono}} 

43 
\newcommand\lfp{{\tt lfp}} 

44 
\newcommand\gfp{{\tt gfp}} 

45 
\newcommand\id{{\tt id}} 

46 
\newcommand\trans{{\tt trans}} 

47 
\newcommand\wf{{\tt wf}} 

48 
\newcommand\wfrec{\hbox{\tt wfrec}} 

49 
\newcommand\nat{{\tt nat}} 

50 
\newcommand\natcase{\hbox{\tt nat\_case}} 

51 
\newcommand\transrec{{\tt transrec}} 

52 
\newcommand\rank{{\tt rank}} 

53 
\newcommand\univ{{\tt univ}} 

54 
\newcommand\Vrec{{\tt Vrec}} 

55 
\newcommand\Inl{{\tt Inl}} 

56 
\newcommand\Inr{{\tt Inr}} 

57 
\newcommand\case{{\tt case}} 

58 
\newcommand\lst{{\tt list}} 

59 
\newcommand\Nil{{\tt Nil}} 

60 
\newcommand\Cons{{\tt Cons}} 

61 
\newcommand\lstcase{\hbox{\tt list\_case}} 

62 
\newcommand\lstrec{\hbox{\tt list\_rec}} 

63 
\newcommand\length{{\tt length}} 

64 
\newcommand\listn{{\tt listn}} 

65 
\newcommand\acc{{\tt acc}} 

66 
\newcommand\primrec{{\tt primrec}} 

67 
\newcommand\SC{{\tt SC}} 

68 
\newcommand\CONST{{\tt CONST}} 

69 
\newcommand\PROJ{{\tt PROJ}} 

70 
\newcommand\COMP{{\tt COMP}} 

71 
\newcommand\PREC{{\tt PREC}} 

72 

73 
\newcommand\quniv{{\tt quniv}} 

74 
\newcommand\llist{{\tt llist}} 

75 
\newcommand\LNil{{\tt LNil}} 

76 
\newcommand\LCons{{\tt LCons}} 

77 
\newcommand\lconst{{\tt lconst}} 

78 
\newcommand\lleq{{\tt lleq}} 

79 
\newcommand\map{{\tt map}} 

80 
\newcommand\term{{\tt term}} 

81 
\newcommand\Apply{{\tt Apply}} 

82 
\newcommand\termcase{{\tt term\_case}} 

83 
\newcommand\rev{{\tt rev}} 

84 
\newcommand\reflect{{\tt reflect}} 

85 
\newcommand\tree{{\tt tree}} 

86 
\newcommand\forest{{\tt forest}} 

87 
\newcommand\Part{{\tt Part}} 

88 
\newcommand\TF{{\tt tree\_forest}} 

89 
\newcommand\Tcons{{\tt Tcons}} 

90 
\newcommand\Fcons{{\tt Fcons}} 

91 
\newcommand\Fnil{{\tt Fnil}} 

92 
\newcommand\TFcase{\hbox{\tt TF\_case}} 

93 
\newcommand\Fin{{\tt Fin}} 

94 
\newcommand\QInl{{\tt QInl}} 

95 
\newcommand\QInr{{\tt QInr}} 

96 
\newcommand\qsplit{{\tt qsplit}} 

97 
\newcommand\qcase{{\tt qcase}} 

98 
\newcommand\Con{{\tt Con}} 

99 
\newcommand\data{{\tt data}} 

100 

101 
\binperiod %%%treat . like a binary operator 

102 

103 
\begin{document} 

104 
\pagestyle{empty} 

105 
\begin{titlepage} 

106 
\maketitle 

107 
\begin{abstract} 

108 
Several theorem provers provide commands for formalizing recursive 

179  109 
data\types or inductively defined sets. This paper presents a new 
110 
approach, based on fixedpoint definitions. It is unusually general: it 

111 
admits all provably monotone inductive definitions. It is conceptually 

112 
simple, which has allowed the easy implementation of mutual recursion and 

113 
other conveniences. It also handles coinductive definitions: simply 

114 
replace the least fixedpoint by a greatest fixedpoint. This represents 

115 
the first automated support for coinductive definitions. 

130  116 

117 
The method has been implemented in Isabelle's formalization of ZF set 

179  118 
theory. It should be applicable to any logic in which the KnasterTarski 
119 
Theorem can be proved. The paper briefly describes a method of 

120 
formalizing nonwellfounded data structures in standard ZF set theory. 

103  121 

122 
Examples include lists of $n$ elements, the accessible part of a relation 

123 
and the set of primitive recursive functions. One example of a 

130  124 
coinductive definition is bisimulations for lazy lists. \ifCADE\else 
103  125 
Recursive datatypes are examined in detail, as well as one example of a 
130  126 
{\bf codatatype}: lazy lists. The appendices are simple user's manuals 
103  127 
for this Isabelle/ZF package.\fi 
128 
\end{abstract} 

129 
% 

130 
\begin{center} Copyright \copyright{} \number\year{} by Lawrence C. Paulson 

131 
\end{center} 

132 
\thispagestyle{empty} 

133 
\end{titlepage} 

134 

135 
\tableofcontents 

136 
\cleardoublepage 

130  137 
\pagenumbering{arabic}\pagestyle{headings} 
103  138 

139 
\section{Introduction} 

140 
Several theorem provers provide commands for formalizing recursive data 

141 
structures, like lists and trees. Examples include Boyer and Moore's shell 

142 
principle~\cite{bm79} and Melham's recursive type package for the HOL 

143 
system~\cite{melham89}. Such data structures are called {\bf datatypes} 

144 
below, by analogy with {\tt datatype} definitions in Standard~ML\@. 

145 

130  146 
A datatype is but one example of an {\bf inductive definition}. This 
103  147 
specifies the least set closed under given rules~\cite{aczel77}. The 
148 
collection of theorems in a logic is inductively defined. A structural 

149 
operational semantics~\cite{hennessy90} is an inductive definition of a 

150 
reduction or evaluation relation on programs. A few theorem provers 

151 
provide commands for formalizing inductive definitions; these include 

152 
Coq~\cite{paulin92} and again the HOL system~\cite{camilleri92}. 

153 

130  154 
The dual notion is that of a {\bf coinductive definition}. This specifies 
103  155 
the greatest set closed under given rules. Important examples include 
156 
using bisimulation relations to formalize equivalence of 

157 
processes~\cite{milner89} or lazy functional programs~\cite{abramsky90}. 

158 
Other examples include lazy lists and other infinite data structures; these 

130  159 
are called {\bf codatatypes} below. 
103  160 

161 
Most existing implementations of datatype and inductive definitions accept 

130  162 
an artificially narrow class of inputs, and are not easily extended. The 
103  163 
shell principle and Coq's inductive definition rules are built into the 
164 
underlying logic. Melham's packages derive datatypes and inductive 

165 
definitions from specialized constructions in higherorder logic. 

166 

167 
This paper describes a package based on a fixedpoint approach. Least 

168 
fixedpoints yield inductive definitions; greatest fixedpoints yield 

130  169 
coinductive definitions. The package is uniquely powerful: 
103  170 
\begin{itemize} 
171 
\item It accepts the largest natural class of inductive definitions, namely 

130  172 
all that are provably monotone. 
103  173 
\item It accepts a wide class of datatype definitions. 
130  174 
\item It handles coinductive and codatatype definitions. Most of 
175 
the discussion below applies equally to inductive and coinductive 

103  176 
definitions, and most of the code is shared. To my knowledge, this is 
130  177 
the only package supporting coinductive definitions. 
103  178 
\item Definitions may be mutually recursive. 
179 
\end{itemize} 

180 
The package is implemented in Isabelle~\cite{isabelleintro}, using ZF set 

181 
theory \cite{paulsonsetI,paulsonsetII}. However, the fixedpoint 

182 
approach is independent of Isabelle. The recursion equations are specified 

183 
as introduction rules for the mutually recursive sets. The package 

184 
transforms these rules into a mapping over sets, and attempts to prove that 

185 
the mapping is monotonic and welltyped. If successful, the package 

186 
makes fixedpoint definitions and proves the introduction, elimination and 

130  187 
(co)induction rules. The package consists of several Standard ML 
103  188 
functors~\cite{paulson91}; it accepts its argument and returns its result 
189 
as ML structures. 

190 

191 
Most datatype packages equip the new datatype with some means of expressing 

192 
recursive functions. This is the main thing lacking from my package. Its 

193 
fixedpoint operators define recursive sets, not recursive functions. But 

194 
the Isabelle/ZF theory provides wellfounded recursion and other logical 

195 
tools to simplify this task~\cite{paulsonsetII}. 

196 

130  197 
{\bf Outline.} \S2 briefly introduces the least and greatest fixedpoint 
198 
operators. \S3 discusses the form of introduction rules, mutual recursion and 

199 
other points common to inductive and coinductive definitions. \S4 discusses 

200 
induction and coinduction rules separately. \S5 presents several examples, 

201 
including a coinductive definition. \S6 describes datatype definitions, while 

202 
\S7 draws brief conclusions. \ifCADE\else The appendices are simple user's 

203 
manuals for this Isabelle/ZF package.\fi 

103  204 

205 
Most of the definitions and theorems shown below have been generated by the 

206 
package. I have renamed some variables to improve readability. 

207 

208 
\section{Fixedpoint operators} 

209 
In set theory, the least and greatest fixedpoint operators are defined as 

210 
follows: 

211 
\begin{eqnarray*} 

212 
\lfp(D,h) & \equiv & \inter\{X\sbs D. h(X)\sbs X\} \\ 

213 
\gfp(D,h) & \equiv & \union\{X\sbs D. X\sbs h(X)\} 

214 
\end{eqnarray*} 

130  215 
Let $D$ be a set. Say that $h$ is {\bf bounded by}~$D$ if $h(D)\sbs D$, and 
216 
{\bf monotone below~$D$} if 

103  217 
$h(A)\sbs h(B)$ for all $A$ and $B$ such that $A\sbs B\sbs D$. If $h$ is 
218 
bounded by~$D$ and monotone then both operators yield fixedpoints: 

219 
\begin{eqnarray*} 

220 
\lfp(D,h) & = & h(\lfp(D,h)) \\ 

221 
\gfp(D,h) & = & h(\gfp(D,h)) 

222 
\end{eqnarray*} 

223 
These equations are instances of the KnasterTarski theorem, which states 

224 
that every monotonic function over a complete lattice has a 

225 
fixedpoint~\cite{davey&priestley}. It is obvious from their definitions 

226 
that $\lfp$ must be the least fixedpoint, and $\gfp$ the greatest. 

227 

228 
This fixedpoint theory is simple. The KnasterTarski theorem is easy to 

229 
prove. Showing monotonicity of~$h$ is trivial, in typical cases. We must 

230 
also exhibit a bounding set~$D$ for~$h$. Sometimes this is trivial, as 

179  231 
when a set of `theorems' is (co)inductively defined over some previously 
232 
existing set of `formulae.' But defining the bounding set for 

130  233 
(co)datatype definitions requires some effort; see~\S\ref{univsec} below. 
103  234 

235 

130  236 
\section{Elements of an inductive or coinductive definition}\label{basicsec} 
237 
Consider a (co)inductive definition of the sets $R_1$, \ldots,~$R_n$, in 

103  238 
mutual recursion. They will be constructed from previously existing sets 
239 
$D_1$, \ldots,~$D_n$, respectively, which are called their {\bf domains}. 

240 
The construction yields not $R_i\sbs D_i$ but $R_i\sbs D_1+\cdots+D_n$, where 

130  241 
$R_i$ is contained in the image of~$D_i$ under an 
242 
injection. Reasons for this are discussed 

243 
elsewhere~\cite[\S4.5]{paulsonsetII}. 

103  244 

245 
The definition may involve arbitrary parameters $\vec{p}=p_1$, 

246 
\ldots,~$p_k$. Each recursive set then has the form $R_i(\vec{p})$. The 

247 
parameters must be identical every time they occur within a definition. This 

248 
would appear to be a serious restriction compared with other systems such as 

249 
Coq~\cite{paulin92}. For instance, we cannot define the lists of 

250 
$n$ elements as the set $\listn(A,n)$ using rules where the parameter~$n$ 

130  251 
varies. \S\ref{listnsec} describes how to express this set using the 
252 
inductive definition package. 

103  253 

254 
To avoid clutter below, the recursive sets are shown as simply $R_i$ 

255 
instead of $R_i(\vec{p})$. 

256 

257 
\subsection{The form of the introduction rules}\label{introsec} 

258 
The body of the definition consists of the desired introduction rules, 

259 
specified as strings. The conclusion of each rule must have the form $t\in 

260 
R_i$, where $t$ is any term. Premises typically have the same form, but 

261 
they can have the more general form $t\in M(R_i)$ or express arbitrary 

262 
sideconditions. 

263 

264 
The premise $t\in M(R_i)$ is permitted if $M$ is a monotonic operator on 

265 
sets, satisfying the rule 

266 
\[ \infer{M(A)\sbs M(B)}{A\sbs B} \] 

130  267 
The user must supply the package with monotonicity rules for all such premises. 
103  268 

269 
Because any monotonic $M$ may appear in premises, the criteria for an 

270 
acceptable definition is semantic rather than syntactic. A suitable choice 

271 
of~$M$ and~$t$ can express a lot. The powerset operator $\pow$ is 

272 
monotone, and the premise $t\in\pow(R)$ expresses $t\sbs R$; see 

130  273 
\S\ref{accsec} for an example. The `list of' operator is monotone, as is 
274 
easily proved by induction. The premise $t\in\lst(R)$ avoids having to 

275 
encode the effect of~$\lst(R)$ using mutual recursion; see \S\ref{primrecsec} 

276 
and also my earlier paper~\cite[\S4.4]{paulsonsetII}. 

103  277 

278 
Introduction rules may also contain {\bf sideconditions}. These are 

279 
premises consisting of arbitrary formulae not mentioning the recursive 

280 
sets. Sideconditions typically involve typechecking. One example is the 

281 
premise $a\in A$ in the following rule from the definition of lists: 

282 
\[ \infer{\Cons(a,l)\in\lst(A)}{a\in A & l\in\lst(A)} \] 

283 

284 
\subsection{The fixedpoint definitions} 

285 
The package translates the list of desired introduction rules into a fixedpoint 

286 
definition. Consider, as a running example, the finite set operator 

287 
$\Fin(A)$: the set of all finite subsets of~$A$. It can be 

288 
defined as the least set closed under the rules 

289 
\[ \emptyset\in\Fin(A) \qquad 

290 
\infer{\{a\}\un b\in\Fin(A)}{a\in A & b\in\Fin(A)} 

291 
\] 

292 

130  293 
The domain in a (co)inductive definition must be some existing set closed 
103  294 
under the rules. A suitable domain for $\Fin(A)$ is $\pow(A)$, the set of all 
295 
subsets of~$A$. The package generates the definition 

296 
\begin{eqnarray*} 

297 
\Fin(A) & \equiv & \lfp(\pow(A), \; 

298 
\begin{array}[t]{r@{\,}l} 

299 
\lambda X. \{z\in\pow(A). & z=\emptyset \disj{} \\ 

300 
&(\exists a\,b. z=\{a\}\un b\conj a\in A\conj b\in X)\}) 

301 
\end{array} 

130  302 
\end{eqnarray*} 
103  303 
The contribution of each rule to the definition of $\Fin(A)$ should be 
130  304 
obvious. A coinductive definition is similar but uses $\gfp$ instead 
103  305 
of~$\lfp$. 
306 

307 
The package must prove that the fixedpoint operator is applied to a 

308 
monotonic function. If the introduction rules have the form described 

309 
above, and if the package is supplied a monotonicity theorem for every 

310 
$t\in M(R_i)$ premise, then this proof is trivial.\footnote{Due to the 

311 
presence of logical connectives in the fixedpoint's body, the 

312 
monotonicity proof requires some unusual rules. These state that the 

130  313 
connectives $\conj$, $\disj$ and $\exists$ preserve monotonicity with respect 
314 
to the partial ordering on unary predicates given by $P\sqsubseteq Q$ if and 

103  315 
only if $\forall x.P(x)\imp Q(x)$.} 
316 

130  317 
The result structure contains the definitions of the recursive sets as a theorem 
318 
list called {\tt defs}. It also contains, as the theorem {\tt unfold}, a 

103  319 
fixedpoint equation such as 
320 
\begin{eqnarray*} 

321 
\Fin(A) & = & 

322 
\begin{array}[t]{r@{\,}l} 

323 
\{z\in\pow(A). & z=\emptyset \disj{} \\ 

324 
&(\exists a\,b. z=\{a\}\un b\conj a\in A\conj b\in \Fin(A))\} 

325 
\end{array} 

326 
\end{eqnarray*} 

130  327 
It also contains, as the theorem {\tt dom\_subset}, an inclusion such as 
103  328 
$\Fin(A)\sbs\pow(A)$. 
329 

330 

331 
\subsection{Mutual recursion} \label{mutualsec} 

130  332 
In a mutually recursive definition, the domain of the fixedpoint construction 
103  333 
is the disjoint sum of the domain~$D_i$ of each~$R_i$, for $i=1$, 
334 
\ldots,~$n$. The package uses the injections of the 

335 
binary disjoint sum, typically $\Inl$ and~$\Inr$, to express injections 

130  336 
$h_{1n}$, \ldots, $h_{nn}$ for the $n$ary disjoint sum $D_1+\cdots+D_n$. 
103  337 

338 
As discussed elsewhere \cite[\S4.5]{paulsonsetII}, Isabelle/ZF defines the 

339 
operator $\Part$ to support mutual recursion. The set $\Part(A,h)$ 

340 
contains those elements of~$A$ having the form~$h(z)$: 

341 
\begin{eqnarray*} 

342 
\Part(A,h) & \equiv & \{x\in A. \exists z. x=h(z)\}. 

343 
\end{eqnarray*} 

344 
For mutually recursive sets $R_1$, \ldots,~$R_n$ with 

345 
$n>1$, the package makes $n+1$ definitions. The first defines a set $R$ using 

346 
a fixedpoint operator. The remaining $n$ definitions have the form 

347 
\begin{eqnarray*} 

130  348 
R_i & \equiv & \Part(R,h_{in}), \qquad i=1,\ldots, n. 
103  349 
\end{eqnarray*} 
350 
It follows that $R=R_1\un\cdots\un R_n$, where the $R_i$ are pairwise disjoint. 

351 

352 

353 
\subsection{Proving the introduction rules} 

130  354 
The user supplies the package with the desired form of the introduction 
103  355 
rules. Once it has derived the theorem {\tt unfold}, it attempts 
130  356 
to prove those rules. From the user's point of view, this is the 
103  357 
trickiest stage; the proofs often fail. The task is to show that the domain 
358 
$D_1+\cdots+D_n$ of the combined set $R_1\un\cdots\un R_n$ is 

359 
closed under all the introduction rules. This essentially involves replacing 

360 
each~$R_i$ by $D_1+\cdots+D_n$ in each of the introduction rules and 

361 
attempting to prove the result. 

362 

363 
Consider the $\Fin(A)$ example. After substituting $\pow(A)$ for $\Fin(A)$ 

364 
in the rules, the package must prove 

365 
\[ \emptyset\in\pow(A) \qquad 

366 
\infer{\{a\}\un b\in\pow(A)}{a\in A & b\in\pow(A)} 

367 
\] 

368 
Such proofs can be regarded as typechecking the definition. The user 

369 
supplies the package with typechecking rules to apply. Usually these are 

370 
general purpose rules from the ZF theory. They could however be rules 

371 
specifically proved for a particular inductive definition; sometimes this is 

372 
the easiest way to get the definition through! 

373 

130  374 
The result structure contains the introduction rules as the theorem list {\tt 
375 
intrs}. 

103  376 

377 
\subsection{The elimination rule} 

130  378 
The elimination rule, called {\tt elim}, performs case analysis: a 
379 
case for each introduction rule. The elimination rule 

380 
for $\Fin(A)$ is 

103  381 
\[ \infer{Q}{x\in\Fin(A) & \infer*{Q}{[x=\emptyset]} 
382 
& \infer*{Q}{[x=\{a\}\un b & a\in A &b\in\Fin(A)]_{a,b}} } 

383 
\] 

130  384 
This rule states that if $x\in\Fin(A)$ then either $x=\emptyset$ or else 
385 
$x=\{a\}\un b$ for some $a\in A$ and $b\in\Fin(A)$; it is a simple consequence 

386 
of {\tt unfold}. 

387 

388 
\ifCADE\typeout{****Omitting mk_cases from CADE version!}\else 

103  389 
The package also returns a function {\tt mk\_cases}, for generating simplified 
390 
instances of the elimination rule. However, {\tt mk\_cases} only works for 

391 
datatypes and for inductive definitions involving datatypes, such as an 

392 
inductively defined relation between lists. It instantiates {\tt elim} 

393 
with a usersupplied term, then simplifies the cases using the freeness of 

394 
the underlying datatype. 

130  395 
See \S\ref{mkcases} for an example. 
396 
\fi 

103  397 

130  398 
\section{Induction and coinduction rules} 
399 
Here we must consider inductive and coinductive definitions separately. 

103  400 
For an inductive definition, the package returns an induction rule derived 
401 
directly from the properties of least fixedpoints, as well as a modified 

402 
rule for mutual recursion and inductively defined relations. For a 

130  403 
coinductive definition, the package returns a basic coinduction rule. 
103  404 

405 
\subsection{The basic induction rule}\label{basicindsec} 

130  406 
The basic rule, called {\tt induct}, is appropriate in most situations. 
103  407 
For inductive definitions, it is strong rule induction~\cite{camilleri92}; for 
408 
datatype definitions (see below), it is just structural induction. 

409 

410 
The induction rule for an inductively defined set~$R$ has the following form. 

411 
The major premise is $x\in R$. There is a minor premise for each 

412 
introduction rule: 

413 
\begin{itemize} 

414 
\item If the introduction rule concludes $t\in R_i$, then the minor premise 

415 
is~$P(t)$. 

416 

417 
\item The minor premise's eigenvariables are precisely the introduction 

130  418 
rule's free variables that are not parameters of~$R$. For instance, the 
419 
eigenvariables in the $\Fin(A)$ rule below are $a$ and $b$, but not~$A$. 

103  420 

421 
\item If the introduction rule has a premise $t\in R_i$, then the minor 

422 
premise discharges the assumption $t\in R_i$ and the induction 

423 
hypothesis~$P(t)$. If the introduction rule has a premise $t\in M(R_i)$ 

424 
then the minor premise discharges the single assumption 

425 
\[ t\in M(\{z\in R_i. P(z)\}). \] 

426 
Because $M$ is monotonic, this assumption implies $t\in M(R_i)$. The 

427 
occurrence of $P$ gives the effect of an induction hypothesis, which may be 

428 
exploited by appealing to properties of~$M$. 

429 
\end{itemize} 

130  430 
The induction rule for $\Fin(A)$ resembles the elimination rule shown above, 
431 
but includes an induction hypothesis: 

103  432 
\[ \infer{P(x)}{x\in\Fin(A) & P(\emptyset) 
433 
& \infer*{P(\{a\}\un b)}{[a\in A & b\in\Fin(A) & P(b)]_{a,b}} } 

434 
\] 

435 
Stronger induction rules often suggest themselves. In the case of 

436 
$\Fin(A)$, the Isabelle/ZF theory proceeds to derive a rule whose third 

437 
premise discharges the extra assumption $a\not\in b$. Most special induction 

438 
rules must be proved manually, but the package proves a rule for mutual 

439 
induction and inductive relations. 

440 

441 
\subsection{Mutual induction} 

442 
The mutual induction rule is called {\tt 

443 
mutual\_induct}. It differs from the basic rule in several respects: 

444 
\begin{itemize} 

445 
\item Instead of a single predicate~$P$, it uses $n$ predicates $P_1$, 

446 
\ldots,~$P_n$: one for each recursive set. 

447 

448 
\item There is no major premise such as $x\in R_i$. Instead, the conclusion 

449 
refers to all the recursive sets: 

450 
\[ (\forall z.z\in R_1\imp P_1(z))\conj\cdots\conj 

451 
(\forall z.z\in R_n\imp P_n(z)) 

452 
\] 

453 
Proving the premises simultaneously establishes $P_i(z)$ for $z\in 

454 
R_i$ and $i=1$, \ldots,~$n$. 

455 

456 
\item If the domain of some $R_i$ is the Cartesian product 

457 
$A_1\times\cdots\times A_m$, then the corresponding predicate $P_i$ takes $m$ 

458 
arguments and the corresponding conjunct of the conclusion is 

459 
\[ (\forall z_1\ldots z_m.\pair{z_1,\ldots,z_m}\in R_i\imp P_i(z_1,\ldots,z_m)) 

460 
\] 

461 
\end{itemize} 

462 
The last point above simplifies reasoning about inductively defined 

463 
relations. It eliminates the need to express properties of $z_1$, 

464 
\ldots,~$z_m$ as properties of the tuple $\pair{z_1,\ldots,z_m}$. 

465 

130  466 
\subsection{Coinduction}\label{coindsec} 
467 
A coinductive definition yields a primitive coinduction rule, with no 

103  468 
refinements such as those for the induction rules. (Experience may suggest 
130  469 
refinements later.) Consider the codatatype of lazy lists as an example. For 
103  470 
suitable definitions of $\LNil$ and $\LCons$, lazy lists may be defined as the 
471 
greatest fixedpoint satisfying the rules 

472 
\[ \LNil\in\llist(A) \qquad 

473 
\infer[()]{\LCons(a,l)\in\llist(A)}{a\in A & l\in\llist(A)} 

474 
\] 

130  475 
The $()$ tag stresses that this is a coinductive definition. A suitable 
103  476 
domain for $\llist(A)$ is $\quniv(A)$, a set closed under variant forms of 
477 
sum and product for representing infinite data structures 

130  478 
(see~\S\ref{univsec}). Coinductive definitions use these variant sums and 
103  479 
products. 
480 

481 
The package derives an {\tt unfold} theorem similar to that for $\Fin(A)$. 

482 
Then it proves the theorem {\tt co\_induct}, which expresses that $\llist(A)$ 

483 
is the greatest solution to this equation contained in $\quniv(A)$: 

130  484 
\[ \infer{x\in\llist(A)}{x\in X & X\sbs \quniv(A) & 
103  485 
\infer*{z=\LNil\disj \bigl(\exists a\,l.\, 
486 
\begin{array}[t]{@{}l} 

487 
z=\LCons(a,l) \conj a\in A \conj{}\\ 

488 
l\in X\un\llist(A) \bigr) 

489 
\end{array} }{[z\in X]_z}} 

490 
\] 

130  491 
This rule complements the introduction rules; it provides a means of showing 
492 
$x\in\llist(A)$ when $x$ is infinite. For instance, if $x=\LCons(0,x)$ then 

493 
applying the rule with $X=\{x\}$ proves $x\in\llist(\nat)$. Here $\nat$ 

494 
is the set of natural numbers. 

495 

103  496 
Having $X\un\llist(A)$ instead of simply $X$ in the third premise above 
497 
represents a slight strengthening of the greatest fixedpoint property. I 

130  498 
discuss several forms of coinduction rules elsewhere~\cite{paulsoncoind}. 
103  499 

500 

130  501 
\section{Examples of inductive and coinductive definitions}\label{indegsec} 
103  502 
This section presents several examples: the finite set operator, 
503 
lists of $n$ elements, bisimulations on lazy lists, the wellfounded part 

504 
of a relation, and the primitive recursive functions. 

505 

506 
\subsection{The finite set operator} 

507 
The definition of finite sets has been discussed extensively above. Here 

508 
is the corresponding ML invocation (note that $\cons(a,b)$ abbreviates 

509 
$\{a\}\un b$ in Isabelle/ZF): 

510 
\begin{ttbox} 

511 
structure Fin = Inductive_Fun 

512 
(val thy = Arith.thy addconsts [(["Fin"],"i=>i")]; 

513 
val rec_doms = [("Fin","Pow(A)")]; 

514 
val sintrs = 

515 
["0 : Fin(A)", 

516 
"[ a: A; b: Fin(A) ] ==> cons(a,b) : Fin(A)"]; 

517 
val monos = []; 

518 
val con_defs = []; 

130  519 
val type_intrs = [empty_subsetI, cons_subsetI, PowI]; 
103  520 
val type_elims = [make_elim PowD]); 
521 
\end{ttbox} 

522 
The parent theory is obtained from {\tt Arith.thy} by adding the unary 

523 
function symbol~$\Fin$. Its domain is specified as $\pow(A)$, where $A$ is 

524 
the parameter appearing in the introduction rules. For typechecking, the 

525 
package supplies the introduction rules: 

526 
\[ \emptyset\sbs A \qquad 

527 
\infer{\{a\}\un B\sbs C}{a\in C & B\sbs C} 

528 
\] 

529 
A further introduction rule and an elimination rule express the two 

530 
directions of the equivalence $A\in\pow(B)\bimp A\sbs B$. Typechecking 

531 
involves mostly introduction rules. When the package returns, we can refer 

532 
to the $\Fin(A)$ introduction rules as {\tt Fin.intrs}, the induction rule 

533 
as {\tt Fin.induct}, and so forth. 

534 

535 
\subsection{Lists of $n$ elements}\label{listnsec} 

179  536 
This has become a standard example of an inductive definition. Following 
537 
PaulinMohring~\cite{paulin92}, we could attempt to define a new datatype 

538 
$\listn(A,n)$, for lists of length~$n$, as an $n$indexed family of sets. 

539 
But her introduction rules 

103  540 
\[ {\tt Niln}\in\listn(A,0) \qquad 
541 
\infer{{\tt Consn}(n,a,l)\in\listn(A,\succ(n))} 

542 
{n\in\nat & a\in A & l\in\listn(A,n)} 

543 
\] 

544 
are not acceptable to the inductive definition package: 

545 
$\listn$ occurs with three different parameter lists in the definition. 

546 

547 
\begin{figure} 

548 
\begin{small} 

549 
\begin{verbatim} 

550 
structure ListN = Inductive_Fun 

551 
(val thy = ListFn.thy addconsts [(["listn"],"i=>i")]; 

552 
val rec_doms = [("listn", "nat*list(A)")]; 

553 
val sintrs = 

554 
["<0,Nil> : listn(A)", 

130  555 
"[ a: A; <n,l> : listn(A) ] ==> 
556 
<succ(n), Cons(a,l)> : listn(A)"]; 

103  557 
val monos = []; 
558 
val con_defs = []; 

130  559 
val type_intrs = nat_typechecks @ List.intrs @ [SigmaI]; 
103  560 
val type_elims = [SigmaE2]); 
561 
\end{verbatim} 

562 
\end{small} 

563 
\hrule 

564 
\caption{Defining lists of $n$ elements} \label{listnfig} 

565 
\end{figure} 

566 

567 
There is an obvious way of handling this particular example, which may suggest 

568 
a general approach to varying parameters. Here, we can take advantage of an 

569 
existing datatype definition of $\lst(A)$, with constructors $\Nil$ 

570 
and~$\Cons$. Then incorporate the number~$n$ into the inductive set itself, 

571 
defining $\listn(A)$ as a relation. It consists of pairs $\pair{n,l}$ such 

572 
that $n\in\nat$ and~$l\in\lst(A)$ and $l$ has length~$n$. In fact, 

573 
$\listn(A)$ turns out to be the converse of the length function on~$\lst(A)$. 

574 
The Isabelle/ZF introduction rules are 

575 
\[ \pair{0,\Nil}\in\listn(A) \qquad 

576 
\infer{\pair{\succ(n),\Cons(a,l)}\in\listn(A)} 

577 
{a\in A & \pair{n,l}\in\listn(A)} 

578 
\] 

579 
Figure~\ref{listnfig} presents the ML invocation. A theory of lists, 

580 
extended with a declaration of $\listn$, is the parent theory. The domain 

581 
is specified as $\nat\times\lst(A)$. The typechecking rules include those 

582 
for 0, $\succ$, $\Nil$ and $\Cons$. Because $\listn(A)$ is a set of pairs, 

583 
typechecking also requires introduction and elimination rules to express 

584 
both directions of the equivalence $\pair{a,b}\in A\times B \bimp a\in A 

585 
\conj b\in B$. 

586 

587 
The package returns introduction, elimination and induction rules for 

588 
$\listn$. The basic induction rule, {\tt ListN.induct}, is 

589 
\[ \infer{P(x)}{x\in\listn(A) & P(\pair{0,\Nil}) & 

590 
\infer*{P(\pair{\succ(n),\Cons(a,l)})} 

591 
{[a\in A & \pair{n,l}\in\listn(A) & P(\pair{n,l})]_{a,l,n}}} 

592 
\] 

593 
This rule requires the induction formula to be a 

594 
unary property of pairs,~$P(\pair{n,l})$. The alternative rule, {\tt 

595 
ListN.mutual\_induct}, uses a binary property instead: 

130  596 
\[ \infer{\forall n\,l. \pair{n,l}\in\listn(A) \imp P(n,l)} 
103  597 
{P(0,\Nil) & 
598 
\infer*{P(\succ(n),\Cons(a,l))} 

599 
{[a\in A & \pair{n,l}\in\listn(A) & P(n,l)]_{a,l,n}}} 

600 
\] 

601 
It is now a simple matter to prove theorems about $\listn(A)$, such as 

602 
\[ \forall l\in\lst(A). \pair{\length(l),\, l}\in\listn(A) \] 

603 
\[ \listn(A)``\{n\} = \{l\in\lst(A). \length(l)=n\} \] 

130  604 
This latter result  here $r``X$ denotes the image of $X$ under $r$ 
103  605 
 asserts that the inductive definition agrees with the obvious notion of 
606 
$n$element list. 

607 

608 
Unlike in Coq, the definition does not declare a new datatype. A `list of 

130  609 
$n$ elements' really is a list and is subject to list operators such 
610 
as append (concatenation). For example, a trivial induction on 

611 
$\pair{m,l}\in\listn(A)$ yields 

103  612 
\[ \infer{\pair{m\mathbin{+} m,\, l@l'}\in\listn(A)} 
613 
{\pair{m,l}\in\listn(A) & \pair{m',l'}\in\listn(A)} 

614 
\] 

615 
where $+$ here denotes addition on the natural numbers and @ denotes append. 

616 

617 
\ifCADE\typeout{****Omitting mk_cases from CADE version!} 

618 
\else 

130  619 
\subsection{A demonstration of {\tt mk\_cases}}\label{mkcases} 
103  620 
The elimination rule, {\tt ListN.elim}, is cumbersome: 
621 
\[ \infer{Q}{x\in\listn(A) & 

622 
\infer*{Q}{[x = \pair{0,\Nil}]} & 

623 
\infer*{Q} 

624 
{\left[\begin{array}{l} 

625 
x = \pair{\succ(n),\Cons(a,l)} \\ 

626 
a\in A \\ 

627 
\pair{n,l}\in\listn(A) 

628 
\end{array} \right]_{a,l,n}}} 

629 
\] 

179  630 
The ML function {\tt ListN.mk\_cases} generates simplified instances of 
631 
this rule. It works by freeness reasoning on the list constructors: 

632 
$\Cons(a,l)$ is injective in its two arguments and differs from~$\Nil$. If 

633 
$x$ is $\pair{i,\Nil}$ or $\pair{i,\Cons(a,l)}$ then {\tt ListN.mk\_cases} 

634 
deduces the corresponding form of~$i$. For example, 

103  635 
\begin{ttbox} 
636 
ListN.mk_cases List.con_defs "<i,Cons(a,l)> : listn(A)" 

637 
\end{ttbox} 

130  638 
yields a rule with only two premises: 
103  639 
\[ \infer{Q}{\pair{i, \Cons(a,l)}\in\listn(A) & 
640 
\infer*{Q} 

641 
{\left[\begin{array}{l} 

642 
i = \succ(n) \\ a\in A \\ \pair{n,l}\in\listn(A) 

643 
\end{array} \right]_{n}}} 

644 
\] 

645 
The package also has builtin rules for freeness reasoning about $0$ 

646 
and~$\succ$. So if $x$ is $\pair{0,l}$ or $\pair{\succ(i),l}$, then {\tt 

647 
ListN.mk\_cases} can similarly deduce the corresponding form of~$l$. 

648 

649 
The function {\tt mk\_cases} is also useful with datatype definitions 

650 
themselves. The version from the definition of lists, namely {\tt 

651 
List.mk\_cases}, can prove the rule 

652 
\[ \infer{Q}{\Cons(a,l)\in\lst(A) & 

653 
& \infer*{Q}{[a\in A &l\in\lst(A)]} } 

654 
\] 

655 
The most important uses of {\tt mk\_cases} concern inductive definitions of 

130  656 
evaluation relations. Then {\tt mk\_cases} supports case analysis on 
657 
possible evaluations, for example to prove that evaluation is 

658 
functional. 

103  659 
\fi %CADE 
660 

130  661 
\subsection{A coinductive definition: bisimulations on lazy lists} 
662 
This example anticipates the definition of the codatatype $\llist(A)$, which 

663 
consists of finite and infinite lists over~$A$. Its constructors are $\LNil$ 

664 
and 

665 
$\LCons$, satisfying the introduction rules shown in~\S\ref{coindsec}. 

103  666 
Because $\llist(A)$ is defined as a greatest fixedpoint and uses the variant 
667 
pairing and injection operators, it contains nonwellfounded elements such as 

668 
solutions to $\LCons(a,l)=l$. 

669 

130  670 
The next step in the development of lazy lists is to define a coinduction 
103  671 
principle for proving equalities. This is done by showing that the equality 
672 
relation on lazy lists is the greatest fixedpoint of some monotonic 

673 
operation. The usual approach~\cite{pitts94} is to define some notion of 

674 
bisimulation for lazy lists, define equivalence to be the greatest 

675 
bisimulation, and finally to prove that two lazy lists are equivalent if and 

130  676 
only if they are equal. The coinduction rule for equivalence then yields a 
677 
coinduction principle for equalities. 

103  678 

679 
A binary relation $R$ on lazy lists is a {\bf bisimulation} provided $R\sbs 

680 
R^+$, where $R^+$ is the relation 

130  681 
\[ \{\pair{\LNil,\LNil}\} \un 
682 
\{\pair{\LCons(a,l),\LCons(a,l')} . a\in A \conj \pair{l,l'}\in R\}. 

103  683 
\] 
684 

685 
A pair of lazy lists are {\bf equivalent} if they belong to some bisimulation. 

130  686 
Equivalence can be coinductively defined as the greatest fixedpoint for the 
103  687 
introduction rules 
130  688 
\[ \pair{\LNil,\LNil} \in\lleq(A) \qquad 
689 
\infer[()]{\pair{\LCons(a,l),\LCons(a,l')} \in\lleq(A)} 

690 
{a\in A & \pair{l,l'}\in \lleq(A)} 

103  691 
\] 
130  692 
To make this coinductive definition, we invoke \verbCoInductive_Fun: 
103  693 
\begin{ttbox} 
130  694 
structure LList_Eq = CoInductive_Fun 
103  695 
(val thy = LList.thy addconsts [(["lleq"],"i=>i")]; 
130  696 
val rec_doms = [("lleq", "llist(A) * llist(A)")]; 
103  697 
val sintrs = 
130  698 
["<LNil, LNil> : lleq(A)", 
699 
"[ a:A; <l,l'>: lleq(A) ] ==> 

700 
<LCons(a,l), LCons(a,l')>: lleq(A)"]; 

103  701 
val monos = []; 
702 
val con_defs = []; 

130  703 
val type_intrs = LList.intrs @ [SigmaI]; 
704 
val type_elims = [SigmaE2]); 

103  705 
\end{ttbox} 
706 
Again, {\tt addconsts} declares a constant for $\lleq$ in the parent theory. 

130  707 
The domain of $\lleq(A)$ is $\llist(A)\times\llist(A)$. The typechecking 
708 
rules include the introduction rules for lazy lists as well as rules 

709 
for both directions of the equivalence 

710 
$\pair{a,b}\in A\times B \bimp a\in A \conj b\in B$. 

103  711 

712 
The package returns the introduction rules and the elimination rule, as 

130  713 
usual. But instead of induction rules, it returns a coinduction rule. 
103  714 
The rule is too big to display in the usual notation; its conclusion is 
130  715 
$x\in\lleq(A)$ and its premises are $x\in X$, 
716 
${X\sbs\llist(A)\times\llist(A)}$ and 

717 
\[ \infer*{z=\pair{\LNil,\LNil}\disj \bigl(\exists a\,l\,l'.\, 

103  718 
\begin{array}[t]{@{}l} 
130  719 
z=\pair{\LCons(a,l),\LCons(a,l')} \conj a\in A \conj{}\\ 
720 
\pair{l,l'}\in X\un\lleq(A) \bigr) 

103  721 
\end{array} }{[z\in X]_z} 
722 
\] 

130  723 
Thus if $x\in X$, where $X$ is a bisimulation contained in the 
724 
domain of $\lleq(A)$, then $x\in\lleq(A)$. It is easy to show that 

103  725 
$\lleq(A)$ is reflexive: the equality relation is a bisimulation. And 
726 
$\lleq(A)$ is symmetric: its converse is a bisimulation. But showing that 

130  727 
$\lleq(A)$ coincides with the equality relation takes some work. 
103  728 

729 
\subsection{The accessible part of a relation}\label{accsec} 

730 
Let $\prec$ be a binary relation on~$D$; in short, $(\prec)\sbs D\times D$. 

731 
The {\bf accessible} or {\bf wellfounded} part of~$\prec$, written 

732 
$\acc(\prec)$, is essentially that subset of~$D$ for which $\prec$ admits 

733 
no infinite decreasing chains~\cite{aczel77}. Formally, $\acc(\prec)$ is 

734 
inductively defined to be the least set that contains $a$ if it contains 

735 
all $\prec$predecessors of~$a$, for $a\in D$. Thus we need an 

736 
introduction rule of the form 

737 
\[ \infer{a\in\acc(\prec)}{\forall y.y\prec a\imp y\in\acc(\prec)} \] 

738 
PaulinMohring treats this example in Coq~\cite{paulin92}, but it causes 

739 
difficulties for other systems. Its premise does not conform to 

740 
the structure of introduction rules for HOL's inductive definition 

741 
package~\cite{camilleri92}. It is also unacceptable to Isabelle package 

130  742 
(\S\ref{introsec}), but fortunately can be transformed into the acceptable 
103  743 
form $t\in M(R)$. 
744 

745 
The powerset operator is monotonic, and $t\in\pow(R)$ is equivalent to 

746 
$t\sbs R$. This in turn is equivalent to $\forall y\in t. y\in R$. To 

747 
express $\forall y.y\prec a\imp y\in\acc(\prec)$ we need only find a 

748 
term~$t$ such that $y\in t$ if and only if $y\prec a$. A suitable $t$ is 

749 
the inverse image of~$\{a\}$ under~$\prec$. 

750 

751 
The ML invocation below follows this approach. Here $r$ is~$\prec$ and 

130  752 
$\field(r)$ refers to~$D$, the domain of $\acc(r)$. (The field of a 
753 
relation is the union of its domain and range.) Finally 

754 
$r^{}``\{a\}$ denotes the inverse image of~$\{a\}$ under~$r$. The package is 

755 
supplied the theorem {\tt Pow\_mono}, which asserts that $\pow$ is monotonic. 

103  756 
\begin{ttbox} 
757 
structure Acc = Inductive_Fun 

758 
(val thy = WF.thy addconsts [(["acc"],"i=>i")]; 

759 
val rec_doms = [("acc", "field(r)")]; 

760 
val sintrs = 

130  761 
["[ r``\{a\}: Pow(acc(r)); a: field(r) ] ==> a: acc(r)"]; 
103  762 
val monos = [Pow_mono]; 
763 
val con_defs = []; 

764 
val type_intrs = []; 

765 
val type_elims = []); 

766 
\end{ttbox} 

767 
The Isabelle theory proceeds to prove facts about $\acc(\prec)$. For 

768 
instance, $\prec$ is wellfounded if and only if its field is contained in 

769 
$\acc(\prec)$. 

770 

771 
As mentioned in~\S\ref{basicindsec}, a premise of the form $t\in M(R)$ 

772 
gives rise to an unusual induction hypothesis. Let us examine the 

773 
induction rule, {\tt Acc.induct}: 

774 
\[ \infer{P(x)}{x\in\acc(r) & 

775 
\infer*{P(a)}{[r^{}``\{a\}\in\pow(\{z\in\acc(r).P(z)\}) & 

776 
a\in\field(r)]_a}} 

777 
\] 

778 
The strange induction hypothesis is equivalent to 

779 
$\forall y. \pair{y,a}\in r\imp y\in\acc(r)\conj P(y)$. 

780 
Therefore the rule expresses wellfounded induction on the accessible part 

781 
of~$\prec$. 

782 

783 
The use of inverse image is not essential. The Isabelle package can accept 

784 
introduction rules with arbitrary premises of the form $\forall 

785 
\vec{y}.P(\vec{y})\imp f(\vec{y})\in R$. The premise can be expressed 

786 
equivalently as 

130  787 
\[ \{z\in D. P(\vec{y}) \conj z=f(\vec{y})\} \in \pow(R) \] 
103  788 
provided $f(\vec{y})\in D$ for all $\vec{y}$ such that~$P(\vec{y})$. The 
789 
following section demonstrates another use of the premise $t\in M(R)$, 

790 
where $M=\lst$. 

791 

792 
\subsection{The primitive recursive functions}\label{primrecsec} 

793 
The primitive recursive functions are traditionally defined inductively, as 

794 
a subset of the functions over the natural numbers. One difficulty is that 

795 
functions of all arities are taken together, but this is easily 

796 
circumvented by regarding them as functions on lists. Another difficulty, 

797 
the notion of composition, is less easily circumvented. 

798 

799 
Here is a more precise definition. Letting $\vec{x}$ abbreviate 

800 
$x_0,\ldots,x_{n1}$, we can write lists such as $[\vec{x}]$, 

801 
$[y+1,\vec{x}]$, etc. A function is {\bf primitive recursive} if it 

802 
belongs to the least set of functions in $\lst(\nat)\to\nat$ containing 

803 
\begin{itemize} 

804 
\item The {\bf successor} function $\SC$, such that $\SC[y,\vec{x}]=y+1$. 

805 
\item All {\bf constant} functions $\CONST(k)$, such that 

806 
$\CONST(k)[\vec{x}]=k$. 

807 
\item All {\bf projection} functions $\PROJ(i)$, such that 

808 
$\PROJ(i)[\vec{x}]=x_i$ if $0\leq i<n$. 

809 
\item All {\bf compositions} $\COMP(g,[f_0,\ldots,f_{m1}])$, 

810 
where $g$ and $f_0$, \ldots, $f_{m1}$ are primitive recursive, 

811 
such that 

812 
\begin{eqnarray*} 

813 
\COMP(g,[f_0,\ldots,f_{m1}])[\vec{x}] & = & 

814 
g[f_0[\vec{x}],\ldots,f_{m1}[\vec{x}]]. 

815 
\end{eqnarray*} 

816 

817 
\item All {\bf recursions} $\PREC(f,g)$, where $f$ and $g$ are primitive 

818 
recursive, such that 

819 
\begin{eqnarray*} 

820 
\PREC(f,g)[0,\vec{x}] & = & f[\vec{x}] \\ 

821 
\PREC(f,g)[y+1,\vec{x}] & = & g[\PREC(f,g)[y,\vec{x}],\, y,\, \vec{x}]. 

822 
\end{eqnarray*} 

823 
\end{itemize} 

824 
Composition is awkward because it combines not two functions, as is usual, 

825 
but $m+1$ functions. In her proof that Ackermann's function is not 

826 
primitive recursive, Nora Szasz was unable to formalize this definition 

827 
directly~\cite{szasz93}. So she generalized primitive recursion to 

828 
tuplevalued functions. This modified the inductive definition such that 

829 
each operation on primitive recursive functions combined just two functions. 

830 

831 
\begin{figure} 

130  832 
\begin{small}\begin{verbatim} 
103  833 
structure Primrec = Inductive_Fun 
834 
(val thy = Primrec0.thy; 

835 
val rec_doms = [("primrec", "list(nat)>nat")]; 

130  836 
val ext = None; 
103  837 
val sintrs = 
838 
["SC : primrec", 

839 
"k: nat ==> CONST(k) : primrec", 

840 
"i: nat ==> PROJ(i) : primrec", 

130  841 
"[ g: primrec; fs: list(primrec) ] ==> COMP(g,fs): primrec", 
842 
"[ f: primrec; g: primrec ] ==> PREC(f,g): primrec"]; 

103  843 
val monos = [list_mono]; 
844 
val con_defs = [SC_def,CONST_def,PROJ_def,COMP_def,PREC_def]; 

130  845 
val type_intrs = pr0_typechecks; 
103  846 
val type_elims = []); 
130  847 
\end{verbatim}\end{small} 
103  848 
\hrule 
849 
\caption{Inductive definition of the primitive recursive functions} 

850 
\label{primrecfig} 

851 
\end{figure} 

852 
\def\fs{{\it fs}} 

853 
Szasz was using ALF, but Coq and HOL would also have problems accepting 

854 
this definition. Isabelle's package accepts it easily since 

855 
$[f_0,\ldots,f_{m1}]$ is a list of primitive recursive functions and 

856 
$\lst$ is monotonic. There are five introduction rules, one for each of 

857 
the five forms of primitive recursive function. Note the one for $\COMP$: 

858 
\[ \infer{\COMP(g,\fs)\in\primrec}{g\in\primrec & \fs\in\lst(\primrec)} \] 

859 
The induction rule for $\primrec$ has one case for each introduction rule. 

860 
Due to the use of $\lst$ as a monotone operator, the composition case has 

861 
an unusual induction hypothesis: 

862 
\[ \infer*{P(\COMP(g,\fs))} 

130  863 
{[g\in\primrec & \fs\in\lst(\{z\in\primrec.P(z)\})]_{\fs,g}} \] 
103  864 
The hypothesis states that $\fs$ is a list of primitive recursive functions 
865 
satisfying the induction formula. Proving the $\COMP$ case typically requires 

866 
structural induction on lists, yielding two subcases: either $\fs=\Nil$ or 

867 
else $\fs=\Cons(f,\fs')$, where $f\in\primrec$, $P(f)$, and $\fs'$ is 

868 
another list of primitive recursive functions satisfying~$P$. 

869 

870 
Figure~\ref{primrecfig} presents the ML invocation. Theory {\tt 

871 
Primrec0.thy} defines the constants $\SC$, etc.; their definitions 

872 
consist of routine list programming and are omitted. The Isabelle theory 

873 
goes on to formalize Ackermann's function and prove that it is not 

874 
primitive recursive, using the induction rule {\tt Primrec.induct}. The 

875 
proof follows Szasz's excellent account. 

876 

877 
ALF and Coq treat inductive definitions as datatypes, with a new 

878 
constructor for each introduction rule. This forced Szasz to define a 

879 
small programming language for the primitive recursive functions, and then 

880 
define their semantics. But the Isabelle/ZF formulation defines the 

881 
primitive recursive functions directly as a subset of the function set 

882 
$\lst(\nat)\to\nat$. This saves a step and conforms better to mathematical 

883 
tradition. 

884 

885 

130  886 
\section{Datatypes and codatatypes}\label{datasec} 
887 
A (co)datatype definition is a (co)inductive definition with automatically 

888 
defined constructors and a case analysis operator. The package proves that the 

103  889 
case operator inverts the constructors, and can also prove freeness theorems 
890 
involving any pair of constructors. 

891 

892 

130  893 
\subsection{Constructors and their domain}\label{univsec} 
894 
Conceptually, our two forms of definition are distinct: a (co)inductive 

895 
definition selects a subset of an existing set, but a (co)datatype 

103  896 
definition creates a new set. But the package reduces the latter to the 
897 
former. A set having strong closure properties must serve as the domain 

130  898 
of the (co)inductive definition. Constructing this set requires some 
899 
theoretical effort. Consider first datatypes and then codatatypes. 

103  900 

901 
Isabelle/ZF defines the standard notion of Cartesian product $A\times B$, 

902 
containing ordered pairs $\pair{a,b}$. Now the $m$tuple 

903 
$\pair{x_1,\ldots\,x_m}$ is the empty set~$\emptyset$ if $m=0$, simply 

904 
$x_1$ if $m=1$, and $\pair{x_1,\pair{x_2,\ldots\,x_m}}$ if $m\geq2$. 

905 
Isabelle/ZF also defines the disjoint sum $A+B$, containing injections 

906 
$\Inl(a)\equiv\pair{0,a}$ and $\Inr(b)\equiv\pair{1,b}$. 

907 

908 
A datatype constructor $\Con(x_1,\ldots\,x_m)$ is defined to be 

909 
$h(\pair{x_1,\ldots\,x_m})$, where $h$ is composed of $\Inl$ and~$\Inr$. 

910 
In a mutually recursive definition, all constructors for the set~$R_i$ have 

130  911 
the outer form~$h_{in}$, where $h_{in}$ is the injection described 
103  912 
in~\S\ref{mutualsec}. Further nested injections ensure that the 
913 
constructors for~$R_i$ are pairwise distinct. 

914 

915 
Isabelle/ZF defines the set $\univ(A)$, which contains~$A$ and 

916 
furthermore contains $\pair{a,b}$, $\Inl(a)$ and $\Inr(b)$ for $a$, 

917 
$b\in\univ(A)$. In a typical datatype definition with set parameters 

918 
$A_1$, \ldots, $A_k$, a suitable domain for all the recursive sets is 

919 
$\univ(A_1\un\cdots\un A_k)$. This solves the problem for 

920 
datatypes~\cite[\S4.2]{paulsonsetII}. 

921 

922 
The standard pairs and injections can only yield wellfounded 

923 
constructions. This eases the (manual!) definition of recursive functions 

130  924 
over datatypes. But they are unsuitable for codatatypes, which typically 
103  925 
contain nonwellfounded objects. 
926 

130  927 
To support codatatypes, Isabelle/ZF defines a variant notion of ordered 
103  928 
pair, written~$\pair{a;b}$. It also defines the corresponding variant 
929 
notion of Cartesian product $A\otimes B$, variant injections $\QInl(a)$ 

930 
and~$\QInr(b)$, and variant disjoint sum $A\oplus B$. Finally it defines 

931 
the set $\quniv(A)$, which contains~$A$ and furthermore contains 

932 
$\pair{a;b}$, $\QInl(a)$ and $\QInr(b)$ for $a$, $b\in\quniv(A)$. In a 

130  933 
typical codatatype definition with set parameters $A_1$, \ldots, $A_k$, a 
934 
suitable domain is $\quniv(A_1\un\cdots\un A_k)$. This approach using 

935 
standard ZF set theory\footnote{No reference is available. Variant pairs 

936 
are defined by $\pair{a;b}\equiv (a+b) \equiv (\{0\}\times a) \un 

937 
(\{1\}\times b)$, where $\times$ is the Cartesian product for standard 

938 
ordered pairs. Now 

103  939 
$\pair{a;b}$ is injective and monotonic in its two arguments. 
940 
Nonwellfounded constructions, such as infinite lists, are constructed 

941 
as least fixedpoints; the bounding set typically has the form 

942 
$\univ(a_1\un\cdots\un a_k)$, where $a_1$, \ldots, $a_k$ are specified 

943 
elements of the construction.} 

130  944 
is an alternative to adopting Aczel's AntiFoundation 
945 
Axiom~\cite{aczel88}. 

103  946 

947 
\subsection{The case analysis operator} 

130  948 
The (co)datatype package automatically defines a case analysis operator, 
179  949 
called {\tt$R$\_case}. A mutually recursive definition still has only one 
950 
operator, whose name combines those of the recursive sets: it is called 

951 
{\tt$R_1$\_\ldots\_$R_n$\_case}. The case operator is analogous to those 

952 
for products and sums. 

103  953 

954 
Datatype definitions employ standard products and sums, whose operators are 

955 
$\split$ and $\case$ and satisfy the equations 

956 
\begin{eqnarray*} 

957 
\split(f,\pair{x,y}) & = & f(x,y) \\ 

958 
\case(f,g,\Inl(x)) & = & f(x) \\ 

959 
\case(f,g,\Inr(y)) & = & g(y) 

960 
\end{eqnarray*} 

961 
Suppose the datatype has $k$ constructors $\Con_1$, \ldots,~$\Con_k$. Then 

962 
its case operator takes $k+1$ arguments and satisfies an equation for each 

963 
constructor: 

964 
\begin{eqnarray*} 

965 
R\hbox{\_case}(f_1,\ldots,f_k, {\tt Con}_i(\vec{x})) & = & f_i(\vec{x}), 

966 
\qquad i = 1, \ldots, k 

967 
\end{eqnarray*} 

130  968 
The case operator's definition takes advantage of Isabelle's representation 
969 
of syntax in the typed $\lambda$calculus; it could readily be adapted to a 

970 
theorem prover for higherorder logic. If $f$ and~$g$ have metatype 

971 
$i\To i$ then so do $\split(f)$ and 

972 
$\case(f,g)$. This works because $\split$ and $\case$ operate on their last 

973 
argument. They are easily combined to make complex case analysis 

103  974 
operators. Here are two examples: 
975 
\begin{itemize} 

976 
\item $\split(\lambda x.\split(f(x)))$ performs case analysis for 

977 
$A\times (B\times C)$, as is easily verified: 

978 
\begin{eqnarray*} 

979 
\split(\lambda x.\split(f(x)), \pair{a,b,c}) 

980 
& = & (\lambda x.\split(f(x))(a,\pair{b,c}) \\ 

981 
& = & \split(f(a), \pair{b,c}) \\ 

982 
& = & f(a,b,c) 

983 
\end{eqnarray*} 

984 

985 
\item $\case(f,\case(g,h))$ performs case analysis for $A+(B+C)$; let us 

986 
verify one of the three equations: 

987 
\begin{eqnarray*} 

988 
\case(f,\case(g,h), \Inr(\Inl(b))) 

989 
& = & \case(g,h,\Inl(b)) \\ 

990 
& = & g(b) 

991 
\end{eqnarray*} 

992 
\end{itemize} 

130  993 
Codatatype definitions are treated in precisely the same way. They express 
103  994 
case operators using those for the variant products and sums, namely 
995 
$\qsplit$ and~$\qcase$. 

996 

997 

998 
\ifCADE The package has processed all the datatypes discussed in my earlier 

130  999 
paper~\cite{paulsonsetII} and the codatatype of lazy lists. Space 
103  1000 
limitations preclude discussing these examples here, but they are 
1001 
distributed with Isabelle. 

1002 
\typeout{****Omitting datatype examples from CADE version!} \else 

1003 

1004 
To see how constructors and the case analysis operator are defined, let us 

1005 
examine some examples. These include lists and trees/forests, which I have 

1006 
discussed extensively in another paper~\cite{paulsonsetII}. 

1007 

1008 
\begin{figure} 

1009 
\begin{ttbox} 

1010 
structure List = Datatype_Fun 

1011 
(val thy = Univ.thy; 

1012 
val rec_specs = 

1013 
[("list", "univ(A)", 

1014 
[(["Nil"], "i"), 

1015 
(["Cons"], "[i,i]=>i")])]; 

1016 
val rec_styp = "i=>i"; 

130  1017 
val ext = None; 
103  1018 
val sintrs = 
1019 
["Nil : list(A)", 

1020 
"[ a: A; l: list(A) ] ==> Cons(a,l) : list(A)"]; 

1021 
val monos = []; 

130  1022 
val type_intrs = datatype_intrs; 
103  1023 
val type_elims = datatype_elims); 
1024 
\end{ttbox} 

1025 
\hrule 

1026 
\caption{Defining the datatype of lists} \label{listfig} 

1027 

1028 
\medskip 

1029 
\begin{ttbox} 

130  1030 
structure LList = CoDatatype_Fun 
103  1031 
(val thy = QUniv.thy; 
1032 
val rec_specs = 

1033 
[("llist", "quniv(A)", 

1034 
[(["LNil"], "i"), 

1035 
(["LCons"], "[i,i]=>i")])]; 

1036 
val rec_styp = "i=>i"; 

130  1037 
val ext = None; 
103  1038 
val sintrs = 
1039 
["LNil : llist(A)", 

1040 
"[ a: A; l: llist(A) ] ==> LCons(a,l) : llist(A)"]; 

1041 
val monos = []; 

130  1042 
val type_intrs = codatatype_intrs; 
1043 
val type_elims = codatatype_elims); 

103  1044 
\end{ttbox} 
1045 
\hrule 

130  1046 
\caption{Defining the codatatype of lazy lists} \label{llistfig} 
103  1047 
\end{figure} 
1048 

1049 
\subsection{Example: lists and lazy lists} 

1050 
Figures \ref{listfig} and~\ref{llistfig} present the ML definitions of 

1051 
lists and lazy lists, respectively. They highlight the (many) similarities 

130  1052 
and (few) differences between datatype and codatatype definitions. 
103  1053 

1054 
Each form of list has two constructors, one for the empty list and one for 

1055 
adding an element to a list. Each takes a parameter, defining the set of 

1056 
lists over a given set~$A$. Each uses the appropriate domain from a 

1057 
Isabelle/ZF theory: 

1058 
\begin{itemize} 

1059 
\item $\lst(A)$ specifies domain $\univ(A)$ and parent theory {\tt Univ.thy}. 

1060 

1061 
\item $\llist(A)$ specifies domain $\quniv(A)$ and parent theory {\tt 

1062 
QUniv.thy}. 

1063 
\end{itemize} 

1064 

130  1065 
Since $\lst(A)$ is a datatype, it enjoys a structural induction rule, {\tt 
1066 
List.induct}: 

103  1067 
\[ \infer{P(x)}{x\in\lst(A) & P(\Nil) 
1068 
& \infer*{P(\Cons(a,l))}{[a\in A & l\in\lst(A) & P(l)]_{a,l}} } 

1069 
\] 

1070 
Induction and freeness yield the law $l\not=\Cons(a,l)$. To strengthen this, 

1071 
Isabelle/ZF defines the rank of a set and proves that the standard pairs and 

1072 
injections have greater rank than their components. An immediate consequence, 

1073 
which justifies structural recursion on lists \cite[\S4.3]{paulsonsetII}, 

1074 
is 

1075 
\[ \rank(l) < \rank(\Cons(a,l)). \] 

1076 

130  1077 
Since $\llist(A)$ is a codatatype, it has no induction rule. Instead it has 
1078 
the coinduction rule shown in \S\ref{coindsec}. Since variant pairs and 

103  1079 
injections are monotonic and need not have greater rank than their 
1080 
components, fixedpoint operators can create cyclic constructions. For 

1081 
example, the definition 

1082 
\begin{eqnarray*} 

1083 
\lconst(a) & \equiv & \lfp(\univ(a), \lambda l. \LCons(a,l)) 

1084 
\end{eqnarray*} 

1085 
yields $\lconst(a) = \LCons(a,\lconst(a))$. 

1086 

1087 
\medskip 

1088 
It may be instructive to examine the definitions of the constructors and 

1089 
case operator for $\lst(A)$. The definitions for $\llist(A)$ are similar. 

1090 
The list constructors are defined as follows: 

1091 
\begin{eqnarray*} 

1092 
\Nil & = & \Inl(\emptyset) \\ 

1093 
\Cons(a,l) & = & \Inr(\pair{a,l}) 

1094 
\end{eqnarray*} 

1095 
The operator $\lstcase$ performs case analysis on these two alternatives: 

1096 
\begin{eqnarray*} 

1097 
\lstcase(c,h) & \equiv & \case(\lambda u.c, \split(h)) 

1098 
\end{eqnarray*} 

1099 
Let us verify the two equations: 

1100 
\begin{eqnarray*} 

1101 
\lstcase(c, h, \Nil) & = & 

1102 
\case(\lambda u.c, \split(h), \Inl(\emptyset)) \\ 

1103 
& = & (\lambda u.c)(\emptyset) \\ 

130  1104 
& = & c\\[1ex] 
103  1105 
\lstcase(c, h, \Cons(x,y)) & = & 
1106 
\case(\lambda u.c, \split(h), \Inr(\pair{x,y})) \\ 

1107 
& = & \split(h, \pair{x,y}) \\ 

130  1108 
& = & h(x,y) 
103  1109 
\end{eqnarray*} 
1110 

1111 
\begin{figure} 

1112 
\begin{small} 

1113 
\begin{verbatim} 

1114 
structure TF = Datatype_Fun 

1115 
(val thy = Univ.thy; 

1116 
val rec_specs = 

1117 
[("tree", "univ(A)", 

1118 
[(["Tcons"], "[i,i]=>i")]), 

1119 
("forest", "univ(A)", 

1120 
[(["Fnil"], "i"), 

1121 
(["Fcons"], "[i,i]=>i")])]; 

1122 
val rec_styp = "i=>i"; 

130  1123 
val ext = None; 
103  1124 
val sintrs = 
1125 
["[ a:A; f: forest(A) ] ==> Tcons(a,f) : tree(A)", 

1126 
"Fnil : forest(A)", 

1127 
"[ t: tree(A); f: forest(A) ] ==> Fcons(t,f) : forest(A)"]; 

1128 
val monos = []; 

130  1129 
val type_intrs = datatype_intrs; 
103  1130 
val type_elims = datatype_elims); 
1131 
\end{verbatim} 

1132 
\end{small} 

1133 
\hrule 

1134 
\caption{Defining the datatype of trees and forests} \label{tffig} 

1135 
\end{figure} 

1136 

1137 

1138 
\subsection{Example: mutual recursion} 

130  1139 
In mutually recursive trees and forests~\cite[\S4.5]{paulsonsetII}, trees 
103  1140 
have the one constructor $\Tcons$, while forests have the two constructors 
1141 
$\Fnil$ and~$\Fcons$. Figure~\ref{tffig} presents the ML 

1142 
definition. It has much in common with that of $\lst(A)$, including its 

1143 
use of $\univ(A)$ for the domain and {\tt Univ.thy} for the parent theory. 

1144 
The three introduction rules define the mutual recursion. The 

1145 
distinguishing feature of this example is its two induction rules. 

1146 

1147 
The basic induction rule is called {\tt TF.induct}: 

1148 
\[ \infer{P(x)}{x\in\TF(A) & 

1149 
\infer*{P(\Tcons(a,f))} 

1150 
{\left[\begin{array}{l} a\in A \\ 

1151 
f\in\forest(A) \\ P(f) 

1152 
\end{array} 

1153 
\right]_{a,f}} 

1154 
& P(\Fnil) 

130  1155 
& \infer*{P(\Fcons(t,f))} 
103  1156 
{\left[\begin{array}{l} t\in\tree(A) \\ P(t) \\ 
1157 
f\in\forest(A) \\ P(f) 

1158 
\end{array} 

1159 
\right]_{t,f}} } 

1160 
\] 

1161 
This rule establishes a single predicate for $\TF(A)$, the union of the 

1162 
recursive sets. 

1163 

1164 
Although such reasoning is sometimes useful 

1165 
\cite[\S4.5]{paulsonsetII}, a proper mutual induction rule should establish 

1166 
separate predicates for $\tree(A)$ and $\forest(A)$. The package calls this 

1167 
rule {\tt TF.mutual\_induct}. Observe the usage of $P$ and $Q$ in the 

1168 
induction hypotheses: 

1169 
\[ \infer{(\forall z. z\in\tree(A)\imp P(z)) \conj 

1170 
(\forall z. z\in\forest(A)\imp Q(z))} 

1171 
{\infer*{P(\Tcons(a,f))} 

1172 
{\left[\begin{array}{l} a\in A \\ 

1173 
f\in\forest(A) \\ Q(f) 

1174 
\end{array} 

1175 
\right]_{a,f}} 

1176 
& Q(\Fnil) 

130  1177 
& \infer*{Q(\Fcons(t,f))} 
103  1178 
{\left[\begin{array}{l} t\in\tree(A) \\ P(t) \\ 
1179 
f\in\forest(A) \\ Q(f) 

1180 
\end{array} 

1181 
\right]_{t,f}} } 

1182 
\] 

1183 
As mentioned above, the package does not define a structural recursion 

1184 
operator. I have described elsewhere how this is done 

1185 
\cite[\S4.5]{paulsonsetII}. 

1186 

1187 
Both forest constructors have the form $\Inr(\cdots)$, 

1188 
while the tree constructor has the form $\Inl(\cdots)$. This pattern would 

1189 
hold regardless of how many tree or forest constructors there were. 

1190 
\begin{eqnarray*} 

1191 
\Tcons(a,l) & = & \Inl(\pair{a,l}) \\ 

1192 
\Fnil & = & \Inr(\Inl(\emptyset)) \\ 

1193 
\Fcons(a,l) & = & \Inr(\Inr(\pair{a,l})) 

1194 
\end{eqnarray*} 

1195 
There is only one case operator; it works on the union of the trees and 

1196 
forests: 

1197 
\begin{eqnarray*} 

1198 
{\tt tree\_forest\_case}(f,c,g) & \equiv & 

1199 
\case(\split(f),\, \case(\lambda u.c, \split(g))) 

1200 
\end{eqnarray*} 

1201 

1202 
\begin{figure} 

1203 
\begin{small} 

1204 
\begin{verbatim} 

1205 
structure Data = Datatype_Fun 

1206 
(val thy = Univ.thy; 

1207 
val rec_specs = 

1208 
[("data", "univ(A Un B)", 

1209 
[(["Con0"], "i"), 

1210 
(["Con1"], "i=>i"), 

1211 
(["Con2"], "[i,i]=>i"), 

1212 
(["Con3"], "[i,i,i]=>i")])]; 

1213 
val rec_styp = "[i,i]=>i"; 

130  1214 
val ext = None; 
103  1215 
val sintrs = 
1216 
["Con0 : data(A,B)", 

1217 
"[ a: A ] ==> Con1(a) : data(A,B)", 

1218 
"[ a: A; b: B ] ==> Con2(a,b) : data(A,B)", 

1219 
"[ a: A; b: B; d: data(A,B) ] ==> Con3(a,b,d) : data(A,B)"]; 

1220 
val monos = []; 

130  1221 
val type_intrs = datatype_intrs; 
103  1222 
val type_elims = datatype_elims); 
1223 
\end{verbatim} 

1224 
\end{small} 

1225 
\hrule 

1226 
\caption{Defining the fourconstructor sample datatype} \label{datafig} 

1227 
\end{figure} 

1228 

1229 
\subsection{A fourconstructor datatype} 

1230 
Finally let us consider a fairly general datatype. It has four 

130  1231 
constructors $\Con_0$, \ldots, $\Con_3$, with the 
103  1232 
corresponding arities. Figure~\ref{datafig} presents the ML definition. 
1233 
Because this datatype has two set parameters, $A$ and~$B$, it specifies 

1234 
$\univ(A\un B)$ as its domain. The structural induction rule has four 

1235 
minor premises, one per constructor: 

1236 
\[ \infer{P(x)}{x\in\data(A,B) & 

1237 
P(\Con_0) & 

1238 
\infer*{P(\Con_1(a))}{[a\in A]_a} & 

1239 
\infer*{P(\Con_2(a,b))} 

1240 
{\left[\begin{array}{l} a\in A \\ b\in B \end{array} 

1241 
\right]_{a,b}} & 

1242 
\infer*{P(\Con_3(a,b,d))} 

1243 
{\left[\begin{array}{l} a\in A \\ b\in B \\ 

1244 
d\in\data(A,B) \\ P(d) 

1245 
\end{array} 

1246 
\right]_{a,b,d}} } 

1247 
\] 

1248 

1249 
The constructor definitions are 

1250 
\begin{eqnarray*} 

1251 
\Con_0 & = & \Inl(\Inl(\emptyset)) \\ 

1252 
\Con_1(a) & = & \Inl(\Inr(a)) \\ 

1253 
\Con_2(a,b) & = & \Inr(\Inl(\pair{a,b})) \\ 

1254 
\Con_3(a,b,c) & = & \Inr(\Inr(\pair{a,b,c})). 

1255 
\end{eqnarray*} 

1256 
The case operator is 

1257 
\begin{eqnarray*} 

1258 
{\tt data\_case}(f_0,f_1,f_2,f_3) & \equiv & 

1259 
\case(\begin{array}[t]{@{}l} 

1260 
\case(\lambda u.f_0,\; f_1),\, \\ 

1261 
\case(\split(f_2),\; \split(\lambda v.\split(f_3(v)))) ) 

1262 
\end{array} 

1263 
\end{eqnarray*} 

1264 
This may look cryptic, but the case equations are trivial to verify. 

1265 

1266 
In the constructor definitions, the injections are balanced. A more naive 

1267 
approach is to define $\Con_3(a,b,c)$ as 

1268 
$\Inr(\Inr(\Inr(\pair{a,b,c})))$; instead, each constructor has two 

1269 
injections. The difference here is small. But the ZF examples include a 

1270 
60element enumeration type, where each constructor has 5 or~6 injections. 

1271 
The naive approach would require 1 to~59 injections; the definitions would be 

1272 
quadratic in size. It is like the difference between the binary and unary 

1273 
numeral systems. 

1274 

130  1275 
The result structure contains the case operator and constructor definitions as 
1276 
the theorem list \verbcon_defs. It contains the case equations, such as 

103  1277 
\begin{eqnarray*} 
1278 
{\tt data\_case}(f_0,f_1,f_2,f_3,\Con_3(a,b,c)) & = & f_3(a,b,c), 

1279 
\end{eqnarray*} 

1280 
as the theorem list \verbcase_eqns. There is one equation per constructor. 

1281 

1282 
\subsection{Proving freeness theorems} 

1283 
There are two kinds of freeness theorems: 

1284 
\begin{itemize} 

1285 
\item {\bf injectiveness} theorems, such as 

1286 
\[ \Con_2(a,b) = \Con_2(a',b') \bimp a=a' \conj b=b' \] 

1287 

1288 
\item {\bf distinctness} theorems, such as 

1289 
\[ \Con_1(a) \not= \Con_2(a',b') \] 

1290 
\end{itemize} 

1291 
Since the number of such theorems is quadratic in the number of constructors, 

1292 
the package does not attempt to prove them all. Instead it returns tools for 

1293 
proving desired theorems  either explicitly or `on the fly' during 

1294 
simplification or classical reasoning. 

1295 

1296 
The theorem list \verbfree_iffs enables the simplifier to perform freeness 

1297 
reasoning. This works by incremental unfolding of constructors that appear in 

1298 
equations. The theorem list contains logical equivalences such as 

1299 
\begin{eqnarray*} 

1300 
\Con_0=c & \bimp & c=\Inl(\Inl(\emptyset)) \\ 

1301 
\Con_1(a)=c & \bimp & c=\Inl(\Inr(a)) \\ 

1302 
& \vdots & \\ 

1303 
\Inl(a)=\Inl(b) & \bimp & a=b \\ 

130  1304 
\Inl(a)=\Inr(b) & \bimp & {\tt False} \\ 
103  1305 
\pair{a,b} = \pair{a',b'} & \bimp & a=a' \conj b=b' 
1306 
\end{eqnarray*} 

1307 
For example, these rewrite $\Con_1(a)=\Con_1(b)$ to $a=b$ in four steps. 

1308 

1309 
The theorem list \verbfree_SEs enables the classical 

1310 
reasoner to perform similar replacements. It consists of elimination rules 

1311 
to replace $\Con_0=c$ by $c=\Inl(\Inl(\emptyset))$, and so forth, in the 

1312 
assumptions. 

1313 

1314 
Such incremental unfolding combines freeness reasoning with other proof 

1315 
steps. It has the unfortunate sideeffect of unfolding definitions of 

1316 
constructors in contexts such as $\exists x.\Con_1(a)=x$, where they should 

1317 
be left alone. Calling the Isabelle tactic {\tt fold\_tac con\_defs} 

1318 
restores the defined constants. 

1319 
\fi %CADE 

1320 

1321 
\section{Conclusions and future work} 

130  1322 
The fixedpoint approach makes it easy to implement a powerful 
1323 
package for inductive and coinductive definitions. It is efficient: it 

103  1324 
processes most definitions in seconds and even a 60constructor datatype 
1325 
requires only two minutes. It is also simple: the package consists of 

1326 
under 1100 lines (35K bytes) of Standard ML code. The first working 

1327 
version took under a week to code. 

1328 

1329 
The approach is not restricted to set theory. It should be suitable for 

1330 
any logic that has some notion of set and the KnasterTarski Theorem. 

1331 
Indeed, Melham's inductive definition package for the HOL 

1332 
system~\cite{camilleri92} implicitly proves this theorem. 

1333 

130  1334 
Datatype and codatatype definitions furthermore require a particular set 
103  1335 
closed under a suitable notion of ordered pair. I intend to use the 
1336 
Isabelle/ZF package as the basis for a higherorder logic one, using 

1337 
Isabelle/HOL\@. The necessary theory is already 

130  1338 
mechanized~\cite{paulsoncoind}. HOL represents sets by unary predicates; 
103  1339 
defining the corresponding types may cause complication. 
1340 

1341 

1342 
\bibliographystyle{plain} 

1343 
\bibliography{atp,theory,funprog,isabelle} 

1344 
%%%%%\doendnotes 

1345 

1346 
\ifCADE\typeout{****Omitting appendices from CADE version!} 

1347 
\else 

1348 
\newpage 

1349 
\appendix 

130  1350 
\section{Inductive and coinductive definitions: users guide} 
1351 
The ML functors \verbInductive_Fun and \verbCoInductive_Fun build 

1352 
inductive and coinductive definitions, respectively. This section describes 

103  1353 
how to invoke them. 
1354 

1355 
\subsection{The result structure} 

1356 
Many of the result structure's components have been discussed 

1357 
in~\S\ref{basicsec}; others are selfexplanatory. 

1358 
\begin{description} 

1359 
\item[\tt thy] is the new theory containing the recursive sets. 

1360 

1361 
\item[\tt defs] is the list of definitions of the recursive sets. 

1362 

1363 
\item[\tt bnd\_mono] is a monotonicity theorem for the fixedpoint operator. 

1364 

1365 
\item[\tt unfold] is a fixedpoint equation for the recursive set (the union of 

1366 
the recursive sets, in the case of mutual recursion). 

1367 

1368 
\item[\tt dom\_subset] is a theorem stating inclusion in the domain. 

1369 

1370 
\item[\tt intrs] is the list of introduction rules, now proved as theorems, for 

1371 
the recursive sets. 

1372 

1373 
\item[\tt elim] is the elimination rule. 

1374 

1375 
\item[\tt mk\_cases] is a function to create simplified instances of {\tt 

1376 
elim}, using freeness reasoning on some underlying datatype. 

1377 
\end{description} 

1378 

1379 
For an inductive definition, the result structure contains two induction rules, 

130  1380 
{\tt induct} and \verbmutual_induct. For a coinductive definition, it 
1381 
contains the rule \verbcoinduct. 

1382 

1383 
Figure~\ref{defresultfig} summarizes the two result signatures, 

1384 
specifying the types of all these components. 

103  1385 

1386 
\begin{figure} 

1387 
\begin{ttbox} 

1388 
sig 

1389 
val thy : theory 

1390 
val defs : thm list 

1391 
val bnd_mono : thm 

1392 
val unfold : thm 

1393 
val dom_subset : thm 

1394 
val intrs : thm list 

1395 
val elim : thm 

1396 
val mk_cases : thm list > string > thm 

1397 
{\it(Inductive definitions only)} 

1398 
val induct : thm 

1399 
val mutual_induct: thm 

130  1400 
{\it(Coinductive definitions only)} 
1401 
val coinduct : thm 

103  1402 
end 
1403 
\end{ttbox} 

1404 
\hrule 

130  1405 
\caption{The result of a (co)inductive definition} \label{defresultfig} 
103  1406 

130  1407 
\medskip 
103  1408 
\begin{ttbox} 
1409 
sig 

1410 
val thy : theory 

1411 
val rec_doms : (string*string) list 

1412 
val sintrs : string list 

1413 
val monos : thm list 

1414 
val con_defs : thm list 

1415 
val type_intrs : thm list 

1416 
val type_elims : thm list 

1417 
end 

1418 
\end{ttbox} 

1419 
\hrule 

130  1420 
\caption{The argument of a (co)inductive definition} \label{defargfig} 
103  1421 
\end{figure} 
1422 

1423 
\subsection{The argument structure} 

130  1424 
Both \verbInductive_Fun and \verbCoInductive_Fun take the same argument 
103  1425 
structure (Figure~\ref{defargfig}). Its components are as follows: 
1426 
\begin{description} 

1427 
\item[\tt thy] is the definition's parent theory, which {\it must\/} 

1428 
declare constants for the recursive sets. 

1429 

1430 
\item[\tt rec\_doms] is a list of pairs, associating the name of each recursive 

1431 
set with its domain. 

1432 

1433 
\item[\tt sintrs] specifies the desired introduction rules as strings. 

1434 

1435 
\item[\tt monos] consists of monotonicity theorems for each operator applied 

1436 
to a recursive set in the introduction rules. 

1437 

1438 
\item[\tt con\_defs] contains definitions of constants appearing in the 

130  1439 
introduction rules. The (co)datatype package supplies the constructors' 
103  1440 
definitions here. Most direct calls of \verbInductive_Fun or 
130  1441 
\verbCoInductive_Fun pass the empty list; one exception is the primitive 
103  1442 
recursive functions example (\S\ref{primrecsec}). 
1443 

1444 
\item[\tt type\_intrs] consists of introduction rules for typechecking the 

1445 
definition, as discussed in~\S\ref{basicsec}. They are applied using 

1446 
depthfirst search; you can trace the proof by setting 

1447 
\verbtrace_DEPTH_FIRST := true. 

1448 

1449 
\item[\tt type\_elims] consists of elimination rules for typechecking the 

1450 
definition. They are presumed to be `safe' and are applied as much as 

1451 
possible, prior to the {\tt type\_intrs} search. 

1452 
\end{description} 

1453 
The package has a few notable restrictions: 

1454 
\begin{itemize} 

1455 
\item The parent theory, {\tt thy}, must declare the recursive sets as 

1456 
constants. You can extend a theory with new constants using {\tt 

1457 
addconsts}, as illustrated in~\S\ref{indegsec}. If the inductive 

1458 
definition also requires new concrete syntax, then it is simpler to 

1459 
express the parent theory using a theory file. It is often convenient to 

1460 
define an infix syntax for relations, say $a\prec b$ for $\pair{a,b}\in 

1461 
R$. 

1462 

1463 
\item The names of the recursive sets must be identifiers, not infix 

1464 
operators. 

1465 

1466 
\item Sideconditions must not be conjunctions. However, an introduction rule 

1467 
may contain any number of sideconditions. 

1468 
\end{itemize} 

1469 

1470 

130  1471 
\section{Datatype and codatatype definitions: users guide} 
1472 
The ML functors \verbDatatype_Fun and \verbCoDatatype_Fun define datatypes 

1473 
and codatatypes, invoking \verbDatatype_Fun and 

1474 
\verbCoDatatype_Fun to make the underlying (co)inductive definitions. 

103  1475 

1476 

1477 
\subsection{The result structure} 

130  1478 
The result structure extends that of (co)inductive definitions 
103  1479 
(Figure~\ref{defresultfig}) with several additional items: 
1480 
\begin{ttbox} 

1481 
val con_thy : theory 

1482 
val con_defs : thm list 

1483 
val case_eqns : thm list 

1484 
val free_iffs : thm list 

1485 
val free_SEs : thm list 

1486 
val mk_free : string > thm 

1487 
\end{ttbox} 

1488 
Most of these have been discussed in~\S\ref{datasec}. Here is a summary: 

1489 
\begin{description} 

1490 
\item[\tt con\_thy] is a new theory containing definitions of the 

130  1491 
(co)datatype's constructors and case operator. It also declares the 
103  1492 
recursive sets as constants, so that it may serve as the parent 
130  1493 
theory for the (co)inductive definition. 
103  1494 

1495 
\item[\tt con\_defs] is a list of definitions: the case operator followed by 

1496 
the constructors. This theorem list can be supplied to \verbmk_cases, for 

1497 
example. 

1498 

1499 
\item[\tt case\_eqns] is a list of equations, stating that the case operator 

1500 
inverts each constructor. 

1501 

1502 
\item[\tt free\_iffs] is a list of logical equivalences to perform freeness 

1503 
reasoning by rewriting. A typical application has the form 

1504 
\begin{ttbox} 

1505 
by (asm_simp_tac (ZF_ss addsimps free_iffs) 1); 

1506 
\end{ttbox} 

1507 

1508 
\item[\tt free\_SEs] is a list of `safe' elimination rules to perform freeness 

1509 
reasoning. It can be supplied to \verberesolve_tac or to the classical 

1510 
reasoner: 

1511 
\begin{ttbox} 

1512 
by (fast_tac (ZF_cs addSEs free_SEs) 1); 

1513 
\end{ttbox} 

1514 

1515 
\item[\tt mk\_free] is a function to prove freeness properties, specified as 

1516 
strings. The theorems can be expressed in various forms, such as logical 

1517 
equivalences or elimination rules. 

1518 
\end{description} 

1519 

1520 
The result structure also inherits everything from the underlying 

130  1521 
(co)inductive definition, such as the introduction rules, elimination rule, 
179  1522 
and (co)induction rule. 
103  1523 

1524 

1525 
\begin{figure} 

1526 
\begin{ttbox} 

1527 
sig 

1528 
val thy : theory 

1529 
val rec_specs : (string * string * (string list*string)list) list 

1530 
val rec_styp : string 

1531 
val ext : Syntax.sext option 

1532 
val sintrs : string list 

1533 
val monos : thm list 

1534 
val type_intrs: thm list 

1535 
val type_elims: thm list 

1536 
end 

1537 
\end{ttbox} 

1538 
\hrule 

130  1539 
\caption{The argument of a (co)datatype definition} \label{dataargfig} 
103  1540 
\end{figure} 
1541 

1542 
\subsection{The argument structure} 

130
c035b6b9eafc
Many edits suggested by Grundy & Thompson
lc 