author | blanchet |
Mon, 31 Jan 2022 16:09:23 +0100 | |
changeset 75020 | b087610592b4 |
parent 69597 | ff784d5a5bfb |
child 76987 | 4c275405faae |
permissions | -rw-r--r-- |
63026 | 1 |
theory Typeclass_Hierarchy |
69422
472af2d7835d
clarified session dependencies: faster build_doc/build_release;
wenzelm
parents:
68484
diff
changeset
|
2 |
imports Setup |
63026 | 3 |
begin |
4 |
||
5 |
section \<open>Introduction\<close> |
|
6 |
||
7 |
text \<open> |
|
8 |
The {Isabelle/HOL} type-class hierarchy entered the stage |
|
63555 | 9 |
in a quite ancient era -- first related \<^emph>\<open>NEWS\<close> entries date |
63026 | 10 |
back to release {Isabelle99-1}. Since then, there have been |
11 |
ongoing modifications and additions, leading to ({Isabelle2016}) |
|
12 |
more than 180 type-classes. This sheer complexity makes access |
|
13 |
and understanding of that type-class hierarchy difficult and |
|
14 |
involved, let alone maintenance. |
|
15 |
||
16 |
The purpose of this primer is to shed some light on this, |
|
17 |
not only on the mere ingredients, but also on the design |
|
18 |
principles which have evolved and proven useful over time. |
|
19 |
\<close> |
|
20 |
||
21 |
section \<open>Foundations\<close> |
|
22 |
||
23 |
subsection \<open>Locales and type classes\<close> |
|
24 |
||
25 |
text \<open> |
|
26 |
Some familiarity with the Isabelle module system is assumed: |
|
27 |
defining locales and type-classes, interpreting locales and |
|
28 |
instantiating type-classes, adding relationships between |
|
29 |
locales (@{command sublocale}) and type-classes |
|
30 |
(@{command subclass}). Handy introductions are the |
|
67299 | 31 |
respective tutorials @{cite "isabelle-locale"} |
32 |
@{cite "isabelle-classes"}. |
|
63026 | 33 |
\<close> |
34 |
||
35 |
subsection \<open>Strengths and restrictions of type classes\<close> |
|
36 |
||
37 |
text \<open> |
|
38 |
The primary motivation for using type classes in {Isabelle/HOL} |
|
39 |
always have been numerical types, which form an inclusion chain: |
|
40 |
||
41 |
\begin{center} |
|
69597 | 42 |
\<^typ>\<open>nat\<close> \<open>\<sqsubset>\<close> \<^typ>\<open>int\<close> \<open>\<sqsubset>\<close> \<^typ>\<open>rat\<close> |
43 |
\<open>\<sqsubset>\<close> \<^typ>\<open>real\<close> \<open>\<sqsubset>\<close> \<^typ>\<open>complex\<close> |
|
63026 | 44 |
\end{center} |
45 |
||
69505 | 46 |
\noindent The inclusion \<open>\<sqsubset>\<close> means that any value of the numerical |
63026 | 47 |
type to the left hand side mathematically can be transferred |
48 |
to the numerical type on the right hand side. |
|
49 |
||
50 |
How to accomplish this given the quite restrictive type system |
|
67299 | 51 |
of {Isabelle/HOL}? Paulson @{cite "paulson-numerical"} explains |
63026 | 52 |
that each numerical type has some characteristic properties |
69505 | 53 |
which define an characteristic algebraic structure; \<open>\<sqsubset>\<close> |
63026 | 54 |
then corresponds to specialization of the corresponding |
55 |
characteristic algebraic structures. These algebraic structures |
|
56 |
are expressed using algebraic type classes and embeddings |
|
57 |
of numerical types into them: |
|
58 |
||
59 |
\begin{center}\begin{tabular}{lccc} |
|
69597 | 60 |
\<^term>\<open>of_nat\<close> \<open>::\<close> & \<^typ>\<open>nat\<close> & \<open>\<Rightarrow>\<close> & @{typ [source] "'a::semiring_1"} \\ |
69505 | 61 |
& \<open>\<sqinter>\<close> & & \<open>\<up>\<close> \\ |
69597 | 62 |
\<^term>\<open>of_int\<close> \<open>::\<close> & \<^typ>\<open>int\<close> & \<open>\<Rightarrow>\<close> & @{typ [source] "'a::ring_1"} \\ |
69505 | 63 |
& \<open>\<sqinter>\<close> & & \<open>\<up>\<close> \\ |
69597 | 64 |
\<^term>\<open>of_rat\<close> \<open>::\<close> & \<^typ>\<open>rat\<close> & \<open>\<Rightarrow>\<close> & @{typ [source] "'a::field_char_0"} \\ |
69505 | 65 |
& \<open>\<sqinter>\<close> & & \<open>\<up>\<close> \\ |
69597 | 66 |
\<^term>\<open>of_real\<close> \<open>::\<close> & \<^typ>\<open>real\<close> & \<open>\<Rightarrow>\<close> & @{typ [source] "'a::real_algebra_1"} \\ |
69505 | 67 |
& \<open>\<sqinter>\<close> \\ |
69597 | 68 |
& \<^typ>\<open>complex\<close> |
63026 | 69 |
\end{tabular}\end{center} |
70 |
||
69505 | 71 |
\noindent \<open>d \<leftarrow> c\<close> means that \<open>c\<close> is subclass of \<open>d\<close>. |
72 |
Hence each characteristic embedding \<open>of_num\<close> can transform |
|
73 |
any value of type \<open>num\<close> to any numerical type further |
|
63026 | 74 |
up in the inclusion chain. |
75 |
||
76 |
This canonical example exhibits key strengths of type classes: |
|
77 |
||
78 |
\<^item> Sharing of operations and facts among different |
|
79 |
types, hence also sharing of notation and names: there |
|
69505 | 80 |
is only one plus operation using infix syntax \<open>+\<close>, |
81 |
only one zero written \<open>0\<close>, and neutrality |
|
63555 | 82 |
(@{thm (frugal_sorts) add_0_left [all, no_vars]} and |
83 |
@{thm (frugal_sorts) add_0_right [all, no_vars]}) |
|
63026 | 84 |
is referred to |
85 |
uniformly by names @{fact add_0_left} and @{fact add_0_right}. |
|
86 |
||
87 |
\<^item> Proof tool setups are shared implicitly: |
|
88 |
@{fact add_0} and @{fact add_0_right} are simplification |
|
89 |
rules by default. |
|
90 |
||
91 |
\<^item> Hence existing proofs about particular numerical |
|
92 |
types are often easy to generalize to algebraic structures, |
|
93 |
given that they do not depend on specific properties of |
|
94 |
those numerical types. |
|
95 |
||
96 |
Considerable restrictions include: |
|
97 |
||
98 |
\<^item> Type class operations are restricted to one |
|
99 |
type parameter; this is insufficient e.g. for expressing |
|
100 |
a unified power operation adequately (see \secref{sec:power}). |
|
101 |
||
102 |
\<^item> Parameters are fixed over the whole type class |
|
103 |
hierarchy and cannot be refined in specific situations: |
|
69597 | 104 |
think of integral domains with a predicate \<^term>\<open>is_unit\<close>; |
63026 | 105 |
for natural numbers, this degenerates to the much simpler |
106 |
@{term [source] "HOL.equal (1::nat)"} but facts |
|
69597 | 107 |
refer to \<^term>\<open>is_unit\<close> nonetheless. |
63026 | 108 |
|
109 |
\<^item> Type classes are not apt for meta-theory. There |
|
110 |
is no practically usable way to express that the units |
|
111 |
of an integral domain form a multiplicative group using |
|
69505 | 112 |
type classes. But see session \<open>HOL-Algebra\<close> |
63026 | 113 |
which provides locales with an explicit carrier. |
114 |
\<close> |
|
115 |
||
116 |
||
117 |
subsection \<open>Navigating the hierarchy\<close> |
|
118 |
||
119 |
text \<open> |
|
120 |
An indispensable tool to inspect the class hierarchy is |
|
121 |
@{command class_deps} which displays the graph of classes, |
|
122 |
optionally showing the logical content for each class also. |
|
123 |
Optional parameters restrict the graph to a particular segment |
|
124 |
which is useful to get a graspable view. See |
|
67299 | 125 |
the Isar reference manual @{cite "isabelle-isar-ref"} for details. |
63026 | 126 |
\<close> |
127 |
||
128 |
||
129 |
section \<open>The hierarchy\<close> |
|
130 |
||
63555 | 131 |
subsection \<open>Syntactic type classes \label{sec:syntactic-type-class}\<close> |
63026 | 132 |
|
133 |
text \<open> |
|
134 |
At the top of the hierarchy there are a couple of syntactic type |
|
135 |
classes, ie. classes with operations but with no axioms, |
|
136 |
most notably: |
|
137 |
||
69597 | 138 |
\<^item> @{command class} \<^class>\<open>plus\<close> with @{term [source] "(a::'a::plus) + b"} |
63026 | 139 |
|
69597 | 140 |
\<^item> @{command class} \<^class>\<open>zero\<close> with @{term [source] "0::'a::zero"} |
63026 | 141 |
|
69597 | 142 |
\<^item> @{command class} \<^class>\<open>times\<close> with @{term [source] "(a::'a::times) * b"} |
63026 | 143 |
|
69597 | 144 |
\<^item> @{command class} \<^class>\<open>one\<close> with @{term [source] "1::'a::one"} |
63026 | 145 |
|
146 |
\noindent Before the introduction of the @{command class} statement in |
|
67299 | 147 |
Isabelle @{cite "Haftmann-Wenzel:2006:classes"} it was impossible |
63026 | 148 |
to define operations with associated axioms in the same class, |
149 |
hence there were always pairs of syntactic and logical type |
|
150 |
classes. This restriction is lifted nowadays, but there are |
|
151 |
still reasons to maintain syntactic type classes: |
|
152 |
||
153 |
\<^item> Syntactic type classes allow generic notation to be used |
|
154 |
regardless of a particular logical interpretation; e.g. |
|
69505 | 155 |
although multiplication \<open>*\<close> is usually associative, |
63026 | 156 |
there are examples where it is not (e.g. octonions), and |
69505 | 157 |
leaving \<open>*\<close> without axioms allows to re-use this |
63026 | 158 |
syntax by means of type class instantiation also for such |
159 |
exotic examples. |
|
160 |
||
161 |
\<^item> Type classes might share operations but not necessarily |
|
69597 | 162 |
axioms on them, e.g. \<^term>\<open>gcd\<close> (see \secref{sec:gcd}). |
63026 | 163 |
Hence their common base is a syntactic type class. |
164 |
||
165 |
\noindent However syntactic type classes should only be used with striking |
|
166 |
cause. Otherwise there is risk for confusion if the notation |
|
167 |
suggests properties which do not hold without particular |
|
168 |
constraints. This can be illustrated using numerals |
|
169 |
(see \secref{sec:numerals}): @{lemma "2 + 2 = 4" by simp} is |
|
170 |
provable without further ado, and this also meets the typical |
|
171 |
expectation towards a numeral notation; in more ancient releases |
|
69597 | 172 |
numerals were purely syntactic and \<^prop>\<open>2 + 2 = 4\<close> was |
63026 | 173 |
not provable without particular type constraints. |
174 |
\<close> |
|
175 |
||
63555 | 176 |
|
177 |
subsection \<open>Additive and multiplicative semigroups and monoids\<close> |
|
178 |
||
179 |
text \<open> |
|
180 |
In common literature, notation for semigroups and monoids |
|
69505 | 181 |
is either multiplicative \<open>(*, 1)\<close> or additive |
182 |
\<open>(+, 0)\<close> with underlying properties isomorphic. |
|
63555 | 183 |
In {Isabelle/HOL}, this is accomplished using the following |
184 |
abstract setup: |
|
185 |
||
69597 | 186 |
\<^item> A \<^locale>\<open>semigroup\<close> introduces an abstract binary |
63555 | 187 |
associative operation. |
188 |
||
69597 | 189 |
\<^item> A \<^locale>\<open>monoid\<close> is an extension of \<^locale>\<open>semigroup\<close> |
63555 | 190 |
with a neutral element. |
191 |
||
69597 | 192 |
\<^item> Both \<^locale>\<open>semigroup\<close> and \<^locale>\<open>monoid\<close> provide |
69505 | 193 |
dedicated syntax for their operations \<open>(\<^bold>*, \<^bold>1)\<close>. |
63555 | 194 |
This syntax is not visible on the global theory level |
195 |
but only for abstract reasoning inside the respective |
|
196 |
locale. |
|
197 |
||
198 |
\<^item> Concrete global syntax is added building on existing |
|
199 |
syntactic type classes \secref{sec:syntactic-type-class} |
|
200 |
using the following classes: |
|
201 |
||
69597 | 202 |
\<^item> @{command class} \<^class>\<open>semigroup_mult\<close> = \<^class>\<open>times\<close> |
63555 | 203 |
|
69597 | 204 |
\<^item> @{command class} \<^class>\<open>monoid_mult\<close> = \<^class>\<open>one\<close> + \<^class>\<open>semigroup_mult\<close> |
63555 | 205 |
|
69597 | 206 |
\<^item> @{command class} \<^class>\<open>semigroup_add\<close> = \<^class>\<open>plus\<close> |
63555 | 207 |
|
69597 | 208 |
\<^item> @{command class} \<^class>\<open>monoid_add\<close> = \<^class>\<open>zero\<close> + \<^class>\<open>semigroup_add\<close> |
63555 | 209 |
|
69597 | 210 |
Locales \<^locale>\<open>semigroup\<close> and \<^locale>\<open>monoid\<close> are |
63555 | 211 |
interpreted (using @{command sublocale}) into their |
69505 | 212 |
corresponding type classes, with prefixes \<open>add\<close> |
69597 | 213 |
and \<open>mult\<close>; hence facts derived in \<^locale>\<open>semigroup\<close> |
214 |
and \<^locale>\<open>monoid\<close> are propagated simultaneously to |
|
63555 | 215 |
\<^emph>\<open>both\<close> using a consistent naming policy, ie. |
216 |
||
217 |
\<^item> @{fact semigroup.assoc}: @{thm (frugal_sorts) semigroup.assoc [all, no_vars]} |
|
218 |
||
219 |
\<^item> @{fact mult.assoc}: @{thm (frugal_sorts) mult.assoc [all, no_vars]} |
|
220 |
||
221 |
\<^item> @{fact add.assoc}: @{thm (frugal_sorts) add.assoc [all, no_vars]} |
|
222 |
||
223 |
\<^item> @{fact monoid.right_neutral}: @{thm (frugal_sorts) monoid.right_neutral [all, no_vars]} |
|
224 |
||
225 |
\<^item> @{fact mult.right_neutral}: @{thm (frugal_sorts) mult.right_neutral [all, no_vars]} |
|
226 |
||
227 |
\<^item> @{fact add.right_neutral}: @{thm (frugal_sorts) add.right_neutral [all, no_vars]} |
|
228 |
||
69597 | 229 |
\<^item> Note that the syntax in \<^locale>\<open>semigroup\<close> and \<^locale>\<open>monoid\<close> |
63555 | 230 |
is bold; this avoids clashes when writing properties |
231 |
inside one of these locales in presence of that global |
|
232 |
concrete type class syntax. |
|
233 |
||
234 |
\noindent That hierarchy extends in a straightforward manner |
|
235 |
to abelian semigroups and commutative monoids\footnote{The |
|
236 |
designation \<^emph>\<open>abelian\<close> is quite standard concerning |
|
237 |
(semi)groups, but not for monoids}: |
|
238 |
||
69597 | 239 |
\<^item> Locales \<^locale>\<open>abel_semigroup\<close> and \<^locale>\<open>comm_monoid\<close> |
63555 | 240 |
add commutativity as property. |
241 |
||
242 |
\<^item> Concrete syntax emerges through |
|
243 |
||
69597 | 244 |
\<^item> @{command class} \<^class>\<open>ab_semigroup_add\<close> = \<^class>\<open>semigroup_add\<close> |
63555 | 245 |
|
69597 | 246 |
\<^item> @{command class} \<^class>\<open>ab_semigroup_mult\<close> = \<^class>\<open>semigroup_mult\<close> |
63555 | 247 |
|
69597 | 248 |
\<^item> @{command class} \<^class>\<open>comm_monoid_add\<close> = \<^class>\<open>zero\<close> + \<^class>\<open>ab_semigroup_add\<close> |
63555 | 249 |
|
69597 | 250 |
\<^item> @{command class} \<^class>\<open>comm_monoid_mult\<close> = \<^class>\<open>one\<close> + \<^class>\<open>ab_semigroup_mult\<close> |
63555 | 251 |
|
252 |
and corresponding interpretation of the locales above, yielding |
|
253 |
||
254 |
\<^item> @{fact abel_semigroup.commute}: @{thm (frugal_sorts) abel_semigroup.commute [all, no_vars]} |
|
255 |
||
256 |
\<^item> @{fact mult.commute}: @{thm (frugal_sorts) mult.commute [all, no_vars]} |
|
257 |
||
258 |
\<^item> @{fact add.commute}: @{thm (frugal_sorts) add.commute [all, no_vars]} |
|
259 |
\<close> |
|
260 |
||
261 |
paragraph \<open>Named collection of theorems\<close> |
|
262 |
||
263 |
text \<open> |
|
264 |
Locale interpretation interacts smoothly with named collections of |
|
265 |
theorems as introduced by command @{command named_theorems}. In our |
|
266 |
example, rules concerning associativity and commutativity are no |
|
267 |
simplification rules by default since they desired orientation may |
|
268 |
vary depending on the situation. However, there is a collection |
|
269 |
@{fact ac_simps} where facts @{fact abel_semigroup.assoc}, |
|
270 |
@{fact abel_semigroup.commute} and @{fact abel_semigroup.left_commute} |
|
271 |
are declared as members. Due to interpretation, also |
|
272 |
@{fact mult.assoc}, @{fact mult.commute} and @{fact mult.left_commute} |
|
273 |
are also members of @{fact ac_simps}, as any corresponding facts |
|
69597 | 274 |
stemming from interpretation of \<^locale>\<open>abel_semigroup\<close>. |
63555 | 275 |
Hence adding @{fact ac_simps} to the simplification rules for |
276 |
a single method call uses all associativity and commutativity |
|
277 |
rules known by means of interpretation. |
|
278 |
\<close> |
|
279 |
||
280 |
||
281 |
subsection \<open>Additive and multiplicative groups\<close> |
|
282 |
||
283 |
text \<open> |
|
284 |
The hierarchy for inverse group operations takes into account |
|
285 |
that there are weaker algebraic structures with only a partially |
|
286 |
inverse operation. E. g. the natural numbers have bounded |
|
69597 | 287 |
subtraction \<^term>\<open>m - (n::nat)\<close> which is only an inverse |
288 |
operation if \<^term>\<open>m \<ge> (n::nat)\<close>; unary minus \<open>-\<close> |
|
63555 | 289 |
is pointless on the natural numbers. |
290 |
||
291 |
Hence for both additive and multiplicative notation there |
|
292 |
are syntactic classes for inverse operations, both unary |
|
293 |
and binary: |
|
294 |
||
69597 | 295 |
\<^item> @{command class} \<^class>\<open>minus\<close> with @{term [source] "(a::'a::minus) - b"} |
63555 | 296 |
|
69597 | 297 |
\<^item> @{command class} \<^class>\<open>uminus\<close> with @{term [source] "- a::'a::uminus"} |
63555 | 298 |
|
69597 | 299 |
\<^item> @{command class} \<^class>\<open>divide\<close> with @{term [source] "(a::'a::divide) div b"} |
63555 | 300 |
|
69597 | 301 |
\<^item> @{command class} \<^class>\<open>inverse\<close> = \<^class>\<open>divide\<close> with @{term [source] "inverse a::'a::inverse"} |
63555 | 302 |
\\ and @{term [source] "(a::'a::inverse) / b"} |
303 |
||
69597 | 304 |
\noindent Here \<^class>\<open>inverse\<close> specializes the ``partial'' syntax |
63555 | 305 |
@{term [source] "a div b"} to the more specific |
306 |
@{term [source] "a / b"}. |
|
307 |
||
308 |
Semantic properties are added by |
|
309 |
||
69597 | 310 |
\<^item> @{command class} \<^class>\<open>cancel_ab_semigroup_add\<close> = \<^class>\<open>ab_semigroup_add\<close> + \<^class>\<open>minus\<close> |
63555 | 311 |
|
69597 | 312 |
\<^item> @{command class} \<^class>\<open>cancel_comm_monoid_add\<close> = \<^class>\<open>cancel_ab_semigroup_add\<close> + \<^class>\<open>comm_monoid_add\<close> |
63555 | 313 |
|
314 |
\noindent which specify a minimal binary partially inverse operation as |
|
315 |
||
316 |
\<^item> @{fact add_diff_cancel_left'}: @{thm (frugal_sorts) add_diff_cancel_left' [all, no_vars]} |
|
317 |
||
318 |
\<^item> @{fact diff_diff_add}: @{thm (frugal_sorts) diff_diff_add [all, no_vars]} |
|
319 |
||
320 |
\noindent which in turn allow to derive facts like |
|
321 |
||
322 |
\<^item> @{fact add_left_imp_eq}: @{thm (frugal_sorts) add_left_imp_eq [all, no_vars]} |
|
323 |
||
324 |
\noindent The total inverse operation is established as follows: |
|
325 |
||
69597 | 326 |
\<^item> Locale \<^locale>\<open>group\<close> extends the abstract hierarchy with |
63555 | 327 |
the inverse operation. |
328 |
||
329 |
\<^item> The concrete additive inverse operation emerges through |
|
330 |
||
69597 | 331 |
\<^item> @{command class} \<^class>\<open>group_add\<close> = \<^class>\<open>minus\<close> + \<^class>\<open>uminus\<close> + \<^class>\<open>monoid_add\<close> (in \<^theory>\<open>HOL.Groups\<close>) \\ |
63555 | 332 |
|
69597 | 333 |
\<^item> @{command class} \<^class>\<open>ab_group_add\<close> = \<^class>\<open>minus\<close> + \<^class>\<open>uminus\<close> + \<^class>\<open>comm_monoid_add\<close> (in \<^theory>\<open>HOL.Groups\<close>) |
63555 | 334 |
|
69597 | 335 |
and corresponding interpretation of locale \<^locale>\<open>group\<close>, yielding e.g. |
63555 | 336 |
|
337 |
\<^item> @{fact group.left_inverse}: @{thm (frugal_sorts) group.left_inverse [all, no_vars]} |
|
338 |
||
339 |
\<^item> @{fact add.left_inverse}: @{thm (frugal_sorts) add.left_inverse [all, no_vars]} |
|
340 |
||
341 |
\noindent There is no multiplicative counterpart. Why? In rings, |
|
342 |
the multiplicative group excludes the zero element, hence |
|
343 |
the inverse operation is not total. See further \secref{sec:rings}. |
|
344 |
\<close> |
|
345 |
||
346 |
paragraph \<open>Mitigating against redundancy by default simplification rules\<close> |
|
347 |
||
348 |
text \<open> |
|
349 |
Inverse operations imposes some redundancy on the type class |
|
350 |
hierarchy: in a group with a total inverse operation, the |
|
351 |
unary operation is simpler and more primitive than the binary |
|
352 |
one; but we cannot eliminate the binary one in favour of |
|
353 |
a mere syntactic abbreviation since the binary one is vital |
|
354 |
to express a partial inverse operation. |
|
355 |
||
356 |
This is mitigated by providing suitable default simplification |
|
357 |
rules: expression involving the unary inverse operation are |
|
358 |
simplified to binary inverse operation whenever appropriate. |
|
359 |
The rationale is that simplification is a central device in |
|
360 |
explorative proving, where proof obligation remaining after certain |
|
361 |
default proof steps including simplification are inspected |
|
362 |
to get an idea what is missing to finish a proof. When |
|
363 |
preferable normal forms are encoded into |
|
364 |
default simplification rules, proof obligations after simplification |
|
365 |
are normalized and hence more proof-friendly. |
|
366 |
\<close> |
|
367 |
||
63026 | 368 |
end |