author | wenzelm |
Fri, 22 Jun 2018 20:31:49 +0200 | |
changeset 68484 | 59793df7f853 |
parent 67299 | ba52a058942f |
child 69422 | 472af2d7835d |
permissions | -rw-r--r-- |
63026 | 1 |
theory Typeclass_Hierarchy |
66453
cc19f7ca2ed6
session-qualified theory imports: isabelle imports -U -i -d '~~/src/Benchmarks' -a;
wenzelm
parents:
63555
diff
changeset
|
2 |
imports Typeclass_Hierarchy_Basics.Setup |
63026 | 3 |
begin |
4 |
||
5 |
section \<open>Introduction\<close> |
|
6 |
||
7 |
text \<open> |
|
8 |
The {Isabelle/HOL} type-class hierarchy entered the stage |
|
63555 | 9 |
in a quite ancient era -- first related \<^emph>\<open>NEWS\<close> entries date |
63026 | 10 |
back to release {Isabelle99-1}. Since then, there have been |
11 |
ongoing modifications and additions, leading to ({Isabelle2016}) |
|
12 |
more than 180 type-classes. This sheer complexity makes access |
|
13 |
and understanding of that type-class hierarchy difficult and |
|
14 |
involved, let alone maintenance. |
|
15 |
||
16 |
The purpose of this primer is to shed some light on this, |
|
17 |
not only on the mere ingredients, but also on the design |
|
18 |
principles which have evolved and proven useful over time. |
|
19 |
\<close> |
|
20 |
||
21 |
section \<open>Foundations\<close> |
|
22 |
||
23 |
subsection \<open>Locales and type classes\<close> |
|
24 |
||
25 |
text \<open> |
|
26 |
Some familiarity with the Isabelle module system is assumed: |
|
27 |
defining locales and type-classes, interpreting locales and |
|
28 |
instantiating type-classes, adding relationships between |
|
29 |
locales (@{command sublocale}) and type-classes |
|
30 |
(@{command subclass}). Handy introductions are the |
|
67299 | 31 |
respective tutorials @{cite "isabelle-locale"} |
32 |
@{cite "isabelle-classes"}. |
|
63026 | 33 |
\<close> |
34 |
||
35 |
subsection \<open>Strengths and restrictions of type classes\<close> |
|
36 |
||
37 |
text \<open> |
|
38 |
The primary motivation for using type classes in {Isabelle/HOL} |
|
39 |
always have been numerical types, which form an inclusion chain: |
|
40 |
||
41 |
\begin{center} |
|
42 |
@{typ nat} @{text \<sqsubset>} @{typ int} @{text \<sqsubset>} @{typ rat} |
|
43 |
@{text \<sqsubset>} @{typ real} @{text \<sqsubset>} @{typ complex} |
|
44 |
\end{center} |
|
45 |
||
46 |
\noindent The inclusion @{text \<sqsubset>} means that any value of the numerical |
|
47 |
type to the left hand side mathematically can be transferred |
|
48 |
to the numerical type on the right hand side. |
|
49 |
||
50 |
How to accomplish this given the quite restrictive type system |
|
67299 | 51 |
of {Isabelle/HOL}? Paulson @{cite "paulson-numerical"} explains |
63026 | 52 |
that each numerical type has some characteristic properties |
53 |
which define an characteristic algebraic structure; @{text \<sqsubset>} |
|
54 |
then corresponds to specialization of the corresponding |
|
55 |
characteristic algebraic structures. These algebraic structures |
|
56 |
are expressed using algebraic type classes and embeddings |
|
57 |
of numerical types into them: |
|
58 |
||
59 |
\begin{center}\begin{tabular}{lccc} |
|
60 |
@{term of_nat} @{text "::"} & @{typ nat} & @{text \<Rightarrow>} & @{typ [source] "'a::semiring_1"} \\ |
|
61 |
& @{text \<sqinter>} & & @{text \<up>} \\ |
|
62 |
@{term of_int} @{text "::"} & @{typ int} & @{text \<Rightarrow>} & @{typ [source] "'a::ring_1"} \\ |
|
63 |
& @{text \<sqinter>} & & @{text \<up>} \\ |
|
64 |
@{term of_rat} @{text "::"} & @{typ rat} & @{text \<Rightarrow>} & @{typ [source] "'a::field_char_0"} \\ |
|
65 |
& @{text \<sqinter>} & & @{text \<up>} \\ |
|
66 |
@{term of_real} @{text "::"} & @{typ real} & @{text \<Rightarrow>} & @{typ [source] "'a::real_algebra_1"} \\ |
|
67 |
& @{text \<sqinter>} \\ |
|
68 |
& @{typ complex} |
|
69 |
\end{tabular}\end{center} |
|
70 |
||
71 |
\noindent @{text "d \<leftarrow> c"} means that @{text c} is subclass of @{text d}. |
|
72 |
Hence each characteristic embedding @{text of_num} can transform |
|
73 |
any value of type @{text num} to any numerical type further |
|
74 |
up in the inclusion chain. |
|
75 |
||
76 |
This canonical example exhibits key strengths of type classes: |
|
77 |
||
78 |
\<^item> Sharing of operations and facts among different |
|
79 |
types, hence also sharing of notation and names: there |
|
80 |
is only one plus operation using infix syntax @{text "+"}, |
|
81 |
only one zero written @{text 0}, and neutrality |
|
63555 | 82 |
(@{thm (frugal_sorts) add_0_left [all, no_vars]} and |
83 |
@{thm (frugal_sorts) add_0_right [all, no_vars]}) |
|
63026 | 84 |
is referred to |
85 |
uniformly by names @{fact add_0_left} and @{fact add_0_right}. |
|
86 |
||
87 |
\<^item> Proof tool setups are shared implicitly: |
|
88 |
@{fact add_0} and @{fact add_0_right} are simplification |
|
89 |
rules by default. |
|
90 |
||
91 |
\<^item> Hence existing proofs about particular numerical |
|
92 |
types are often easy to generalize to algebraic structures, |
|
93 |
given that they do not depend on specific properties of |
|
94 |
those numerical types. |
|
95 |
||
96 |
Considerable restrictions include: |
|
97 |
||
98 |
\<^item> Type class operations are restricted to one |
|
99 |
type parameter; this is insufficient e.g. for expressing |
|
100 |
a unified power operation adequately (see \secref{sec:power}). |
|
101 |
||
102 |
\<^item> Parameters are fixed over the whole type class |
|
103 |
hierarchy and cannot be refined in specific situations: |
|
104 |
think of integral domains with a predicate @{term is_unit}; |
|
105 |
for natural numbers, this degenerates to the much simpler |
|
106 |
@{term [source] "HOL.equal (1::nat)"} but facts |
|
107 |
refer to @{term is_unit} nonetheless. |
|
108 |
||
109 |
\<^item> Type classes are not apt for meta-theory. There |
|
110 |
is no practically usable way to express that the units |
|
111 |
of an integral domain form a multiplicative group using |
|
112 |
type classes. But see session @{text "HOL-Algebra"} |
|
113 |
which provides locales with an explicit carrier. |
|
114 |
\<close> |
|
115 |
||
116 |
||
117 |
subsection \<open>Navigating the hierarchy\<close> |
|
118 |
||
119 |
text \<open> |
|
120 |
An indispensable tool to inspect the class hierarchy is |
|
121 |
@{command class_deps} which displays the graph of classes, |
|
122 |
optionally showing the logical content for each class also. |
|
123 |
Optional parameters restrict the graph to a particular segment |
|
124 |
which is useful to get a graspable view. See |
|
67299 | 125 |
the Isar reference manual @{cite "isabelle-isar-ref"} for details. |
63026 | 126 |
\<close> |
127 |
||
128 |
||
129 |
section \<open>The hierarchy\<close> |
|
130 |
||
63555 | 131 |
subsection \<open>Syntactic type classes \label{sec:syntactic-type-class}\<close> |
63026 | 132 |
|
133 |
text \<open> |
|
134 |
At the top of the hierarchy there are a couple of syntactic type |
|
135 |
classes, ie. classes with operations but with no axioms, |
|
136 |
most notably: |
|
137 |
||
63555 | 138 |
\<^item> @{command class} @{class plus} with @{term [source] "(a::'a::plus) + b"} |
63026 | 139 |
|
63555 | 140 |
\<^item> @{command class} @{class zero} with @{term [source] "0::'a::zero"} |
63026 | 141 |
|
63555 | 142 |
\<^item> @{command class} @{class times} with @{term [source] "(a::'a::times) * b"} |
63026 | 143 |
|
63555 | 144 |
\<^item> @{command class} @{class one} with @{term [source] "1::'a::one"} |
63026 | 145 |
|
146 |
\noindent Before the introduction of the @{command class} statement in |
|
67299 | 147 |
Isabelle @{cite "Haftmann-Wenzel:2006:classes"} it was impossible |
63026 | 148 |
to define operations with associated axioms in the same class, |
149 |
hence there were always pairs of syntactic and logical type |
|
150 |
classes. This restriction is lifted nowadays, but there are |
|
151 |
still reasons to maintain syntactic type classes: |
|
152 |
||
153 |
\<^item> Syntactic type classes allow generic notation to be used |
|
154 |
regardless of a particular logical interpretation; e.g. |
|
155 |
although multiplication @{text "*"} is usually associative, |
|
156 |
there are examples where it is not (e.g. octonions), and |
|
157 |
leaving @{text "*"} without axioms allows to re-use this |
|
158 |
syntax by means of type class instantiation also for such |
|
159 |
exotic examples. |
|
160 |
||
161 |
\<^item> Type classes might share operations but not necessarily |
|
162 |
axioms on them, e.g. @{term gcd} (see \secref{sec:gcd}). |
|
163 |
Hence their common base is a syntactic type class. |
|
164 |
||
165 |
\noindent However syntactic type classes should only be used with striking |
|
166 |
cause. Otherwise there is risk for confusion if the notation |
|
167 |
suggests properties which do not hold without particular |
|
168 |
constraints. This can be illustrated using numerals |
|
169 |
(see \secref{sec:numerals}): @{lemma "2 + 2 = 4" by simp} is |
|
170 |
provable without further ado, and this also meets the typical |
|
171 |
expectation towards a numeral notation; in more ancient releases |
|
172 |
numerals were purely syntactic and @{prop "2 + 2 = 4"} was |
|
173 |
not provable without particular type constraints. |
|
174 |
\<close> |
|
175 |
||
63555 | 176 |
|
177 |
subsection \<open>Additive and multiplicative semigroups and monoids\<close> |
|
178 |
||
179 |
text \<open> |
|
180 |
In common literature, notation for semigroups and monoids |
|
181 |
is either multiplicative @{text "(*, 1)"} or additive |
|
182 |
@{text "(+, 0)"} with underlying properties isomorphic. |
|
183 |
In {Isabelle/HOL}, this is accomplished using the following |
|
184 |
abstract setup: |
|
185 |
||
186 |
\<^item> A @{locale semigroup} introduces an abstract binary |
|
187 |
associative operation. |
|
188 |
||
189 |
\<^item> A @{locale monoid} is an extension of @{locale semigroup} |
|
190 |
with a neutral element. |
|
191 |
||
192 |
\<^item> Both @{locale semigroup} and @{locale monoid} provide |
|
193 |
dedicated syntax for their operations @{text "(\<^bold>*, \<^bold>1)"}. |
|
194 |
This syntax is not visible on the global theory level |
|
195 |
but only for abstract reasoning inside the respective |
|
196 |
locale. |
|
197 |
||
198 |
\<^item> Concrete global syntax is added building on existing |
|
199 |
syntactic type classes \secref{sec:syntactic-type-class} |
|
200 |
using the following classes: |
|
201 |
||
202 |
\<^item> @{command class} @{class semigroup_mult} = @{class times} |
|
203 |
||
204 |
\<^item> @{command class} @{class monoid_mult} = @{class one} + @{class semigroup_mult} |
|
205 |
||
206 |
\<^item> @{command class} @{class semigroup_add} = @{class plus} |
|
207 |
||
208 |
\<^item> @{command class} @{class monoid_add} = @{class zero} + @{class semigroup_add} |
|
209 |
||
210 |
Locales @{locale semigroup} and @{locale monoid} are |
|
211 |
interpreted (using @{command sublocale}) into their |
|
212 |
corresponding type classes, with prefixes @{text add} |
|
213 |
and @{text mult}; hence facts derived in @{locale semigroup} |
|
214 |
and @{locale monoid} are propagated simultaneously to |
|
215 |
\<^emph>\<open>both\<close> using a consistent naming policy, ie. |
|
216 |
||
217 |
\<^item> @{fact semigroup.assoc}: @{thm (frugal_sorts) semigroup.assoc [all, no_vars]} |
|
218 |
||
219 |
\<^item> @{fact mult.assoc}: @{thm (frugal_sorts) mult.assoc [all, no_vars]} |
|
220 |
||
221 |
\<^item> @{fact add.assoc}: @{thm (frugal_sorts) add.assoc [all, no_vars]} |
|
222 |
||
223 |
\<^item> @{fact monoid.right_neutral}: @{thm (frugal_sorts) monoid.right_neutral [all, no_vars]} |
|
224 |
||
225 |
\<^item> @{fact mult.right_neutral}: @{thm (frugal_sorts) mult.right_neutral [all, no_vars]} |
|
226 |
||
227 |
\<^item> @{fact add.right_neutral}: @{thm (frugal_sorts) add.right_neutral [all, no_vars]} |
|
228 |
||
229 |
\<^item> Note that the syntax in @{locale semigroup} and @{locale monoid} |
|
230 |
is bold; this avoids clashes when writing properties |
|
231 |
inside one of these locales in presence of that global |
|
232 |
concrete type class syntax. |
|
233 |
||
234 |
\noindent That hierarchy extends in a straightforward manner |
|
235 |
to abelian semigroups and commutative monoids\footnote{The |
|
236 |
designation \<^emph>\<open>abelian\<close> is quite standard concerning |
|
237 |
(semi)groups, but not for monoids}: |
|
238 |
||
239 |
\<^item> Locales @{locale abel_semigroup} and @{locale comm_monoid} |
|
240 |
add commutativity as property. |
|
241 |
||
242 |
\<^item> Concrete syntax emerges through |
|
243 |
||
244 |
\<^item> @{command class} @{class ab_semigroup_add} = @{class semigroup_add} |
|
245 |
||
246 |
\<^item> @{command class} @{class ab_semigroup_mult} = @{class semigroup_mult} |
|
247 |
||
248 |
\<^item> @{command class} @{class comm_monoid_add} = @{class zero} + @{class ab_semigroup_add} |
|
249 |
||
250 |
\<^item> @{command class} @{class comm_monoid_mult} = @{class one} + @{class ab_semigroup_mult} |
|
251 |
||
252 |
and corresponding interpretation of the locales above, yielding |
|
253 |
||
254 |
\<^item> @{fact abel_semigroup.commute}: @{thm (frugal_sorts) abel_semigroup.commute [all, no_vars]} |
|
255 |
||
256 |
\<^item> @{fact mult.commute}: @{thm (frugal_sorts) mult.commute [all, no_vars]} |
|
257 |
||
258 |
\<^item> @{fact add.commute}: @{thm (frugal_sorts) add.commute [all, no_vars]} |
|
259 |
\<close> |
|
260 |
||
261 |
paragraph \<open>Named collection of theorems\<close> |
|
262 |
||
263 |
text \<open> |
|
264 |
Locale interpretation interacts smoothly with named collections of |
|
265 |
theorems as introduced by command @{command named_theorems}. In our |
|
266 |
example, rules concerning associativity and commutativity are no |
|
267 |
simplification rules by default since they desired orientation may |
|
268 |
vary depending on the situation. However, there is a collection |
|
269 |
@{fact ac_simps} where facts @{fact abel_semigroup.assoc}, |
|
270 |
@{fact abel_semigroup.commute} and @{fact abel_semigroup.left_commute} |
|
271 |
are declared as members. Due to interpretation, also |
|
272 |
@{fact mult.assoc}, @{fact mult.commute} and @{fact mult.left_commute} |
|
273 |
are also members of @{fact ac_simps}, as any corresponding facts |
|
274 |
stemming from interpretation of @{locale abel_semigroup}. |
|
275 |
Hence adding @{fact ac_simps} to the simplification rules for |
|
276 |
a single method call uses all associativity and commutativity |
|
277 |
rules known by means of interpretation. |
|
278 |
\<close> |
|
279 |
||
280 |
||
281 |
subsection \<open>Additive and multiplicative groups\<close> |
|
282 |
||
283 |
text \<open> |
|
284 |
The hierarchy for inverse group operations takes into account |
|
285 |
that there are weaker algebraic structures with only a partially |
|
286 |
inverse operation. E. g. the natural numbers have bounded |
|
287 |
subtraction @{term "m - (n::nat)"} which is only an inverse |
|
288 |
operation if @{term "m \<ge> (n::nat)"}; unary minus @{text "-"} |
|
289 |
is pointless on the natural numbers. |
|
290 |
||
291 |
Hence for both additive and multiplicative notation there |
|
292 |
are syntactic classes for inverse operations, both unary |
|
293 |
and binary: |
|
294 |
||
295 |
\<^item> @{command class} @{class minus} with @{term [source] "(a::'a::minus) - b"} |
|
296 |
||
297 |
\<^item> @{command class} @{class uminus} with @{term [source] "- a::'a::uminus"} |
|
298 |
||
299 |
\<^item> @{command class} @{class divide} with @{term [source] "(a::'a::divide) div b"} |
|
300 |
||
301 |
\<^item> @{command class} @{class inverse} = @{class divide} with @{term [source] "inverse a::'a::inverse"} |
|
302 |
\\ and @{term [source] "(a::'a::inverse) / b"} |
|
303 |
||
304 |
\noindent Here @{class inverse} specializes the ``partial'' syntax |
|
305 |
@{term [source] "a div b"} to the more specific |
|
306 |
@{term [source] "a / b"}. |
|
307 |
||
308 |
Semantic properties are added by |
|
309 |
||
310 |
\<^item> @{command class} @{class cancel_ab_semigroup_add} = @{class ab_semigroup_add} + @{class minus} |
|
311 |
||
312 |
\<^item> @{command class} @{class cancel_comm_monoid_add} = @{class cancel_ab_semigroup_add} + @{class comm_monoid_add} |
|
313 |
||
314 |
\noindent which specify a minimal binary partially inverse operation as |
|
315 |
||
316 |
\<^item> @{fact add_diff_cancel_left'}: @{thm (frugal_sorts) add_diff_cancel_left' [all, no_vars]} |
|
317 |
||
318 |
\<^item> @{fact diff_diff_add}: @{thm (frugal_sorts) diff_diff_add [all, no_vars]} |
|
319 |
||
320 |
\noindent which in turn allow to derive facts like |
|
321 |
||
322 |
\<^item> @{fact add_left_imp_eq}: @{thm (frugal_sorts) add_left_imp_eq [all, no_vars]} |
|
323 |
||
324 |
\noindent The total inverse operation is established as follows: |
|
325 |
||
326 |
\<^item> Locale @{locale group} extends the abstract hierarchy with |
|
327 |
the inverse operation. |
|
328 |
||
329 |
\<^item> The concrete additive inverse operation emerges through |
|
330 |
||
68484 | 331 |
\<^item> @{command class} @{class group_add} = @{class minus} + @{class uminus} + @{class monoid_add} (in @{theory HOL.Groups}) \\ |
63555 | 332 |
|
68484 | 333 |
\<^item> @{command class} @{class ab_group_add} = @{class minus} + @{class uminus} + @{class comm_monoid_add} (in @{theory HOL.Groups}) |
63555 | 334 |
|
335 |
and corresponding interpretation of locale @{locale group}, yielding e.g. |
|
336 |
||
337 |
\<^item> @{fact group.left_inverse}: @{thm (frugal_sorts) group.left_inverse [all, no_vars]} |
|
338 |
||
339 |
\<^item> @{fact add.left_inverse}: @{thm (frugal_sorts) add.left_inverse [all, no_vars]} |
|
340 |
||
341 |
\noindent There is no multiplicative counterpart. Why? In rings, |
|
342 |
the multiplicative group excludes the zero element, hence |
|
343 |
the inverse operation is not total. See further \secref{sec:rings}. |
|
344 |
\<close> |
|
345 |
||
346 |
paragraph \<open>Mitigating against redundancy by default simplification rules\<close> |
|
347 |
||
348 |
text \<open> |
|
349 |
Inverse operations imposes some redundancy on the type class |
|
350 |
hierarchy: in a group with a total inverse operation, the |
|
351 |
unary operation is simpler and more primitive than the binary |
|
352 |
one; but we cannot eliminate the binary one in favour of |
|
353 |
a mere syntactic abbreviation since the binary one is vital |
|
354 |
to express a partial inverse operation. |
|
355 |
||
356 |
This is mitigated by providing suitable default simplification |
|
357 |
rules: expression involving the unary inverse operation are |
|
358 |
simplified to binary inverse operation whenever appropriate. |
|
359 |
The rationale is that simplification is a central device in |
|
360 |
explorative proving, where proof obligation remaining after certain |
|
361 |
default proof steps including simplification are inspected |
|
362 |
to get an idea what is missing to finish a proof. When |
|
363 |
preferable normal forms are encoded into |
|
364 |
default simplification rules, proof obligations after simplification |
|
365 |
are normalized and hence more proof-friendly. |
|
366 |
\<close> |
|
367 |
||
63026 | 368 |
end |