38405
|
1 |
theory Refinement
|
|
2 |
imports Setup
|
|
3 |
begin
|
|
4 |
|
|
5 |
section {* Program and datatype refinement \label{sec:refinement} *}
|
|
6 |
|
38451
|
7 |
text {*
|
|
8 |
Code generation by shallow embedding (cf.~\secref{sec:principle})
|
|
9 |
allows to choose code equations and datatype constructors freely,
|
|
10 |
given that some very basic syntactic properties are met; this
|
|
11 |
flexibility opens up mechanisms for refinement which allow to extend
|
|
12 |
the scope and quality of generated code dramatically.
|
|
13 |
*}
|
|
14 |
|
|
15 |
|
|
16 |
subsection {* Program refinement *}
|
|
17 |
|
|
18 |
text {*
|
|
19 |
Program refinement works by choosing appropriate code equations
|
|
20 |
explicitly (cf.~\label{sec:equations}); as example, we use Fibonacci
|
|
21 |
numbers:
|
|
22 |
*}
|
|
23 |
|
|
24 |
fun %quote fib :: "nat \<Rightarrow> nat" where
|
|
25 |
"fib 0 = 0"
|
|
26 |
| "fib (Suc 0) = Suc 0"
|
|
27 |
| "fib (Suc (Suc n)) = fib n + fib (Suc n)"
|
|
28 |
|
|
29 |
text {*
|
|
30 |
\noindent The runtime of the corresponding code grows exponential due
|
|
31 |
to two recursive calls:
|
|
32 |
*}
|
|
33 |
|
|
34 |
text %quote {*@{code_stmts fib (consts) fib (Haskell)}*}
|
|
35 |
|
|
36 |
text {*
|
|
37 |
\noindent A more efficient implementation would use dynamic
|
|
38 |
programming, e.g.~sharing of common intermediate results between
|
|
39 |
recursive calls. This idea is expressed by an auxiliary operation
|
|
40 |
which computes a Fibonacci number and its successor simultaneously:
|
|
41 |
*}
|
|
42 |
|
|
43 |
definition %quote fib_step :: "nat \<Rightarrow> nat \<times> nat" where
|
|
44 |
"fib_step n = (fib (Suc n), fib n)"
|
|
45 |
|
|
46 |
text {*
|
|
47 |
\noindent This operation can be implemented by recursion using
|
|
48 |
dynamic programming:
|
|
49 |
*}
|
|
50 |
|
|
51 |
lemma %quote [code]:
|
|
52 |
"fib_step 0 = (Suc 0, 0)"
|
|
53 |
"fib_step (Suc n) = (let (m, q) = fib_step n in (m + q, m))"
|
|
54 |
by (simp_all add: fib_step_def)
|
|
55 |
|
|
56 |
text {*
|
|
57 |
\noindent What remains is to implement @{const fib} by @{const
|
|
58 |
fib_step} as follows:
|
|
59 |
*}
|
|
60 |
|
|
61 |
lemma %quote [code]:
|
|
62 |
"fib 0 = 0"
|
|
63 |
"fib (Suc n) = fst (fib_step n)"
|
|
64 |
by (simp_all add: fib_step_def)
|
|
65 |
|
|
66 |
text {*
|
|
67 |
\noindent The resulting code shows only linear growth of runtime:
|
|
68 |
*}
|
|
69 |
|
|
70 |
text %quote {*@{code_stmts fib (consts) fib fib_step (Haskell)}*}
|
|
71 |
|
|
72 |
|
38459
|
73 |
subsection {* Datatype refinement *}
|
38437
|
74 |
|
|
75 |
text {*
|
38459
|
76 |
Selecting specific code equations \emph{and} datatype constructors
|
|
77 |
leads to datatype refinement. As an example, we will develop an
|
|
78 |
alternative representation of the queue example given in
|
|
79 |
\secref{sec:queue_example}. The amortised representation is
|
|
80 |
convenient for generating code but exposes its \qt{implementation}
|
|
81 |
details, which may be cumbersome when proving theorems about it.
|
|
82 |
Therefore, here is a simple, straightforward representation of
|
|
83 |
queues:
|
38437
|
84 |
*}
|
|
85 |
|
|
86 |
datatype %quote 'a queue = Queue "'a list"
|
|
87 |
|
|
88 |
definition %quote empty :: "'a queue" where
|
|
89 |
"empty = Queue []"
|
|
90 |
|
|
91 |
primrec %quote enqueue :: "'a \<Rightarrow> 'a queue \<Rightarrow> 'a queue" where
|
|
92 |
"enqueue x (Queue xs) = Queue (xs @ [x])"
|
|
93 |
|
|
94 |
fun %quote dequeue :: "'a queue \<Rightarrow> 'a option \<times> 'a queue" where
|
|
95 |
"dequeue (Queue []) = (None, Queue [])"
|
|
96 |
| "dequeue (Queue (x # xs)) = (Some x, Queue xs)"
|
|
97 |
|
|
98 |
text {*
|
|
99 |
\noindent This we can use directly for proving; for executing,
|
|
100 |
we provide an alternative characterisation:
|
|
101 |
*}
|
|
102 |
|
|
103 |
definition %quote AQueue :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a queue" where
|
|
104 |
"AQueue xs ys = Queue (ys @ rev xs)"
|
|
105 |
|
|
106 |
code_datatype %quote AQueue
|
|
107 |
|
|
108 |
text {*
|
|
109 |
\noindent Here we define a \qt{constructor} @{const "AQueue"} which
|
|
110 |
is defined in terms of @{text "Queue"} and interprets its arguments
|
|
111 |
according to what the \emph{content} of an amortised queue is supposed
|
38459
|
112 |
to be.
|
|
113 |
|
|
114 |
The prerequisite for datatype constructors is only syntactical: a
|
|
115 |
constructor must be of type @{text "\<tau> = \<dots> \<Rightarrow> \<kappa> \<alpha>\<^isub>1 \<dots> \<alpha>\<^isub>n"} where @{text
|
|
116 |
"{\<alpha>\<^isub>1, \<dots>, \<alpha>\<^isub>n}"} is exactly the set of \emph{all} type variables in
|
|
117 |
@{text "\<tau>"}; then @{text "\<kappa>"} is its corresponding datatype. The
|
|
118 |
HOL datatype package by default registers any new datatype with its
|
38511
|
119 |
constructors, but this may be changed using @{command_def
|
38459
|
120 |
code_datatype}; the currently chosen constructors can be inspected
|
|
121 |
using the @{command print_codesetup} command.
|
|
122 |
|
|
123 |
Equipped with this, we are able to prove the following equations
|
38437
|
124 |
for our primitive queue operations which \qt{implement} the simple
|
|
125 |
queues in an amortised fashion:
|
|
126 |
*}
|
|
127 |
|
|
128 |
lemma %quote empty_AQueue [code]:
|
|
129 |
"empty = AQueue [] []"
|
|
130 |
unfolding AQueue_def empty_def by simp
|
|
131 |
|
|
132 |
lemma %quote enqueue_AQueue [code]:
|
|
133 |
"enqueue x (AQueue xs ys) = AQueue (x # xs) ys"
|
|
134 |
unfolding AQueue_def by simp
|
|
135 |
|
|
136 |
lemma %quote dequeue_AQueue [code]:
|
|
137 |
"dequeue (AQueue xs []) =
|
|
138 |
(if xs = [] then (None, AQueue [] [])
|
|
139 |
else dequeue (AQueue [] (rev xs)))"
|
|
140 |
"dequeue (AQueue xs (y # ys)) = (Some y, AQueue xs ys)"
|
|
141 |
unfolding AQueue_def by simp_all
|
|
142 |
|
|
143 |
text {*
|
|
144 |
\noindent For completeness, we provide a substitute for the
|
|
145 |
@{text case} combinator on queues:
|
|
146 |
*}
|
|
147 |
|
|
148 |
lemma %quote queue_case_AQueue [code]:
|
|
149 |
"queue_case f (AQueue xs ys) = f (ys @ rev xs)"
|
|
150 |
unfolding AQueue_def by simp
|
|
151 |
|
|
152 |
text {*
|
|
153 |
\noindent The resulting code looks as expected:
|
|
154 |
*}
|
|
155 |
|
|
156 |
text %quote {*@{code_stmts empty enqueue dequeue (SML)}*}
|
|
157 |
|
|
158 |
text {*
|
38459
|
159 |
The same techniques can also be applied to types which are not
|
|
160 |
specified as datatypes, e.g.~type @{typ int} is originally specified
|
38511
|
161 |
as quotient type by means of @{command_def typedef}, but for code
|
38459
|
162 |
generation constants allowing construction of binary numeral values
|
|
163 |
are used as constructors for @{typ int}.
|
38437
|
164 |
|
38459
|
165 |
This approach however fails if the representation of a type demands
|
|
166 |
invariants; this issue is discussed in the next section.
|
|
167 |
*}
|
|
168 |
|
38437
|
169 |
|
38459
|
170 |
subsection {* Datatype refinement involving invariants *}
|
38437
|
171 |
|
38459
|
172 |
text {*
|
38502
|
173 |
Datatype representation involving invariants require a dedicated
|
|
174 |
setup for the type and its primitive operations. As a running
|
|
175 |
example, we implement a type @{text "'a dlist"} of list consisting
|
|
176 |
of distinct elements.
|
|
177 |
|
|
178 |
The first step is to decide on which representation the abstract
|
|
179 |
type (in our example @{text "'a dlist"}) should be implemented.
|
|
180 |
Here we choose @{text "'a list"}. Then a conversion from the concrete
|
|
181 |
type to the abstract type must be specified, here:
|
|
182 |
*}
|
|
183 |
|
|
184 |
text %quote {*
|
|
185 |
@{term_type Dlist}
|
|
186 |
*}
|
|
187 |
|
|
188 |
text {*
|
|
189 |
\noindent Next follows the specification of a suitable \emph{projection},
|
|
190 |
i.e.~a conversion from abstract to concrete type:
|
|
191 |
*}
|
|
192 |
|
|
193 |
text %quote {*
|
|
194 |
@{term_type list_of_dlist}
|
|
195 |
*}
|
|
196 |
|
|
197 |
text {*
|
|
198 |
\noindent This projection must be specified such that the following
|
|
199 |
\emph{abstract datatype certificate} can be proven:
|
|
200 |
*}
|
|
201 |
|
|
202 |
lemma %quote [code abstype]:
|
|
203 |
"Dlist (list_of_dlist dxs) = dxs"
|
|
204 |
by (fact Dlist_list_of_dlist)
|
|
205 |
|
|
206 |
text {*
|
|
207 |
\noindent Note that so far the invariant on representations
|
|
208 |
(@{term_type distinct}) has never been mentioned explicitly:
|
|
209 |
the invariant is only referred to implicitly: all values in
|
|
210 |
set @{term "{xs. list_of_dlist (Dlist xs) = xs}"} are invariant,
|
|
211 |
and in our example this is exactly @{term "{xs. distinct xs}"}.
|
|
212 |
|
|
213 |
The primitive operations on @{typ "'a dlist"} are specified
|
|
214 |
indirectly using the projection @{const list_of_dlist}. For
|
|
215 |
the empty @{text "dlist"}, @{const Dlist.empty}, we finally want
|
|
216 |
the code equation
|
|
217 |
*}
|
|
218 |
|
|
219 |
text %quote {*
|
|
220 |
@{term "Dlist.empty = Dlist []"}
|
|
221 |
*}
|
|
222 |
|
|
223 |
text {*
|
|
224 |
\noindent This we have to prove indirectly as follows:
|
|
225 |
*}
|
|
226 |
|
|
227 |
lemma %quote [code abstract]:
|
|
228 |
"list_of_dlist Dlist.empty = []"
|
|
229 |
by (fact list_of_dlist_empty)
|
|
230 |
|
|
231 |
text {*
|
|
232 |
\noindent This equation logically encodes both the desired code
|
|
233 |
equation and that the expression @{const Dlist} is applied to obeys
|
|
234 |
the implicit invariant. Equations for insertion and removal are
|
|
235 |
similar:
|
|
236 |
*}
|
|
237 |
|
|
238 |
lemma %quote [code abstract]:
|
|
239 |
"list_of_dlist (Dlist.insert x dxs) = List.insert x (list_of_dlist dxs)"
|
|
240 |
by (fact list_of_dlist_insert)
|
|
241 |
|
|
242 |
lemma %quote [code abstract]:
|
|
243 |
"list_of_dlist (Dlist.remove x dxs) = remove1 x (list_of_dlist dxs)"
|
|
244 |
by (fact list_of_dlist_remove)
|
|
245 |
|
|
246 |
text {*
|
|
247 |
\noindent Then the corresponding code is as follows:
|
|
248 |
*}
|
|
249 |
|
|
250 |
text %quote {*
|
|
251 |
@{code_stmts Dlist.empty Dlist.insert Dlist.remove list_of_dlist (Haskell)}
|
|
252 |
*} (*(types) dlist (consts) dempty dinsert dremove list_of List.member insert remove *)
|
|
253 |
|
|
254 |
text {*
|
|
255 |
Typical data structures implemented by representations involving
|
|
256 |
invariants are available in the library, e.g.~theories @{theory
|
|
257 |
Fset} and @{theory Mapping} specify sets (type @{typ "'a fset"}) and
|
|
258 |
key-value-mappings (type @{typ "('a, 'b) mapping"}) respectively;
|
|
259 |
these can be implemented by distinct lists as presented here as
|
|
260 |
example (theory @{theory Dlist}) and red-black-trees respectively
|
|
261 |
(theory @{theory RBT}).
|
38437
|
262 |
*}
|
|
263 |
|
38405
|
264 |
end
|