1231 |
1231 |
1232 I have written a paper discussing the treatment of lazy lists; it also |
1232 I have written a paper discussing the treatment of lazy lists; it also |
1233 covers finite lists~\cite{paulson-coind}. |
1233 covers finite lists~\cite{paulson-coind}. |
1234 |
1234 |
1235 |
1235 |
|
1236 \section{Datatype declarations} |
|
1237 \index{*datatype|(} |
|
1238 |
|
1239 \underscoreon |
|
1240 |
|
1241 It is often necessary to extend a theory with \ML-like datatypes. This |
|
1242 extension consists of the new type, declarations of its constructors and |
|
1243 rules that describe the new type. The theory definition section {\tt |
|
1244 datatype} represents a compact way of doing this. |
|
1245 |
|
1246 |
|
1247 \subsection{Foundations} |
|
1248 |
|
1249 A datatype declaration has the following general structure: |
|
1250 \[ \mbox{\tt datatype}~ (\alpha_1,\dots,\alpha_n)t ~=~ |
|
1251 c_1(\tau_{11},\dots,\tau_{1k_1}) ~\mid~ \dots ~\mid~ |
|
1252 c_m(\tau_{m1},\dots,\tau_{mk_m}) |
|
1253 \] |
|
1254 where $\alpha_i$ are type variables, $c_i$ are distinct constructor names and |
|
1255 $\tau_{ij}$ are one of the following: |
|
1256 \begin{itemize} |
|
1257 \item type variables $\alpha_1,\dots,\alpha_n$, |
|
1258 \item types $(\beta_1,\dots,\beta_l)s$ where $s$ is a previously declared |
|
1259 type or type synonym and $\{\beta_1,\dots,\beta_l\} \subseteq |
|
1260 \{\alpha_1,\dots,\alpha_n\}$, |
|
1261 \item the newly defined type $(\alpha_1,\dots,\alpha_n)t$ \footnote{This |
|
1262 makes it a recursive type. To ensure that the new type is not empty at |
|
1263 least one constructor must consist of only non-recursive type |
|
1264 components.} |
|
1265 \end{itemize} |
|
1266 The constructors are automatically defined as functions of their respective |
|
1267 type: |
|
1268 \[ c_j : [\tau_{j1},\dots,\tau_{jk_j}] \To (\alpha_1,\dots,\alpha_n)t \] |
|
1269 These functions have certain {\em freeness} properties: |
|
1270 \begin{description} |
|
1271 \item[\tt inject] They are injective: |
|
1272 \[ (c_j(x_1,\dots,x_{k_j}) = c_j(y_1,\dots,y_{k_j})) = |
|
1273 (x_1 = y_1 \land \dots \land x_{k_j} = y_{k_j}) |
|
1274 \] |
|
1275 \item[\tt ineq] They are distinct: |
|
1276 \[ c_i(x_1,\dots,x_{k_i}) \neq c_j(y_1,\dots,y_{k_j}) \qquad |
|
1277 \mbox{for all}~ i \neq j. |
|
1278 \] |
|
1279 \end{description} |
|
1280 Because the number of inequalities is quadratic in the number of |
|
1281 constructors, a different method is used if their number exceeds |
|
1282 a certain value, currently 4. In that case every constructor is mapped to a |
|
1283 natural number |
|
1284 \[ |
|
1285 \begin{array}{lcl} |
|
1286 \mbox{\it t\_ord}(c_1(x_1,\dots,x_{k_1})) & = & 0 \\ |
|
1287 & \vdots & \\ |
|
1288 \mbox{\it t\_ord}(c_m(x_1,\dots,x_{k_m})) & = & m-1 |
|
1289 \end{array} |
|
1290 \] |
|
1291 and inequality of constructors is expressed by: |
|
1292 \[ |
|
1293 \mbox{\it t\_ord}(x) \neq \mbox{\it t\_ord}(y) \Imp x \neq y. |
|
1294 \] |
|
1295 In addition a structural induction axiom {\tt induct} is provided: |
|
1296 \[ |
|
1297 \infer{P(x)} |
|
1298 {\begin{array}{lcl} |
|
1299 \Forall x_1\dots x_{k_1}. |
|
1300 \List{P(x_{r_{11}}); \dots; P(x_{r_{1l_1}})} & |
|
1301 \Imp & P(c_1(x_1,\dots,x_{k_1})) \\ |
|
1302 & \vdots & \\ |
|
1303 \Forall x_1\dots x_{k_m}. |
|
1304 \List{P(x_{r_{m1}}); \dots; P(x_{r_{ml_m}})} & |
|
1305 \Imp & P(c_m(x_1,\dots,x_{k_m})) |
|
1306 \end{array}} |
|
1307 \] |
|
1308 where $\{r_{j1},\dots,r_{jl_j}\} = \{i \in \{1,\dots k_j\} ~\mid~ \tau_{ji} |
|
1309 = (\alpha_1,\dots,\alpha_n)t \}$, i.e.\ the property $P$ can be assumed for |
|
1310 all arguments of the recursive type. |
|
1311 |
|
1312 The type also comes with an \ML-like {\tt case}-construct: |
|
1313 \[ |
|
1314 \begin{array}{rrcl} |
|
1315 \mbox{\tt case}~e~\mbox{\tt of} & c_1(y_{11},\dots,y_{1k_1}) & \To & e_1 \\ |
|
1316 \vdots \\ |
|
1317 \mid & c_m(y_{m1},\dots,y_{mk_m}) & \To & e_m |
|
1318 \end{array} |
|
1319 \] |
|
1320 In contrast to \ML, {\em all} constructors must be present, their order is |
|
1321 fixed, and nested patterns are not supported. |
|
1322 |
|
1323 |
|
1324 \subsection{Defining datatypes} |
|
1325 |
|
1326 A datatype is defined in a theory definition file using the keyword {\tt |
|
1327 datatype}. The definition following {\tt datatype} must conform to the |
|
1328 syntax of {\em typedecl} specified in Fig.~\ref{datatype-grammar} and must |
|
1329 obey the rules in the previous section. As a result the theory is extended |
|
1330 with the new type, the constructors, and the theorems listed in the previous |
|
1331 section. |
|
1332 |
|
1333 \begin{figure} |
|
1334 \begin{rail} |
|
1335 typedecl : typevarlist id '=' (cons + '|') |
|
1336 ; |
|
1337 cons : (id | string) ( () | '(' (typ + ',') ')' ) ( () | mixfix ) |
|
1338 ; |
|
1339 typ : typevarlist id |
|
1340 | tid |
|
1341 ; |
|
1342 typevarlist : () | tid | '(' (tid + ',') ')' |
|
1343 ; |
|
1344 \end{rail} |
|
1345 \caption{Syntax of datatype declarations} |
|
1346 \label{datatype-grammar} |
|
1347 \end{figure} |
|
1348 |
|
1349 Reading the theory file produces a structure which, in addtion to the usual |
|
1350 components, contains a structure named $t$ for each datatype $t$ defined in |
|
1351 the file\footnote{Otherwise multiple datatypes in the same theory file would |
|
1352 lead to name clashes.}. Each structure $t$ contains the following elements: |
|
1353 \begin{ttbox} |
|
1354 val induct : thm |
|
1355 val inject : thm list |
|
1356 val ineq : thm list |
|
1357 val cases : thm list |
|
1358 val simps : thm list |
|
1359 val induct_tac : string -> int -> tactic |
|
1360 \end{ttbox} |
|
1361 {\tt induct}, {\tt inject} and {\tt ineq} contain the theorems described |
|
1362 above. For convenience {\tt ineq} contains inequalities in both directions. |
|
1363 \begin{warn} |
|
1364 If there are five or more constructors, the {\em t\_ord} scheme is used for |
|
1365 {\tt ineq}. In this case the theory {\tt Arith} must be contained |
|
1366 in the current theory, if necessary by adding it explicitly. |
|
1367 \end{warn} |
|
1368 The reduction rules of the {\tt case}-construct can be found in {\tt cases}. |
|
1369 All theorems from {\tt inject}, {\tt ineq} and {\tt cases} are combined in |
|
1370 {\tt simps} for use with the simplifier. The tactic ``{\verb$induct_tac$~{\em |
|
1371 var i}\/}'' applies structural induction over variable {\em var} to |
|
1372 subgoal {\em i}. |
|
1373 |
|
1374 |
|
1375 \subsection{Examples} |
|
1376 |
|
1377 \subsubsection{The datatype $\alpha~list$} |
|
1378 |
|
1379 We want to define the type $\alpha~list$.\footnote{Of course there is |
|
1380 a list type in HOL already. But for now let us assume that we have to define |
|
1381 a new type.} To do this we have to build a new theory that contains the |
|
1382 type definition. We start from {\tt HOL}. |
|
1383 \begin{ttbox} |
|
1384 MyList = HOL + |
|
1385 datatype 'a list = Nil | Cons ('a, 'a list) |
|
1386 end |
|
1387 \end{ttbox} |
|
1388 After loading the theory with \verb$use_thy "MyList"$, we can prove |
|
1389 $Cons(x,xs)\neq xs$ using the new theory. First we build a suitable simpset |
|
1390 for the simplifier: |
|
1391 |
|
1392 \begin{ttbox} |
|
1393 val mylist_ss = HOL_ss addsimps MyList.list.simps; |
|
1394 goal MyList.thy "!x. Cons(x,xs) ~= xs"; |
|
1395 {\out Level 0} |
|
1396 {\out ! x. Cons(x, xs) ~= xs} |
|
1397 {\out 1. ! x. Cons(x, xs) ~= xs} |
|
1398 \end{ttbox} |
|
1399 This can be proved by the structural induction tactic: |
|
1400 \begin{ttbox} |
|
1401 by (MyList.list.induct_tac "xs" 1); |
|
1402 {\out Level 1} |
|
1403 {\out ! x. Cons(x, xs) ~= xs} |
|
1404 {\out 1. ! x. Cons(x, Nil) ~= Nil} |
|
1405 {\out 2. !!a list.} |
|
1406 {\out ! x. Cons(x, list) ~= list ==>} |
|
1407 {\out ! x. Cons(x, Cons(a, list)) ~= Cons(a, list)} |
|
1408 \end{ttbox} |
|
1409 The first subgoal can be proved with the simplifier and the {\tt ineq} axiom |
|
1410 that is part of \verb$mylist_ss$. |
|
1411 \begin{ttbox} |
|
1412 by (simp_tac mylist_ss 1); |
|
1413 {\out Level 2} |
|
1414 {\out ! x. Cons(x, xs) ~= xs} |
|
1415 {\out 1. !!a list.} |
|
1416 {\out ! x. Cons(x, list) ~= list ==>} |
|
1417 {\out ! x. Cons(x, Cons(a, list)) ~= Cons(a, list)} |
|
1418 \end{ttbox} |
|
1419 Using the {\tt inject} axiom we can quickly prove the remaining goal. |
|
1420 \begin{ttbox} |
|
1421 by (asm_simp_tac mylist_ss 1); |
|
1422 {\out Level 3} |
|
1423 {\out ! x. Cons(x, xs) ~= xs} |
|
1424 {\out No subgoals!} |
|
1425 \end{ttbox} |
|
1426 Because both subgoals were proved by almost the same tactic we could have |
|
1427 done that in one step using |
|
1428 \begin{ttbox} |
|
1429 by (ALLGOALS (asm_simp_tac mylist_ss)); |
|
1430 \end{ttbox} |
|
1431 |
|
1432 |
|
1433 \subsubsection{The datatype $\alpha~list$ with mixfix syntax} |
|
1434 |
|
1435 In this example we define the type $\alpha~list$ again but this time we want |
|
1436 to write {\tt []} instead of {\tt Nil} and we want to use the infix operator |
|
1437 \verb|#| instead of {\tt Cons}. To do this we simply add mixfix annotations |
|
1438 after the constructor declarations as follows: |
|
1439 \begin{ttbox} |
|
1440 MyList = HOL + |
|
1441 datatype 'a list = "[]" ("[]") |
|
1442 | "#" ('a, 'a list) (infixr 70) |
|
1443 end |
|
1444 \end{ttbox} |
|
1445 Now the theorem in the previous example can be written \verb|x#xs ~= xs|. The |
|
1446 proof is the same. |
|
1447 |
|
1448 \subsubsection{Defining functions on datatypes} |
|
1449 |
|
1450 Normally one wants to define functions dealing with a newly defined type and |
|
1451 prove properties of these functions. As an example let us define \verb|@| for |
|
1452 concatenation and {\tt ttl} for the (total) tail of a list, i.e.\ the list |
|
1453 without its first element. We do this in theory \verb|List_fun| which is an |
|
1454 extension of {\tt MyList}: |
|
1455 \begin{ttbox} |
|
1456 List_fun = MyList + |
|
1457 consts ttl :: "'a list => 'a list" |
|
1458 "@" :: "['a list,'a list] => 'a list" (infixr 60) |
|
1459 rules |
|
1460 ttl_def "ttl (l) == case l of [] => [] | y#ys => ys" |
|
1461 |
|
1462 app_Nil "[] @ ys = ys" |
|
1463 app_Cons "(x#xs) @ ys = x#(xs @ ys)" |
|
1464 end |
|
1465 \end{ttbox} |
|
1466 Let us have a look at the use of {\tt case} in the definition of {\tt ttl}. |
|
1467 The expression {\tt case e of [] => [] | y\#ys => ys} evaluates to {\tt []} |
|
1468 if \verb|e=[]| and to {\tt ys} if \verb|e=y#ys|. These properties are |
|
1469 trivial to derive by simplification: |
|
1470 \begin{ttbox} |
|
1471 val mylist_ss = HOL_ss addsimps MyList.list.simps; |
|
1472 |
|
1473 goalw List_fun.thy [List_fun.ttl_def] "ttl([]) = []"; |
|
1474 by (simp_tac mylist_ss 1); |
|
1475 val ttl_Nil = result(); |
|
1476 |
|
1477 goalw List_fun.thy [List_fun.ttl_def] "ttl(y#ys) = ys"; |
|
1478 by (simp_tac mylist_ss 1); |
|
1479 val ttl_Cons = result(); |
|
1480 |
|
1481 val list_fun_ss = mylist_ss addsimps |
|
1482 [ttl_Nil, ttl_Cons, List_fun.app_Nil, List_fun.app_Cons]; |
|
1483 \end{ttbox} |
|
1484 Note that we include the derived properties in our simpset, not the original |
|
1485 definition of {\tt ttl}. The former are better behaved because they only |
|
1486 apply if the argument is already a constructor. |
|
1487 |
|
1488 One could have defined \verb$@$ with a single {\tt case}-construct |
|
1489 \begin{ttbox} |
|
1490 app_def "x @ y == case x of [] => y | z#zs => z # (zs @ y)" |
|
1491 \end{ttbox} |
|
1492 and derived \verb$app_Nil$ and \verb$app_Cons$ from it. But \verb$app_def$ is |
|
1493 not easy to work with: the left-hand side matches a subterm of the right-hand |
|
1494 side, which causes the simplifier to loop. |
|
1495 |
|
1496 Here is a simple proof of the associativity of \verb$@$: |
|
1497 \begin{ttbox} |
|
1498 goal List_fun.thy "(x @ y) @ z = x @ (y @ z)"; |
|
1499 by (MyList.list.induct_tac "x" 1); |
|
1500 by (simp_tac list_fun_ss 1); |
|
1501 by (asm_simp_tac list_fun_ss 1); |
|
1502 \end{ttbox} |
|
1503 The last two steps can again be combined using {\tt ALLGOALS}. |
|
1504 |
|
1505 |
|
1506 \subsubsection{A datatype for weekdays} |
|
1507 |
|
1508 This example shows a datatype that consists of more than four constructors: |
|
1509 \begin{ttbox} |
|
1510 Days = Arith + |
|
1511 datatype days = Mo | Tu | We | Th | Fr | Sa | So |
|
1512 end |
|
1513 \end{ttbox} |
|
1514 Because there are more than four constructors, the theory must be based on |
|
1515 {\tt Arith}. Inequality is defined via a function \verb|days_ord|. Although |
|
1516 the expression \verb|Mo ~= Tu| is not directly contained in {\tt ineq}, it |
|
1517 can be proved by the simplifier if \verb$arith_ss$ is used: |
|
1518 \begin{ttbox} |
|
1519 val days_ss = arith_ss addsimps Days.days.simps; |
|
1520 |
|
1521 goal Days.thy "Mo ~= Tu"; |
|
1522 by (simp_tac days_ss 1); |
|
1523 \end{ttbox} |
|
1524 Note that usually it is not necessary to derive these inequalities explicitly |
|
1525 because the simplifier will dispose of them automatically. |
|
1526 |
|
1527 \index{*datatype|)} |
|
1528 \underscoreoff |
|
1529 |
|
1530 |
|
1531 |
1236 \section{The examples directories} |
1532 \section{The examples directories} |
1237 Directory {\tt HOL/Subst} contains Martin Coen's mechanisation of a theory of |
1533 Directory {\tt HOL/Subst} contains Martin Coen's mechanisation of a theory of |
1238 substitutions and unifiers. It is based on Paulson's previous |
1534 substitutions and unifiers. It is based on Paulson's previous |
1239 mechanisation in {\LCF}~\cite{paulson85} of Manna and Waldinger's |
1535 mechanisation in {\LCF}~\cite{paulson85} of Manna and Waldinger's |
1240 theory~\cite{mw81}. |
1536 theory~\cite{mw81}. |