summary |
shortlog |
changelog |
graph |
tags |
branches |
files |
changeset |
file |
revisions |
annotate |
diff |
raw

src/HOL/Unix/Unix.thy

author | wenzelm |

Sat Nov 04 15:24:40 2017 +0100 (20 months ago) | |

changeset 67003 | 49850a679c2c |

parent 66453 | cc19f7ca2ed6 |

child 67443 | 3abf6a722518 |

permissions | -rw-r--r-- |

more robust sorted_entries;

1 (* Title: HOL/Unix/Unix.thy

2 Author: Markus Wenzel, TU Muenchen

3 *)

5 section \<open>Unix file-systems \label{sec:unix-file-system}\<close>

7 theory Unix

8 imports

9 Nested_Environment

10 "HOL-Library.Sublist"

11 begin

13 text \<open>

14 We give a simple mathematical model of the basic structures underlying the

15 Unix file-system, together with a few fundamental operations that could be

16 imagined to be performed internally by the Unix kernel. This forms the basis

17 for the set of Unix system-calls to be introduced later (see

18 \secref{sec:unix-trans}), which are the actual interface offered to

19 processes running in user-space.

21 \<^medskip>

22 Basically, any Unix file is either a \<^emph>\<open>plain file\<close> or a \<^emph>\<open>directory\<close>,

23 consisting of some \<^emph>\<open>content\<close> plus \<^emph>\<open>attributes\<close>. The content of a plain

24 file is plain text. The content of a directory is a mapping from names to

25 further files.\<^footnote>\<open>In fact, this is the only way that names get associated with

26 files. In Unix files do \<^emph>\<open>not\<close> have a name in itself. Even more, any number

27 of names may be associated with the very same file due to \<^emph>\<open>hard links\<close>

28 (although this is excluded from our model).\<close> Attributes include information

29 to control various ways to access the file (read, write etc.).

31 Our model will be quite liberal in omitting excessive detail that is easily

32 seen to be ``irrelevant'' for the aspects of Unix file-systems to be

33 discussed here. First of all, we ignore character and block special files,

34 pipes, sockets, hard links, symbolic links, and mount points.

35 \<close>

38 subsection \<open>Names\<close>

40 text \<open>

41 User ids and file name components shall be represented by natural numbers

42 (without loss of generality). We do not bother about encoding of actual

43 names (e.g.\ strings), nor a mapping between user names and user ids as

44 would be present in a reality.

45 \<close>

47 type_synonym uid = nat

48 type_synonym name = nat

49 type_synonym path = "name list"

52 subsection \<open>Attributes\<close>

54 text \<open>

55 Unix file attributes mainly consist of \<^emph>\<open>owner\<close> information and a number of

56 \<^emph>\<open>permission\<close> bits which control access for ``user'', ``group'', and

57 ``others'' (see the Unix man pages \<open>chmod(2)\<close> and \<open>stat(2)\<close> for more

58 details).

60 \<^medskip>

61 Our model of file permissions only considers the ``others'' part. The

62 ``user'' field may be omitted without loss of overall generality, since the

63 owner is usually able to change it anyway by performing \<open>chmod\<close>.\<^footnote>\<open>The

64 inclined Unix expert may try to figure out some exotic arrangements of a

65 real-world Unix file-system such that the owner of a file is unable to apply

66 the \<open>chmod\<close> system call.\<close> We omit ``group'' permissions as a genuine

67 simplification as we just do not intend to discuss a model of multiple

68 groups and group membership, but pretend that everyone is member of a single

69 global group.\<^footnote>\<open>A general HOL model of user group structures and related

70 issues is given in @{cite "Naraschewski:2001"}.\<close>

71 \<close>

73 datatype perm =

74 Readable

75 | Writable

76 | Executable \<comment> "(ignored)"

78 type_synonym perms = "perm set"

80 record att =

81 owner :: uid

82 others :: perms

84 text \<open>

85 For plain files @{term Readable} and @{term Writable} specify read and write

86 access to the actual content, i.e.\ the string of text stored here. For

87 directories @{term Readable} determines if the set of entry names may be

88 accessed, and @{term Writable} controls the ability to create or delete any

89 entries (both plain files or sub-directories).

91 As another simplification, we ignore the @{term Executable} permission

92 altogether. In reality it would indicate executable plain files (also known

93 as ``binaries''), or control actual lookup of directory entries (recall that

94 mere directory browsing is controlled via @{term Readable}). Note that the

95 latter means that in order to perform any file-system operation whatsoever,

96 all directories encountered on the path would have to grant @{term

97 Executable}. We ignore this detail and pretend that all directories give

98 @{term Executable} permission to anybody.

99 \<close>

102 subsection \<open>Files\<close>

104 text \<open>

105 In order to model the general tree structure of a Unix file-system we use

106 the arbitrarily branching datatype @{typ "('a, 'b, 'c) env"} from the

107 standard library of Isabelle/HOL @{cite "Bauer-et-al:2002:HOL-Library"}.

108 This type provides constructors @{term Val} and @{term Env} as follows:

110 \<^medskip>

111 {\def\isastyleminor{\isastyle}

112 \begin{tabular}{l}

113 @{term [source] "Val :: 'a \<Rightarrow> ('a, 'b, 'c) env"} \\

114 @{term [source] "Env :: 'b \<Rightarrow> ('c \<Rightarrow> ('a, 'b, 'c) env option) \<Rightarrow> ('a, 'b, 'c) env"} \\

115 \end{tabular}}

116 \<^medskip>

118 Here the parameter @{typ 'a} refers to plain values occurring at leaf

119 positions, parameter @{typ 'b} to information kept with inner branch nodes,

120 and parameter @{typ 'c} to the branching type of the tree structure. For our

121 purpose we use the type instance with @{typ "att \<times> string"} (representing

122 plain files), @{typ att} (for attributes of directory nodes), and @{typ

123 name} (for the index type of directory nodes).

124 \<close>

126 type_synonym "file" = "(att \<times> string, att, name) env"

128 text \<open>

129 \<^medskip>

130 The HOL library also provides @{term lookup} and @{term update} operations

131 for general tree structures with the subsequent primitive recursive

132 characterizations.

134 \<^medskip>

135 {\def\isastyleminor{\isastyle}

136 \begin{tabular}{l}

137 @{term [source] "lookup :: ('a, 'b, 'c) env \<Rightarrow> 'c list \<Rightarrow> ('a, 'b, 'c) env option"} \\

138 @{term [source]

139 "update :: 'c list \<Rightarrow> ('a, 'b, 'c) env option \<Rightarrow> ('a, 'b, 'c) env \<Rightarrow> ('a, 'b, 'c) env"} \\

140 \end{tabular}}

142 @{thm [display, indent = 2, eta_contract = false] lookup_eq [no_vars]}

143 @{thm [display, indent = 2, eta_contract = false] update_eq [no_vars]}

145 Several further properties of these operations are proven in @{cite

146 "Bauer-et-al:2002:HOL-Library"}. These will be routinely used later on

147 without further notice.

149 \<^bigskip>

150 Apparently, the elements of type @{typ "file"} contain an @{typ att}

151 component in either case. We now define a few auxiliary operations to

152 manipulate this field uniformly, following the conventions for record types

153 in Isabelle/HOL @{cite "Nipkow-et-al:2000:HOL"}.

154 \<close>

156 definition

157 "attributes file =

158 (case file of

159 Val (att, text) \<Rightarrow> att

160 | Env att dir \<Rightarrow> att)"

162 definition

163 "map_attributes f file =

164 (case file of

165 Val (att, text) \<Rightarrow> Val (f att, text)

166 | Env att dir \<Rightarrow> Env (f att) dir)"

168 lemma [simp]: "attributes (Val (att, text)) = att"

169 by (simp add: attributes_def)

171 lemma [simp]: "attributes (Env att dir) = att"

172 by (simp add: attributes_def)

174 lemma [simp]: "attributes (map_attributes f file) = f (attributes file)"

175 by (cases "file") (simp_all add: attributes_def map_attributes_def

176 split_tupled_all)

178 lemma [simp]: "map_attributes f (Val (att, text)) = Val (f att, text)"

179 by (simp add: map_attributes_def)

181 lemma [simp]: "map_attributes f (Env att dir) = Env (f att) dir"

182 by (simp add: map_attributes_def)

185 subsection \<open>Initial file-systems\<close>

187 text \<open>

188 Given a set of \<^emph>\<open>known users\<close> a file-system shall be initialized by

189 providing an empty home directory for each user, with read-only access for

190 everyone else. (Note that we may directly use the user id as home directory

191 name, since both types have been identified.) Certainly, the very root

192 directory is owned by the super user (who has user id 0).

193 \<close>

195 definition

196 "init users =

197 Env \<lparr>owner = 0, others = {Readable}\<rparr>

198 (\<lambda>u. if u \<in> users then Some (Env \<lparr>owner = u, others = {Readable}\<rparr> empty)

199 else None)"

202 subsection \<open>Accessing file-systems\<close>

204 text \<open>

205 The main internal file-system operation is access of a file by a user,

206 requesting a certain set of permissions. The resulting @{typ "file option"}

207 indicates if a file had been present at the corresponding @{typ path} and if

208 access was granted according to the permissions recorded within the

209 file-system.

211 Note that by the rules of Unix file-system security (e.g.\ @{cite

212 "Tanenbaum:1992"}) both the super-user and owner may always access a file

213 unconditionally (in our simplified model).

214 \<close>

216 definition

217 "access root path uid perms =

218 (case lookup root path of

219 None \<Rightarrow> None

220 | Some file \<Rightarrow>

221 if uid = 0

222 \<or> uid = owner (attributes file)

223 \<or> perms \<subseteq> others (attributes file)

224 then Some file

225 else None)"

227 text \<open>

228 \<^medskip>

229 Successful access to a certain file is the main prerequisite for

230 system-calls to be applicable (cf.\ \secref{sec:unix-trans}). Any

231 modification of the file-system is then performed using the basic @{term

232 update} operation.

234 \<^medskip>

235 We see that @{term access} is just a wrapper for the basic @{term lookup}

236 function, with additional checking of attributes. Subsequently we establish

237 a few auxiliary facts that stem from the primitive @{term lookup} used

238 within @{term access}.

239 \<close>

241 lemma access_empty_lookup: "access root path uid {} = lookup root path"

242 by (simp add: access_def split: option.splits)

244 lemma access_some_lookup:

245 "access root path uid perms = Some file \<Longrightarrow>

246 lookup root path = Some file"

247 by (simp add: access_def split: option.splits if_splits)

249 lemma access_update_other:

250 assumes parallel: "path' \<parallel> path"

251 shows "access (update path' opt root) path uid perms = access root path uid perms"

252 proof -

253 from parallel obtain y z xs ys zs where

254 "y \<noteq> z" and "path' = xs @ y # ys" and "path = xs @ z # zs"

255 by (blast dest: parallel_decomp)

256 then have "lookup (update path' opt root) path = lookup root path"

257 by (blast intro: lookup_update_other)

258 then show ?thesis by (simp only: access_def)

259 qed

262 section \<open>File-system transitions \label{sec:unix-trans}\<close>

264 subsection \<open>Unix system calls \label{sec:unix-syscall}\<close>

266 text \<open>

267 According to established operating system design (cf.\ @{cite

268 "Tanenbaum:1992"}) user space processes may only initiate system operations

269 by a fixed set of \<^emph>\<open>system-calls\<close>. This enables the kernel to enforce

270 certain security policies in the first place.\<^footnote>\<open>Incidently, this is the very

271 same principle employed by any ``LCF-style'' theorem proving system

272 according to Milner's principle of ``correctness by construction'', such as

273 Isabelle/HOL itself.\<close>

275 \<^medskip>

276 In our model of Unix we give a fixed datatype \<open>operation\<close> for the syntax of

277 system-calls, together with an inductive definition of file-system state

278 transitions of the form \<open>root \<midarrow>x\<rightarrow> root'\<close> for the operational semantics.

279 \<close>

281 datatype operation =

282 Read uid string path

283 | Write uid string path

284 | Chmod uid perms path

285 | Creat uid perms path

286 | Unlink uid path

287 | Mkdir uid perms path

288 | Rmdir uid path

289 | Readdir uid "name set" path

291 text \<open>

292 The @{typ uid} field of an operation corresponds to the \<^emph>\<open>effective user id\<close>

293 of the underlying process, although our model never mentions processes

294 explicitly. The other parameters are provided as arguments by the caller;

295 the @{term path} one is common to all kinds of system-calls.

296 \<close>

298 primrec uid_of :: "operation \<Rightarrow> uid"

299 where

300 "uid_of (Read uid text path) = uid"

301 | "uid_of (Write uid text path) = uid"

302 | "uid_of (Chmod uid perms path) = uid"

303 | "uid_of (Creat uid perms path) = uid"

304 | "uid_of (Unlink uid path) = uid"

305 | "uid_of (Mkdir uid path perms) = uid"

306 | "uid_of (Rmdir uid path) = uid"

307 | "uid_of (Readdir uid names path) = uid"

309 primrec path_of :: "operation \<Rightarrow> path"

310 where

311 "path_of (Read uid text path) = path"

312 | "path_of (Write uid text path) = path"

313 | "path_of (Chmod uid perms path) = path"

314 | "path_of (Creat uid perms path) = path"

315 | "path_of (Unlink uid path) = path"

316 | "path_of (Mkdir uid perms path) = path"

317 | "path_of (Rmdir uid path) = path"

318 | "path_of (Readdir uid names path) = path"

320 text \<open>

321 \<^medskip>

322 Note that we have omitted explicit \<open>Open\<close> and \<open>Close\<close> operations, pretending

323 that @{term Read} and @{term Write} would already take care of this behind

324 the scenes. Thus we have basically treated actual sequences of real

325 system-calls \<open>open\<close>--\<open>read\<close>/\<open>write\<close>--\<open>close\<close> as atomic.

327 In principle, this could make big a difference in a model with explicit

328 concurrent processes. On the other hand, even on a real Unix system the

329 exact scheduling of concurrent \<open>open\<close> and \<open>close\<close> operations does \<^emph>\<open>not\<close>

330 directly affect the success of corresponding \<open>read\<close> or \<open>write\<close>. Unix allows

331 several processes to have files opened at the same time, even for writing!

332 Certainly, the result from reading the contents later may be hard to

333 predict, but the system-calls involved here will succeed in any case.

335 \<^bigskip>

336 The operational semantics of system calls is now specified via transitions

337 of the file-system configuration. This is expressed as an inductive relation

338 (although there is no actual recursion involved here).

339 \<close>

341 inductive transition :: "file \<Rightarrow> operation \<Rightarrow> file \<Rightarrow> bool"

342 ("_ \<midarrow>_\<rightarrow> _" [90, 1000, 90] 100)

343 where

344 read:

345 "root \<midarrow>(Read uid text path)\<rightarrow> root"

346 if "access root path uid {Readable} = Some (Val (att, text))"

347 | "write":

348 "root \<midarrow>(Write uid text path)\<rightarrow> update path (Some (Val (att, text))) root"

349 if "access root path uid {Writable} = Some (Val (att, text'))"

350 | chmod:

351 "root \<midarrow>(Chmod uid perms path)\<rightarrow>

352 update path (Some (map_attributes (others_update (\<lambda>_. perms)) file)) root"

353 if "access root path uid {} = Some file" and "uid = 0 \<or> uid = owner (attributes file)"

354 | creat:

355 "root \<midarrow>(Creat uid perms path)\<rightarrow>

356 update path (Some (Val (\<lparr>owner = uid, others = perms\<rparr>, []))) root"

357 if "path = parent_path @ [name]"

358 and "access root parent_path uid {Writable} = Some (Env att parent)"

359 and "access root path uid {} = None"

360 | unlink:

361 "root \<midarrow>(Unlink uid path)\<rightarrow> update path None root"

362 if "path = parent_path @ [name]"

363 and "access root parent_path uid {Writable} = Some (Env att parent)"

364 and "access root path uid {} = Some (Val plain)"

365 | mkdir:

366 "root \<midarrow>(Mkdir uid perms path)\<rightarrow>

367 update path (Some (Env \<lparr>owner = uid, others = perms\<rparr> empty)) root"

368 if "path = parent_path @ [name]"

369 and "access root parent_path uid {Writable} = Some (Env att parent)"

370 and "access root path uid {} = None"

371 | rmdir:

372 "root \<midarrow>(Rmdir uid path)\<rightarrow> update path None root"

373 if "path = parent_path @ [name]"

374 and "access root parent_path uid {Writable} = Some (Env att parent)"

375 and "access root path uid {} = Some (Env att' empty)"

376 | readdir:

377 "root \<midarrow>(Readdir uid names path)\<rightarrow> root"

378 if "access root path uid {Readable} = Some (Env att dir)"

379 and "names = dom dir"

381 text \<open>

382 \<^medskip>

383 Certainly, the above specification is central to the whole formal

384 development. Any of the results to be established later on are only

385 meaningful to the outside world if this transition system provides an

386 adequate model of real Unix systems. This kind of ``reality-check'' of a

387 formal model is the well-known problem of \<^emph>\<open>validation\<close>.

389 If in doubt, one may consider to compare our definition with the informal

390 specifications given the corresponding Unix man pages, or even peek at an

391 actual implementation such as @{cite "Torvalds-et-al:Linux"}. Another common

392 way to gain confidence into the formal model is to run simple simulations

393 (see \secref{sec:unix-examples}), and check the results with that of

394 experiments performed on a real Unix system.

395 \<close>

398 subsection \<open>Basic properties of single transitions \label{sec:unix-single-trans}\<close>

400 text \<open>

401 The transition system \<open>root \<midarrow>x\<rightarrow> root'\<close> defined above determines a unique

402 result @{term root'} from given @{term root} and @{term x} (this is holds

403 rather trivially, since there is even only one clause for each operation).

404 This uniqueness statement will simplify our subsequent development to some

405 extent, since we only have to reason about a partial function rather than a

406 general relation.

407 \<close>

409 theorem transition_uniq:

410 assumes root': "root \<midarrow>x\<rightarrow> root'"

411 and root'': "root \<midarrow>x\<rightarrow> root''"

412 shows "root' = root''"

413 using root''

414 proof cases

415 case read

416 with root' show ?thesis by cases auto

417 next

418 case "write"

419 with root' show ?thesis by cases auto

420 next

421 case chmod

422 with root' show ?thesis by cases auto

423 next

424 case creat

425 with root' show ?thesis by cases auto

426 next

427 case unlink

428 with root' show ?thesis by cases auto

429 next

430 case mkdir

431 with root' show ?thesis by cases auto

432 next

433 case rmdir

434 with root' show ?thesis by cases auto

435 next

436 case readdir

437 with root' show ?thesis by cases blast+

438 qed

440 text \<open>

441 \<^medskip>

442 Apparently, file-system transitions are \<^emph>\<open>type-safe\<close> in the sense that the

443 result of transforming an actual directory yields again a directory.

444 \<close>

446 theorem transition_type_safe:

447 assumes tr: "root \<midarrow>x\<rightarrow> root'"

448 and inv: "\<exists>att dir. root = Env att dir"

449 shows "\<exists>att dir. root' = Env att dir"

450 proof (cases "path_of x")

451 case Nil

452 with tr inv show ?thesis

453 by cases (auto simp add: access_def split: if_splits)

454 next

455 case Cons

456 from tr obtain opt where

457 "root' = root \<or> root' = update (path_of x) opt root"

458 by cases auto

459 with inv Cons show ?thesis

460 by (auto simp add: update_eq split: list.splits)

461 qed

463 text \<open>

464 The previous result may be seen as the most basic invariant on the

465 file-system state that is enforced by any proper kernel implementation. So

466 user processes --- being bound to the system-call interface --- may never

467 mess up a file-system such that the root becomes a plain file instead of a

468 directory, which would be a strange situation indeed.

469 \<close>

472 subsection \<open>Iterated transitions\<close>

474 text \<open>

475 Iterated system transitions via finite sequences of system operations are

476 modeled inductively as follows. In a sense, this relation describes the

477 cumulative effect of the sequence of system-calls issued by a number of

478 running processes over a finite amount of time.

479 \<close>

481 inductive transitions :: "file \<Rightarrow> operation list \<Rightarrow> file \<Rightarrow> bool" ("_ \<Midarrow>_\<Rightarrow> _" [90, 1000, 90] 100)

482 where

483 nil: "root \<Midarrow>[]\<Rightarrow> root"

484 | cons: "root \<Midarrow>(x # xs)\<Rightarrow> root''" if "root \<midarrow>x\<rightarrow> root'" and "root' \<Midarrow>xs\<Rightarrow> root''"

486 text \<open>

487 \<^medskip>

488 We establish a few basic facts relating iterated transitions with single

489 ones, according to the recursive structure of lists.

490 \<close>

492 lemma transitions_nil_eq: "root \<Midarrow>[]\<Rightarrow> root' \<longleftrightarrow> root = root'"

493 proof

494 assume "root \<Midarrow>[]\<Rightarrow> root'"

495 then show "root = root'" by cases simp_all

496 next

497 assume "root = root'"

498 then show "root \<Midarrow>[]\<Rightarrow> root'" by (simp only: transitions.nil)

499 qed

501 lemma transitions_cons_eq:

502 "root \<Midarrow>(x # xs)\<Rightarrow> root'' \<longleftrightarrow> (\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' \<Midarrow>xs\<Rightarrow> root'')"

503 proof

504 assume "root \<Midarrow>(x # xs)\<Rightarrow> root''"

505 then show "\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' \<Midarrow>xs\<Rightarrow> root''"

506 by cases auto

507 next

508 assume "\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' \<Midarrow>xs\<Rightarrow> root''"

509 then show "root \<Midarrow>(x # xs)\<Rightarrow> root''"

510 by (blast intro: transitions.cons)

511 qed

513 text \<open>

514 The next two rules show how to ``destruct'' known transition sequences. Note

515 that the second one actually relies on the uniqueness property of the basic

516 transition system (see \secref{sec:unix-single-trans}).

517 \<close>

519 lemma transitions_nilD: "root \<Midarrow>[]\<Rightarrow> root' \<Longrightarrow> root' = root"

520 by (simp add: transitions_nil_eq)

522 lemma transitions_consD:

523 assumes list: "root \<Midarrow>(x # xs)\<Rightarrow> root''"

524 and hd: "root \<midarrow>x\<rightarrow> root'"

525 shows "root' \<Midarrow>xs\<Rightarrow> root''"

526 proof -

527 from list obtain r' where r': "root \<midarrow>x\<rightarrow> r'" and root'': "r' \<Midarrow>xs\<Rightarrow> root''"

528 by cases simp_all

529 from r' hd have "r' = root'" by (rule transition_uniq)

530 with root'' show "root' \<Midarrow>xs\<Rightarrow> root''" by simp

531 qed

533 text \<open>

534 \<^medskip>

535 The following fact shows how an invariant @{term Q} of single transitions

536 with property @{term P} may be transferred to iterated transitions. The

537 proof is rather obvious by rule induction over the definition of @{term

538 "root \<Midarrow>xs\<Rightarrow> root'"}.

539 \<close>

541 lemma transitions_invariant:

542 assumes r: "\<And>r x r'. r \<midarrow>x\<rightarrow> r' \<Longrightarrow> Q r \<Longrightarrow> P x \<Longrightarrow> Q r'"

543 and trans: "root \<Midarrow>xs\<Rightarrow> root'"

544 shows "Q root \<Longrightarrow> \<forall>x \<in> set xs. P x \<Longrightarrow> Q root'"

545 using trans

546 proof induct

547 case nil

548 show ?case by (rule nil.prems)

549 next

550 case (cons root x root' xs root'')

551 note P = \<open>\<forall>x \<in> set (x # xs). P x\<close>

552 then have "P x" by simp

553 with \<open>root \<midarrow>x\<rightarrow> root'\<close> and \<open>Q root\<close> have Q': "Q root'" by (rule r)

554 from P have "\<forall>x \<in> set xs. P x" by simp

555 with Q' show "Q root''" by (rule cons.hyps)

556 qed

558 text \<open>

559 As an example of applying the previous result, we transfer the basic

560 type-safety property (see \secref{sec:unix-single-trans}) from single

561 transitions to iterated ones, which is a rather obvious result indeed.

562 \<close>

564 theorem transitions_type_safe:

565 assumes "root \<Midarrow>xs\<Rightarrow> root'"

566 and "\<exists>att dir. root = Env att dir"

567 shows "\<exists>att dir. root' = Env att dir"

568 using transition_type_safe and assms

569 proof (rule transitions_invariant)

570 show "\<forall>x \<in> set xs. True" by blast

571 qed

574 section \<open>Executable sequences\<close>

576 text \<open>

577 An inductively defined relation such as the one of \<open>root \<midarrow>x\<rightarrow> root'\<close> (see

578 \secref{sec:unix-syscall}) has two main aspects. First of all, the resulting

579 system admits a certain set of transition rules (introductions) as given in

580 the specification. Furthermore, there is an explicit least-fixed-point

581 construction involved, which results in induction (and case-analysis) rules

582 to eliminate known transitions in an exhaustive manner.

584 \<^medskip>

585 Subsequently, we explore our transition system in an experimental style,

586 mainly using the introduction rules with basic algebraic properties of the

587 underlying structures. The technique closely resembles that of Prolog

588 combined with functional evaluation in a very simple manner.

590 Just as the ``closed-world assumption'' is left implicit in Prolog, we do

591 not refer to induction over the whole transition system here. So this is

592 still purely positive reasoning about possible executions; exhaustive

593 reasoning will be employed only later on (see \secref{sec:unix-bogosity}),

594 when we shall demonstrate that certain behavior is \<^emph>\<open>not\<close> possible.

595 \<close>

598 subsection \<open>Possible transitions\<close>

600 text \<open>

601 Rather obviously, a list of system operations can be executed within a

602 certain state if there is a result state reached by an iterated transition.

603 \<close>

605 definition "can_exec root xs \<longleftrightarrow> (\<exists>root'. root \<Midarrow>xs\<Rightarrow> root')"

607 lemma can_exec_nil: "can_exec root []"

608 unfolding can_exec_def by (blast intro: transitions.intros)

610 lemma can_exec_cons:

611 "root \<midarrow>x\<rightarrow> root' \<Longrightarrow> can_exec root' xs \<Longrightarrow> can_exec root (x # xs)"

612 unfolding can_exec_def by (blast intro: transitions.intros)

614 text \<open>

615 \<^medskip>

616 In case that we already know that a certain sequence can be executed we may

617 destruct it backwardly into individual transitions.

618 \<close>

620 lemma can_exec_snocD: "can_exec root (xs @ [y])

621 \<Longrightarrow> \<exists>root' root''. root \<Midarrow>xs\<Rightarrow> root' \<and> root' \<midarrow>y\<rightarrow> root''"

622 proof (induct xs arbitrary: root)

623 case Nil

624 then show ?case

625 by (simp add: can_exec_def transitions_nil_eq transitions_cons_eq)

626 next

627 case (Cons x xs)

628 from \<open>can_exec root ((x # xs) @ [y])\<close> obtain r root'' where

629 x: "root \<midarrow>x\<rightarrow> r" and

630 xs_y: "r \<Midarrow>(xs @ [y])\<Rightarrow> root''"

631 by (auto simp add: can_exec_def transitions_nil_eq transitions_cons_eq)

632 from xs_y Cons.hyps obtain root' r'

633 where xs: "r \<Midarrow>xs\<Rightarrow> root'" and y: "root' \<midarrow>y\<rightarrow> r'"

634 unfolding can_exec_def by blast

635 from x xs have "root \<Midarrow>(x # xs)\<Rightarrow> root'"

636 by (rule transitions.cons)

637 with y show ?case by blast

638 qed

641 subsection \<open>Example executions \label{sec:unix-examples}\<close>

643 text \<open>

644 We are now ready to perform a few experiments within our formal model of

645 Unix system-calls. The common technique is to alternate introduction rules

646 of the transition system (see \secref{sec:unix-trans}), and steps to solve

647 any emerging side conditions using algebraic properties of the underlying

648 file-system structures (see \secref{sec:unix-file-system}).

649 \<close>

651 lemmas eval = access_def init_def

653 theorem "u \<in> users \<Longrightarrow> can_exec (init users)

654 [Mkdir u perms [u, name]]"

655 apply (rule can_exec_cons)

656 \<comment> \<open>back-chain @{term can_exec} (of @{term [source] Cons})\<close>

657 apply (rule mkdir)

658 \<comment> \<open>back-chain @{term Mkdir}\<close>

659 apply (force simp add: eval)+

660 \<comment> \<open>solve preconditions of @{term Mkdir}\<close>

661 apply (simp add: eval)

662 \<comment> \<open>peek at resulting dir (optional)\<close>

663 txt \<open>@{subgoals [display]}\<close>

664 apply (rule can_exec_nil)

665 \<comment> \<open>back-chain @{term can_exec} (of @{term [source] Nil})\<close>

666 done

668 text \<open>

669 By inspecting the result shown just before the final step above, we may gain

670 confidence that our specification of Unix system-calls actually makes sense.

671 Further common errors are usually exhibited when preconditions of transition

672 rules fail unexpectedly.

674 \<^medskip>

675 Here are a few further experiments, using the same techniques as before.

676 \<close>

678 theorem "u \<in> users \<Longrightarrow> can_exec (init users)

679 [Creat u perms [u, name],

680 Unlink u [u, name]]"

681 apply (rule can_exec_cons)

682 apply (rule creat)

683 apply (force simp add: eval)+

684 apply (simp add: eval)

685 apply (rule can_exec_cons)

686 apply (rule unlink)

687 apply (force simp add: eval)+

688 apply (simp add: eval)

689 txt \<open>peek at result: @{subgoals [display]}\<close>

690 apply (rule can_exec_nil)

691 done

693 theorem "u \<in> users \<Longrightarrow> Writable \<in> perms\<^sub>1 \<Longrightarrow>

694 Readable \<in> perms\<^sub>2 \<Longrightarrow> name\<^sub>3 \<noteq> name\<^sub>4 \<Longrightarrow>

695 can_exec (init users)

696 [Mkdir u perms\<^sub>1 [u, name\<^sub>1],

697 Mkdir u' perms\<^sub>2 [u, name\<^sub>1, name\<^sub>2],

698 Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],

699 Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>4],

700 Readdir u {name\<^sub>3, name\<^sub>4} [u, name\<^sub>1, name\<^sub>2]]"

701 apply (rule can_exec_cons, rule transition.intros,

702 (force simp add: eval)+, (simp add: eval)?)+

703 txt \<open>peek at result: @{subgoals [display]}\<close>

704 apply (rule can_exec_nil)

705 done

707 theorem "u \<in> users \<Longrightarrow> Writable \<in> perms\<^sub>1 \<Longrightarrow> Readable \<in> perms\<^sub>3 \<Longrightarrow>

708 can_exec (init users)

709 [Mkdir u perms\<^sub>1 [u, name\<^sub>1],

710 Mkdir u' perms\<^sub>2 [u, name\<^sub>1, name\<^sub>2],

711 Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],

712 Write u' ''foo'' [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],

713 Read u ''foo'' [u, name\<^sub>1, name\<^sub>2, name\<^sub>3]]"

714 apply (rule can_exec_cons, rule transition.intros,

715 (force simp add: eval)+, (simp add: eval)?)+

716 txt \<open>peek at result: @{subgoals [display]}\<close>

717 apply (rule can_exec_nil)

718 done

721 section \<open>Odd effects --- treated formally \label{sec:unix-bogosity}\<close>

723 text \<open>

724 We are now ready to give a completely formal treatment of the slightly odd

725 situation discussed in the introduction (see \secref{sec:unix-intro}): the

726 file-system can easily reach a state where a user is unable to remove his

727 very own directory, because it is still populated by items placed there by

728 another user in an uncouth manner.

729 \<close>

732 subsection \<open>The general procedure \label{sec:unix-inv-procedure}\<close>

734 text \<open>

735 The following theorem expresses the general procedure we are following to

736 achieve the main result.

737 \<close>

739 theorem general_procedure:

740 assumes cannot_y: "\<And>r r'. Q r \<Longrightarrow> r \<midarrow>y\<rightarrow> r' \<Longrightarrow> False"

741 and init_inv: "\<And>root. init users \<Midarrow>bs\<Rightarrow> root \<Longrightarrow> Q root"

742 and preserve_inv: "\<And>r x r'. r \<midarrow>x\<rightarrow> r' \<Longrightarrow> Q r \<Longrightarrow> P x \<Longrightarrow> Q r'"

743 and init_result: "init users \<Midarrow>bs\<Rightarrow> root"

744 shows "\<not> (\<exists>xs. (\<forall>x \<in> set xs. P x) \<and> can_exec root (xs @ [y]))"

745 proof -

746 {

747 fix xs

748 assume Ps: "\<forall>x \<in> set xs. P x"

749 assume can_exec: "can_exec root (xs @ [y])"

750 then obtain root' root'' where xs: "root \<Midarrow>xs\<Rightarrow> root'" and y: "root' \<midarrow>y\<rightarrow> root''"

751 by (blast dest: can_exec_snocD)

752 from init_result have "Q root" by (rule init_inv)

753 from preserve_inv xs this Ps have "Q root'"

754 by (rule transitions_invariant)

755 from this y have False by (rule cannot_y)

756 }

757 then show ?thesis by blast

758 qed

760 text \<open>

761 Here @{prop "P x"} refers to the restriction on file-system operations that

762 are admitted after having reached the critical configuration; according to

763 the problem specification this will become @{prop "uid_of x = user\<^sub>1"} later

764 on. Furthermore, @{term y} refers to the operations we claim to be

765 impossible to perform afterwards, we will take @{term Rmdir} later. Moreover

766 @{term Q} is a suitable (auxiliary) invariant over the file-system; choosing

767 @{term Q} adequately is very important to make the proof work (see

768 \secref{sec:unix-inv-lemmas}).

769 \<close>

772 subsection \<open>The particular situation\<close>

774 text \<open>

775 We introduce a few global declarations and axioms to describe our particular

776 situation considered here. Thus we avoid excessive use of local parameters

777 in the subsequent development.

778 \<close>

780 context

781 fixes users :: "uid set"

782 and user\<^sub>1 :: uid

783 and user\<^sub>2 :: uid

784 and name\<^sub>1 :: name

785 and name\<^sub>2 :: name

786 and name\<^sub>3 :: name

787 and perms\<^sub>1 :: perms

788 and perms\<^sub>2 :: perms

789 assumes user\<^sub>1_known: "user\<^sub>1 \<in> users"

790 and user\<^sub>1_not_root: "user\<^sub>1 \<noteq> 0"

791 and users_neq: "user\<^sub>1 \<noteq> user\<^sub>2"

792 and perms\<^sub>1_writable: "Writable \<in> perms\<^sub>1"

793 and perms\<^sub>2_not_writable: "Writable \<notin> perms\<^sub>2"

794 notes facts = user\<^sub>1_known user\<^sub>1_not_root users_neq

795 perms\<^sub>1_writable perms\<^sub>2_not_writable

796 begin

798 definition

799 "bogus =

800 [Mkdir user\<^sub>1 perms\<^sub>1 [user\<^sub>1, name\<^sub>1],

801 Mkdir user\<^sub>2 perms\<^sub>2 [user\<^sub>1, name\<^sub>1, name\<^sub>2],

802 Creat user\<^sub>2 perms\<^sub>2 [user\<^sub>1, name\<^sub>1, name\<^sub>2, name\<^sub>3]]"

804 definition "bogus_path = [user\<^sub>1, name\<^sub>1, name\<^sub>2]"

806 text \<open>

807 The @{term bogus} operations are the ones that lead into the uncouth

808 situation; @{term bogus_path} is the key position within the file-system

809 where things go awry.

810 \<close>

813 subsection \<open>Invariance lemmas \label{sec:unix-inv-lemmas}\<close>

815 text \<open>

816 The following invariant over the root file-system describes the bogus

817 situation in an abstract manner: located at a certain @{term path} within

818 the file-system is a non-empty directory that is neither owned nor writable

819 by @{term user\<^sub>1}.

820 \<close>

822 definition

823 "invariant root path \<longleftrightarrow>

824 (\<exists>att dir.

825 access root path user\<^sub>1 {} = Some (Env att dir) \<and> dir \<noteq> empty \<and>

826 user\<^sub>1 \<noteq> owner att \<and>

827 access root path user\<^sub>1 {Writable} = None)"

829 text \<open>

830 Following the general procedure (see \secref{sec:unix-inv-procedure}) we

831 will now establish the three key lemmas required to yield the final result.

833 \<^enum> The invariant is sufficiently strong to entail the pathological case

834 that @{term user\<^sub>1} is unable to remove the (owned) directory at @{term

835 "[user\<^sub>1, name\<^sub>1]"}.

837 \<^enum> The invariant does hold after having executed the @{term bogus} list of

838 operations (starting with an initial file-system configuration).

840 \<^enum> The invariant is preserved by any file-system operation performed by

841 @{term user\<^sub>1} alone, without any help by other users.

843 As the invariant appears both as assumptions and conclusions in the course

844 of proof, its formulation is rather critical for the whole development to

845 work out properly. In particular, the third step is very sensitive to the

846 invariant being either too strong or too weak. Moreover the invariant has to

847 be sufficiently abstract, lest the proof become cluttered by confusing

848 detail.

850 \<^medskip>

851 The first two lemmas are technically straight forward --- we just have to

852 inspect rather special cases.

853 \<close>

855 lemma cannot_rmdir:

856 assumes inv: "invariant root bogus_path"

857 and rmdir: "root \<midarrow>(Rmdir user\<^sub>1 [user\<^sub>1, name\<^sub>1])\<rightarrow> root'"

858 shows False

859 proof -

860 from inv obtain "file" where "access root bogus_path user\<^sub>1 {} = Some file"

861 unfolding invariant_def by blast

862 moreover

863 from rmdir obtain att where "access root [user\<^sub>1, name\<^sub>1] user\<^sub>1 {} = Some (Env att empty)"

864 by cases auto

865 then have "access root ([user\<^sub>1, name\<^sub>1] @ [name\<^sub>2]) user\<^sub>1 {} = empty name\<^sub>2"

866 by (simp only: access_empty_lookup lookup_append_some) simp

867 ultimately show False by (simp add: bogus_path_def)

868 qed

870 lemma

871 assumes init: "init users \<Midarrow>bogus\<Rightarrow> root"

872 shows init_invariant: "invariant root bogus_path"

873 supply eval = facts access_def init_def

874 using init

875 apply (unfold bogus_def bogus_path_def)

876 apply (drule transitions_consD, rule transition.intros,

877 (force simp add: eval)+, (simp add: eval)?)+

878 \<comment> "evaluate all operations"

879 apply (drule transitions_nilD)

880 \<comment> "reach final result"

881 apply (simp add: invariant_def eval)

882 \<comment> "check the invariant"

883 done

885 text \<open>

886 \<^medskip>

887 At last we are left with the main effort to show that the ``bogosity''

888 invariant is preserved by any file-system operation performed by @{term

889 user\<^sub>1} alone. Note that this holds for any @{term path} given, the

890 particular @{term bogus_path} is not required here.

891 \<close>

893 lemma preserve_invariant:

894 assumes tr: "root \<midarrow>x\<rightarrow> root'"

895 and inv: "invariant root path"

896 and uid: "uid_of x = user\<^sub>1"

897 shows "invariant root' path"

898 proof -

899 from inv obtain att dir where

900 inv1: "access root path user\<^sub>1 {} = Some (Env att dir)" and

901 inv2: "dir \<noteq> empty" and

902 inv3: "user\<^sub>1 \<noteq> owner att" and

903 inv4: "access root path user\<^sub>1 {Writable} = None"

904 by (auto simp add: invariant_def)

906 from inv1 have lookup: "lookup root path = Some (Env att dir)"

907 by (simp only: access_empty_lookup)

909 from inv1 inv3 inv4 and user\<^sub>1_not_root

910 have not_writable: "Writable \<notin> others att"

911 by (auto simp add: access_def split: option.splits)

913 show ?thesis

914 proof cases

915 assume "root' = root"

916 with inv show "invariant root' path" by (simp only:)

917 next

918 assume changed: "root' \<noteq> root"

919 with tr obtain opt where root': "root' = update (path_of x) opt root"

920 by cases auto

921 show ?thesis

922 proof (rule prefix_cases)

923 assume "path_of x \<parallel> path"

924 with inv root'

925 have "\<And>perms. access root' path user\<^sub>1 perms = access root path user\<^sub>1 perms"

926 by (simp only: access_update_other)

927 with inv show "invariant root' path"

928 by (simp only: invariant_def)

929 next

930 assume "prefix (path_of x) path"

931 then obtain ys where path: "path = path_of x @ ys" ..

933 show ?thesis

934 proof (cases ys)

935 assume "ys = []"

936 with tr uid inv2 inv3 lookup changed path and user\<^sub>1_not_root

937 have False

938 by cases (auto simp add: access_empty_lookup dest: access_some_lookup)

939 then show ?thesis ..

940 next

941 fix z zs assume ys: "ys = z # zs"

942 have "lookup root' path = lookup root path"

943 proof -

944 from inv2 lookup path ys

945 have look: "lookup root (path_of x @ z # zs) = Some (Env att dir)"

946 by (simp only:)

947 then obtain att' dir' file' where

948 look': "lookup root (path_of x) = Some (Env att' dir')" and

949 dir': "dir' z = Some file'" and

950 file': "lookup file' zs = Some (Env att dir)"

951 by (blast dest: lookup_some_upper)

953 from tr uid changed look' dir' obtain att'' where

954 look'': "lookup root' (path_of x) = Some (Env att'' dir')"

955 by cases (auto simp add: access_empty_lookup lookup_update_some

956 dest: access_some_lookup)

957 with dir' file' have "lookup root' (path_of x @ z # zs) =

958 Some (Env att dir)"

959 by (simp add: lookup_append_some)

960 with look path ys show ?thesis

961 by simp

962 qed

963 with inv show "invariant root' path"

964 by (simp only: invariant_def access_def)

965 qed

966 next

967 assume "strict_prefix path (path_of x)"

968 then obtain y ys where path: "path_of x = path @ y # ys" ..

970 obtain dir' where

971 lookup': "lookup root' path = Some (Env att dir')" and

972 inv2': "dir' \<noteq> empty"

973 proof (cases ys)

974 assume "ys = []"

975 with path have parent: "path_of x = path @ [y]" by simp

976 with tr uid inv4 changed obtain "file" where

977 "root' = update (path_of x) (Some file) root"

978 by cases auto

979 with lookup parent have "lookup root' path = Some (Env att (dir(y\<mapsto>file)))"

980 by (simp only: update_append_some update_cons_nil_env)

981 moreover have "dir(y\<mapsto>file) \<noteq> empty" by simp

982 ultimately show ?thesis ..

983 next

984 fix z zs assume ys: "ys = z # zs"

985 with lookup root' path

986 have "lookup root' path = Some (update (y # ys) opt (Env att dir))"

987 by (simp only: update_append_some)

988 also obtain file' where

989 "update (y # ys) opt (Env att dir) = Env att (dir(y\<mapsto>file'))"

990 proof -

991 have "dir y \<noteq> None"

992 proof -

993 have "dir y = lookup (Env att dir) [y]"

994 by (simp split: option.splits)

995 also from lookup have "\<dots> = lookup root (path @ [y])"

996 by (simp only: lookup_append_some)

997 also have "\<dots> \<noteq> None"

998 proof -

999 from ys obtain us u where rev_ys: "ys = us @ [u]"

1000 by (cases ys rule: rev_cases) simp

1001 with tr path

1002 have "lookup root ((path @ [y]) @ (us @ [u])) \<noteq> None \<or>

1003 lookup root ((path @ [y]) @ us) \<noteq> None"

1004 by cases (auto dest: access_some_lookup)

1005 then show ?thesis

1006 by (fastforce dest!: lookup_some_append)

1007 qed

1008 finally show ?thesis .

1009 qed

1010 with ys show ?thesis using that by auto

1011 qed

1012 also have "dir(y\<mapsto>file') \<noteq> empty" by simp

1013 ultimately show ?thesis ..

1014 qed

1016 from lookup' have inv1': "access root' path user\<^sub>1 {} = Some (Env att dir')"

1017 by (simp only: access_empty_lookup)

1019 from inv3 lookup' and not_writable user\<^sub>1_not_root

1020 have "access root' path user\<^sub>1 {Writable} = None"

1021 by (simp add: access_def)

1022 with inv1' inv2' inv3 show ?thesis unfolding invariant_def by blast

1023 qed

1024 qed

1025 qed

1028 subsection \<open>Putting it all together \label{sec:unix-main-result}\<close>

1030 text \<open>

1031 The main result is now imminent, just by composing the three invariance

1032 lemmas (see \secref{sec:unix-inv-lemmas}) according the the overall

1033 procedure (see \secref{sec:unix-inv-procedure}).

1034 \<close>

1036 corollary

1037 assumes bogus: "init users \<Midarrow>bogus\<Rightarrow> root"

1038 shows "\<not> (\<exists>xs. (\<forall>x \<in> set xs. uid_of x = user\<^sub>1) \<and>

1039 can_exec root (xs @ [Rmdir user\<^sub>1 [user\<^sub>1, name\<^sub>1]]))"

1040 proof -

1041 from cannot_rmdir init_invariant preserve_invariant

1042 and bogus show ?thesis by (rule general_procedure)

1043 qed

1045 end

1047 end