src/HOL/Unix/Unix.thy
 author wenzelm Mon Aug 31 21:28:08 2015 +0200 (2015-08-31) changeset 61070 b72a990adfe2 parent 60374 6b858199f240 child 61893 4121cc8479f8 permissions -rw-r--r--
prefer symbols;
     1 (*  Title:      HOL/Unix/Unix.thy

     2     Author:     Markus Wenzel, TU Muenchen

     3 *)

     4

     5 section \<open>Unix file-systems \label{sec:unix-file-system}\<close>

     6

     7 theory Unix

     8 imports

     9   Nested_Environment

    10   "~~/src/HOL/Library/Sublist"

    11 begin

    12

    13 text \<open>

    14   We give a simple mathematical model of the basic structures

    15   underlying the Unix file-system, together with a few fundamental

    16   operations that could be imagined to be performed internally by the

    17   Unix kernel.  This forms the basis for the set of Unix system-calls

    18   to be introduced later (see \secref{sec:unix-trans}), which are the

    19   actual interface offered to processes running in user-space.

    20

    21   \medskip Basically, any Unix file is either a \emph{plain file} or a

    22   \emph{directory}, consisting of some \emph{content} plus

    23   \emph{attributes}.  The content of a plain file is plain text.  The

    24   content of a directory is a mapping from names to further

    25   files.\footnote{In fact, this is the only way that names get

    26   associated with files.  In Unix files do \emph{not} have a name in

    27   itself.  Even more, any number of names may be associated with the

    28   very same file due to \emph{hard links} (although this is excluded

    29   from our model).}  Attributes include information to control various

    30   ways to access the file (read, write etc.).

    31

    32   Our model will be quite liberal in omitting excessive detail that is

    33   easily seen to be irrelevant'' for the aspects of Unix

    34   file-systems to be discussed here.  First of all, we ignore

    35   character and block special files, pipes, sockets, hard links,

    36   symbolic links, and mount points.

    37 \<close>

    38

    39

    40 subsection \<open>Names\<close>

    41

    42 text \<open>

    43   User ids and file name components shall be represented by natural

    44   numbers (without loss of generality).  We do not bother about

    45   encoding of actual names (e.g.\ strings), nor a mapping between user

    46   names and user ids as would be present in a reality.

    47 \<close>

    48

    49 type_synonym uid = nat

    50 type_synonym name = nat

    51 type_synonym path = "name list"

    52

    53

    54 subsection \<open>Attributes\<close>

    55

    56 text \<open>

    57   Unix file attributes mainly consist of \emph{owner} information and

    58   a number of \emph{permission} bits which control access for

    59   user'', group'', and others'' (see the Unix man pages @{text

    60   "chmod(2)"} and @{text "stat(2)"} for more details).

    61

    62   \medskip Our model of file permissions only considers the others''

    63   part.  The user'' field may be omitted without loss of overall

    64   generality, since the owner is usually able to change it anyway by

    65   performing @{text chmod}.\footnote{The inclined Unix expert may try

    66   to figure out some exotic arrangements of a real-world Unix

    67   file-system such that the owner of a file is unable to apply the

    68   @{text chmod} system call.}  We omit group'' permissions as a

    69   genuine simplification as we just do not intend to discuss a model

    70   of multiple groups and group membership, but pretend that everyone

    71   is member of a single global group.\footnote{A general HOL model of

    72   user group structures and related issues is given in

    73   @{cite "Naraschewski:2001"}.}

    74 \<close>

    75

    76 datatype perm =

    77     Readable

    78   | Writable

    79   | Executable    -- "(ignored)"

    80

    81 type_synonym perms = "perm set"

    82

    83 record att =

    84   owner :: uid

    85   others :: perms

    86

    87 text \<open>

    88   For plain files @{term Readable} and @{term Writable} specify read

    89   and write access to the actual content, i.e.\ the string of text

    90   stored here.  For directories @{term Readable} determines if the set

    91   of entry names may be accessed, and @{term Writable} controls the

    92   ability to create or delete any entries (both plain files or

    93   sub-directories).

    94

    95   As another simplification, we ignore the @{term Executable}

    96   permission altogether.  In reality it would indicate executable

    97   plain files (also known as binaries''), or control actual lookup

    98   of directory entries (recall that mere directory browsing is

    99   controlled via @{term Readable}).  Note that the latter means that

   100   in order to perform any file-system operation whatsoever, all

   101   directories encountered on the path would have to grant @{term

   102   Executable}.  We ignore this detail and pretend that all directories

   103   give @{term Executable} permission to anybody.

   104 \<close>

   105

   106

   107 subsection \<open>Files\<close>

   108

   109 text \<open>

   110   In order to model the general tree structure of a Unix file-system

   111   we use the arbitrarily branching datatype @{typ "('a, 'b, 'c) env"}

   112   from the standard library of Isabelle/HOL

   113   @{cite "Bauer-et-al:2002:HOL-Library"}.  This type provides

   114   constructors @{term Val} and @{term Env} as follows:

   115

   116   \medskip

   117   {\def\isastyleminor{\isastyle}

   118   \begin{tabular}{l}

   119   @{term [source] "Val :: 'a \<Rightarrow> ('a, 'b, 'c) env"} \\

   120   @{term [source] "Env :: 'b \<Rightarrow> ('c \<Rightarrow> ('a, 'b, 'c) env option) \<Rightarrow> ('a, 'b, 'c) env"} \\

   121   \end{tabular}}

   122   \medskip

   123

   124   Here the parameter @{typ 'a} refers to plain values occurring at

   125   leaf positions, parameter @{typ 'b} to information kept with inner

   126   branch nodes, and parameter @{typ 'c} to the branching type of the

   127   tree structure.  For our purpose we use the type instance with @{typ

   128   "att \<times> string"} (representing plain files), @{typ att} (for

   129   attributes of directory nodes), and @{typ name} (for the index type

   130   of directory nodes).

   131 \<close>

   132

   133 type_synonym "file" = "(att \<times> string, att, name) env"

   134

   135 text \<open>

   136   \medskip The HOL library also provides @{term lookup} and @{term

   137   update} operations for general tree structures with the subsequent

   138   primitive recursive characterizations.

   139

   140   \medskip

   141   {\def\isastyleminor{\isastyle}

   142   \begin{tabular}{l}

   143   @{term [source] "lookup :: ('a, 'b, 'c) env \<Rightarrow> 'c list \<Rightarrow> ('a, 'b, 'c) env option"} \\

   144   @{term [source]

   145   "update :: 'c list \<Rightarrow> ('a, 'b, 'c) env option \<Rightarrow> ('a, 'b, 'c) env \<Rightarrow> ('a, 'b, 'c) env"} \\

   146   \end{tabular}}

   147

   148   @{thm [display, indent = 2, eta_contract = false] lookup_eq [no_vars]}

   149   @{thm [display, indent = 2, eta_contract = false] update_eq [no_vars]}

   150

   151   Several further properties of these operations are proven in

   152   @{cite "Bauer-et-al:2002:HOL-Library"}.  These will be routinely used

   153   later on without further notice.

   154

   155   \bigskip Apparently, the elements of type @{typ "file"} contain an

   156   @{typ att} component in either case.  We now define a few auxiliary

   157   operations to manipulate this field uniformly, following the

   158   conventions for record types in Isabelle/HOL

   159   @{cite "Nipkow-et-al:2000:HOL"}.

   160 \<close>

   161

   162 definition

   163   "attributes file =

   164     (case file of

   165       Val (att, text) \<Rightarrow> att

   166     | Env att dir \<Rightarrow> att)"

   167

   168 definition

   169   "map_attributes f file =

   170     (case file of

   171       Val (att, text) \<Rightarrow> Val (f att, text)

   172     | Env att dir \<Rightarrow> Env (f att) dir)"

   173

   174 lemma [simp]: "attributes (Val (att, text)) = att"

   175   by (simp add: attributes_def)

   176

   177 lemma [simp]: "attributes (Env att dir) = att"

   178   by (simp add: attributes_def)

   179

   180 lemma [simp]: "attributes (map_attributes f file) = f (attributes file)"

   181   by (cases "file") (simp_all add: attributes_def map_attributes_def

   182     split_tupled_all)

   183

   184 lemma [simp]: "map_attributes f (Val (att, text)) = Val (f att, text)"

   185   by (simp add: map_attributes_def)

   186

   187 lemma [simp]: "map_attributes f (Env att dir) = Env (f att) dir"

   188   by (simp add: map_attributes_def)

   189

   190

   191 subsection \<open>Initial file-systems\<close>

   192

   193 text \<open>

   194   Given a set of \emph{known users} a file-system shall be initialized

   195   by providing an empty home directory for each user, with read-only

   196   access for everyone else.  (Note that we may directly use the user

   197   id as home directory name, since both types have been identified.)

   198   Certainly, the very root directory is owned by the super user (who

   199   has user id 0).

   200 \<close>

   201

   202 definition

   203   "init users =

   204     Env \<lparr>owner = 0, others = {Readable}\<rparr>

   205       (\<lambda>u. if u \<in> users then Some (Env \<lparr>owner = u, others = {Readable}\<rparr> empty)

   206         else None)"

   207

   208

   209 subsection \<open>Accessing file-systems\<close>

   210

   211 text \<open>

   212   The main internal file-system operation is access of a file by a

   213   user, requesting a certain set of permissions.  The resulting @{typ

   214   "file option"} indicates if a file had been present at the

   215   corresponding @{typ path} and if access was granted according to the

   216   permissions recorded within the file-system.

   217

   218   Note that by the rules of Unix file-system security (e.g.\

   219   @{cite "Tanenbaum:1992"}) both the super-user and owner may always

   220   access a file unconditionally (in our simplified model).

   221 \<close>

   222

   223 definition

   224   "access root path uid perms =

   225     (case lookup root path of

   226       None \<Rightarrow> None

   227     | Some file \<Rightarrow>

   228         if uid = 0

   229           \<or> uid = owner (attributes file)

   230           \<or> perms \<subseteq> others (attributes file)

   231         then Some file

   232         else None)"

   233

   234 text \<open>

   235   \medskip Successful access to a certain file is the main

   236   prerequisite for system-calls to be applicable (cf.\

   237   \secref{sec:unix-trans}).  Any modification of the file-system is

   238   then performed using the basic @{term update} operation.

   239

   240   \medskip We see that @{term access} is just a wrapper for the basic

   241   @{term lookup} function, with additional checking of

   242   attributes. Subsequently we establish a few auxiliary facts that

   243   stem from the primitive @{term lookup} used within @{term access}.

   244 \<close>

   245

   246 lemma access_empty_lookup: "access root path uid {} = lookup root path"

   247   by (simp add: access_def split: option.splits)

   248

   249 lemma access_some_lookup:

   250   "access root path uid perms = Some file \<Longrightarrow>

   251     lookup root path = Some file"

   252   by (simp add: access_def split: option.splits if_splits)

   253

   254 lemma access_update_other:

   255   assumes parallel: "path' \<parallel> path"

   256   shows "access (update path' opt root) path uid perms = access root path uid perms"

   257 proof -

   258   from parallel obtain y z xs ys zs where

   259       "y \<noteq> z" and "path' = xs @ y # ys" and "path = xs @ z # zs"

   260     by (blast dest: parallel_decomp)

   261   then have "lookup (update path' opt root) path = lookup root path"

   262     by (blast intro: lookup_update_other)

   263   then show ?thesis by (simp only: access_def)

   264 qed

   265

   266

   267 section \<open>File-system transitions \label{sec:unix-trans}\<close>

   268

   269 subsection \<open>Unix system calls \label{sec:unix-syscall}\<close>

   270

   271 text \<open>

   272   According to established operating system design (cf.\

   273   @{cite "Tanenbaum:1992"}) user space processes may only initiate system

   274   operations by a fixed set of \emph{system-calls}.  This enables the

   275   kernel to enforce certain security policies in the first

   276   place.\footnote{Incidently, this is the very same principle employed

   277   by any LCF-style'' theorem proving system according to Milner's

   278   principle of correctness by construction'', such as Isabelle/HOL

   279   itself.}

   280

   281   \medskip In our model of Unix we give a fixed datatype @{text

   282   operation} for the syntax of system-calls, together with an

   283   inductive definition of file-system state transitions of the form

   284   @{text "root \<midarrow>x\<rightarrow> root'"} for the operational semantics.

   285 \<close>

   286

   287 datatype operation =

   288     Read uid string path

   289   | Write uid string path

   290   | Chmod uid perms path

   291   | Creat uid perms path

   292   | Unlink uid path

   293   | Mkdir uid perms path

   294   | Rmdir uid path

   295   | Readdir uid "name set" path

   296

   297 text \<open>

   298   The @{typ uid} field of an operation corresponds to the

   299   \emph{effective user id} of the underlying process, although our

   300   model never mentions processes explicitly.  The other parameters are

   301   provided as arguments by the caller; the @{term path} one is common

   302   to all kinds of system-calls.

   303 \<close>

   304

   305 primrec uid_of :: "operation \<Rightarrow> uid"

   306 where

   307   "uid_of (Read uid text path) = uid"

   308 | "uid_of (Write uid text path) = uid"

   309 | "uid_of (Chmod uid perms path) = uid"

   310 | "uid_of (Creat uid perms path) = uid"

   311 | "uid_of (Unlink uid path) = uid"

   312 | "uid_of (Mkdir uid path perms) = uid"

   313 | "uid_of (Rmdir uid path) = uid"

   314 | "uid_of (Readdir uid names path) = uid"

   315

   316 primrec path_of :: "operation \<Rightarrow> path"

   317 where

   318   "path_of (Read uid text path) = path"

   319 | "path_of (Write uid text path) = path"

   320 | "path_of (Chmod uid perms path) = path"

   321 | "path_of (Creat uid perms path) = path"

   322 | "path_of (Unlink uid path) = path"

   323 | "path_of (Mkdir uid perms path) = path"

   324 | "path_of (Rmdir uid path) = path"

   325 | "path_of (Readdir uid names path) = path"

   326

   327 text \<open>

   328   \medskip Note that we have omitted explicit @{text Open} and @{text

   329   Close} operations, pretending that @{term Read} and @{term Write}

   330   would already take care of this behind the scenes.  Thus we have

   331   basically treated actual sequences of real system-calls @{text

   332   "open"}--@{text read}/@{text write}--@{text close} as atomic.

   333

   334   In principle, this could make big a difference in a model with

   335   explicit concurrent processes.  On the other hand, even on a real

   336   Unix system the exact scheduling of concurrent @{text "open"} and

   337   @{text close} operations does \emph{not} directly affect the success

   338   of corresponding @{text read} or @{text write}.  Unix allows several

   339   processes to have files opened at the same time, even for writing!

   340   Certainly, the result from reading the contents later may be hard to

   341   predict, but the system-calls involved here will succeed in any

   342   case.

   343

   344   \bigskip The operational semantics of system calls is now specified

   345   via transitions of the file-system configuration.  This is expressed

   346   as an inductive relation (although there is no actual recursion

   347   involved here).

   348 \<close>

   349

   350 inductive transition :: "file \<Rightarrow> operation \<Rightarrow> file \<Rightarrow> bool"

   351   ("_ \<midarrow>_\<rightarrow> _" [90, 1000, 90] 100)

   352 where

   353   read:

   354     "access root path uid {Readable} = Some (Val (att, text)) \<Longrightarrow>

   355       root \<midarrow>(Read uid text path)\<rightarrow> root" |

   356   "write":

   357     "access root path uid {Writable} = Some (Val (att, text')) \<Longrightarrow>

   358       root \<midarrow>(Write uid text path)\<rightarrow> update path (Some (Val (att, text))) root" |

   359

   360   chmod:

   361     "access root path uid {} = Some file \<Longrightarrow>

   362       uid = 0 \<or> uid = owner (attributes file) \<Longrightarrow>

   363       root \<midarrow>(Chmod uid perms path)\<rightarrow> update path

   364         (Some (map_attributes (others_update (\<lambda>_. perms)) file)) root" |

   365

   366   creat:

   367     "path = parent_path @ [name] \<Longrightarrow>

   368       access root parent_path uid {Writable} = Some (Env att parent) \<Longrightarrow>

   369       access root path uid {} = None \<Longrightarrow>

   370       root \<midarrow>(Creat uid perms path)\<rightarrow> update path

   371         (Some (Val (\<lparr>owner = uid, others = perms\<rparr>, []))) root" |

   372   unlink:

   373     "path = parent_path @ [name] \<Longrightarrow>

   374       access root parent_path uid {Writable} = Some (Env att parent) \<Longrightarrow>

   375       access root path uid {} = Some (Val plain) \<Longrightarrow>

   376       root \<midarrow>(Unlink uid path)\<rightarrow> update path None root" |

   377

   378   mkdir:

   379     "path = parent_path @ [name] \<Longrightarrow>

   380       access root parent_path uid {Writable} = Some (Env att parent) \<Longrightarrow>

   381       access root path uid {} = None \<Longrightarrow>

   382       root \<midarrow>(Mkdir uid perms path)\<rightarrow> update path

   383         (Some (Env \<lparr>owner = uid, others = perms\<rparr> empty)) root" |

   384   rmdir:

   385     "path = parent_path @ [name] \<Longrightarrow>

   386       access root parent_path uid {Writable} = Some (Env att parent) \<Longrightarrow>

   387       access root path uid {} = Some (Env att' empty) \<Longrightarrow>

   388       root \<midarrow>(Rmdir uid path)\<rightarrow> update path None root" |

   389

   390   readdir:

   391     "access root path uid {Readable} = Some (Env att dir) \<Longrightarrow>

   392       names = dom dir \<Longrightarrow>

   393       root \<midarrow>(Readdir uid names path)\<rightarrow> root"

   394

   395 text \<open>

   396   \medskip Certainly, the above specification is central to the whole

   397   formal development.  Any of the results to be established later on

   398   are only meaningful to the outside world if this transition system

   399   provides an adequate model of real Unix systems.  This kind of

   400   reality-check'' of a formal model is the well-known problem of

   401   \emph{validation}.

   402

   403   If in doubt, one may consider to compare our definition with the

   404   informal specifications given the corresponding Unix man pages, or

   405   even peek at an actual implementation such as

   406   @{cite "Torvalds-et-al:Linux"}.  Another common way to gain confidence

   407   into the formal model is to run simple simulations (see

   408   \secref{sec:unix-examples}), and check the results with that of

   409   experiments performed on a real Unix system.

   410 \<close>

   411

   412

   413 subsection \<open>Basic properties of single transitions \label{sec:unix-single-trans}\<close>

   414

   415 text \<open>

   416   The transition system @{text "root \<midarrow>x\<rightarrow> root'"} defined above

   417   determines a unique result @{term root'} from given @{term root} and

   418   @{term x} (this is holds rather trivially, since there is even only

   419   one clause for each operation).  This uniqueness statement will

   420   simplify our subsequent development to some extent, since we only

   421   have to reason about a partial function rather than a general

   422   relation.

   423 \<close>

   424

   425 theorem transition_uniq:

   426   assumes root': "root \<midarrow>x\<rightarrow> root'"

   427     and root'': "root \<midarrow>x\<rightarrow> root''"

   428   shows "root' = root''"

   429   using root''

   430 proof cases

   431   case read

   432   with root' show ?thesis by cases auto

   433 next

   434   case "write"

   435   with root' show ?thesis by cases auto

   436 next

   437   case chmod

   438   with root' show ?thesis by cases auto

   439 next

   440   case creat

   441   with root' show ?thesis by cases auto

   442 next

   443   case unlink

   444   with root' show ?thesis by cases auto

   445 next

   446   case mkdir

   447   with root' show ?thesis by cases auto

   448 next

   449   case rmdir

   450   with root' show ?thesis by cases auto

   451 next

   452   case readdir

   453   with root' show ?thesis by cases blast+

   454 qed

   455

   456 text \<open>

   457   \medskip Apparently, file-system transitions are \emph{type-safe} in

   458   the sense that the result of transforming an actual directory yields

   459   again a directory.

   460 \<close>

   461

   462 theorem transition_type_safe:

   463   assumes tr: "root \<midarrow>x\<rightarrow> root'"

   464     and inv: "\<exists>att dir. root = Env att dir"

   465   shows "\<exists>att dir. root' = Env att dir"

   466 proof (cases "path_of x")

   467   case Nil

   468   with tr inv show ?thesis

   469     by cases (auto simp add: access_def split: if_splits)

   470 next

   471   case Cons

   472   from tr obtain opt where

   473       "root' = root \<or> root' = update (path_of x) opt root"

   474     by cases auto

   475   with inv Cons show ?thesis

   476     by (auto simp add: update_eq split: list.splits)

   477 qed

   478

   479 text \<open>

   480   The previous result may be seen as the most basic invariant on the

   481   file-system state that is enforced by any proper kernel

   482   implementation.  So user processes --- being bound to the

   483   system-call interface --- may never mess up a file-system such that

   484   the root becomes a plain file instead of a directory, which would be

   485   a strange situation indeed.

   486 \<close>

   487

   488

   489 subsection \<open>Iterated transitions\<close>

   490

   491 text \<open>

   492   Iterated system transitions via finite sequences of system

   493   operations are modeled inductively as follows.  In a sense, this

   494   relation describes the cumulative effect of the sequence of

   495   system-calls issued by a number of running processes over a finite

   496   amount of time.

   497 \<close>

   498

   499 inductive transitions :: "file \<Rightarrow> operation list \<Rightarrow> file \<Rightarrow> bool"

   500   ("_ =_\<Rightarrow> _" [90, 1000, 90] 100)

   501 where

   502   nil: "root =[]\<Rightarrow> root"

   503 | cons: "root \<midarrow>x\<rightarrow> root' \<Longrightarrow> root' =xs\<Rightarrow> root'' \<Longrightarrow> root =(x # xs)\<Rightarrow> root''"

   504

   505 text \<open>

   506   \medskip We establish a few basic facts relating iterated

   507   transitions with single ones, according to the recursive structure

   508   of lists.

   509 \<close>

   510

   511 lemma transitions_nil_eq: "root =[]\<Rightarrow> root' = (root = root')"

   512 proof

   513   assume "root =[]\<Rightarrow> root'"

   514   then show "root = root'" by cases simp_all

   515 next

   516   assume "root = root'"

   517   then show "root =[]\<Rightarrow> root'" by (simp only: transitions.nil)

   518 qed

   519

   520 lemma transitions_cons_eq:

   521   "root =(x # xs)\<Rightarrow> root'' = (\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' =xs\<Rightarrow> root'')"

   522 proof

   523   assume "root =(x # xs)\<Rightarrow> root''"

   524   then show "\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' =xs\<Rightarrow> root''"

   525     by cases auto

   526 next

   527   assume "\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' =xs\<Rightarrow> root''"

   528   then show "root =(x # xs)\<Rightarrow> root''"

   529     by (blast intro: transitions.cons)

   530 qed

   531

   532 text \<open>

   533   The next two rules show how to destruct'' known transition

   534   sequences.  Note that the second one actually relies on the

   535   uniqueness property of the basic transition system (see

   536   \secref{sec:unix-single-trans}).

   537 \<close>

   538

   539 lemma transitions_nilD: "root =[]\<Rightarrow> root' \<Longrightarrow> root' = root"

   540   by (simp add: transitions_nil_eq)

   541

   542 lemma transitions_consD:

   543   assumes list: "root =(x # xs)\<Rightarrow> root''"

   544     and hd: "root \<midarrow>x\<rightarrow> root'"

   545   shows "root' =xs\<Rightarrow> root''"

   546 proof -

   547   from list obtain r' where r': "root \<midarrow>x\<rightarrow> r'" and root'': "r' =xs\<Rightarrow> root''"

   548     by cases simp_all

   549   from r' hd have "r' = root'" by (rule transition_uniq)

   550   with root'' show "root' =xs\<Rightarrow> root''" by simp

   551 qed

   552

   553 text \<open>

   554   \medskip The following fact shows how an invariant @{term Q} of

   555   single transitions with property @{term P} may be transferred to

   556   iterated transitions.  The proof is rather obvious by rule induction

   557   over the definition of @{term "root =xs\<Rightarrow> root'"}.

   558 \<close>

   559

   560 lemma transitions_invariant:

   561   assumes r: "\<And>r x r'. r \<midarrow>x\<rightarrow> r' \<Longrightarrow> Q r \<Longrightarrow> P x \<Longrightarrow> Q r'"

   562     and trans: "root =xs\<Rightarrow> root'"

   563   shows "Q root \<Longrightarrow> \<forall>x \<in> set xs. P x \<Longrightarrow> Q root'"

   564   using trans

   565 proof induct

   566   case nil

   567   show ?case by (rule nil.prems)

   568 next

   569   case (cons root x root' xs root'')

   570   note P = \<open>\<forall>x \<in> set (x # xs). P x\<close>

   571   then have "P x" by simp

   572   with \<open>root \<midarrow>x\<rightarrow> root'\<close> and \<open>Q root\<close> have Q': "Q root'" by (rule r)

   573   from P have "\<forall>x \<in> set xs. P x" by simp

   574   with Q' show "Q root''" by (rule cons.hyps)

   575 qed

   576

   577 text \<open>

   578   As an example of applying the previous result, we transfer the basic

   579   type-safety property (see \secref{sec:unix-single-trans}) from

   580   single transitions to iterated ones, which is a rather obvious

   581   result indeed.

   582 \<close>

   583

   584 theorem transitions_type_safe:

   585   assumes "root =xs\<Rightarrow> root'"

   586     and "\<exists>att dir. root = Env att dir"

   587   shows "\<exists>att dir. root' = Env att dir"

   588   using transition_type_safe and assms

   589 proof (rule transitions_invariant)

   590   show "\<forall>x \<in> set xs. True" by blast

   591 qed

   592

   593

   594 section \<open>Executable sequences\<close>

   595

   596 text \<open>

   597   An inductively defined relation such as the one of @{text "root \<midarrow>x\<rightarrow>

   598   root'"} (see \secref{sec:unix-syscall}) has two main aspects.  First

   599   of all, the resulting system admits a certain set of transition

   600   rules (introductions) as given in the specification.  Furthermore,

   601   there is an explicit least-fixed-point construction involved, which

   602   results in induction (and case-analysis) rules to eliminate known

   603   transitions in an exhaustive manner.

   604

   605   \medskip Subsequently, we explore our transition system in an

   606   experimental style, mainly using the introduction rules with basic

   607   algebraic properties of the underlying structures.  The technique

   608   closely resembles that of Prolog combined with functional evaluation

   609   in a very simple manner.

   610

   611   Just as the closed-world assumption'' is left implicit in Prolog,

   612   we do not refer to induction over the whole transition system here.

   613   So this is still purely positive reasoning about possible

   614   executions; exhaustive reasoning will be employed only later on (see

   615   \secref{sec:unix-bogosity}), when we shall demonstrate that certain

   616   behavior is \emph{not} possible.

   617 \<close>

   618

   619

   620 subsection \<open>Possible transitions\<close>

   621

   622 text \<open>

   623   Rather obviously, a list of system operations can be executed within

   624   a certain state if there is a result state reached by an iterated

   625   transition.

   626 \<close>

   627

   628 definition "can_exec root xs = (\<exists>root'. root =xs\<Rightarrow> root')"

   629

   630 lemma can_exec_nil: "can_exec root []"

   631   unfolding can_exec_def by (blast intro: transitions.intros)

   632

   633 lemma can_exec_cons:

   634     "root \<midarrow>x\<rightarrow> root' \<Longrightarrow> can_exec root' xs \<Longrightarrow> can_exec root (x # xs)"

   635   unfolding can_exec_def by (blast intro: transitions.intros)

   636

   637 text \<open>

   638   \medskip In case that we already know that a certain sequence can be

   639   executed we may destruct it backwardly into individual transitions.

   640 \<close>

   641

   642 lemma can_exec_snocD: "can_exec root (xs @ [y])

   643     \<Longrightarrow> \<exists>root' root''. root =xs\<Rightarrow> root' \<and> root' \<midarrow>y\<rightarrow> root''"

   644 proof (induct xs arbitrary: root)

   645   case Nil

   646   then show ?case

   647     by (simp add: can_exec_def transitions_nil_eq transitions_cons_eq)

   648 next

   649   case (Cons x xs)

   650   from \<open>can_exec root ((x # xs) @ [y])\<close> obtain r root'' where

   651       x: "root \<midarrow>x\<rightarrow> r" and

   652       xs_y: "r =(xs @ [y])\<Rightarrow> root''"

   653     by (auto simp add: can_exec_def transitions_nil_eq transitions_cons_eq)

   654   from xs_y Cons.hyps obtain root' r' where xs: "r =xs\<Rightarrow> root'" and y: "root' \<midarrow>y\<rightarrow> r'"

   655     unfolding can_exec_def by blast

   656   from x xs have "root =(x # xs)\<Rightarrow> root'"

   657     by (rule transitions.cons)

   658   with y show ?case by blast

   659 qed

   660

   661

   662 subsection \<open>Example executions \label{sec:unix-examples}\<close>

   663

   664 text \<open>

   665   We are now ready to perform a few experiments within our formal

   666   model of Unix system-calls.  The common technique is to alternate

   667   introduction rules of the transition system (see

   668   \secref{sec:unix-trans}), and steps to solve any emerging side

   669   conditions using algebraic properties of the underlying file-system

   670   structures (see \secref{sec:unix-file-system}).

   671 \<close>

   672

   673 lemmas eval = access_def init_def

   674

   675 theorem "u \<in> users \<Longrightarrow> can_exec (init users)

   676     [Mkdir u perms [u, name]]"

   677   apply (rule can_exec_cons)

   678     -- \<open>back-chain @{term can_exec} (of @{term [source] Cons})\<close>

   679   apply (rule mkdir)

   680     -- \<open>back-chain @{term Mkdir}\<close>

   681   apply (force simp add: eval)+

   682     -- \<open>solve preconditions of @{term Mkdir}\<close>

   683   apply (simp add: eval)

   684     -- \<open>peek at resulting dir (optional)\<close>

   685   txt \<open>@{subgoals [display]}\<close>

   686   apply (rule can_exec_nil)

   687     -- \<open>back-chain @{term can_exec} (of @{term [source] Nil})\<close>

   688   done

   689

   690 text \<open>

   691   By inspecting the result shown just before the final step above, we

   692   may gain confidence that our specification of Unix system-calls

   693   actually makes sense.  Further common errors are usually exhibited

   694   when preconditions of transition rules fail unexpectedly.

   695

   696   \medskip Here are a few further experiments, using the same

   697   techniques as before.

   698 \<close>

   699

   700 theorem "u \<in> users \<Longrightarrow> can_exec (init users)

   701    [Creat u perms [u, name],

   702     Unlink u [u, name]]"

   703   apply (rule can_exec_cons)

   704   apply (rule creat)

   705   apply (force simp add: eval)+

   706   apply (simp add: eval)

   707   apply (rule can_exec_cons)

   708   apply (rule unlink)

   709   apply (force simp add: eval)+

   710   apply (simp add: eval)

   711   txt \<open>peek at result: @{subgoals [display]}\<close>

   712   apply (rule can_exec_nil)

   713   done

   714

   715 theorem "u \<in> users \<Longrightarrow> Writable \<in> perms\<^sub>1 \<Longrightarrow>

   716   Readable \<in> perms\<^sub>2 \<Longrightarrow> name\<^sub>3 \<noteq> name\<^sub>4 \<Longrightarrow>

   717   can_exec (init users)

   718    [Mkdir u perms\<^sub>1 [u, name\<^sub>1],

   719     Mkdir u' perms\<^sub>2 [u, name\<^sub>1, name\<^sub>2],

   720     Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],

   721     Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>4],

   722     Readdir u {name\<^sub>3, name\<^sub>4} [u, name\<^sub>1, name\<^sub>2]]"

   723   apply (rule can_exec_cons, rule transition.intros,

   724     (force simp add: eval)+, (simp add: eval)?)+

   725   txt \<open>peek at result: @{subgoals [display]}\<close>

   726   apply (rule can_exec_nil)

   727   done

   728

   729 theorem "u \<in> users \<Longrightarrow> Writable \<in> perms\<^sub>1 \<Longrightarrow> Readable \<in> perms\<^sub>3 \<Longrightarrow>

   730   can_exec (init users)

   731    [Mkdir u perms\<^sub>1 [u, name\<^sub>1],

   732     Mkdir u' perms\<^sub>2 [u, name\<^sub>1, name\<^sub>2],

   733     Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],

   734     Write u' ''foo'' [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],

   735     Read u ''foo'' [u, name\<^sub>1, name\<^sub>2, name\<^sub>3]]"

   736   apply (rule can_exec_cons, rule transition.intros,

   737     (force simp add: eval)+, (simp add: eval)?)+

   738   txt \<open>peek at result: @{subgoals [display]}\<close>

   739   apply (rule can_exec_nil)

   740   done

   741

   742

   743 section \<open>Odd effects --- treated formally \label{sec:unix-bogosity}\<close>

   744

   745 text \<open>

   746   We are now ready to give a completely formal treatment of the

   747   slightly odd situation discussed in the introduction (see

   748   \secref{sec:unix-intro}): the file-system can easily reach a state

   749   where a user is unable to remove his very own directory, because it

   750   is still populated by items placed there by another user in an

   751   uncouth manner.

   752 \<close>

   753

   754 subsection \<open>The general procedure \label{sec:unix-inv-procedure}\<close>

   755

   756 text \<open>

   757   The following theorem expresses the general procedure we are

   758   following to achieve the main result.

   759 \<close>

   760

   761 theorem general_procedure:

   762   assumes cannot_y: "\<And>r r'. Q r \<Longrightarrow> r \<midarrow>y\<rightarrow> r' \<Longrightarrow> False"

   763     and init_inv: "\<And>root. init users =bs\<Rightarrow> root \<Longrightarrow> Q root"

   764     and preserve_inv: "\<And>r x r'. r \<midarrow>x\<rightarrow> r' \<Longrightarrow> Q r \<Longrightarrow> P x \<Longrightarrow> Q r'"

   765     and init_result: "init users =bs\<Rightarrow> root"

   766   shows "\<not> (\<exists>xs. (\<forall>x \<in> set xs. P x) \<and> can_exec root (xs @ [y]))"

   767 proof -

   768   {

   769     fix xs

   770     assume Ps: "\<forall>x \<in> set xs. P x"

   771     assume can_exec: "can_exec root (xs @ [y])"

   772     then obtain root' root'' where

   773         xs: "root =xs\<Rightarrow> root'" and y: "root' \<midarrow>y\<rightarrow> root''"

   774       by (blast dest: can_exec_snocD)

   775     from init_result have "Q root" by (rule init_inv)

   776     from preserve_inv xs this Ps have "Q root'"

   777       by (rule transitions_invariant)

   778     from this y have False by (rule cannot_y)

   779   }

   780   then show ?thesis by blast

   781 qed

   782

   783 text \<open>

   784   Here @{prop "P x"} refers to the restriction on file-system

   785   operations that are admitted after having reached the critical

   786   configuration; according to the problem specification this will

   787   become @{prop "uid_of x = user\<^sub>1"} later on.  Furthermore,

   788   @{term y} refers to the operations we claim to be impossible to

   789   perform afterwards, we will take @{term Rmdir} later.  Moreover

   790   @{term Q} is a suitable (auxiliary) invariant over the file-system;

   791   choosing @{term Q} adequately is very important to make the proof

   792   work (see \secref{sec:unix-inv-lemmas}).

   793 \<close>

   794

   795

   796 subsection \<open>The particular situation\<close>

   797

   798 text \<open>

   799   We introduce a few global declarations and axioms to describe our

   800   particular situation considered here.  Thus we avoid excessive use

   801   of local parameters in the subsequent development.

   802 \<close>

   803

   804 context

   805   fixes users :: "uid set"

   806     and user\<^sub>1 :: uid

   807     and user\<^sub>2 :: uid

   808     and name\<^sub>1 :: name

   809     and name\<^sub>2 :: name

   810     and name\<^sub>3 :: name

   811     and perms\<^sub>1 :: perms

   812     and perms\<^sub>2 :: perms

   813   assumes user\<^sub>1_known: "user\<^sub>1 \<in> users"

   814     and user\<^sub>1_not_root: "user\<^sub>1 \<noteq> 0"

   815     and users_neq: "user\<^sub>1 \<noteq> user\<^sub>2"

   816     and perms\<^sub>1_writable: "Writable \<in> perms\<^sub>1"

   817     and perms\<^sub>2_not_writable: "Writable \<notin> perms\<^sub>2"

   818   notes facts = user\<^sub>1_known user\<^sub>1_not_root users_neq

   819     perms\<^sub>1_writable perms\<^sub>2_not_writable

   820 begin

   821

   822 definition

   823   "bogus =

   824      [Mkdir user\<^sub>1 perms\<^sub>1 [user\<^sub>1, name\<^sub>1],

   825       Mkdir user\<^sub>2 perms\<^sub>2 [user\<^sub>1, name\<^sub>1, name\<^sub>2],

   826       Creat user\<^sub>2 perms\<^sub>2 [user\<^sub>1, name\<^sub>1, name\<^sub>2, name\<^sub>3]]"

   827

   828 definition "bogus_path = [user\<^sub>1, name\<^sub>1, name\<^sub>2]"

   829

   830 text \<open>

   831   The @{term bogus} operations are the ones that lead into the uncouth

   832   situation; @{term bogus_path} is the key position within the

   833   file-system where things go awry.

   834 \<close>

   835

   836

   837 subsection \<open>Invariance lemmas \label{sec:unix-inv-lemmas}\<close>

   838

   839 text \<open>

   840   The following invariant over the root file-system describes the

   841   bogus situation in an abstract manner: located at a certain @{term

   842   path} within the file-system is a non-empty directory that is

   843   neither owned nor writable by @{term user\<^sub>1}.

   844 \<close>

   845

   846 definition

   847   "invariant root path =

   848     (\<exists>att dir.

   849       access root path user\<^sub>1 {} = Some (Env att dir) \<and> dir \<noteq> empty \<and>

   850       user\<^sub>1 \<noteq> owner att \<and>

   851       access root path user\<^sub>1 {Writable} = None)"

   852

   853 text \<open>

   854   Following the general procedure (see

   855   \secref{sec:unix-inv-procedure}) we will now establish the three key

   856   lemmas required to yield the final result.

   857

   858   \begin{enumerate}

   859

   860   \item The invariant is sufficiently strong to entail the

   861   pathological case that @{term user\<^sub>1} is unable to remove the

   862   (owned) directory at @{term "[user\<^sub>1, name\<^sub>1]"}.

   863

   864   \item The invariant does hold after having executed the @{term

   865   bogus} list of operations (starting with an initial file-system

   866   configuration).

   867

   868   \item The invariant is preserved by any file-system operation

   869   performed by @{term user\<^sub>1} alone, without any help by other

   870   users.

   871

   872   \end{enumerate}

   873

   874   As the invariant appears both as assumptions and conclusions in the

   875   course of proof, its formulation is rather critical for the whole

   876   development to work out properly.  In particular, the third step is

   877   very sensitive to the invariant being either too strong or too weak.

   878   Moreover the invariant has to be sufficiently abstract, lest the

   879   proof become cluttered by confusing detail.

   880

   881   \medskip The first two lemmas are technically straight forward ---

   882   we just have to inspect rather special cases.

   883 \<close>

   884

   885 lemma cannot_rmdir:

   886   assumes inv: "invariant root bogus_path"

   887     and rmdir: "root \<midarrow>(Rmdir user\<^sub>1 [user\<^sub>1, name\<^sub>1])\<rightarrow> root'"

   888   shows False

   889 proof -

   890   from inv obtain "file" where "access root bogus_path user\<^sub>1 {} = Some file"

   891     unfolding invariant_def by blast

   892   moreover

   893   from rmdir obtain att where

   894       "access root [user\<^sub>1, name\<^sub>1] user\<^sub>1 {} = Some (Env att empty)"

   895     by cases auto

   896   then have "access root ([user\<^sub>1, name\<^sub>1] @ [name\<^sub>2]) user\<^sub>1 {} = empty name\<^sub>2"

   897     by (simp only: access_empty_lookup lookup_append_some) simp

   898   ultimately show False by (simp add: bogus_path_def)

   899 qed

   900

   901 lemma

   902   assumes init: "init users =bogus\<Rightarrow> root"

   903   shows init_invariant: "invariant root bogus_path"

   904   supply eval = facts access_def init_def

   905   using init

   906   apply (unfold bogus_def bogus_path_def)

   907   apply (drule transitions_consD, rule transition.intros,

   908       (force simp add: eval)+, (simp add: eval)?)+

   909     -- "evaluate all operations"

   910   apply (drule transitions_nilD)

   911     -- "reach final result"

   912   apply (simp add: invariant_def eval)

   913     -- "check the invariant"

   914   done

   915

   916 text \<open>

   917   \medskip At last we are left with the main effort to show that the

   918   bogosity'' invariant is preserved by any file-system operation

   919   performed by @{term user\<^sub>1} alone.  Note that this holds for

   920   any @{term path} given, the particular @{term bogus_path} is not

   921   required here.

   922 \<close>

   923

   924 lemma preserve_invariant:

   925   assumes tr: "root \<midarrow>x\<rightarrow> root'"

   926     and inv: "invariant root path"

   927     and uid: "uid_of x = user\<^sub>1"

   928   shows "invariant root' path"

   929 proof -

   930   from inv obtain att dir where

   931       inv1: "access root path user\<^sub>1 {} = Some (Env att dir)" and

   932       inv2: "dir \<noteq> empty" and

   933       inv3: "user\<^sub>1 \<noteq> owner att" and

   934       inv4: "access root path user\<^sub>1 {Writable} = None"

   935     by (auto simp add: invariant_def)

   936

   937   from inv1 have lookup: "lookup root path = Some (Env att dir)"

   938     by (simp only: access_empty_lookup)

   939

   940   from inv1 inv3 inv4 and user\<^sub>1_not_root

   941   have not_writable: "Writable \<notin> others att"

   942     by (auto simp add: access_def split: option.splits)

   943

   944   show ?thesis

   945   proof cases

   946     assume "root' = root"

   947     with inv show "invariant root' path" by (simp only:)

   948   next

   949     assume changed: "root' \<noteq> root"

   950     with tr obtain opt where root': "root' = update (path_of x) opt root"

   951       by cases auto

   952     show ?thesis

   953     proof (rule prefixeq_cases)

   954       assume "path_of x \<parallel> path"

   955       with inv root'

   956       have "\<And>perms. access root' path user\<^sub>1 perms = access root path user\<^sub>1 perms"

   957         by (simp only: access_update_other)

   958       with inv show "invariant root' path"

   959         by (simp only: invariant_def)

   960     next

   961       assume "prefixeq (path_of x) path"

   962       then obtain ys where path: "path = path_of x @ ys" ..

   963

   964       show ?thesis

   965       proof (cases ys)

   966         assume "ys = []"

   967         with tr uid inv2 inv3 lookup changed path and user\<^sub>1_not_root

   968         have False

   969           by cases (auto simp add: access_empty_lookup dest: access_some_lookup)

   970         then show ?thesis ..

   971       next

   972         fix z zs assume ys: "ys = z # zs"

   973         have "lookup root' path = lookup root path"

   974         proof -

   975           from inv2 lookup path ys

   976           have look: "lookup root (path_of x @ z # zs) = Some (Env att dir)"

   977             by (simp only:)

   978           then obtain att' dir' file' where

   979               look': "lookup root (path_of x) = Some (Env att' dir')" and

   980               dir': "dir' z = Some file'" and

   981               file': "lookup file' zs = Some (Env att dir)"

   982             by (blast dest: lookup_some_upper)

   983

   984           from tr uid changed look' dir' obtain att'' where

   985               look'': "lookup root' (path_of x) = Some (Env att'' dir')"

   986             by cases (auto simp add: access_empty_lookup lookup_update_some

   987               dest: access_some_lookup)

   988           with dir' file' have "lookup root' (path_of x @ z # zs) =

   989               Some (Env att dir)"

   990             by (simp add: lookup_append_some)

   991           with look path ys show ?thesis

   992             by simp

   993         qed

   994         with inv show "invariant root' path"

   995           by (simp only: invariant_def access_def)

   996       qed

   997     next

   998       assume "prefix path (path_of x)"

   999       then obtain y ys where path: "path_of x = path @ y # ys" ..

  1000

  1001       obtain dir' where

  1002         lookup': "lookup root' path = Some (Env att dir')" and

  1003         inv2': "dir' \<noteq> empty"

  1004       proof (cases ys)

  1005         assume "ys = []"

  1006         with path have parent: "path_of x = path @ [y]" by simp

  1007         with tr uid inv4 changed obtain "file" where

  1008             "root' = update (path_of x) (Some file) root"

  1009           by cases auto

  1010         with lookup parent have "lookup root' path = Some (Env att (dir(y\<mapsto>file)))"

  1011           by (simp only: update_append_some update_cons_nil_env)

  1012         moreover have "dir(y\<mapsto>file) \<noteq> empty" by simp

  1013         ultimately show ?thesis ..

  1014       next

  1015         fix z zs assume ys: "ys = z # zs"

  1016         with lookup root' path

  1017         have "lookup root' path = Some (update (y # ys) opt (Env att dir))"

  1018           by (simp only: update_append_some)

  1019         also obtain file' where

  1020           "update (y # ys) opt (Env att dir) = Env att (dir(y\<mapsto>file'))"

  1021         proof -

  1022           have "dir y \<noteq> None"

  1023           proof -

  1024             have "dir y = lookup (Env att dir) [y]"

  1025               by (simp split: option.splits)

  1026             also from lookup have "\<dots> = lookup root (path @ [y])"

  1027               by (simp only: lookup_append_some)

  1028             also have "\<dots> \<noteq> None"

  1029             proof -

  1030               from ys obtain us u where rev_ys: "ys = us @ [u]"

  1031                 by (cases ys rule: rev_cases) simp

  1032               with tr path

  1033               have "lookup root ((path @ [y]) @ (us @ [u])) \<noteq> None \<or>

  1034                   lookup root ((path @ [y]) @ us) \<noteq> None"

  1035                 by cases (auto dest: access_some_lookup)

  1036               then show ?thesis

  1037                 by (fastforce dest!: lookup_some_append)

  1038             qed

  1039             finally show ?thesis .

  1040           qed

  1041           with ys show ?thesis using that by auto

  1042         qed

  1043         also have "dir(y\<mapsto>file') \<noteq> empty" by simp

  1044         ultimately show ?thesis ..

  1045       qed

  1046

  1047       from lookup' have inv1': "access root' path user\<^sub>1 {} = Some (Env att dir')"

  1048         by (simp only: access_empty_lookup)

  1049

  1050       from inv3 lookup' and not_writable user\<^sub>1_not_root

  1051       have "access root' path user\<^sub>1 {Writable} = None"

  1052         by (simp add: access_def)

  1053       with inv1' inv2' inv3 show ?thesis unfolding invariant_def by blast

  1054     qed

  1055   qed

  1056 qed

  1057

  1058

  1059 subsection \<open>Putting it all together \label{sec:unix-main-result}\<close>

  1060

  1061 text \<open>

  1062   The main result is now imminent, just by composing the three

  1063   invariance lemmas (see \secref{sec:unix-inv-lemmas}) according the the

  1064   overall procedure (see \secref{sec:unix-inv-procedure}).

  1065 \<close>

  1066

  1067 corollary

  1068   assumes bogus: "init users =bogus\<Rightarrow> root"

  1069   shows "\<not> (\<exists>xs. (\<forall>x \<in> set xs. uid_of x = user\<^sub>1) \<and>

  1070     can_exec root (xs @ [Rmdir user\<^sub>1 [user\<^sub>1, name\<^sub>1]]))"

  1071 proof -

  1072   from cannot_rmdir init_invariant preserve_invariant

  1073     and bogus show ?thesis by (rule general_procedure)

  1074 qed

  1075

  1076 end

  1077

  1078 end