src/HOL/Unix/Unix.thy
author wenzelm
Sat Nov 04 15:24:40 2017 +0100 (20 months ago)
changeset 67003 49850a679c2c
parent 66453 cc19f7ca2ed6
child 67443 3abf6a722518
permissions -rw-r--r--
more robust sorted_entries;
     1 (*  Title:      HOL/Unix/Unix.thy
     2     Author:     Markus Wenzel, TU Muenchen
     3 *)
     4 
     5 section \<open>Unix file-systems \label{sec:unix-file-system}\<close>
     6 
     7 theory Unix
     8   imports
     9     Nested_Environment
    10     "HOL-Library.Sublist"
    11 begin
    12 
    13 text \<open>
    14   We give a simple mathematical model of the basic structures underlying the
    15   Unix file-system, together with a few fundamental operations that could be
    16   imagined to be performed internally by the Unix kernel. This forms the basis
    17   for the set of Unix system-calls to be introduced later (see
    18   \secref{sec:unix-trans}), which are the actual interface offered to
    19   processes running in user-space.
    20 
    21   \<^medskip>
    22   Basically, any Unix file is either a \<^emph>\<open>plain file\<close> or a \<^emph>\<open>directory\<close>,
    23   consisting of some \<^emph>\<open>content\<close> plus \<^emph>\<open>attributes\<close>. The content of a plain
    24   file is plain text. The content of a directory is a mapping from names to
    25   further files.\<^footnote>\<open>In fact, this is the only way that names get associated with
    26   files. In Unix files do \<^emph>\<open>not\<close> have a name in itself. Even more, any number
    27   of names may be associated with the very same file due to \<^emph>\<open>hard links\<close>
    28   (although this is excluded from our model).\<close> Attributes include information
    29   to control various ways to access the file (read, write etc.).
    30 
    31   Our model will be quite liberal in omitting excessive detail that is easily
    32   seen to be ``irrelevant'' for the aspects of Unix file-systems to be
    33   discussed here. First of all, we ignore character and block special files,
    34   pipes, sockets, hard links, symbolic links, and mount points.
    35 \<close>
    36 
    37 
    38 subsection \<open>Names\<close>
    39 
    40 text \<open>
    41   User ids and file name components shall be represented by natural numbers
    42   (without loss of generality). We do not bother about encoding of actual
    43   names (e.g.\ strings), nor a mapping between user names and user ids as
    44   would be present in a reality.
    45 \<close>
    46 
    47 type_synonym uid = nat
    48 type_synonym name = nat
    49 type_synonym path = "name list"
    50 
    51 
    52 subsection \<open>Attributes\<close>
    53 
    54 text \<open>
    55   Unix file attributes mainly consist of \<^emph>\<open>owner\<close> information and a number of
    56   \<^emph>\<open>permission\<close> bits which control access for ``user'', ``group'', and
    57   ``others'' (see the Unix man pages \<open>chmod(2)\<close> and \<open>stat(2)\<close> for more
    58   details).
    59 
    60   \<^medskip>
    61   Our model of file permissions only considers the ``others'' part. The
    62   ``user'' field may be omitted without loss of overall generality, since the
    63   owner is usually able to change it anyway by performing \<open>chmod\<close>.\<^footnote>\<open>The
    64   inclined Unix expert may try to figure out some exotic arrangements of a
    65   real-world Unix file-system such that the owner of a file is unable to apply
    66   the \<open>chmod\<close> system call.\<close> We omit ``group'' permissions as a genuine
    67   simplification as we just do not intend to discuss a model of multiple
    68   groups and group membership, but pretend that everyone is member of a single
    69   global group.\<^footnote>\<open>A general HOL model of user group structures and related
    70   issues is given in @{cite "Naraschewski:2001"}.\<close>
    71 \<close>
    72 
    73 datatype perm =
    74     Readable
    75   | Writable
    76   | Executable    \<comment> "(ignored)"
    77 
    78 type_synonym perms = "perm set"
    79 
    80 record att =
    81   owner :: uid
    82   others :: perms
    83 
    84 text \<open>
    85   For plain files @{term Readable} and @{term Writable} specify read and write
    86   access to the actual content, i.e.\ the string of text stored here. For
    87   directories @{term Readable} determines if the set of entry names may be
    88   accessed, and @{term Writable} controls the ability to create or delete any
    89   entries (both plain files or sub-directories).
    90 
    91   As another simplification, we ignore the @{term Executable} permission
    92   altogether. In reality it would indicate executable plain files (also known
    93   as ``binaries''), or control actual lookup of directory entries (recall that
    94   mere directory browsing is controlled via @{term Readable}). Note that the
    95   latter means that in order to perform any file-system operation whatsoever,
    96   all directories encountered on the path would have to grant @{term
    97   Executable}. We ignore this detail and pretend that all directories give
    98   @{term Executable} permission to anybody.
    99 \<close>
   100 
   101 
   102 subsection \<open>Files\<close>
   103 
   104 text \<open>
   105   In order to model the general tree structure of a Unix file-system we use
   106   the arbitrarily branching datatype @{typ "('a, 'b, 'c) env"} from the
   107   standard library of Isabelle/HOL @{cite "Bauer-et-al:2002:HOL-Library"}.
   108   This type provides constructors @{term Val} and @{term Env} as follows:
   109 
   110   \<^medskip>
   111   {\def\isastyleminor{\isastyle}
   112   \begin{tabular}{l}
   113   @{term [source] "Val :: 'a \<Rightarrow> ('a, 'b, 'c) env"} \\
   114   @{term [source] "Env :: 'b \<Rightarrow> ('c \<Rightarrow> ('a, 'b, 'c) env option) \<Rightarrow> ('a, 'b, 'c) env"} \\
   115   \end{tabular}}
   116   \<^medskip>
   117 
   118   Here the parameter @{typ 'a} refers to plain values occurring at leaf
   119   positions, parameter @{typ 'b} to information kept with inner branch nodes,
   120   and parameter @{typ 'c} to the branching type of the tree structure. For our
   121   purpose we use the type instance with @{typ "att \<times> string"} (representing
   122   plain files), @{typ att} (for attributes of directory nodes), and @{typ
   123   name} (for the index type of directory nodes).
   124 \<close>
   125 
   126 type_synonym "file" = "(att \<times> string, att, name) env"
   127 
   128 text \<open>
   129   \<^medskip>
   130   The HOL library also provides @{term lookup} and @{term update} operations
   131   for general tree structures with the subsequent primitive recursive
   132   characterizations.
   133 
   134   \<^medskip>
   135   {\def\isastyleminor{\isastyle}
   136   \begin{tabular}{l}
   137   @{term [source] "lookup :: ('a, 'b, 'c) env \<Rightarrow> 'c list \<Rightarrow> ('a, 'b, 'c) env option"} \\
   138   @{term [source]
   139   "update :: 'c list \<Rightarrow> ('a, 'b, 'c) env option \<Rightarrow> ('a, 'b, 'c) env \<Rightarrow> ('a, 'b, 'c) env"} \\
   140   \end{tabular}}
   141 
   142   @{thm [display, indent = 2, eta_contract = false] lookup_eq [no_vars]}
   143   @{thm [display, indent = 2, eta_contract = false] update_eq [no_vars]}
   144 
   145   Several further properties of these operations are proven in @{cite
   146   "Bauer-et-al:2002:HOL-Library"}. These will be routinely used later on
   147   without further notice.
   148 
   149   \<^bigskip>
   150   Apparently, the elements of type @{typ "file"} contain an @{typ att}
   151   component in either case. We now define a few auxiliary operations to
   152   manipulate this field uniformly, following the conventions for record types
   153   in Isabelle/HOL @{cite "Nipkow-et-al:2000:HOL"}.
   154 \<close>
   155 
   156 definition
   157   "attributes file =
   158     (case file of
   159       Val (att, text) \<Rightarrow> att
   160     | Env att dir \<Rightarrow> att)"
   161 
   162 definition
   163   "map_attributes f file =
   164     (case file of
   165       Val (att, text) \<Rightarrow> Val (f att, text)
   166     | Env att dir \<Rightarrow> Env (f att) dir)"
   167 
   168 lemma [simp]: "attributes (Val (att, text)) = att"
   169   by (simp add: attributes_def)
   170 
   171 lemma [simp]: "attributes (Env att dir) = att"
   172   by (simp add: attributes_def)
   173 
   174 lemma [simp]: "attributes (map_attributes f file) = f (attributes file)"
   175   by (cases "file") (simp_all add: attributes_def map_attributes_def
   176     split_tupled_all)
   177 
   178 lemma [simp]: "map_attributes f (Val (att, text)) = Val (f att, text)"
   179   by (simp add: map_attributes_def)
   180 
   181 lemma [simp]: "map_attributes f (Env att dir) = Env (f att) dir"
   182   by (simp add: map_attributes_def)
   183 
   184 
   185 subsection \<open>Initial file-systems\<close>
   186 
   187 text \<open>
   188   Given a set of \<^emph>\<open>known users\<close> a file-system shall be initialized by
   189   providing an empty home directory for each user, with read-only access for
   190   everyone else. (Note that we may directly use the user id as home directory
   191   name, since both types have been identified.) Certainly, the very root
   192   directory is owned by the super user (who has user id 0).
   193 \<close>
   194 
   195 definition
   196   "init users =
   197     Env \<lparr>owner = 0, others = {Readable}\<rparr>
   198       (\<lambda>u. if u \<in> users then Some (Env \<lparr>owner = u, others = {Readable}\<rparr> empty)
   199         else None)"
   200 
   201 
   202 subsection \<open>Accessing file-systems\<close>
   203 
   204 text \<open>
   205   The main internal file-system operation is access of a file by a user,
   206   requesting a certain set of permissions. The resulting @{typ "file option"}
   207   indicates if a file had been present at the corresponding @{typ path} and if
   208   access was granted according to the permissions recorded within the
   209   file-system.
   210 
   211   Note that by the rules of Unix file-system security (e.g.\ @{cite
   212   "Tanenbaum:1992"}) both the super-user and owner may always access a file
   213   unconditionally (in our simplified model).
   214 \<close>
   215 
   216 definition
   217   "access root path uid perms =
   218     (case lookup root path of
   219       None \<Rightarrow> None
   220     | Some file \<Rightarrow>
   221         if uid = 0
   222           \<or> uid = owner (attributes file)
   223           \<or> perms \<subseteq> others (attributes file)
   224         then Some file
   225         else None)"
   226 
   227 text \<open>
   228   \<^medskip>
   229   Successful access to a certain file is the main prerequisite for
   230   system-calls to be applicable (cf.\ \secref{sec:unix-trans}). Any
   231   modification of the file-system is then performed using the basic @{term
   232   update} operation.
   233 
   234   \<^medskip>
   235   We see that @{term access} is just a wrapper for the basic @{term lookup}
   236   function, with additional checking of attributes. Subsequently we establish
   237   a few auxiliary facts that stem from the primitive @{term lookup} used
   238   within @{term access}.
   239 \<close>
   240 
   241 lemma access_empty_lookup: "access root path uid {} = lookup root path"
   242   by (simp add: access_def split: option.splits)
   243 
   244 lemma access_some_lookup:
   245   "access root path uid perms = Some file \<Longrightarrow>
   246     lookup root path = Some file"
   247   by (simp add: access_def split: option.splits if_splits)
   248 
   249 lemma access_update_other:
   250   assumes parallel: "path' \<parallel> path"
   251   shows "access (update path' opt root) path uid perms = access root path uid perms"
   252 proof -
   253   from parallel obtain y z xs ys zs where
   254       "y \<noteq> z" and "path' = xs @ y # ys" and "path = xs @ z # zs"
   255     by (blast dest: parallel_decomp)
   256   then have "lookup (update path' opt root) path = lookup root path"
   257     by (blast intro: lookup_update_other)
   258   then show ?thesis by (simp only: access_def)
   259 qed
   260 
   261 
   262 section \<open>File-system transitions \label{sec:unix-trans}\<close>
   263 
   264 subsection \<open>Unix system calls \label{sec:unix-syscall}\<close>
   265 
   266 text \<open>
   267   According to established operating system design (cf.\ @{cite
   268   "Tanenbaum:1992"}) user space processes may only initiate system operations
   269   by a fixed set of \<^emph>\<open>system-calls\<close>. This enables the kernel to enforce
   270   certain security policies in the first place.\<^footnote>\<open>Incidently, this is the very
   271   same principle employed by any ``LCF-style'' theorem proving system
   272   according to Milner's principle of ``correctness by construction'', such as
   273   Isabelle/HOL itself.\<close>
   274 
   275   \<^medskip>
   276   In our model of Unix we give a fixed datatype \<open>operation\<close> for the syntax of
   277   system-calls, together with an inductive definition of file-system state
   278   transitions of the form \<open>root \<midarrow>x\<rightarrow> root'\<close> for the operational semantics.
   279 \<close>
   280 
   281 datatype operation =
   282     Read uid string path
   283   | Write uid string path
   284   | Chmod uid perms path
   285   | Creat uid perms path
   286   | Unlink uid path
   287   | Mkdir uid perms path
   288   | Rmdir uid path
   289   | Readdir uid "name set" path
   290 
   291 text \<open>
   292   The @{typ uid} field of an operation corresponds to the \<^emph>\<open>effective user id\<close>
   293   of the underlying process, although our model never mentions processes
   294   explicitly. The other parameters are provided as arguments by the caller;
   295   the @{term path} one is common to all kinds of system-calls.
   296 \<close>
   297 
   298 primrec uid_of :: "operation \<Rightarrow> uid"
   299   where
   300     "uid_of (Read uid text path) = uid"
   301   | "uid_of (Write uid text path) = uid"
   302   | "uid_of (Chmod uid perms path) = uid"
   303   | "uid_of (Creat uid perms path) = uid"
   304   | "uid_of (Unlink uid path) = uid"
   305   | "uid_of (Mkdir uid path perms) = uid"
   306   | "uid_of (Rmdir uid path) = uid"
   307   | "uid_of (Readdir uid names path) = uid"
   308 
   309 primrec path_of :: "operation \<Rightarrow> path"
   310   where
   311     "path_of (Read uid text path) = path"
   312   | "path_of (Write uid text path) = path"
   313   | "path_of (Chmod uid perms path) = path"
   314   | "path_of (Creat uid perms path) = path"
   315   | "path_of (Unlink uid path) = path"
   316   | "path_of (Mkdir uid perms path) = path"
   317   | "path_of (Rmdir uid path) = path"
   318   | "path_of (Readdir uid names path) = path"
   319 
   320 text \<open>
   321   \<^medskip>
   322   Note that we have omitted explicit \<open>Open\<close> and \<open>Close\<close> operations, pretending
   323   that @{term Read} and @{term Write} would already take care of this behind
   324   the scenes. Thus we have basically treated actual sequences of real
   325   system-calls \<open>open\<close>--\<open>read\<close>/\<open>write\<close>--\<open>close\<close> as atomic.
   326 
   327   In principle, this could make big a difference in a model with explicit
   328   concurrent processes. On the other hand, even on a real Unix system the
   329   exact scheduling of concurrent \<open>open\<close> and \<open>close\<close> operations does \<^emph>\<open>not\<close>
   330   directly affect the success of corresponding \<open>read\<close> or \<open>write\<close>. Unix allows
   331   several processes to have files opened at the same time, even for writing!
   332   Certainly, the result from reading the contents later may be hard to
   333   predict, but the system-calls involved here will succeed in any case.
   334 
   335   \<^bigskip>
   336   The operational semantics of system calls is now specified via transitions
   337   of the file-system configuration. This is expressed as an inductive relation
   338   (although there is no actual recursion involved here).
   339 \<close>
   340 
   341 inductive transition :: "file \<Rightarrow> operation \<Rightarrow> file \<Rightarrow> bool"
   342   ("_ \<midarrow>_\<rightarrow> _" [90, 1000, 90] 100)
   343   where
   344     read:
   345       "root \<midarrow>(Read uid text path)\<rightarrow> root"
   346       if "access root path uid {Readable} = Some (Val (att, text))"
   347   | "write":
   348       "root \<midarrow>(Write uid text path)\<rightarrow> update path (Some (Val (att, text))) root"
   349       if "access root path uid {Writable} = Some (Val (att, text'))"
   350   | chmod:
   351       "root \<midarrow>(Chmod uid perms path)\<rightarrow>
   352         update path (Some (map_attributes (others_update (\<lambda>_. perms)) file)) root"
   353       if "access root path uid {} = Some file" and "uid = 0 \<or> uid = owner (attributes file)"
   354   | creat:
   355       "root \<midarrow>(Creat uid perms path)\<rightarrow>
   356         update path (Some (Val (\<lparr>owner = uid, others = perms\<rparr>, []))) root"
   357       if "path = parent_path @ [name]"
   358       and "access root parent_path uid {Writable} = Some (Env att parent)"
   359       and "access root path uid {} = None"
   360   | unlink:
   361       "root \<midarrow>(Unlink uid path)\<rightarrow> update path None root"
   362       if "path = parent_path @ [name]"
   363       and "access root parent_path uid {Writable} = Some (Env att parent)"
   364       and "access root path uid {} = Some (Val plain)"
   365   | mkdir:
   366       "root \<midarrow>(Mkdir uid perms path)\<rightarrow>
   367         update path (Some (Env \<lparr>owner = uid, others = perms\<rparr> empty)) root"
   368       if "path = parent_path @ [name]"
   369       and "access root parent_path uid {Writable} = Some (Env att parent)"
   370       and "access root path uid {} = None"
   371   | rmdir:
   372       "root \<midarrow>(Rmdir uid path)\<rightarrow> update path None root"
   373       if "path = parent_path @ [name]"
   374       and "access root parent_path uid {Writable} = Some (Env att parent)"
   375       and "access root path uid {} = Some (Env att' empty)"
   376   | readdir:
   377       "root \<midarrow>(Readdir uid names path)\<rightarrow> root"
   378       if "access root path uid {Readable} = Some (Env att dir)"
   379       and "names = dom dir"
   380 
   381 text \<open>
   382   \<^medskip>
   383   Certainly, the above specification is central to the whole formal
   384   development. Any of the results to be established later on are only
   385   meaningful to the outside world if this transition system provides an
   386   adequate model of real Unix systems. This kind of ``reality-check'' of a
   387   formal model is the well-known problem of \<^emph>\<open>validation\<close>.
   388 
   389   If in doubt, one may consider to compare our definition with the informal
   390   specifications given the corresponding Unix man pages, or even peek at an
   391   actual implementation such as @{cite "Torvalds-et-al:Linux"}. Another common
   392   way to gain confidence into the formal model is to run simple simulations
   393   (see \secref{sec:unix-examples}), and check the results with that of
   394   experiments performed on a real Unix system.
   395 \<close>
   396 
   397 
   398 subsection \<open>Basic properties of single transitions \label{sec:unix-single-trans}\<close>
   399 
   400 text \<open>
   401   The transition system \<open>root \<midarrow>x\<rightarrow> root'\<close> defined above determines a unique
   402   result @{term root'} from given @{term root} and @{term x} (this is holds
   403   rather trivially, since there is even only one clause for each operation).
   404   This uniqueness statement will simplify our subsequent development to some
   405   extent, since we only have to reason about a partial function rather than a
   406   general relation.
   407 \<close>
   408 
   409 theorem transition_uniq:
   410   assumes root': "root \<midarrow>x\<rightarrow> root'"
   411     and root'': "root \<midarrow>x\<rightarrow> root''"
   412   shows "root' = root''"
   413   using root''
   414 proof cases
   415   case read
   416   with root' show ?thesis by cases auto
   417 next
   418   case "write"
   419   with root' show ?thesis by cases auto
   420 next
   421   case chmod
   422   with root' show ?thesis by cases auto
   423 next
   424   case creat
   425   with root' show ?thesis by cases auto
   426 next
   427   case unlink
   428   with root' show ?thesis by cases auto
   429 next
   430   case mkdir
   431   with root' show ?thesis by cases auto
   432 next
   433   case rmdir
   434   with root' show ?thesis by cases auto
   435 next
   436   case readdir
   437   with root' show ?thesis by cases blast+
   438 qed
   439 
   440 text \<open>
   441   \<^medskip>
   442   Apparently, file-system transitions are \<^emph>\<open>type-safe\<close> in the sense that the
   443   result of transforming an actual directory yields again a directory.
   444 \<close>
   445 
   446 theorem transition_type_safe:
   447   assumes tr: "root \<midarrow>x\<rightarrow> root'"
   448     and inv: "\<exists>att dir. root = Env att dir"
   449   shows "\<exists>att dir. root' = Env att dir"
   450 proof (cases "path_of x")
   451   case Nil
   452   with tr inv show ?thesis
   453     by cases (auto simp add: access_def split: if_splits)
   454 next
   455   case Cons
   456   from tr obtain opt where
   457       "root' = root \<or> root' = update (path_of x) opt root"
   458     by cases auto
   459   with inv Cons show ?thesis
   460     by (auto simp add: update_eq split: list.splits)
   461 qed
   462 
   463 text \<open>
   464   The previous result may be seen as the most basic invariant on the
   465   file-system state that is enforced by any proper kernel implementation. So
   466   user processes --- being bound to the system-call interface --- may never
   467   mess up a file-system such that the root becomes a plain file instead of a
   468   directory, which would be a strange situation indeed.
   469 \<close>
   470 
   471 
   472 subsection \<open>Iterated transitions\<close>
   473 
   474 text \<open>
   475   Iterated system transitions via finite sequences of system operations are
   476   modeled inductively as follows. In a sense, this relation describes the
   477   cumulative effect of the sequence of system-calls issued by a number of
   478   running processes over a finite amount of time.
   479 \<close>
   480 
   481 inductive transitions :: "file \<Rightarrow> operation list \<Rightarrow> file \<Rightarrow> bool"  ("_ \<Midarrow>_\<Rightarrow> _" [90, 1000, 90] 100)
   482   where
   483     nil: "root \<Midarrow>[]\<Rightarrow> root"
   484   | cons: "root \<Midarrow>(x # xs)\<Rightarrow> root''" if "root \<midarrow>x\<rightarrow> root'" and "root' \<Midarrow>xs\<Rightarrow> root''"
   485 
   486 text \<open>
   487   \<^medskip>
   488   We establish a few basic facts relating iterated transitions with single
   489   ones, according to the recursive structure of lists.
   490 \<close>
   491 
   492 lemma transitions_nil_eq: "root \<Midarrow>[]\<Rightarrow> root' \<longleftrightarrow> root = root'"
   493 proof
   494   assume "root \<Midarrow>[]\<Rightarrow> root'"
   495   then show "root = root'" by cases simp_all
   496 next
   497   assume "root = root'"
   498   then show "root \<Midarrow>[]\<Rightarrow> root'" by (simp only: transitions.nil)
   499 qed
   500 
   501 lemma transitions_cons_eq:
   502   "root \<Midarrow>(x # xs)\<Rightarrow> root'' \<longleftrightarrow> (\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' \<Midarrow>xs\<Rightarrow> root'')"
   503 proof
   504   assume "root \<Midarrow>(x # xs)\<Rightarrow> root''"
   505   then show "\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' \<Midarrow>xs\<Rightarrow> root''"
   506     by cases auto
   507 next
   508   assume "\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' \<Midarrow>xs\<Rightarrow> root''"
   509   then show "root \<Midarrow>(x # xs)\<Rightarrow> root''"
   510     by (blast intro: transitions.cons)
   511 qed
   512 
   513 text \<open>
   514   The next two rules show how to ``destruct'' known transition sequences. Note
   515   that the second one actually relies on the uniqueness property of the basic
   516   transition system (see \secref{sec:unix-single-trans}).
   517 \<close>
   518 
   519 lemma transitions_nilD: "root \<Midarrow>[]\<Rightarrow> root' \<Longrightarrow> root' = root"
   520   by (simp add: transitions_nil_eq)
   521 
   522 lemma transitions_consD:
   523   assumes list: "root \<Midarrow>(x # xs)\<Rightarrow> root''"
   524     and hd: "root \<midarrow>x\<rightarrow> root'"
   525   shows "root' \<Midarrow>xs\<Rightarrow> root''"
   526 proof -
   527   from list obtain r' where r': "root \<midarrow>x\<rightarrow> r'" and root'': "r' \<Midarrow>xs\<Rightarrow> root''"
   528     by cases simp_all
   529   from r' hd have "r' = root'" by (rule transition_uniq)
   530   with root'' show "root' \<Midarrow>xs\<Rightarrow> root''" by simp
   531 qed
   532 
   533 text \<open>
   534   \<^medskip>
   535   The following fact shows how an invariant @{term Q} of single transitions
   536   with property @{term P} may be transferred to iterated transitions. The
   537   proof is rather obvious by rule induction over the definition of @{term
   538   "root \<Midarrow>xs\<Rightarrow> root'"}.
   539 \<close>
   540 
   541 lemma transitions_invariant:
   542   assumes r: "\<And>r x r'. r \<midarrow>x\<rightarrow> r' \<Longrightarrow> Q r \<Longrightarrow> P x \<Longrightarrow> Q r'"
   543     and trans: "root \<Midarrow>xs\<Rightarrow> root'"
   544   shows "Q root \<Longrightarrow> \<forall>x \<in> set xs. P x \<Longrightarrow> Q root'"
   545   using trans
   546 proof induct
   547   case nil
   548   show ?case by (rule nil.prems)
   549 next
   550   case (cons root x root' xs root'')
   551   note P = \<open>\<forall>x \<in> set (x # xs). P x\<close>
   552   then have "P x" by simp
   553   with \<open>root \<midarrow>x\<rightarrow> root'\<close> and \<open>Q root\<close> have Q': "Q root'" by (rule r)
   554   from P have "\<forall>x \<in> set xs. P x" by simp
   555   with Q' show "Q root''" by (rule cons.hyps)
   556 qed
   557 
   558 text \<open>
   559   As an example of applying the previous result, we transfer the basic
   560   type-safety property (see \secref{sec:unix-single-trans}) from single
   561   transitions to iterated ones, which is a rather obvious result indeed.
   562 \<close>
   563 
   564 theorem transitions_type_safe:
   565   assumes "root \<Midarrow>xs\<Rightarrow> root'"
   566     and "\<exists>att dir. root = Env att dir"
   567   shows "\<exists>att dir. root' = Env att dir"
   568   using transition_type_safe and assms
   569 proof (rule transitions_invariant)
   570   show "\<forall>x \<in> set xs. True" by blast
   571 qed
   572 
   573 
   574 section \<open>Executable sequences\<close>
   575 
   576 text \<open>
   577   An inductively defined relation such as the one of \<open>root \<midarrow>x\<rightarrow> root'\<close> (see
   578   \secref{sec:unix-syscall}) has two main aspects. First of all, the resulting
   579   system admits a certain set of transition rules (introductions) as given in
   580   the specification. Furthermore, there is an explicit least-fixed-point
   581   construction involved, which results in induction (and case-analysis) rules
   582   to eliminate known transitions in an exhaustive manner.
   583 
   584   \<^medskip>
   585   Subsequently, we explore our transition system in an experimental style,
   586   mainly using the introduction rules with basic algebraic properties of the
   587   underlying structures. The technique closely resembles that of Prolog
   588   combined with functional evaluation in a very simple manner.
   589 
   590   Just as the ``closed-world assumption'' is left implicit in Prolog, we do
   591   not refer to induction over the whole transition system here. So this is
   592   still purely positive reasoning about possible executions; exhaustive
   593   reasoning will be employed only later on (see \secref{sec:unix-bogosity}),
   594   when we shall demonstrate that certain behavior is \<^emph>\<open>not\<close> possible.
   595 \<close>
   596 
   597 
   598 subsection \<open>Possible transitions\<close>
   599 
   600 text \<open>
   601   Rather obviously, a list of system operations can be executed within a
   602   certain state if there is a result state reached by an iterated transition.
   603 \<close>
   604 
   605 definition "can_exec root xs \<longleftrightarrow> (\<exists>root'. root \<Midarrow>xs\<Rightarrow> root')"
   606 
   607 lemma can_exec_nil: "can_exec root []"
   608   unfolding can_exec_def by (blast intro: transitions.intros)
   609 
   610 lemma can_exec_cons:
   611     "root \<midarrow>x\<rightarrow> root' \<Longrightarrow> can_exec root' xs \<Longrightarrow> can_exec root (x # xs)"
   612   unfolding can_exec_def by (blast intro: transitions.intros)
   613 
   614 text \<open>
   615   \<^medskip>
   616   In case that we already know that a certain sequence can be executed we may
   617   destruct it backwardly into individual transitions.
   618 \<close>
   619 
   620 lemma can_exec_snocD: "can_exec root (xs @ [y])
   621     \<Longrightarrow> \<exists>root' root''. root \<Midarrow>xs\<Rightarrow> root' \<and> root' \<midarrow>y\<rightarrow> root''"
   622 proof (induct xs arbitrary: root)
   623   case Nil
   624   then show ?case
   625     by (simp add: can_exec_def transitions_nil_eq transitions_cons_eq)
   626 next
   627   case (Cons x xs)
   628   from \<open>can_exec root ((x # xs) @ [y])\<close> obtain r root'' where
   629       x: "root \<midarrow>x\<rightarrow> r" and
   630       xs_y: "r \<Midarrow>(xs @ [y])\<Rightarrow> root''"
   631     by (auto simp add: can_exec_def transitions_nil_eq transitions_cons_eq)
   632   from xs_y Cons.hyps obtain root' r'
   633     where xs: "r \<Midarrow>xs\<Rightarrow> root'" and y: "root' \<midarrow>y\<rightarrow> r'"
   634     unfolding can_exec_def by blast
   635   from x xs have "root \<Midarrow>(x # xs)\<Rightarrow> root'"
   636     by (rule transitions.cons)
   637   with y show ?case by blast
   638 qed
   639 
   640 
   641 subsection \<open>Example executions \label{sec:unix-examples}\<close>
   642 
   643 text \<open>
   644   We are now ready to perform a few experiments within our formal model of
   645   Unix system-calls. The common technique is to alternate introduction rules
   646   of the transition system (see \secref{sec:unix-trans}), and steps to solve
   647   any emerging side conditions using algebraic properties of the underlying
   648   file-system structures (see \secref{sec:unix-file-system}).
   649 \<close>
   650 
   651 lemmas eval = access_def init_def
   652 
   653 theorem "u \<in> users \<Longrightarrow> can_exec (init users)
   654     [Mkdir u perms [u, name]]"
   655   apply (rule can_exec_cons)
   656     \<comment> \<open>back-chain @{term can_exec} (of @{term [source] Cons})\<close>
   657   apply (rule mkdir)
   658     \<comment> \<open>back-chain @{term Mkdir}\<close>
   659   apply (force simp add: eval)+
   660     \<comment> \<open>solve preconditions of @{term Mkdir}\<close>
   661   apply (simp add: eval)
   662     \<comment> \<open>peek at resulting dir (optional)\<close>
   663   txt \<open>@{subgoals [display]}\<close>
   664   apply (rule can_exec_nil)
   665     \<comment> \<open>back-chain @{term can_exec} (of @{term [source] Nil})\<close>
   666   done
   667 
   668 text \<open>
   669   By inspecting the result shown just before the final step above, we may gain
   670   confidence that our specification of Unix system-calls actually makes sense.
   671   Further common errors are usually exhibited when preconditions of transition
   672   rules fail unexpectedly.
   673 
   674   \<^medskip>
   675   Here are a few further experiments, using the same techniques as before.
   676 \<close>
   677 
   678 theorem "u \<in> users \<Longrightarrow> can_exec (init users)
   679    [Creat u perms [u, name],
   680     Unlink u [u, name]]"
   681   apply (rule can_exec_cons)
   682   apply (rule creat)
   683   apply (force simp add: eval)+
   684   apply (simp add: eval)
   685   apply (rule can_exec_cons)
   686   apply (rule unlink)
   687   apply (force simp add: eval)+
   688   apply (simp add: eval)
   689   txt \<open>peek at result: @{subgoals [display]}\<close>
   690   apply (rule can_exec_nil)
   691   done
   692 
   693 theorem "u \<in> users \<Longrightarrow> Writable \<in> perms\<^sub>1 \<Longrightarrow>
   694   Readable \<in> perms\<^sub>2 \<Longrightarrow> name\<^sub>3 \<noteq> name\<^sub>4 \<Longrightarrow>
   695   can_exec (init users)
   696    [Mkdir u perms\<^sub>1 [u, name\<^sub>1],
   697     Mkdir u' perms\<^sub>2 [u, name\<^sub>1, name\<^sub>2],
   698     Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],
   699     Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>4],
   700     Readdir u {name\<^sub>3, name\<^sub>4} [u, name\<^sub>1, name\<^sub>2]]"
   701   apply (rule can_exec_cons, rule transition.intros,
   702     (force simp add: eval)+, (simp add: eval)?)+
   703   txt \<open>peek at result: @{subgoals [display]}\<close>
   704   apply (rule can_exec_nil)
   705   done
   706 
   707 theorem "u \<in> users \<Longrightarrow> Writable \<in> perms\<^sub>1 \<Longrightarrow> Readable \<in> perms\<^sub>3 \<Longrightarrow>
   708   can_exec (init users)
   709    [Mkdir u perms\<^sub>1 [u, name\<^sub>1],
   710     Mkdir u' perms\<^sub>2 [u, name\<^sub>1, name\<^sub>2],
   711     Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],
   712     Write u' ''foo'' [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],
   713     Read u ''foo'' [u, name\<^sub>1, name\<^sub>2, name\<^sub>3]]"
   714   apply (rule can_exec_cons, rule transition.intros,
   715     (force simp add: eval)+, (simp add: eval)?)+
   716   txt \<open>peek at result: @{subgoals [display]}\<close>
   717   apply (rule can_exec_nil)
   718   done
   719 
   720 
   721 section \<open>Odd effects --- treated formally \label{sec:unix-bogosity}\<close>
   722 
   723 text \<open>
   724   We are now ready to give a completely formal treatment of the slightly odd
   725   situation discussed in the introduction (see \secref{sec:unix-intro}): the
   726   file-system can easily reach a state where a user is unable to remove his
   727   very own directory, because it is still populated by items placed there by
   728   another user in an uncouth manner.
   729 \<close>
   730 
   731 
   732 subsection \<open>The general procedure \label{sec:unix-inv-procedure}\<close>
   733 
   734 text \<open>
   735   The following theorem expresses the general procedure we are following to
   736   achieve the main result.
   737 \<close>
   738 
   739 theorem general_procedure:
   740   assumes cannot_y: "\<And>r r'. Q r \<Longrightarrow> r \<midarrow>y\<rightarrow> r' \<Longrightarrow> False"
   741     and init_inv: "\<And>root. init users \<Midarrow>bs\<Rightarrow> root \<Longrightarrow> Q root"
   742     and preserve_inv: "\<And>r x r'. r \<midarrow>x\<rightarrow> r' \<Longrightarrow> Q r \<Longrightarrow> P x \<Longrightarrow> Q r'"
   743     and init_result: "init users \<Midarrow>bs\<Rightarrow> root"
   744   shows "\<not> (\<exists>xs. (\<forall>x \<in> set xs. P x) \<and> can_exec root (xs @ [y]))"
   745 proof -
   746   {
   747     fix xs
   748     assume Ps: "\<forall>x \<in> set xs. P x"
   749     assume can_exec: "can_exec root (xs @ [y])"
   750     then obtain root' root'' where xs: "root \<Midarrow>xs\<Rightarrow> root'" and y: "root' \<midarrow>y\<rightarrow> root''"
   751       by (blast dest: can_exec_snocD)
   752     from init_result have "Q root" by (rule init_inv)
   753     from preserve_inv xs this Ps have "Q root'"
   754       by (rule transitions_invariant)
   755     from this y have False by (rule cannot_y)
   756   }
   757   then show ?thesis by blast
   758 qed
   759 
   760 text \<open>
   761   Here @{prop "P x"} refers to the restriction on file-system operations that
   762   are admitted after having reached the critical configuration; according to
   763   the problem specification this will become @{prop "uid_of x = user\<^sub>1"} later
   764   on. Furthermore, @{term y} refers to the operations we claim to be
   765   impossible to perform afterwards, we will take @{term Rmdir} later. Moreover
   766   @{term Q} is a suitable (auxiliary) invariant over the file-system; choosing
   767   @{term Q} adequately is very important to make the proof work (see
   768   \secref{sec:unix-inv-lemmas}).
   769 \<close>
   770 
   771 
   772 subsection \<open>The particular situation\<close>
   773 
   774 text \<open>
   775   We introduce a few global declarations and axioms to describe our particular
   776   situation considered here. Thus we avoid excessive use of local parameters
   777   in the subsequent development.
   778 \<close>
   779 
   780 context
   781   fixes users :: "uid set"
   782     and user\<^sub>1 :: uid
   783     and user\<^sub>2 :: uid
   784     and name\<^sub>1 :: name
   785     and name\<^sub>2 :: name
   786     and name\<^sub>3 :: name
   787     and perms\<^sub>1 :: perms
   788     and perms\<^sub>2 :: perms
   789   assumes user\<^sub>1_known: "user\<^sub>1 \<in> users"
   790     and user\<^sub>1_not_root: "user\<^sub>1 \<noteq> 0"
   791     and users_neq: "user\<^sub>1 \<noteq> user\<^sub>2"
   792     and perms\<^sub>1_writable: "Writable \<in> perms\<^sub>1"
   793     and perms\<^sub>2_not_writable: "Writable \<notin> perms\<^sub>2"
   794   notes facts = user\<^sub>1_known user\<^sub>1_not_root users_neq
   795     perms\<^sub>1_writable perms\<^sub>2_not_writable
   796 begin
   797 
   798 definition
   799   "bogus =
   800      [Mkdir user\<^sub>1 perms\<^sub>1 [user\<^sub>1, name\<^sub>1],
   801       Mkdir user\<^sub>2 perms\<^sub>2 [user\<^sub>1, name\<^sub>1, name\<^sub>2],
   802       Creat user\<^sub>2 perms\<^sub>2 [user\<^sub>1, name\<^sub>1, name\<^sub>2, name\<^sub>3]]"
   803 
   804 definition "bogus_path = [user\<^sub>1, name\<^sub>1, name\<^sub>2]"
   805 
   806 text \<open>
   807   The @{term bogus} operations are the ones that lead into the uncouth
   808   situation; @{term bogus_path} is the key position within the file-system
   809   where things go awry.
   810 \<close>
   811 
   812 
   813 subsection \<open>Invariance lemmas \label{sec:unix-inv-lemmas}\<close>
   814 
   815 text \<open>
   816   The following invariant over the root file-system describes the bogus
   817   situation in an abstract manner: located at a certain @{term path} within
   818   the file-system is a non-empty directory that is neither owned nor writable
   819   by @{term user\<^sub>1}.
   820 \<close>
   821 
   822 definition
   823   "invariant root path \<longleftrightarrow>
   824     (\<exists>att dir.
   825       access root path user\<^sub>1 {} = Some (Env att dir) \<and> dir \<noteq> empty \<and>
   826       user\<^sub>1 \<noteq> owner att \<and>
   827       access root path user\<^sub>1 {Writable} = None)"
   828 
   829 text \<open>
   830   Following the general procedure (see \secref{sec:unix-inv-procedure}) we
   831   will now establish the three key lemmas required to yield the final result.
   832 
   833     \<^enum> The invariant is sufficiently strong to entail the pathological case
   834     that @{term user\<^sub>1} is unable to remove the (owned) directory at @{term
   835     "[user\<^sub>1, name\<^sub>1]"}.
   836 
   837     \<^enum> The invariant does hold after having executed the @{term bogus} list of
   838     operations (starting with an initial file-system configuration).
   839 
   840     \<^enum> The invariant is preserved by any file-system operation performed by
   841     @{term user\<^sub>1} alone, without any help by other users.
   842 
   843   As the invariant appears both as assumptions and conclusions in the course
   844   of proof, its formulation is rather critical for the whole development to
   845   work out properly. In particular, the third step is very sensitive to the
   846   invariant being either too strong or too weak. Moreover the invariant has to
   847   be sufficiently abstract, lest the proof become cluttered by confusing
   848   detail.
   849 
   850   \<^medskip>
   851   The first two lemmas are technically straight forward --- we just have to
   852   inspect rather special cases.
   853 \<close>
   854 
   855 lemma cannot_rmdir:
   856   assumes inv: "invariant root bogus_path"
   857     and rmdir: "root \<midarrow>(Rmdir user\<^sub>1 [user\<^sub>1, name\<^sub>1])\<rightarrow> root'"
   858   shows False
   859 proof -
   860   from inv obtain "file" where "access root bogus_path user\<^sub>1 {} = Some file"
   861     unfolding invariant_def by blast
   862   moreover
   863   from rmdir obtain att where "access root [user\<^sub>1, name\<^sub>1] user\<^sub>1 {} = Some (Env att empty)"
   864     by cases auto
   865   then have "access root ([user\<^sub>1, name\<^sub>1] @ [name\<^sub>2]) user\<^sub>1 {} = empty name\<^sub>2"
   866     by (simp only: access_empty_lookup lookup_append_some) simp
   867   ultimately show False by (simp add: bogus_path_def)
   868 qed
   869 
   870 lemma
   871   assumes init: "init users \<Midarrow>bogus\<Rightarrow> root"
   872   shows init_invariant: "invariant root bogus_path"
   873   supply eval = facts access_def init_def
   874   using init
   875   apply (unfold bogus_def bogus_path_def)
   876   apply (drule transitions_consD, rule transition.intros,
   877       (force simp add: eval)+, (simp add: eval)?)+
   878     \<comment> "evaluate all operations"
   879   apply (drule transitions_nilD)
   880     \<comment> "reach final result"
   881   apply (simp add: invariant_def eval)
   882     \<comment> "check the invariant"
   883   done
   884 
   885 text \<open>
   886   \<^medskip>
   887   At last we are left with the main effort to show that the ``bogosity''
   888   invariant is preserved by any file-system operation performed by @{term
   889   user\<^sub>1} alone. Note that this holds for any @{term path} given, the
   890   particular @{term bogus_path} is not required here.
   891 \<close>
   892 
   893 lemma preserve_invariant:
   894   assumes tr: "root \<midarrow>x\<rightarrow> root'"
   895     and inv: "invariant root path"
   896     and uid: "uid_of x = user\<^sub>1"
   897   shows "invariant root' path"
   898 proof -
   899   from inv obtain att dir where
   900       inv1: "access root path user\<^sub>1 {} = Some (Env att dir)" and
   901       inv2: "dir \<noteq> empty" and
   902       inv3: "user\<^sub>1 \<noteq> owner att" and
   903       inv4: "access root path user\<^sub>1 {Writable} = None"
   904     by (auto simp add: invariant_def)
   905 
   906   from inv1 have lookup: "lookup root path = Some (Env att dir)"
   907     by (simp only: access_empty_lookup)
   908 
   909   from inv1 inv3 inv4 and user\<^sub>1_not_root
   910   have not_writable: "Writable \<notin> others att"
   911     by (auto simp add: access_def split: option.splits)
   912 
   913   show ?thesis
   914   proof cases
   915     assume "root' = root"
   916     with inv show "invariant root' path" by (simp only:)
   917   next
   918     assume changed: "root' \<noteq> root"
   919     with tr obtain opt where root': "root' = update (path_of x) opt root"
   920       by cases auto
   921     show ?thesis
   922     proof (rule prefix_cases)
   923       assume "path_of x \<parallel> path"
   924       with inv root'
   925       have "\<And>perms. access root' path user\<^sub>1 perms = access root path user\<^sub>1 perms"
   926         by (simp only: access_update_other)
   927       with inv show "invariant root' path"
   928         by (simp only: invariant_def)
   929     next
   930       assume "prefix (path_of x) path"
   931       then obtain ys where path: "path = path_of x @ ys" ..
   932 
   933       show ?thesis
   934       proof (cases ys)
   935         assume "ys = []"
   936         with tr uid inv2 inv3 lookup changed path and user\<^sub>1_not_root
   937         have False
   938           by cases (auto simp add: access_empty_lookup dest: access_some_lookup)
   939         then show ?thesis ..
   940       next
   941         fix z zs assume ys: "ys = z # zs"
   942         have "lookup root' path = lookup root path"
   943         proof -
   944           from inv2 lookup path ys
   945           have look: "lookup root (path_of x @ z # zs) = Some (Env att dir)"
   946             by (simp only:)
   947           then obtain att' dir' file' where
   948               look': "lookup root (path_of x) = Some (Env att' dir')" and
   949               dir': "dir' z = Some file'" and
   950               file': "lookup file' zs = Some (Env att dir)"
   951             by (blast dest: lookup_some_upper)
   952 
   953           from tr uid changed look' dir' obtain att'' where
   954               look'': "lookup root' (path_of x) = Some (Env att'' dir')"
   955             by cases (auto simp add: access_empty_lookup lookup_update_some
   956               dest: access_some_lookup)
   957           with dir' file' have "lookup root' (path_of x @ z # zs) =
   958               Some (Env att dir)"
   959             by (simp add: lookup_append_some)
   960           with look path ys show ?thesis
   961             by simp
   962         qed
   963         with inv show "invariant root' path"
   964           by (simp only: invariant_def access_def)
   965       qed
   966     next
   967       assume "strict_prefix path (path_of x)"
   968       then obtain y ys where path: "path_of x = path @ y # ys" ..
   969 
   970       obtain dir' where
   971         lookup': "lookup root' path = Some (Env att dir')" and
   972         inv2': "dir' \<noteq> empty"
   973       proof (cases ys)
   974         assume "ys = []"
   975         with path have parent: "path_of x = path @ [y]" by simp
   976         with tr uid inv4 changed obtain "file" where
   977             "root' = update (path_of x) (Some file) root"
   978           by cases auto
   979         with lookup parent have "lookup root' path = Some (Env att (dir(y\<mapsto>file)))"
   980           by (simp only: update_append_some update_cons_nil_env)
   981         moreover have "dir(y\<mapsto>file) \<noteq> empty" by simp
   982         ultimately show ?thesis ..
   983       next
   984         fix z zs assume ys: "ys = z # zs"
   985         with lookup root' path
   986         have "lookup root' path = Some (update (y # ys) opt (Env att dir))"
   987           by (simp only: update_append_some)
   988         also obtain file' where
   989           "update (y # ys) opt (Env att dir) = Env att (dir(y\<mapsto>file'))"
   990         proof -
   991           have "dir y \<noteq> None"
   992           proof -
   993             have "dir y = lookup (Env att dir) [y]"
   994               by (simp split: option.splits)
   995             also from lookup have "\<dots> = lookup root (path @ [y])"
   996               by (simp only: lookup_append_some)
   997             also have "\<dots> \<noteq> None"
   998             proof -
   999               from ys obtain us u where rev_ys: "ys = us @ [u]"
  1000                 by (cases ys rule: rev_cases) simp
  1001               with tr path
  1002               have "lookup root ((path @ [y]) @ (us @ [u])) \<noteq> None \<or>
  1003                   lookup root ((path @ [y]) @ us) \<noteq> None"
  1004                 by cases (auto dest: access_some_lookup)
  1005               then show ?thesis
  1006                 by (fastforce dest!: lookup_some_append)
  1007             qed
  1008             finally show ?thesis .
  1009           qed
  1010           with ys show ?thesis using that by auto
  1011         qed
  1012         also have "dir(y\<mapsto>file') \<noteq> empty" by simp
  1013         ultimately show ?thesis ..
  1014       qed
  1015 
  1016       from lookup' have inv1': "access root' path user\<^sub>1 {} = Some (Env att dir')"
  1017         by (simp only: access_empty_lookup)
  1018 
  1019       from inv3 lookup' and not_writable user\<^sub>1_not_root
  1020       have "access root' path user\<^sub>1 {Writable} = None"
  1021         by (simp add: access_def)
  1022       with inv1' inv2' inv3 show ?thesis unfolding invariant_def by blast
  1023     qed
  1024   qed
  1025 qed
  1026 
  1027 
  1028 subsection \<open>Putting it all together \label{sec:unix-main-result}\<close>
  1029 
  1030 text \<open>
  1031   The main result is now imminent, just by composing the three invariance
  1032   lemmas (see \secref{sec:unix-inv-lemmas}) according the the overall
  1033   procedure (see \secref{sec:unix-inv-procedure}).
  1034 \<close>
  1035 
  1036 corollary
  1037   assumes bogus: "init users \<Midarrow>bogus\<Rightarrow> root"
  1038   shows "\<not> (\<exists>xs. (\<forall>x \<in> set xs. uid_of x = user\<^sub>1) \<and>
  1039     can_exec root (xs @ [Rmdir user\<^sub>1 [user\<^sub>1, name\<^sub>1]]))"
  1040 proof -
  1041   from cannot_rmdir init_invariant preserve_invariant
  1042     and bogus show ?thesis by (rule general_procedure)
  1043 qed
  1044 
  1045 end
  1046 
  1047 end