src/HOL/Unix/Unix.thy
author wenzelm
Mon Dec 21 21:48:36 2015 +0100 (2015-12-21)
changeset 61893 4121cc8479f8
parent 60374 6b858199f240
child 62176 1221f2a46945
permissions -rw-r--r--
misc tuning and modernization;
     1 (*  Title:      HOL/Unix/Unix.thy
     2     Author:     Markus Wenzel, TU Muenchen
     3 *)
     4 
     5 section \<open>Unix file-systems \label{sec:unix-file-system}\<close>
     6 
     7 theory Unix
     8 imports
     9   Nested_Environment
    10   "~~/src/HOL/Library/Sublist"
    11 begin
    12 
    13 text \<open>
    14   We give a simple mathematical model of the basic structures underlying the
    15   Unix file-system, together with a few fundamental operations that could be
    16   imagined to be performed internally by the Unix kernel. This forms the basis
    17   for the set of Unix system-calls to be introduced later (see
    18   \secref{sec:unix-trans}), which are the actual interface offered to
    19   processes running in user-space.
    20 
    21   \<^medskip>
    22   Basically, any Unix file is either a \<^emph>\<open>plain file\<close> or a \<^emph>\<open>directory\<close>,
    23   consisting of some \<^emph>\<open>content\<close> plus \<^emph>\<open>attributes\<close>. The content of a plain
    24   file is plain text. The content of a directory is a mapping from names to
    25   further files.\<^footnote>\<open>In fact, this is the only way that names get associated with
    26   files. In Unix files do \<^emph>\<open>not\<close> have a name in itself. Even more, any number
    27   of names may be associated with the very same file due to \<^emph>\<open>hard links\<close>
    28   (although this is excluded from our model).\<close> Attributes include information
    29   to control various ways to access the file (read, write etc.).
    30 
    31   Our model will be quite liberal in omitting excessive detail that is easily
    32   seen to be ``irrelevant'' for the aspects of Unix file-systems to be
    33   discussed here. First of all, we ignore character and block special files,
    34   pipes, sockets, hard links, symbolic links, and mount points.
    35 \<close>
    36 
    37 
    38 subsection \<open>Names\<close>
    39 
    40 text \<open>
    41   User ids and file name components shall be represented by natural numbers
    42   (without loss of generality). We do not bother about encoding of actual
    43   names (e.g.\ strings), nor a mapping between user names and user ids as
    44   would be present in a reality.
    45 \<close>
    46 
    47 type_synonym uid = nat
    48 type_synonym name = nat
    49 type_synonym path = "name list"
    50 
    51 
    52 subsection \<open>Attributes\<close>
    53 
    54 text \<open>
    55   Unix file attributes mainly consist of \<^emph>\<open>owner\<close> information and a number of
    56   \<^emph>\<open>permission\<close> bits which control access for ``user'', ``group'', and
    57   ``others'' (see the Unix man pages \<open>chmod(2)\<close> and \<open>stat(2)\<close> for more
    58   details).
    59 
    60   \<^medskip>
    61   Our model of file permissions only considers the ``others'' part. The
    62   ``user'' field may be omitted without loss of overall generality, since the
    63   owner is usually able to change it anyway by performing \<open>chmod\<close>.\<^footnote>\<open>The
    64   inclined Unix expert may try to figure out some exotic arrangements of a
    65   real-world Unix file-system such that the owner of a file is unable to apply
    66   the \<open>chmod\<close> system call.\<close> We omit ``group'' permissions as a genuine
    67   simplification as we just do not intend to discuss a model of multiple
    68   groups and group membership, but pretend that everyone is member of a single
    69   global group.\<^footnote>\<open>A general HOL model of user group structures and related
    70   issues is given in @{cite "Naraschewski:2001"}.\<close>
    71 \<close>
    72 
    73 datatype perm =
    74     Readable
    75   | Writable
    76   | Executable    \<comment> "(ignored)"
    77 
    78 type_synonym perms = "perm set"
    79 
    80 record att =
    81   owner :: uid
    82   others :: perms
    83 
    84 text \<open>
    85   For plain files @{term Readable} and @{term Writable} specify read and write
    86   access to the actual content, i.e.\ the string of text stored here. For
    87   directories @{term Readable} determines if the set of entry names may be
    88   accessed, and @{term Writable} controls the ability to create or delete any
    89   entries (both plain files or sub-directories).
    90 
    91   As another simplification, we ignore the @{term Executable} permission
    92   altogether. In reality it would indicate executable plain files (also known
    93   as ``binaries''), or control actual lookup of directory entries (recall that
    94   mere directory browsing is controlled via @{term Readable}). Note that the
    95   latter means that in order to perform any file-system operation whatsoever,
    96   all directories encountered on the path would have to grant @{term
    97   Executable}. We ignore this detail and pretend that all directories give
    98   @{term Executable} permission to anybody.
    99 \<close>
   100 
   101 
   102 subsection \<open>Files\<close>
   103 
   104 text \<open>
   105   In order to model the general tree structure of a Unix file-system we use
   106   the arbitrarily branching datatype @{typ "('a, 'b, 'c) env"} from the
   107   standard library of Isabelle/HOL @{cite "Bauer-et-al:2002:HOL-Library"}.
   108   This type provides constructors @{term Val} and @{term Env} as follows:
   109 
   110   \<^medskip>
   111   {\def\isastyleminor{\isastyle}
   112   \begin{tabular}{l}
   113   @{term [source] "Val :: 'a \<Rightarrow> ('a, 'b, 'c) env"} \\
   114   @{term [source] "Env :: 'b \<Rightarrow> ('c \<Rightarrow> ('a, 'b, 'c) env option) \<Rightarrow> ('a, 'b, 'c) env"} \\
   115   \end{tabular}}
   116   \<^medskip>
   117 
   118   Here the parameter @{typ 'a} refers to plain values occurring at leaf
   119   positions, parameter @{typ 'b} to information kept with inner branch nodes,
   120   and parameter @{typ 'c} to the branching type of the tree structure. For our
   121   purpose we use the type instance with @{typ "att \<times> string"} (representing
   122   plain files), @{typ att} (for attributes of directory nodes), and @{typ
   123   name} (for the index type of directory nodes).
   124 \<close>
   125 
   126 type_synonym "file" = "(att \<times> string, att, name) env"
   127 
   128 text \<open>
   129   \<^medskip>
   130   The HOL library also provides @{term lookup} and @{term update} operations
   131   for general tree structures with the subsequent primitive recursive
   132   characterizations.
   133 
   134   \<^medskip>
   135   {\def\isastyleminor{\isastyle}
   136   \begin{tabular}{l}
   137   @{term [source] "lookup :: ('a, 'b, 'c) env \<Rightarrow> 'c list \<Rightarrow> ('a, 'b, 'c) env option"} \\
   138   @{term [source]
   139   "update :: 'c list \<Rightarrow> ('a, 'b, 'c) env option \<Rightarrow> ('a, 'b, 'c) env \<Rightarrow> ('a, 'b, 'c) env"} \\
   140   \end{tabular}}
   141 
   142   @{thm [display, indent = 2, eta_contract = false] lookup_eq [no_vars]}
   143   @{thm [display, indent = 2, eta_contract = false] update_eq [no_vars]}
   144 
   145   Several further properties of these operations are proven in @{cite
   146   "Bauer-et-al:2002:HOL-Library"}. These will be routinely used later on
   147   without further notice.
   148 
   149   \<^bigskip>
   150   Apparently, the elements of type @{typ "file"} contain an @{typ att}
   151   component in either case. We now define a few auxiliary operations to
   152   manipulate this field uniformly, following the conventions for record types
   153   in Isabelle/HOL @{cite "Nipkow-et-al:2000:HOL"}.
   154 \<close>
   155 
   156 definition
   157   "attributes file =
   158     (case file of
   159       Val (att, text) \<Rightarrow> att
   160     | Env att dir \<Rightarrow> att)"
   161 
   162 definition
   163   "map_attributes f file =
   164     (case file of
   165       Val (att, text) \<Rightarrow> Val (f att, text)
   166     | Env att dir \<Rightarrow> Env (f att) dir)"
   167 
   168 lemma [simp]: "attributes (Val (att, text)) = att"
   169   by (simp add: attributes_def)
   170 
   171 lemma [simp]: "attributes (Env att dir) = att"
   172   by (simp add: attributes_def)
   173 
   174 lemma [simp]: "attributes (map_attributes f file) = f (attributes file)"
   175   by (cases "file") (simp_all add: attributes_def map_attributes_def
   176     split_tupled_all)
   177 
   178 lemma [simp]: "map_attributes f (Val (att, text)) = Val (f att, text)"
   179   by (simp add: map_attributes_def)
   180 
   181 lemma [simp]: "map_attributes f (Env att dir) = Env (f att) dir"
   182   by (simp add: map_attributes_def)
   183 
   184 
   185 subsection \<open>Initial file-systems\<close>
   186 
   187 text \<open>
   188   Given a set of \<^emph>\<open>known users\<close> a file-system shall be initialized by
   189   providing an empty home directory for each user, with read-only access for
   190   everyone else. (Note that we may directly use the user id as home directory
   191   name, since both types have been identified.) Certainly, the very root
   192   directory is owned by the super user (who has user id 0).
   193 \<close>
   194 
   195 definition
   196   "init users =
   197     Env \<lparr>owner = 0, others = {Readable}\<rparr>
   198       (\<lambda>u. if u \<in> users then Some (Env \<lparr>owner = u, others = {Readable}\<rparr> empty)
   199         else None)"
   200 
   201 
   202 subsection \<open>Accessing file-systems\<close>
   203 
   204 text \<open>
   205   The main internal file-system operation is access of a file by a user,
   206   requesting a certain set of permissions. The resulting @{typ "file option"}
   207   indicates if a file had been present at the corresponding @{typ path} and if
   208   access was granted according to the permissions recorded within the
   209   file-system.
   210 
   211   Note that by the rules of Unix file-system security (e.g.\ @{cite
   212   "Tanenbaum:1992"}) both the super-user and owner may always access a file
   213   unconditionally (in our simplified model).
   214 \<close>
   215 
   216 definition
   217   "access root path uid perms =
   218     (case lookup root path of
   219       None \<Rightarrow> None
   220     | Some file \<Rightarrow>
   221         if uid = 0
   222           \<or> uid = owner (attributes file)
   223           \<or> perms \<subseteq> others (attributes file)
   224         then Some file
   225         else None)"
   226 
   227 text \<open>
   228   \<^medskip>
   229   Successful access to a certain file is the main prerequisite for
   230   system-calls to be applicable (cf.\ \secref{sec:unix-trans}). Any
   231   modification of the file-system is then performed using the basic @{term
   232   update} operation.
   233 
   234   \<^medskip>
   235   We see that @{term access} is just a wrapper for the basic @{term lookup}
   236   function, with additional checking of attributes. Subsequently we establish
   237   a few auxiliary facts that stem from the primitive @{term lookup} used
   238   within @{term access}.
   239 \<close>
   240 
   241 lemma access_empty_lookup: "access root path uid {} = lookup root path"
   242   by (simp add: access_def split: option.splits)
   243 
   244 lemma access_some_lookup:
   245   "access root path uid perms = Some file \<Longrightarrow>
   246     lookup root path = Some file"
   247   by (simp add: access_def split: option.splits if_splits)
   248 
   249 lemma access_update_other:
   250   assumes parallel: "path' \<parallel> path"
   251   shows "access (update path' opt root) path uid perms = access root path uid perms"
   252 proof -
   253   from parallel obtain y z xs ys zs where
   254       "y \<noteq> z" and "path' = xs @ y # ys" and "path = xs @ z # zs"
   255     by (blast dest: parallel_decomp)
   256   then have "lookup (update path' opt root) path = lookup root path"
   257     by (blast intro: lookup_update_other)
   258   then show ?thesis by (simp only: access_def)
   259 qed
   260 
   261 
   262 section \<open>File-system transitions \label{sec:unix-trans}\<close>
   263 
   264 subsection \<open>Unix system calls \label{sec:unix-syscall}\<close>
   265 
   266 text \<open>
   267   According to established operating system design (cf.\ @{cite
   268   "Tanenbaum:1992"}) user space processes may only initiate system operations
   269   by a fixed set of \<^emph>\<open>system-calls\<close>. This enables the kernel to enforce
   270   certain security policies in the first place.\<^footnote>\<open>Incidently, this is the very
   271   same principle employed by any ``LCF-style'' theorem proving system
   272   according to Milner's principle of ``correctness by construction'', such as
   273   Isabelle/HOL itself.\<close>
   274 
   275   \<^medskip>
   276   In our model of Unix we give a fixed datatype \<open>operation\<close> for the syntax of
   277   system-calls, together with an inductive definition of file-system state
   278   transitions of the form \<open>root \<midarrow>x\<rightarrow> root'\<close> for the operational semantics.
   279 \<close>
   280 
   281 datatype operation =
   282     Read uid string path
   283   | Write uid string path
   284   | Chmod uid perms path
   285   | Creat uid perms path
   286   | Unlink uid path
   287   | Mkdir uid perms path
   288   | Rmdir uid path
   289   | Readdir uid "name set" path
   290 
   291 text \<open>
   292   The @{typ uid} field of an operation corresponds to the \<^emph>\<open>effective user id\<close>
   293   of the underlying process, although our model never mentions processes
   294   explicitly. The other parameters are provided as arguments by the caller;
   295   the @{term path} one is common to all kinds of system-calls.
   296 \<close>
   297 
   298 primrec uid_of :: "operation \<Rightarrow> uid"
   299 where
   300   "uid_of (Read uid text path) = uid"
   301 | "uid_of (Write uid text path) = uid"
   302 | "uid_of (Chmod uid perms path) = uid"
   303 | "uid_of (Creat uid perms path) = uid"
   304 | "uid_of (Unlink uid path) = uid"
   305 | "uid_of (Mkdir uid path perms) = uid"
   306 | "uid_of (Rmdir uid path) = uid"
   307 | "uid_of (Readdir uid names path) = uid"
   308 
   309 primrec path_of :: "operation \<Rightarrow> path"
   310 where
   311   "path_of (Read uid text path) = path"
   312 | "path_of (Write uid text path) = path"
   313 | "path_of (Chmod uid perms path) = path"
   314 | "path_of (Creat uid perms path) = path"
   315 | "path_of (Unlink uid path) = path"
   316 | "path_of (Mkdir uid perms path) = path"
   317 | "path_of (Rmdir uid path) = path"
   318 | "path_of (Readdir uid names path) = path"
   319 
   320 text \<open>
   321   \<^medskip>
   322   Note that we have omitted explicit \<open>Open\<close> and \<open>Close\<close> operations, pretending
   323   that @{term Read} and @{term Write} would already take care of this behind
   324   the scenes. Thus we have basically treated actual sequences of real
   325   system-calls \<open>open\<close>--\<open>read\<close>/\<open>write\<close>--\<open>close\<close> as atomic.
   326 
   327   In principle, this could make big a difference in a model with explicit
   328   concurrent processes. On the other hand, even on a real Unix system the
   329   exact scheduling of concurrent \<open>open\<close> and \<open>close\<close> operations does \<^emph>\<open>not\<close>
   330   directly affect the success of corresponding \<open>read\<close> or \<open>write\<close>. Unix allows
   331   several processes to have files opened at the same time, even for writing!
   332   Certainly, the result from reading the contents later may be hard to
   333   predict, but the system-calls involved here will succeed in any case.
   334 
   335   \<^bigskip>
   336   The operational semantics of system calls is now specified via transitions
   337   of the file-system configuration. This is expressed as an inductive relation
   338   (although there is no actual recursion involved here).
   339 \<close>
   340 
   341 inductive transition :: "file \<Rightarrow> operation \<Rightarrow> file \<Rightarrow> bool"
   342   ("_ \<midarrow>_\<rightarrow> _" [90, 1000, 90] 100)
   343 where
   344   read:
   345     "access root path uid {Readable} = Some (Val (att, text)) \<Longrightarrow>
   346       root \<midarrow>(Read uid text path)\<rightarrow> root" |
   347   "write":
   348     "access root path uid {Writable} = Some (Val (att, text')) \<Longrightarrow>
   349       root \<midarrow>(Write uid text path)\<rightarrow> update path (Some (Val (att, text))) root" |
   350 
   351   chmod:
   352     "access root path uid {} = Some file \<Longrightarrow>
   353       uid = 0 \<or> uid = owner (attributes file) \<Longrightarrow>
   354       root \<midarrow>(Chmod uid perms path)\<rightarrow> update path
   355         (Some (map_attributes (others_update (\<lambda>_. perms)) file)) root" |
   356 
   357   creat:
   358     "path = parent_path @ [name] \<Longrightarrow>
   359       access root parent_path uid {Writable} = Some (Env att parent) \<Longrightarrow>
   360       access root path uid {} = None \<Longrightarrow>
   361       root \<midarrow>(Creat uid perms path)\<rightarrow> update path
   362         (Some (Val (\<lparr>owner = uid, others = perms\<rparr>, []))) root" |
   363   unlink:
   364     "path = parent_path @ [name] \<Longrightarrow>
   365       access root parent_path uid {Writable} = Some (Env att parent) \<Longrightarrow>
   366       access root path uid {} = Some (Val plain) \<Longrightarrow>
   367       root \<midarrow>(Unlink uid path)\<rightarrow> update path None root" |
   368 
   369   mkdir:
   370     "path = parent_path @ [name] \<Longrightarrow>
   371       access root parent_path uid {Writable} = Some (Env att parent) \<Longrightarrow>
   372       access root path uid {} = None \<Longrightarrow>
   373       root \<midarrow>(Mkdir uid perms path)\<rightarrow> update path
   374         (Some (Env \<lparr>owner = uid, others = perms\<rparr> empty)) root" |
   375   rmdir:
   376     "path = parent_path @ [name] \<Longrightarrow>
   377       access root parent_path uid {Writable} = Some (Env att parent) \<Longrightarrow>
   378       access root path uid {} = Some (Env att' empty) \<Longrightarrow>
   379       root \<midarrow>(Rmdir uid path)\<rightarrow> update path None root" |
   380 
   381   readdir:
   382     "access root path uid {Readable} = Some (Env att dir) \<Longrightarrow>
   383       names = dom dir \<Longrightarrow>
   384       root \<midarrow>(Readdir uid names path)\<rightarrow> root"
   385 
   386 text \<open>
   387   \<^medskip>
   388   Certainly, the above specification is central to the whole formal
   389   development. Any of the results to be established later on are only
   390   meaningful to the outside world if this transition system provides an
   391   adequate model of real Unix systems. This kind of ``reality-check'' of a
   392   formal model is the well-known problem of \<^emph>\<open>validation\<close>.
   393 
   394   If in doubt, one may consider to compare our definition with the informal
   395   specifications given the corresponding Unix man pages, or even peek at an
   396   actual implementation such as @{cite "Torvalds-et-al:Linux"}. Another common
   397   way to gain confidence into the formal model is to run simple simulations
   398   (see \secref{sec:unix-examples}), and check the results with that of
   399   experiments performed on a real Unix system.
   400 \<close>
   401 
   402 
   403 subsection \<open>Basic properties of single transitions \label{sec:unix-single-trans}\<close>
   404 
   405 text \<open>
   406   The transition system \<open>root \<midarrow>x\<rightarrow> root'\<close> defined above determines a unique
   407   result @{term root'} from given @{term root} and @{term x} (this is holds
   408   rather trivially, since there is even only one clause for each operation).
   409   This uniqueness statement will simplify our subsequent development to some
   410   extent, since we only have to reason about a partial function rather than a
   411   general relation.
   412 \<close>
   413 
   414 theorem transition_uniq:
   415   assumes root': "root \<midarrow>x\<rightarrow> root'"
   416     and root'': "root \<midarrow>x\<rightarrow> root''"
   417   shows "root' = root''"
   418   using root''
   419 proof cases
   420   case read
   421   with root' show ?thesis by cases auto
   422 next
   423   case "write"
   424   with root' show ?thesis by cases auto
   425 next
   426   case chmod
   427   with root' show ?thesis by cases auto
   428 next
   429   case creat
   430   with root' show ?thesis by cases auto
   431 next
   432   case unlink
   433   with root' show ?thesis by cases auto
   434 next
   435   case mkdir
   436   with root' show ?thesis by cases auto
   437 next
   438   case rmdir
   439   with root' show ?thesis by cases auto
   440 next
   441   case readdir
   442   with root' show ?thesis by cases blast+
   443 qed
   444 
   445 text \<open>
   446   \<^medskip>
   447   Apparently, file-system transitions are \<^emph>\<open>type-safe\<close> in the sense that the
   448   result of transforming an actual directory yields again a directory.
   449 \<close>
   450 
   451 theorem transition_type_safe:
   452   assumes tr: "root \<midarrow>x\<rightarrow> root'"
   453     and inv: "\<exists>att dir. root = Env att dir"
   454   shows "\<exists>att dir. root' = Env att dir"
   455 proof (cases "path_of x")
   456   case Nil
   457   with tr inv show ?thesis
   458     by cases (auto simp add: access_def split: if_splits)
   459 next
   460   case Cons
   461   from tr obtain opt where
   462       "root' = root \<or> root' = update (path_of x) opt root"
   463     by cases auto
   464   with inv Cons show ?thesis
   465     by (auto simp add: update_eq split: list.splits)
   466 qed
   467 
   468 text \<open>
   469   The previous result may be seen as the most basic invariant on the
   470   file-system state that is enforced by any proper kernel implementation. So
   471   user processes --- being bound to the system-call interface --- may never
   472   mess up a file-system such that the root becomes a plain file instead of a
   473   directory, which would be a strange situation indeed.
   474 \<close>
   475 
   476 
   477 subsection \<open>Iterated transitions\<close>
   478 
   479 text \<open>
   480   Iterated system transitions via finite sequences of system operations are
   481   modeled inductively as follows. In a sense, this relation describes the
   482   cumulative effect of the sequence of system-calls issued by a number of
   483   running processes over a finite amount of time.
   484 \<close>
   485 
   486 inductive transitions :: "file \<Rightarrow> operation list \<Rightarrow> file \<Rightarrow> bool"
   487   ("_ =_\<Rightarrow> _" [90, 1000, 90] 100)
   488 where
   489   nil: "root =[]\<Rightarrow> root"
   490 | cons: "root \<midarrow>x\<rightarrow> root' \<Longrightarrow> root' =xs\<Rightarrow> root'' \<Longrightarrow> root =(x # xs)\<Rightarrow> root''"
   491 
   492 text \<open>
   493   \<^medskip>
   494   We establish a few basic facts relating iterated transitions with single
   495   ones, according to the recursive structure of lists.
   496 \<close>
   497 
   498 lemma transitions_nil_eq: "root =[]\<Rightarrow> root' = (root = root')"
   499 proof
   500   assume "root =[]\<Rightarrow> root'"
   501   then show "root = root'" by cases simp_all
   502 next
   503   assume "root = root'"
   504   then show "root =[]\<Rightarrow> root'" by (simp only: transitions.nil)
   505 qed
   506 
   507 lemma transitions_cons_eq:
   508   "root =(x # xs)\<Rightarrow> root'' = (\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' =xs\<Rightarrow> root'')"
   509 proof
   510   assume "root =(x # xs)\<Rightarrow> root''"
   511   then show "\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' =xs\<Rightarrow> root''"
   512     by cases auto
   513 next
   514   assume "\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' =xs\<Rightarrow> root''"
   515   then show "root =(x # xs)\<Rightarrow> root''"
   516     by (blast intro: transitions.cons)
   517 qed
   518 
   519 text \<open>
   520   The next two rules show how to ``destruct'' known transition sequences. Note
   521   that the second one actually relies on the uniqueness property of the basic
   522   transition system (see \secref{sec:unix-single-trans}).
   523 \<close>
   524 
   525 lemma transitions_nilD: "root =[]\<Rightarrow> root' \<Longrightarrow> root' = root"
   526   by (simp add: transitions_nil_eq)
   527 
   528 lemma transitions_consD:
   529   assumes list: "root =(x # xs)\<Rightarrow> root''"
   530     and hd: "root \<midarrow>x\<rightarrow> root'"
   531   shows "root' =xs\<Rightarrow> root''"
   532 proof -
   533   from list obtain r' where r': "root \<midarrow>x\<rightarrow> r'" and root'': "r' =xs\<Rightarrow> root''"
   534     by cases simp_all
   535   from r' hd have "r' = root'" by (rule transition_uniq)
   536   with root'' show "root' =xs\<Rightarrow> root''" by simp
   537 qed
   538 
   539 text \<open>
   540   \<^medskip>
   541   The following fact shows how an invariant @{term Q} of single transitions
   542   with property @{term P} may be transferred to iterated transitions. The
   543   proof is rather obvious by rule induction over the definition of @{term
   544   "root =xs\<Rightarrow> root'"}.
   545 \<close>
   546 
   547 lemma transitions_invariant:
   548   assumes r: "\<And>r x r'. r \<midarrow>x\<rightarrow> r' \<Longrightarrow> Q r \<Longrightarrow> P x \<Longrightarrow> Q r'"
   549     and trans: "root =xs\<Rightarrow> root'"
   550   shows "Q root \<Longrightarrow> \<forall>x \<in> set xs. P x \<Longrightarrow> Q root'"
   551   using trans
   552 proof induct
   553   case nil
   554   show ?case by (rule nil.prems)
   555 next
   556   case (cons root x root' xs root'')
   557   note P = \<open>\<forall>x \<in> set (x # xs). P x\<close>
   558   then have "P x" by simp
   559   with \<open>root \<midarrow>x\<rightarrow> root'\<close> and \<open>Q root\<close> have Q': "Q root'" by (rule r)
   560   from P have "\<forall>x \<in> set xs. P x" by simp
   561   with Q' show "Q root''" by (rule cons.hyps)
   562 qed
   563 
   564 text \<open>
   565   As an example of applying the previous result, we transfer the basic
   566   type-safety property (see \secref{sec:unix-single-trans}) from single
   567   transitions to iterated ones, which is a rather obvious result indeed.
   568 \<close>
   569 
   570 theorem transitions_type_safe:
   571   assumes "root =xs\<Rightarrow> root'"
   572     and "\<exists>att dir. root = Env att dir"
   573   shows "\<exists>att dir. root' = Env att dir"
   574   using transition_type_safe and assms
   575 proof (rule transitions_invariant)
   576   show "\<forall>x \<in> set xs. True" by blast
   577 qed
   578 
   579 
   580 section \<open>Executable sequences\<close>
   581 
   582 text \<open>
   583   An inductively defined relation such as the one of \<open>root \<midarrow>x\<rightarrow> root'\<close> (see
   584   \secref{sec:unix-syscall}) has two main aspects. First of all, the resulting
   585   system admits a certain set of transition rules (introductions) as given in
   586   the specification. Furthermore, there is an explicit least-fixed-point
   587   construction involved, which results in induction (and case-analysis) rules
   588   to eliminate known transitions in an exhaustive manner.
   589 
   590   \<^medskip>
   591   Subsequently, we explore our transition system in an experimental style,
   592   mainly using the introduction rules with basic algebraic properties of the
   593   underlying structures. The technique closely resembles that of Prolog
   594   combined with functional evaluation in a very simple manner.
   595 
   596   Just as the ``closed-world assumption'' is left implicit in Prolog, we do
   597   not refer to induction over the whole transition system here. So this is
   598   still purely positive reasoning about possible executions; exhaustive
   599   reasoning will be employed only later on (see \secref{sec:unix-bogosity}),
   600   when we shall demonstrate that certain behavior is \<^emph>\<open>not\<close> possible.
   601 \<close>
   602 
   603 
   604 subsection \<open>Possible transitions\<close>
   605 
   606 text \<open>
   607   Rather obviously, a list of system operations can be executed within a
   608   certain state if there is a result state reached by an iterated transition.
   609 \<close>
   610 
   611 definition "can_exec root xs = (\<exists>root'. root =xs\<Rightarrow> root')"
   612 
   613 lemma can_exec_nil: "can_exec root []"
   614   unfolding can_exec_def by (blast intro: transitions.intros)
   615 
   616 lemma can_exec_cons:
   617     "root \<midarrow>x\<rightarrow> root' \<Longrightarrow> can_exec root' xs \<Longrightarrow> can_exec root (x # xs)"
   618   unfolding can_exec_def by (blast intro: transitions.intros)
   619 
   620 text \<open>
   621   \<^medskip>
   622   In case that we already know that a certain sequence can be executed we may
   623   destruct it backwardly into individual transitions.
   624 \<close>
   625 
   626 lemma can_exec_snocD: "can_exec root (xs @ [y])
   627     \<Longrightarrow> \<exists>root' root''. root =xs\<Rightarrow> root' \<and> root' \<midarrow>y\<rightarrow> root''"
   628 proof (induct xs arbitrary: root)
   629   case Nil
   630   then show ?case
   631     by (simp add: can_exec_def transitions_nil_eq transitions_cons_eq)
   632 next
   633   case (Cons x xs)
   634   from \<open>can_exec root ((x # xs) @ [y])\<close> obtain r root'' where
   635       x: "root \<midarrow>x\<rightarrow> r" and
   636       xs_y: "r =(xs @ [y])\<Rightarrow> root''"
   637     by (auto simp add: can_exec_def transitions_nil_eq transitions_cons_eq)
   638   from xs_y Cons.hyps obtain root' r'
   639     where xs: "r =xs\<Rightarrow> root'" and y: "root' \<midarrow>y\<rightarrow> r'"
   640     unfolding can_exec_def by blast
   641   from x xs have "root =(x # xs)\<Rightarrow> root'"
   642     by (rule transitions.cons)
   643   with y show ?case by blast
   644 qed
   645 
   646 
   647 subsection \<open>Example executions \label{sec:unix-examples}\<close>
   648 
   649 text \<open>
   650   We are now ready to perform a few experiments within our formal model of
   651   Unix system-calls. The common technique is to alternate introduction rules
   652   of the transition system (see \secref{sec:unix-trans}), and steps to solve
   653   any emerging side conditions using algebraic properties of the underlying
   654   file-system structures (see \secref{sec:unix-file-system}).
   655 \<close>
   656 
   657 lemmas eval = access_def init_def
   658 
   659 theorem "u \<in> users \<Longrightarrow> can_exec (init users)
   660     [Mkdir u perms [u, name]]"
   661   apply (rule can_exec_cons)
   662     \<comment> \<open>back-chain @{term can_exec} (of @{term [source] Cons})\<close>
   663   apply (rule mkdir)
   664     \<comment> \<open>back-chain @{term Mkdir}\<close>
   665   apply (force simp add: eval)+
   666     \<comment> \<open>solve preconditions of @{term Mkdir}\<close>
   667   apply (simp add: eval)
   668     \<comment> \<open>peek at resulting dir (optional)\<close>
   669   txt \<open>@{subgoals [display]}\<close>
   670   apply (rule can_exec_nil)
   671     \<comment> \<open>back-chain @{term can_exec} (of @{term [source] Nil})\<close>
   672   done
   673 
   674 text \<open>
   675   By inspecting the result shown just before the final step above, we may gain
   676   confidence that our specification of Unix system-calls actually makes sense.
   677   Further common errors are usually exhibited when preconditions of transition
   678   rules fail unexpectedly.
   679 
   680   \<^medskip>
   681   Here are a few further experiments, using the same techniques as before.
   682 \<close>
   683 
   684 theorem "u \<in> users \<Longrightarrow> can_exec (init users)
   685    [Creat u perms [u, name],
   686     Unlink u [u, name]]"
   687   apply (rule can_exec_cons)
   688   apply (rule creat)
   689   apply (force simp add: eval)+
   690   apply (simp add: eval)
   691   apply (rule can_exec_cons)
   692   apply (rule unlink)
   693   apply (force simp add: eval)+
   694   apply (simp add: eval)
   695   txt \<open>peek at result: @{subgoals [display]}\<close>
   696   apply (rule can_exec_nil)
   697   done
   698 
   699 theorem "u \<in> users \<Longrightarrow> Writable \<in> perms\<^sub>1 \<Longrightarrow>
   700   Readable \<in> perms\<^sub>2 \<Longrightarrow> name\<^sub>3 \<noteq> name\<^sub>4 \<Longrightarrow>
   701   can_exec (init users)
   702    [Mkdir u perms\<^sub>1 [u, name\<^sub>1],
   703     Mkdir u' perms\<^sub>2 [u, name\<^sub>1, name\<^sub>2],
   704     Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],
   705     Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>4],
   706     Readdir u {name\<^sub>3, name\<^sub>4} [u, name\<^sub>1, name\<^sub>2]]"
   707   apply (rule can_exec_cons, rule transition.intros,
   708     (force simp add: eval)+, (simp add: eval)?)+
   709   txt \<open>peek at result: @{subgoals [display]}\<close>
   710   apply (rule can_exec_nil)
   711   done
   712 
   713 theorem "u \<in> users \<Longrightarrow> Writable \<in> perms\<^sub>1 \<Longrightarrow> Readable \<in> perms\<^sub>3 \<Longrightarrow>
   714   can_exec (init users)
   715    [Mkdir u perms\<^sub>1 [u, name\<^sub>1],
   716     Mkdir u' perms\<^sub>2 [u, name\<^sub>1, name\<^sub>2],
   717     Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],
   718     Write u' ''foo'' [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],
   719     Read u ''foo'' [u, name\<^sub>1, name\<^sub>2, name\<^sub>3]]"
   720   apply (rule can_exec_cons, rule transition.intros,
   721     (force simp add: eval)+, (simp add: eval)?)+
   722   txt \<open>peek at result: @{subgoals [display]}\<close>
   723   apply (rule can_exec_nil)
   724   done
   725 
   726 
   727 section \<open>Odd effects --- treated formally \label{sec:unix-bogosity}\<close>
   728 
   729 text \<open>
   730   We are now ready to give a completely formal treatment of the slightly odd
   731   situation discussed in the introduction (see \secref{sec:unix-intro}): the
   732   file-system can easily reach a state where a user is unable to remove his
   733   very own directory, because it is still populated by items placed there by
   734   another user in an uncouth manner.
   735 \<close>
   736 
   737 
   738 subsection \<open>The general procedure \label{sec:unix-inv-procedure}\<close>
   739 
   740 text \<open>
   741   The following theorem expresses the general procedure we are following to
   742   achieve the main result.
   743 \<close>
   744 
   745 theorem general_procedure:
   746   assumes cannot_y: "\<And>r r'. Q r \<Longrightarrow> r \<midarrow>y\<rightarrow> r' \<Longrightarrow> False"
   747     and init_inv: "\<And>root. init users =bs\<Rightarrow> root \<Longrightarrow> Q root"
   748     and preserve_inv: "\<And>r x r'. r \<midarrow>x\<rightarrow> r' \<Longrightarrow> Q r \<Longrightarrow> P x \<Longrightarrow> Q r'"
   749     and init_result: "init users =bs\<Rightarrow> root"
   750   shows "\<not> (\<exists>xs. (\<forall>x \<in> set xs. P x) \<and> can_exec root (xs @ [y]))"
   751 proof -
   752   {
   753     fix xs
   754     assume Ps: "\<forall>x \<in> set xs. P x"
   755     assume can_exec: "can_exec root (xs @ [y])"
   756     then obtain root' root'' where
   757         xs: "root =xs\<Rightarrow> root'" and y: "root' \<midarrow>y\<rightarrow> root''"
   758       by (blast dest: can_exec_snocD)
   759     from init_result have "Q root" by (rule init_inv)
   760     from preserve_inv xs this Ps have "Q root'"
   761       by (rule transitions_invariant)
   762     from this y have False by (rule cannot_y)
   763   }
   764   then show ?thesis by blast
   765 qed
   766 
   767 text \<open>
   768   Here @{prop "P x"} refers to the restriction on file-system operations that
   769   are admitted after having reached the critical configuration; according to
   770   the problem specification this will become @{prop "uid_of x = user\<^sub>1"} later
   771   on. Furthermore, @{term y} refers to the operations we claim to be
   772   impossible to perform afterwards, we will take @{term Rmdir} later. Moreover
   773   @{term Q} is a suitable (auxiliary) invariant over the file-system; choosing
   774   @{term Q} adequately is very important to make the proof work (see
   775   \secref{sec:unix-inv-lemmas}).
   776 \<close>
   777 
   778 
   779 subsection \<open>The particular situation\<close>
   780 
   781 text \<open>
   782   We introduce a few global declarations and axioms to describe our particular
   783   situation considered here. Thus we avoid excessive use of local parameters
   784   in the subsequent development.
   785 \<close>
   786 
   787 context
   788   fixes users :: "uid set"
   789     and user\<^sub>1 :: uid
   790     and user\<^sub>2 :: uid
   791     and name\<^sub>1 :: name
   792     and name\<^sub>2 :: name
   793     and name\<^sub>3 :: name
   794     and perms\<^sub>1 :: perms
   795     and perms\<^sub>2 :: perms
   796   assumes user\<^sub>1_known: "user\<^sub>1 \<in> users"
   797     and user\<^sub>1_not_root: "user\<^sub>1 \<noteq> 0"
   798     and users_neq: "user\<^sub>1 \<noteq> user\<^sub>2"
   799     and perms\<^sub>1_writable: "Writable \<in> perms\<^sub>1"
   800     and perms\<^sub>2_not_writable: "Writable \<notin> perms\<^sub>2"
   801   notes facts = user\<^sub>1_known user\<^sub>1_not_root users_neq
   802     perms\<^sub>1_writable perms\<^sub>2_not_writable
   803 begin
   804 
   805 definition
   806   "bogus =
   807      [Mkdir user\<^sub>1 perms\<^sub>1 [user\<^sub>1, name\<^sub>1],
   808       Mkdir user\<^sub>2 perms\<^sub>2 [user\<^sub>1, name\<^sub>1, name\<^sub>2],
   809       Creat user\<^sub>2 perms\<^sub>2 [user\<^sub>1, name\<^sub>1, name\<^sub>2, name\<^sub>3]]"
   810 
   811 definition "bogus_path = [user\<^sub>1, name\<^sub>1, name\<^sub>2]"
   812 
   813 text \<open>
   814   The @{term bogus} operations are the ones that lead into the uncouth
   815   situation; @{term bogus_path} is the key position within the file-system
   816   where things go awry.
   817 \<close>
   818 
   819 
   820 subsection \<open>Invariance lemmas \label{sec:unix-inv-lemmas}\<close>
   821 
   822 text \<open>
   823   The following invariant over the root file-system describes the bogus
   824   situation in an abstract manner: located at a certain @{term path} within
   825   the file-system is a non-empty directory that is neither owned nor writable
   826   by @{term user\<^sub>1}.
   827 \<close>
   828 
   829 definition
   830   "invariant root path =
   831     (\<exists>att dir.
   832       access root path user\<^sub>1 {} = Some (Env att dir) \<and> dir \<noteq> empty \<and>
   833       user\<^sub>1 \<noteq> owner att \<and>
   834       access root path user\<^sub>1 {Writable} = None)"
   835 
   836 text \<open>
   837   Following the general procedure (see \secref{sec:unix-inv-procedure}) we
   838   will now establish the three key lemmas required to yield the final result.
   839 
   840     \<^enum> The invariant is sufficiently strong to entail the pathological case
   841     that @{term user\<^sub>1} is unable to remove the (owned) directory at @{term
   842     "[user\<^sub>1, name\<^sub>1]"}.
   843 
   844     \<^enum> The invariant does hold after having executed the @{term bogus} list of
   845     operations (starting with an initial file-system configuration).
   846 
   847     \<^enum> The invariant is preserved by any file-system operation performed by
   848     @{term user\<^sub>1} alone, without any help by other users.
   849 
   850   As the invariant appears both as assumptions and conclusions in the course
   851   of proof, its formulation is rather critical for the whole development to
   852   work out properly. In particular, the third step is very sensitive to the
   853   invariant being either too strong or too weak. Moreover the invariant has to
   854   be sufficiently abstract, lest the proof become cluttered by confusing
   855   detail.
   856 
   857   \<^medskip>
   858   The first two lemmas are technically straight forward --- we just have to
   859   inspect rather special cases.
   860 \<close>
   861 
   862 lemma cannot_rmdir:
   863   assumes inv: "invariant root bogus_path"
   864     and rmdir: "root \<midarrow>(Rmdir user\<^sub>1 [user\<^sub>1, name\<^sub>1])\<rightarrow> root'"
   865   shows False
   866 proof -
   867   from inv obtain "file" where "access root bogus_path user\<^sub>1 {} = Some file"
   868     unfolding invariant_def by blast
   869   moreover
   870   from rmdir obtain att where
   871       "access root [user\<^sub>1, name\<^sub>1] user\<^sub>1 {} = Some (Env att empty)"
   872     by cases auto
   873   then have "access root ([user\<^sub>1, name\<^sub>1] @ [name\<^sub>2]) user\<^sub>1 {} = empty name\<^sub>2"
   874     by (simp only: access_empty_lookup lookup_append_some) simp
   875   ultimately show False by (simp add: bogus_path_def)
   876 qed
   877 
   878 lemma
   879   assumes init: "init users =bogus\<Rightarrow> root"
   880   shows init_invariant: "invariant root bogus_path"
   881   supply eval = facts access_def init_def
   882   using init
   883   apply (unfold bogus_def bogus_path_def)
   884   apply (drule transitions_consD, rule transition.intros,
   885       (force simp add: eval)+, (simp add: eval)?)+
   886     \<comment> "evaluate all operations"
   887   apply (drule transitions_nilD)
   888     \<comment> "reach final result"
   889   apply (simp add: invariant_def eval)
   890     \<comment> "check the invariant"
   891   done
   892 
   893 text \<open>
   894   \<^medskip>
   895   At last we are left with the main effort to show that the ``bogosity''
   896   invariant is preserved by any file-system operation performed by @{term
   897   user\<^sub>1} alone. Note that this holds for any @{term path} given, the
   898   particular @{term bogus_path} is not required here.
   899 \<close>
   900 
   901 lemma preserve_invariant:
   902   assumes tr: "root \<midarrow>x\<rightarrow> root'"
   903     and inv: "invariant root path"
   904     and uid: "uid_of x = user\<^sub>1"
   905   shows "invariant root' path"
   906 proof -
   907   from inv obtain att dir where
   908       inv1: "access root path user\<^sub>1 {} = Some (Env att dir)" and
   909       inv2: "dir \<noteq> empty" and
   910       inv3: "user\<^sub>1 \<noteq> owner att" and
   911       inv4: "access root path user\<^sub>1 {Writable} = None"
   912     by (auto simp add: invariant_def)
   913 
   914   from inv1 have lookup: "lookup root path = Some (Env att dir)"
   915     by (simp only: access_empty_lookup)
   916 
   917   from inv1 inv3 inv4 and user\<^sub>1_not_root
   918   have not_writable: "Writable \<notin> others att"
   919     by (auto simp add: access_def split: option.splits)
   920 
   921   show ?thesis
   922   proof cases
   923     assume "root' = root"
   924     with inv show "invariant root' path" by (simp only:)
   925   next
   926     assume changed: "root' \<noteq> root"
   927     with tr obtain opt where root': "root' = update (path_of x) opt root"
   928       by cases auto
   929     show ?thesis
   930     proof (rule prefixeq_cases)
   931       assume "path_of x \<parallel> path"
   932       with inv root'
   933       have "\<And>perms. access root' path user\<^sub>1 perms = access root path user\<^sub>1 perms"
   934         by (simp only: access_update_other)
   935       with inv show "invariant root' path"
   936         by (simp only: invariant_def)
   937     next
   938       assume "prefixeq (path_of x) path"
   939       then obtain ys where path: "path = path_of x @ ys" ..
   940 
   941       show ?thesis
   942       proof (cases ys)
   943         assume "ys = []"
   944         with tr uid inv2 inv3 lookup changed path and user\<^sub>1_not_root
   945         have False
   946           by cases (auto simp add: access_empty_lookup dest: access_some_lookup)
   947         then show ?thesis ..
   948       next
   949         fix z zs assume ys: "ys = z # zs"
   950         have "lookup root' path = lookup root path"
   951         proof -
   952           from inv2 lookup path ys
   953           have look: "lookup root (path_of x @ z # zs) = Some (Env att dir)"
   954             by (simp only:)
   955           then obtain att' dir' file' where
   956               look': "lookup root (path_of x) = Some (Env att' dir')" and
   957               dir': "dir' z = Some file'" and
   958               file': "lookup file' zs = Some (Env att dir)"
   959             by (blast dest: lookup_some_upper)
   960 
   961           from tr uid changed look' dir' obtain att'' where
   962               look'': "lookup root' (path_of x) = Some (Env att'' dir')"
   963             by cases (auto simp add: access_empty_lookup lookup_update_some
   964               dest: access_some_lookup)
   965           with dir' file' have "lookup root' (path_of x @ z # zs) =
   966               Some (Env att dir)"
   967             by (simp add: lookup_append_some)
   968           with look path ys show ?thesis
   969             by simp
   970         qed
   971         with inv show "invariant root' path"
   972           by (simp only: invariant_def access_def)
   973       qed
   974     next
   975       assume "prefix path (path_of x)"
   976       then obtain y ys where path: "path_of x = path @ y # ys" ..
   977 
   978       obtain dir' where
   979         lookup': "lookup root' path = Some (Env att dir')" and
   980         inv2': "dir' \<noteq> empty"
   981       proof (cases ys)
   982         assume "ys = []"
   983         with path have parent: "path_of x = path @ [y]" by simp
   984         with tr uid inv4 changed obtain "file" where
   985             "root' = update (path_of x) (Some file) root"
   986           by cases auto
   987         with lookup parent have "lookup root' path = Some (Env att (dir(y\<mapsto>file)))"
   988           by (simp only: update_append_some update_cons_nil_env)
   989         moreover have "dir(y\<mapsto>file) \<noteq> empty" by simp
   990         ultimately show ?thesis ..
   991       next
   992         fix z zs assume ys: "ys = z # zs"
   993         with lookup root' path
   994         have "lookup root' path = Some (update (y # ys) opt (Env att dir))"
   995           by (simp only: update_append_some)
   996         also obtain file' where
   997           "update (y # ys) opt (Env att dir) = Env att (dir(y\<mapsto>file'))"
   998         proof -
   999           have "dir y \<noteq> None"
  1000           proof -
  1001             have "dir y = lookup (Env att dir) [y]"
  1002               by (simp split: option.splits)
  1003             also from lookup have "\<dots> = lookup root (path @ [y])"
  1004               by (simp only: lookup_append_some)
  1005             also have "\<dots> \<noteq> None"
  1006             proof -
  1007               from ys obtain us u where rev_ys: "ys = us @ [u]"
  1008                 by (cases ys rule: rev_cases) simp
  1009               with tr path
  1010               have "lookup root ((path @ [y]) @ (us @ [u])) \<noteq> None \<or>
  1011                   lookup root ((path @ [y]) @ us) \<noteq> None"
  1012                 by cases (auto dest: access_some_lookup)
  1013               then show ?thesis
  1014                 by (fastforce dest!: lookup_some_append)
  1015             qed
  1016             finally show ?thesis .
  1017           qed
  1018           with ys show ?thesis using that by auto
  1019         qed
  1020         also have "dir(y\<mapsto>file') \<noteq> empty" by simp
  1021         ultimately show ?thesis ..
  1022       qed
  1023 
  1024       from lookup' have inv1': "access root' path user\<^sub>1 {} = Some (Env att dir')"
  1025         by (simp only: access_empty_lookup)
  1026 
  1027       from inv3 lookup' and not_writable user\<^sub>1_not_root
  1028       have "access root' path user\<^sub>1 {Writable} = None"
  1029         by (simp add: access_def)
  1030       with inv1' inv2' inv3 show ?thesis unfolding invariant_def by blast
  1031     qed
  1032   qed
  1033 qed
  1034 
  1035 
  1036 subsection \<open>Putting it all together \label{sec:unix-main-result}\<close>
  1037 
  1038 text \<open>
  1039   The main result is now imminent, just by composing the three invariance
  1040   lemmas (see \secref{sec:unix-inv-lemmas}) according the the overall
  1041   procedure (see \secref{sec:unix-inv-procedure}).
  1042 \<close>
  1043 
  1044 corollary
  1045   assumes bogus: "init users =bogus\<Rightarrow> root"
  1046   shows "\<not> (\<exists>xs. (\<forall>x \<in> set xs. uid_of x = user\<^sub>1) \<and>
  1047     can_exec root (xs @ [Rmdir user\<^sub>1 [user\<^sub>1, name\<^sub>1]]))"
  1048 proof -
  1049   from cannot_rmdir init_invariant preserve_invariant
  1050     and bogus show ?thesis by (rule general_procedure)
  1051 qed
  1052 
  1053 end
  1054 
  1055 end