src/HOL/Unix/Unix.thy
 author wenzelm Sat Nov 04 15:24:40 2017 +0100 (20 months ago) changeset 67003 49850a679c2c parent 66453 cc19f7ca2ed6 child 67443 3abf6a722518 permissions -rw-r--r--
more robust sorted_entries;
     1 (*  Title:      HOL/Unix/Unix.thy

     2     Author:     Markus Wenzel, TU Muenchen

     3 *)

     4

     5 section \<open>Unix file-systems \label{sec:unix-file-system}\<close>

     6

     7 theory Unix

     8   imports

     9     Nested_Environment

    10     "HOL-Library.Sublist"

    11 begin

    12

    13 text \<open>

    14   We give a simple mathematical model of the basic structures underlying the

    15   Unix file-system, together with a few fundamental operations that could be

    16   imagined to be performed internally by the Unix kernel. This forms the basis

    17   for the set of Unix system-calls to be introduced later (see

    18   \secref{sec:unix-trans}), which are the actual interface offered to

    19   processes running in user-space.

    20

    21   \<^medskip>

    22   Basically, any Unix file is either a \<^emph>\<open>plain file\<close> or a \<^emph>\<open>directory\<close>,

    23   consisting of some \<^emph>\<open>content\<close> plus \<^emph>\<open>attributes\<close>. The content of a plain

    24   file is plain text. The content of a directory is a mapping from names to

    25   further files.\<^footnote>\<open>In fact, this is the only way that names get associated with

    26   files. In Unix files do \<^emph>\<open>not\<close> have a name in itself. Even more, any number

    27   of names may be associated with the very same file due to \<^emph>\<open>hard links\<close>

    28   (although this is excluded from our model).\<close> Attributes include information

    29   to control various ways to access the file (read, write etc.).

    30

    31   Our model will be quite liberal in omitting excessive detail that is easily

    32   seen to be irrelevant'' for the aspects of Unix file-systems to be

    33   discussed here. First of all, we ignore character and block special files,

    34   pipes, sockets, hard links, symbolic links, and mount points.

    35 \<close>

    36

    37

    38 subsection \<open>Names\<close>

    39

    40 text \<open>

    41   User ids and file name components shall be represented by natural numbers

    42   (without loss of generality). We do not bother about encoding of actual

    43   names (e.g.\ strings), nor a mapping between user names and user ids as

    44   would be present in a reality.

    45 \<close>

    46

    47 type_synonym uid = nat

    48 type_synonym name = nat

    49 type_synonym path = "name list"

    50

    51

    52 subsection \<open>Attributes\<close>

    53

    54 text \<open>

    55   Unix file attributes mainly consist of \<^emph>\<open>owner\<close> information and a number of

    56   \<^emph>\<open>permission\<close> bits which control access for user'', group'', and

    57   others'' (see the Unix man pages \<open>chmod(2)\<close> and \<open>stat(2)\<close> for more

    58   details).

    59

    60   \<^medskip>

    61   Our model of file permissions only considers the others'' part. The

    62   user'' field may be omitted without loss of overall generality, since the

    63   owner is usually able to change it anyway by performing \<open>chmod\<close>.\<^footnote>\<open>The

    64   inclined Unix expert may try to figure out some exotic arrangements of a

    65   real-world Unix file-system such that the owner of a file is unable to apply

    66   the \<open>chmod\<close> system call.\<close> We omit group'' permissions as a genuine

    67   simplification as we just do not intend to discuss a model of multiple

    68   groups and group membership, but pretend that everyone is member of a single

    69   global group.\<^footnote>\<open>A general HOL model of user group structures and related

    70   issues is given in @{cite "Naraschewski:2001"}.\<close>

    71 \<close>

    72

    73 datatype perm =

    74     Readable

    75   | Writable

    76   | Executable    \<comment> "(ignored)"

    77

    78 type_synonym perms = "perm set"

    79

    80 record att =

    81   owner :: uid

    82   others :: perms

    83

    84 text \<open>

    85   For plain files @{term Readable} and @{term Writable} specify read and write

    86   access to the actual content, i.e.\ the string of text stored here. For

    87   directories @{term Readable} determines if the set of entry names may be

    88   accessed, and @{term Writable} controls the ability to create or delete any

    89   entries (both plain files or sub-directories).

    90

    91   As another simplification, we ignore the @{term Executable} permission

    92   altogether. In reality it would indicate executable plain files (also known

    93   as binaries''), or control actual lookup of directory entries (recall that

    94   mere directory browsing is controlled via @{term Readable}). Note that the

    95   latter means that in order to perform any file-system operation whatsoever,

    96   all directories encountered on the path would have to grant @{term

    97   Executable}. We ignore this detail and pretend that all directories give

    98   @{term Executable} permission to anybody.

    99 \<close>

   100

   101

   102 subsection \<open>Files\<close>

   103

   104 text \<open>

   105   In order to model the general tree structure of a Unix file-system we use

   106   the arbitrarily branching datatype @{typ "('a, 'b, 'c) env"} from the

   107   standard library of Isabelle/HOL @{cite "Bauer-et-al:2002:HOL-Library"}.

   108   This type provides constructors @{term Val} and @{term Env} as follows:

   109

   110   \<^medskip>

   111   {\def\isastyleminor{\isastyle}

   112   \begin{tabular}{l}

   113   @{term [source] "Val :: 'a \<Rightarrow> ('a, 'b, 'c) env"} \\

   114   @{term [source] "Env :: 'b \<Rightarrow> ('c \<Rightarrow> ('a, 'b, 'c) env option) \<Rightarrow> ('a, 'b, 'c) env"} \\

   115   \end{tabular}}

   116   \<^medskip>

   117

   118   Here the parameter @{typ 'a} refers to plain values occurring at leaf

   119   positions, parameter @{typ 'b} to information kept with inner branch nodes,

   120   and parameter @{typ 'c} to the branching type of the tree structure. For our

   121   purpose we use the type instance with @{typ "att \<times> string"} (representing

   122   plain files), @{typ att} (for attributes of directory nodes), and @{typ

   123   name} (for the index type of directory nodes).

   124 \<close>

   125

   126 type_synonym "file" = "(att \<times> string, att, name) env"

   127

   128 text \<open>

   129   \<^medskip>

   130   The HOL library also provides @{term lookup} and @{term update} operations

   131   for general tree structures with the subsequent primitive recursive

   132   characterizations.

   133

   134   \<^medskip>

   135   {\def\isastyleminor{\isastyle}

   136   \begin{tabular}{l}

   137   @{term [source] "lookup :: ('a, 'b, 'c) env \<Rightarrow> 'c list \<Rightarrow> ('a, 'b, 'c) env option"} \\

   138   @{term [source]

   139   "update :: 'c list \<Rightarrow> ('a, 'b, 'c) env option \<Rightarrow> ('a, 'b, 'c) env \<Rightarrow> ('a, 'b, 'c) env"} \\

   140   \end{tabular}}

   141

   142   @{thm [display, indent = 2, eta_contract = false] lookup_eq [no_vars]}

   143   @{thm [display, indent = 2, eta_contract = false] update_eq [no_vars]}

   144

   145   Several further properties of these operations are proven in @{cite

   146   "Bauer-et-al:2002:HOL-Library"}. These will be routinely used later on

   147   without further notice.

   148

   149   \<^bigskip>

   150   Apparently, the elements of type @{typ "file"} contain an @{typ att}

   151   component in either case. We now define a few auxiliary operations to

   152   manipulate this field uniformly, following the conventions for record types

   153   in Isabelle/HOL @{cite "Nipkow-et-al:2000:HOL"}.

   154 \<close>

   155

   156 definition

   157   "attributes file =

   158     (case file of

   159       Val (att, text) \<Rightarrow> att

   160     | Env att dir \<Rightarrow> att)"

   161

   162 definition

   163   "map_attributes f file =

   164     (case file of

   165       Val (att, text) \<Rightarrow> Val (f att, text)

   166     | Env att dir \<Rightarrow> Env (f att) dir)"

   167

   168 lemma [simp]: "attributes (Val (att, text)) = att"

   169   by (simp add: attributes_def)

   170

   171 lemma [simp]: "attributes (Env att dir) = att"

   172   by (simp add: attributes_def)

   173

   174 lemma [simp]: "attributes (map_attributes f file) = f (attributes file)"

   175   by (cases "file") (simp_all add: attributes_def map_attributes_def

   176     split_tupled_all)

   177

   178 lemma [simp]: "map_attributes f (Val (att, text)) = Val (f att, text)"

   179   by (simp add: map_attributes_def)

   180

   181 lemma [simp]: "map_attributes f (Env att dir) = Env (f att) dir"

   182   by (simp add: map_attributes_def)

   183

   184

   185 subsection \<open>Initial file-systems\<close>

   186

   187 text \<open>

   188   Given a set of \<^emph>\<open>known users\<close> a file-system shall be initialized by

   189   providing an empty home directory for each user, with read-only access for

   190   everyone else. (Note that we may directly use the user id as home directory

   191   name, since both types have been identified.) Certainly, the very root

   192   directory is owned by the super user (who has user id 0).

   193 \<close>

   194

   195 definition

   196   "init users =

   197     Env \<lparr>owner = 0, others = {Readable}\<rparr>

   198       (\<lambda>u. if u \<in> users then Some (Env \<lparr>owner = u, others = {Readable}\<rparr> empty)

   199         else None)"

   200

   201

   202 subsection \<open>Accessing file-systems\<close>

   203

   204 text \<open>

   205   The main internal file-system operation is access of a file by a user,

   206   requesting a certain set of permissions. The resulting @{typ "file option"}

   207   indicates if a file had been present at the corresponding @{typ path} and if

   208   access was granted according to the permissions recorded within the

   209   file-system.

   210

   211   Note that by the rules of Unix file-system security (e.g.\ @{cite

   212   "Tanenbaum:1992"}) both the super-user and owner may always access a file

   213   unconditionally (in our simplified model).

   214 \<close>

   215

   216 definition

   217   "access root path uid perms =

   218     (case lookup root path of

   219       None \<Rightarrow> None

   220     | Some file \<Rightarrow>

   221         if uid = 0

   222           \<or> uid = owner (attributes file)

   223           \<or> perms \<subseteq> others (attributes file)

   224         then Some file

   225         else None)"

   226

   227 text \<open>

   228   \<^medskip>

   229   Successful access to a certain file is the main prerequisite for

   230   system-calls to be applicable (cf.\ \secref{sec:unix-trans}). Any

   231   modification of the file-system is then performed using the basic @{term

   232   update} operation.

   233

   234   \<^medskip>

   235   We see that @{term access} is just a wrapper for the basic @{term lookup}

   236   function, with additional checking of attributes. Subsequently we establish

   237   a few auxiliary facts that stem from the primitive @{term lookup} used

   238   within @{term access}.

   239 \<close>

   240

   241 lemma access_empty_lookup: "access root path uid {} = lookup root path"

   242   by (simp add: access_def split: option.splits)

   243

   244 lemma access_some_lookup:

   245   "access root path uid perms = Some file \<Longrightarrow>

   246     lookup root path = Some file"

   247   by (simp add: access_def split: option.splits if_splits)

   248

   249 lemma access_update_other:

   250   assumes parallel: "path' \<parallel> path"

   251   shows "access (update path' opt root) path uid perms = access root path uid perms"

   252 proof -

   253   from parallel obtain y z xs ys zs where

   254       "y \<noteq> z" and "path' = xs @ y # ys" and "path = xs @ z # zs"

   255     by (blast dest: parallel_decomp)

   256   then have "lookup (update path' opt root) path = lookup root path"

   257     by (blast intro: lookup_update_other)

   258   then show ?thesis by (simp only: access_def)

   259 qed

   260

   261

   262 section \<open>File-system transitions \label{sec:unix-trans}\<close>

   263

   264 subsection \<open>Unix system calls \label{sec:unix-syscall}\<close>

   265

   266 text \<open>

   267   According to established operating system design (cf.\ @{cite

   268   "Tanenbaum:1992"}) user space processes may only initiate system operations

   269   by a fixed set of \<^emph>\<open>system-calls\<close>. This enables the kernel to enforce

   270   certain security policies in the first place.\<^footnote>\<open>Incidently, this is the very

   271   same principle employed by any LCF-style'' theorem proving system

   272   according to Milner's principle of correctness by construction'', such as

   273   Isabelle/HOL itself.\<close>

   274

   275   \<^medskip>

   276   In our model of Unix we give a fixed datatype \<open>operation\<close> for the syntax of

   277   system-calls, together with an inductive definition of file-system state

   278   transitions of the form \<open>root \<midarrow>x\<rightarrow> root'\<close> for the operational semantics.

   279 \<close>

   280

   281 datatype operation =

   282     Read uid string path

   283   | Write uid string path

   284   | Chmod uid perms path

   285   | Creat uid perms path

   286   | Unlink uid path

   287   | Mkdir uid perms path

   288   | Rmdir uid path

   289   | Readdir uid "name set" path

   290

   291 text \<open>

   292   The @{typ uid} field of an operation corresponds to the \<^emph>\<open>effective user id\<close>

   293   of the underlying process, although our model never mentions processes

   294   explicitly. The other parameters are provided as arguments by the caller;

   295   the @{term path} one is common to all kinds of system-calls.

   296 \<close>

   297

   298 primrec uid_of :: "operation \<Rightarrow> uid"

   299   where

   300     "uid_of (Read uid text path) = uid"

   301   | "uid_of (Write uid text path) = uid"

   302   | "uid_of (Chmod uid perms path) = uid"

   303   | "uid_of (Creat uid perms path) = uid"

   304   | "uid_of (Unlink uid path) = uid"

   305   | "uid_of (Mkdir uid path perms) = uid"

   306   | "uid_of (Rmdir uid path) = uid"

   307   | "uid_of (Readdir uid names path) = uid"

   308

   309 primrec path_of :: "operation \<Rightarrow> path"

   310   where

   311     "path_of (Read uid text path) = path"

   312   | "path_of (Write uid text path) = path"

   313   | "path_of (Chmod uid perms path) = path"

   314   | "path_of (Creat uid perms path) = path"

   315   | "path_of (Unlink uid path) = path"

   316   | "path_of (Mkdir uid perms path) = path"

   317   | "path_of (Rmdir uid path) = path"

   318   | "path_of (Readdir uid names path) = path"

   319

   320 text \<open>

   321   \<^medskip>

   322   Note that we have omitted explicit \<open>Open\<close> and \<open>Close\<close> operations, pretending

   323   that @{term Read} and @{term Write} would already take care of this behind

   324   the scenes. Thus we have basically treated actual sequences of real

   325   system-calls \<open>open\<close>--\<open>read\<close>/\<open>write\<close>--\<open>close\<close> as atomic.

   326

   327   In principle, this could make big a difference in a model with explicit

   328   concurrent processes. On the other hand, even on a real Unix system the

   329   exact scheduling of concurrent \<open>open\<close> and \<open>close\<close> operations does \<^emph>\<open>not\<close>

   330   directly affect the success of corresponding \<open>read\<close> or \<open>write\<close>. Unix allows

   331   several processes to have files opened at the same time, even for writing!

   332   Certainly, the result from reading the contents later may be hard to

   333   predict, but the system-calls involved here will succeed in any case.

   334

   335   \<^bigskip>

   336   The operational semantics of system calls is now specified via transitions

   337   of the file-system configuration. This is expressed as an inductive relation

   338   (although there is no actual recursion involved here).

   339 \<close>

   340

   341 inductive transition :: "file \<Rightarrow> operation \<Rightarrow> file \<Rightarrow> bool"

   342   ("_ \<midarrow>_\<rightarrow> _" [90, 1000, 90] 100)

   343   where

   344     read:

   345       "root \<midarrow>(Read uid text path)\<rightarrow> root"

   346       if "access root path uid {Readable} = Some (Val (att, text))"

   347   | "write":

   348       "root \<midarrow>(Write uid text path)\<rightarrow> update path (Some (Val (att, text))) root"

   349       if "access root path uid {Writable} = Some (Val (att, text'))"

   350   | chmod:

   351       "root \<midarrow>(Chmod uid perms path)\<rightarrow>

   352         update path (Some (map_attributes (others_update (\<lambda>_. perms)) file)) root"

   353       if "access root path uid {} = Some file" and "uid = 0 \<or> uid = owner (attributes file)"

   354   | creat:

   355       "root \<midarrow>(Creat uid perms path)\<rightarrow>

   356         update path (Some (Val (\<lparr>owner = uid, others = perms\<rparr>, []))) root"

   357       if "path = parent_path @ [name]"

   358       and "access root parent_path uid {Writable} = Some (Env att parent)"

   359       and "access root path uid {} = None"

   360   | unlink:

   361       "root \<midarrow>(Unlink uid path)\<rightarrow> update path None root"

   362       if "path = parent_path @ [name]"

   363       and "access root parent_path uid {Writable} = Some (Env att parent)"

   364       and "access root path uid {} = Some (Val plain)"

   365   | mkdir:

   366       "root \<midarrow>(Mkdir uid perms path)\<rightarrow>

   367         update path (Some (Env \<lparr>owner = uid, others = perms\<rparr> empty)) root"

   368       if "path = parent_path @ [name]"

   369       and "access root parent_path uid {Writable} = Some (Env att parent)"

   370       and "access root path uid {} = None"

   371   | rmdir:

   372       "root \<midarrow>(Rmdir uid path)\<rightarrow> update path None root"

   373       if "path = parent_path @ [name]"

   374       and "access root parent_path uid {Writable} = Some (Env att parent)"

   375       and "access root path uid {} = Some (Env att' empty)"

   376   | readdir:

   377       "root \<midarrow>(Readdir uid names path)\<rightarrow> root"

   378       if "access root path uid {Readable} = Some (Env att dir)"

   379       and "names = dom dir"

   380

   381 text \<open>

   382   \<^medskip>

   383   Certainly, the above specification is central to the whole formal

   384   development. Any of the results to be established later on are only

   385   meaningful to the outside world if this transition system provides an

   386   adequate model of real Unix systems. This kind of reality-check'' of a

   387   formal model is the well-known problem of \<^emph>\<open>validation\<close>.

   388

   389   If in doubt, one may consider to compare our definition with the informal

   390   specifications given the corresponding Unix man pages, or even peek at an

   391   actual implementation such as @{cite "Torvalds-et-al:Linux"}. Another common

   392   way to gain confidence into the formal model is to run simple simulations

   393   (see \secref{sec:unix-examples}), and check the results with that of

   394   experiments performed on a real Unix system.

   395 \<close>

   396

   397

   398 subsection \<open>Basic properties of single transitions \label{sec:unix-single-trans}\<close>

   399

   400 text \<open>

   401   The transition system \<open>root \<midarrow>x\<rightarrow> root'\<close> defined above determines a unique

   402   result @{term root'} from given @{term root} and @{term x} (this is holds

   403   rather trivially, since there is even only one clause for each operation).

   404   This uniqueness statement will simplify our subsequent development to some

   405   extent, since we only have to reason about a partial function rather than a

   406   general relation.

   407 \<close>

   408

   409 theorem transition_uniq:

   410   assumes root': "root \<midarrow>x\<rightarrow> root'"

   411     and root'': "root \<midarrow>x\<rightarrow> root''"

   412   shows "root' = root''"

   413   using root''

   414 proof cases

   415   case read

   416   with root' show ?thesis by cases auto

   417 next

   418   case "write"

   419   with root' show ?thesis by cases auto

   420 next

   421   case chmod

   422   with root' show ?thesis by cases auto

   423 next

   424   case creat

   425   with root' show ?thesis by cases auto

   426 next

   427   case unlink

   428   with root' show ?thesis by cases auto

   429 next

   430   case mkdir

   431   with root' show ?thesis by cases auto

   432 next

   433   case rmdir

   434   with root' show ?thesis by cases auto

   435 next

   436   case readdir

   437   with root' show ?thesis by cases blast+

   438 qed

   439

   440 text \<open>

   441   \<^medskip>

   442   Apparently, file-system transitions are \<^emph>\<open>type-safe\<close> in the sense that the

   443   result of transforming an actual directory yields again a directory.

   444 \<close>

   445

   446 theorem transition_type_safe:

   447   assumes tr: "root \<midarrow>x\<rightarrow> root'"

   448     and inv: "\<exists>att dir. root = Env att dir"

   449   shows "\<exists>att dir. root' = Env att dir"

   450 proof (cases "path_of x")

   451   case Nil

   452   with tr inv show ?thesis

   453     by cases (auto simp add: access_def split: if_splits)

   454 next

   455   case Cons

   456   from tr obtain opt where

   457       "root' = root \<or> root' = update (path_of x) opt root"

   458     by cases auto

   459   with inv Cons show ?thesis

   460     by (auto simp add: update_eq split: list.splits)

   461 qed

   462

   463 text \<open>

   464   The previous result may be seen as the most basic invariant on the

   465   file-system state that is enforced by any proper kernel implementation. So

   466   user processes --- being bound to the system-call interface --- may never

   467   mess up a file-system such that the root becomes a plain file instead of a

   468   directory, which would be a strange situation indeed.

   469 \<close>

   470

   471

   472 subsection \<open>Iterated transitions\<close>

   473

   474 text \<open>

   475   Iterated system transitions via finite sequences of system operations are

   476   modeled inductively as follows. In a sense, this relation describes the

   477   cumulative effect of the sequence of system-calls issued by a number of

   478   running processes over a finite amount of time.

   479 \<close>

   480

   481 inductive transitions :: "file \<Rightarrow> operation list \<Rightarrow> file \<Rightarrow> bool"  ("_ \<Midarrow>_\<Rightarrow> _" [90, 1000, 90] 100)

   482   where

   483     nil: "root \<Midarrow>[]\<Rightarrow> root"

   484   | cons: "root \<Midarrow>(x # xs)\<Rightarrow> root''" if "root \<midarrow>x\<rightarrow> root'" and "root' \<Midarrow>xs\<Rightarrow> root''"

   485

   486 text \<open>

   487   \<^medskip>

   488   We establish a few basic facts relating iterated transitions with single

   489   ones, according to the recursive structure of lists.

   490 \<close>

   491

   492 lemma transitions_nil_eq: "root \<Midarrow>[]\<Rightarrow> root' \<longleftrightarrow> root = root'"

   493 proof

   494   assume "root \<Midarrow>[]\<Rightarrow> root'"

   495   then show "root = root'" by cases simp_all

   496 next

   497   assume "root = root'"

   498   then show "root \<Midarrow>[]\<Rightarrow> root'" by (simp only: transitions.nil)

   499 qed

   500

   501 lemma transitions_cons_eq:

   502   "root \<Midarrow>(x # xs)\<Rightarrow> root'' \<longleftrightarrow> (\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' \<Midarrow>xs\<Rightarrow> root'')"

   503 proof

   504   assume "root \<Midarrow>(x # xs)\<Rightarrow> root''"

   505   then show "\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' \<Midarrow>xs\<Rightarrow> root''"

   506     by cases auto

   507 next

   508   assume "\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' \<Midarrow>xs\<Rightarrow> root''"

   509   then show "root \<Midarrow>(x # xs)\<Rightarrow> root''"

   510     by (blast intro: transitions.cons)

   511 qed

   512

   513 text \<open>

   514   The next two rules show how to destruct'' known transition sequences. Note

   515   that the second one actually relies on the uniqueness property of the basic

   516   transition system (see \secref{sec:unix-single-trans}).

   517 \<close>

   518

   519 lemma transitions_nilD: "root \<Midarrow>[]\<Rightarrow> root' \<Longrightarrow> root' = root"

   520   by (simp add: transitions_nil_eq)

   521

   522 lemma transitions_consD:

   523   assumes list: "root \<Midarrow>(x # xs)\<Rightarrow> root''"

   524     and hd: "root \<midarrow>x\<rightarrow> root'"

   525   shows "root' \<Midarrow>xs\<Rightarrow> root''"

   526 proof -

   527   from list obtain r' where r': "root \<midarrow>x\<rightarrow> r'" and root'': "r' \<Midarrow>xs\<Rightarrow> root''"

   528     by cases simp_all

   529   from r' hd have "r' = root'" by (rule transition_uniq)

   530   with root'' show "root' \<Midarrow>xs\<Rightarrow> root''" by simp

   531 qed

   532

   533 text \<open>

   534   \<^medskip>

   535   The following fact shows how an invariant @{term Q} of single transitions

   536   with property @{term P} may be transferred to iterated transitions. The

   537   proof is rather obvious by rule induction over the definition of @{term

   538   "root \<Midarrow>xs\<Rightarrow> root'"}.

   539 \<close>

   540

   541 lemma transitions_invariant:

   542   assumes r: "\<And>r x r'. r \<midarrow>x\<rightarrow> r' \<Longrightarrow> Q r \<Longrightarrow> P x \<Longrightarrow> Q r'"

   543     and trans: "root \<Midarrow>xs\<Rightarrow> root'"

   544   shows "Q root \<Longrightarrow> \<forall>x \<in> set xs. P x \<Longrightarrow> Q root'"

   545   using trans

   546 proof induct

   547   case nil

   548   show ?case by (rule nil.prems)

   549 next

   550   case (cons root x root' xs root'')

   551   note P = \<open>\<forall>x \<in> set (x # xs). P x\<close>

   552   then have "P x" by simp

   553   with \<open>root \<midarrow>x\<rightarrow> root'\<close> and \<open>Q root\<close> have Q': "Q root'" by (rule r)

   554   from P have "\<forall>x \<in> set xs. P x" by simp

   555   with Q' show "Q root''" by (rule cons.hyps)

   556 qed

   557

   558 text \<open>

   559   As an example of applying the previous result, we transfer the basic

   560   type-safety property (see \secref{sec:unix-single-trans}) from single

   561   transitions to iterated ones, which is a rather obvious result indeed.

   562 \<close>

   563

   564 theorem transitions_type_safe:

   565   assumes "root \<Midarrow>xs\<Rightarrow> root'"

   566     and "\<exists>att dir. root = Env att dir"

   567   shows "\<exists>att dir. root' = Env att dir"

   568   using transition_type_safe and assms

   569 proof (rule transitions_invariant)

   570   show "\<forall>x \<in> set xs. True" by blast

   571 qed

   572

   573

   574 section \<open>Executable sequences\<close>

   575

   576 text \<open>

   577   An inductively defined relation such as the one of \<open>root \<midarrow>x\<rightarrow> root'\<close> (see

   578   \secref{sec:unix-syscall}) has two main aspects. First of all, the resulting

   579   system admits a certain set of transition rules (introductions) as given in

   580   the specification. Furthermore, there is an explicit least-fixed-point

   581   construction involved, which results in induction (and case-analysis) rules

   582   to eliminate known transitions in an exhaustive manner.

   583

   584   \<^medskip>

   585   Subsequently, we explore our transition system in an experimental style,

   586   mainly using the introduction rules with basic algebraic properties of the

   587   underlying structures. The technique closely resembles that of Prolog

   588   combined with functional evaluation in a very simple manner.

   589

   590   Just as the closed-world assumption'' is left implicit in Prolog, we do

   591   not refer to induction over the whole transition system here. So this is

   592   still purely positive reasoning about possible executions; exhaustive

   593   reasoning will be employed only later on (see \secref{sec:unix-bogosity}),

   594   when we shall demonstrate that certain behavior is \<^emph>\<open>not\<close> possible.

   595 \<close>

   596

   597

   598 subsection \<open>Possible transitions\<close>

   599

   600 text \<open>

   601   Rather obviously, a list of system operations can be executed within a

   602   certain state if there is a result state reached by an iterated transition.

   603 \<close>

   604

   605 definition "can_exec root xs \<longleftrightarrow> (\<exists>root'. root \<Midarrow>xs\<Rightarrow> root')"

   606

   607 lemma can_exec_nil: "can_exec root []"

   608   unfolding can_exec_def by (blast intro: transitions.intros)

   609

   610 lemma can_exec_cons:

   611     "root \<midarrow>x\<rightarrow> root' \<Longrightarrow> can_exec root' xs \<Longrightarrow> can_exec root (x # xs)"

   612   unfolding can_exec_def by (blast intro: transitions.intros)

   613

   614 text \<open>

   615   \<^medskip>

   616   In case that we already know that a certain sequence can be executed we may

   617   destruct it backwardly into individual transitions.

   618 \<close>

   619

   620 lemma can_exec_snocD: "can_exec root (xs @ [y])

   621     \<Longrightarrow> \<exists>root' root''. root \<Midarrow>xs\<Rightarrow> root' \<and> root' \<midarrow>y\<rightarrow> root''"

   622 proof (induct xs arbitrary: root)

   623   case Nil

   624   then show ?case

   625     by (simp add: can_exec_def transitions_nil_eq transitions_cons_eq)

   626 next

   627   case (Cons x xs)

   628   from \<open>can_exec root ((x # xs) @ [y])\<close> obtain r root'' where

   629       x: "root \<midarrow>x\<rightarrow> r" and

   630       xs_y: "r \<Midarrow>(xs @ [y])\<Rightarrow> root''"

   631     by (auto simp add: can_exec_def transitions_nil_eq transitions_cons_eq)

   632   from xs_y Cons.hyps obtain root' r'

   633     where xs: "r \<Midarrow>xs\<Rightarrow> root'" and y: "root' \<midarrow>y\<rightarrow> r'"

   634     unfolding can_exec_def by blast

   635   from x xs have "root \<Midarrow>(x # xs)\<Rightarrow> root'"

   636     by (rule transitions.cons)

   637   with y show ?case by blast

   638 qed

   639

   640

   641 subsection \<open>Example executions \label{sec:unix-examples}\<close>

   642

   643 text \<open>

   644   We are now ready to perform a few experiments within our formal model of

   645   Unix system-calls. The common technique is to alternate introduction rules

   646   of the transition system (see \secref{sec:unix-trans}), and steps to solve

   647   any emerging side conditions using algebraic properties of the underlying

   648   file-system structures (see \secref{sec:unix-file-system}).

   649 \<close>

   650

   651 lemmas eval = access_def init_def

   652

   653 theorem "u \<in> users \<Longrightarrow> can_exec (init users)

   654     [Mkdir u perms [u, name]]"

   655   apply (rule can_exec_cons)

   656     \<comment> \<open>back-chain @{term can_exec} (of @{term [source] Cons})\<close>

   657   apply (rule mkdir)

   658     \<comment> \<open>back-chain @{term Mkdir}\<close>

   659   apply (force simp add: eval)+

   660     \<comment> \<open>solve preconditions of @{term Mkdir}\<close>

   661   apply (simp add: eval)

   662     \<comment> \<open>peek at resulting dir (optional)\<close>

   663   txt \<open>@{subgoals [display]}\<close>

   664   apply (rule can_exec_nil)

   665     \<comment> \<open>back-chain @{term can_exec} (of @{term [source] Nil})\<close>

   666   done

   667

   668 text \<open>

   669   By inspecting the result shown just before the final step above, we may gain

   670   confidence that our specification of Unix system-calls actually makes sense.

   671   Further common errors are usually exhibited when preconditions of transition

   672   rules fail unexpectedly.

   673

   674   \<^medskip>

   675   Here are a few further experiments, using the same techniques as before.

   676 \<close>

   677

   678 theorem "u \<in> users \<Longrightarrow> can_exec (init users)

   679    [Creat u perms [u, name],

   680     Unlink u [u, name]]"

   681   apply (rule can_exec_cons)

   682   apply (rule creat)

   683   apply (force simp add: eval)+

   684   apply (simp add: eval)

   685   apply (rule can_exec_cons)

   686   apply (rule unlink)

   687   apply (force simp add: eval)+

   688   apply (simp add: eval)

   689   txt \<open>peek at result: @{subgoals [display]}\<close>

   690   apply (rule can_exec_nil)

   691   done

   692

   693 theorem "u \<in> users \<Longrightarrow> Writable \<in> perms\<^sub>1 \<Longrightarrow>

   694   Readable \<in> perms\<^sub>2 \<Longrightarrow> name\<^sub>3 \<noteq> name\<^sub>4 \<Longrightarrow>

   695   can_exec (init users)

   696    [Mkdir u perms\<^sub>1 [u, name\<^sub>1],

   697     Mkdir u' perms\<^sub>2 [u, name\<^sub>1, name\<^sub>2],

   698     Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],

   699     Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>4],

   700     Readdir u {name\<^sub>3, name\<^sub>4} [u, name\<^sub>1, name\<^sub>2]]"

   701   apply (rule can_exec_cons, rule transition.intros,

   702     (force simp add: eval)+, (simp add: eval)?)+

   703   txt \<open>peek at result: @{subgoals [display]}\<close>

   704   apply (rule can_exec_nil)

   705   done

   706

   707 theorem "u \<in> users \<Longrightarrow> Writable \<in> perms\<^sub>1 \<Longrightarrow> Readable \<in> perms\<^sub>3 \<Longrightarrow>

   708   can_exec (init users)

   709    [Mkdir u perms\<^sub>1 [u, name\<^sub>1],

   710     Mkdir u' perms\<^sub>2 [u, name\<^sub>1, name\<^sub>2],

   711     Creat u' perms\<^sub>3 [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],

   712     Write u' ''foo'' [u, name\<^sub>1, name\<^sub>2, name\<^sub>3],

   713     Read u ''foo'' [u, name\<^sub>1, name\<^sub>2, name\<^sub>3]]"

   714   apply (rule can_exec_cons, rule transition.intros,

   715     (force simp add: eval)+, (simp add: eval)?)+

   716   txt \<open>peek at result: @{subgoals [display]}\<close>

   717   apply (rule can_exec_nil)

   718   done

   719

   720

   721 section \<open>Odd effects --- treated formally \label{sec:unix-bogosity}\<close>

   722

   723 text \<open>

   724   We are now ready to give a completely formal treatment of the slightly odd

   725   situation discussed in the introduction (see \secref{sec:unix-intro}): the

   726   file-system can easily reach a state where a user is unable to remove his

   727   very own directory, because it is still populated by items placed there by

   728   another user in an uncouth manner.

   729 \<close>

   730

   731

   732 subsection \<open>The general procedure \label{sec:unix-inv-procedure}\<close>

   733

   734 text \<open>

   735   The following theorem expresses the general procedure we are following to

   736   achieve the main result.

   737 \<close>

   738

   739 theorem general_procedure:

   740   assumes cannot_y: "\<And>r r'. Q r \<Longrightarrow> r \<midarrow>y\<rightarrow> r' \<Longrightarrow> False"

   741     and init_inv: "\<And>root. init users \<Midarrow>bs\<Rightarrow> root \<Longrightarrow> Q root"

   742     and preserve_inv: "\<And>r x r'. r \<midarrow>x\<rightarrow> r' \<Longrightarrow> Q r \<Longrightarrow> P x \<Longrightarrow> Q r'"

   743     and init_result: "init users \<Midarrow>bs\<Rightarrow> root"

   744   shows "\<not> (\<exists>xs. (\<forall>x \<in> set xs. P x) \<and> can_exec root (xs @ [y]))"

   745 proof -

   746   {

   747     fix xs

   748     assume Ps: "\<forall>x \<in> set xs. P x"

   749     assume can_exec: "can_exec root (xs @ [y])"

   750     then obtain root' root'' where xs: "root \<Midarrow>xs\<Rightarrow> root'" and y: "root' \<midarrow>y\<rightarrow> root''"

   751       by (blast dest: can_exec_snocD)

   752     from init_result have "Q root" by (rule init_inv)

   753     from preserve_inv xs this Ps have "Q root'"

   754       by (rule transitions_invariant)

   755     from this y have False by (rule cannot_y)

   756   }

   757   then show ?thesis by blast

   758 qed

   759

   760 text \<open>

   761   Here @{prop "P x"} refers to the restriction on file-system operations that

   762   are admitted after having reached the critical configuration; according to

   763   the problem specification this will become @{prop "uid_of x = user\<^sub>1"} later

   764   on. Furthermore, @{term y} refers to the operations we claim to be

   765   impossible to perform afterwards, we will take @{term Rmdir} later. Moreover

   766   @{term Q} is a suitable (auxiliary) invariant over the file-system; choosing

   767   @{term Q} adequately is very important to make the proof work (see

   768   \secref{sec:unix-inv-lemmas}).

   769 \<close>

   770

   771

   772 subsection \<open>The particular situation\<close>

   773

   774 text \<open>

   775   We introduce a few global declarations and axioms to describe our particular

   776   situation considered here. Thus we avoid excessive use of local parameters

   777   in the subsequent development.

   778 \<close>

   779

   780 context

   781   fixes users :: "uid set"

   782     and user\<^sub>1 :: uid

   783     and user\<^sub>2 :: uid

   784     and name\<^sub>1 :: name

   785     and name\<^sub>2 :: name

   786     and name\<^sub>3 :: name

   787     and perms\<^sub>1 :: perms

   788     and perms\<^sub>2 :: perms

   789   assumes user\<^sub>1_known: "user\<^sub>1 \<in> users"

   790     and user\<^sub>1_not_root: "user\<^sub>1 \<noteq> 0"

   791     and users_neq: "user\<^sub>1 \<noteq> user\<^sub>2"

   792     and perms\<^sub>1_writable: "Writable \<in> perms\<^sub>1"

   793     and perms\<^sub>2_not_writable: "Writable \<notin> perms\<^sub>2"

   794   notes facts = user\<^sub>1_known user\<^sub>1_not_root users_neq

   795     perms\<^sub>1_writable perms\<^sub>2_not_writable

   796 begin

   797

   798 definition

   799   "bogus =

   800      [Mkdir user\<^sub>1 perms\<^sub>1 [user\<^sub>1, name\<^sub>1],

   801       Mkdir user\<^sub>2 perms\<^sub>2 [user\<^sub>1, name\<^sub>1, name\<^sub>2],

   802       Creat user\<^sub>2 perms\<^sub>2 [user\<^sub>1, name\<^sub>1, name\<^sub>2, name\<^sub>3]]"

   803

   804 definition "bogus_path = [user\<^sub>1, name\<^sub>1, name\<^sub>2]"

   805

   806 text \<open>

   807   The @{term bogus} operations are the ones that lead into the uncouth

   808   situation; @{term bogus_path} is the key position within the file-system

   809   where things go awry.

   810 \<close>

   811

   812

   813 subsection \<open>Invariance lemmas \label{sec:unix-inv-lemmas}\<close>

   814

   815 text \<open>

   816   The following invariant over the root file-system describes the bogus

   817   situation in an abstract manner: located at a certain @{term path} within

   818   the file-system is a non-empty directory that is neither owned nor writable

   819   by @{term user\<^sub>1}.

   820 \<close>

   821

   822 definition

   823   "invariant root path \<longleftrightarrow>

   824     (\<exists>att dir.

   825       access root path user\<^sub>1 {} = Some (Env att dir) \<and> dir \<noteq> empty \<and>

   826       user\<^sub>1 \<noteq> owner att \<and>

   827       access root path user\<^sub>1 {Writable} = None)"

   828

   829 text \<open>

   830   Following the general procedure (see \secref{sec:unix-inv-procedure}) we

   831   will now establish the three key lemmas required to yield the final result.

   832

   833     \<^enum> The invariant is sufficiently strong to entail the pathological case

   834     that @{term user\<^sub>1} is unable to remove the (owned) directory at @{term

   835     "[user\<^sub>1, name\<^sub>1]"}.

   836

   837     \<^enum> The invariant does hold after having executed the @{term bogus} list of

   838     operations (starting with an initial file-system configuration).

   839

   840     \<^enum> The invariant is preserved by any file-system operation performed by

   841     @{term user\<^sub>1} alone, without any help by other users.

   842

   843   As the invariant appears both as assumptions and conclusions in the course

   844   of proof, its formulation is rather critical for the whole development to

   845   work out properly. In particular, the third step is very sensitive to the

   846   invariant being either too strong or too weak. Moreover the invariant has to

   847   be sufficiently abstract, lest the proof become cluttered by confusing

   848   detail.

   849

   850   \<^medskip>

   851   The first two lemmas are technically straight forward --- we just have to

   852   inspect rather special cases.

   853 \<close>

   854

   855 lemma cannot_rmdir:

   856   assumes inv: "invariant root bogus_path"

   857     and rmdir: "root \<midarrow>(Rmdir user\<^sub>1 [user\<^sub>1, name\<^sub>1])\<rightarrow> root'"

   858   shows False

   859 proof -

   860   from inv obtain "file" where "access root bogus_path user\<^sub>1 {} = Some file"

   861     unfolding invariant_def by blast

   862   moreover

   863   from rmdir obtain att where "access root [user\<^sub>1, name\<^sub>1] user\<^sub>1 {} = Some (Env att empty)"

   864     by cases auto

   865   then have "access root ([user\<^sub>1, name\<^sub>1] @ [name\<^sub>2]) user\<^sub>1 {} = empty name\<^sub>2"

   866     by (simp only: access_empty_lookup lookup_append_some) simp

   867   ultimately show False by (simp add: bogus_path_def)

   868 qed

   869

   870 lemma

   871   assumes init: "init users \<Midarrow>bogus\<Rightarrow> root"

   872   shows init_invariant: "invariant root bogus_path"

   873   supply eval = facts access_def init_def

   874   using init

   875   apply (unfold bogus_def bogus_path_def)

   876   apply (drule transitions_consD, rule transition.intros,

   877       (force simp add: eval)+, (simp add: eval)?)+

   878     \<comment> "evaluate all operations"

   879   apply (drule transitions_nilD)

   880     \<comment> "reach final result"

   881   apply (simp add: invariant_def eval)

   882     \<comment> "check the invariant"

   883   done

   884

   885 text \<open>

   886   \<^medskip>

   887   At last we are left with the main effort to show that the bogosity''

   888   invariant is preserved by any file-system operation performed by @{term

   889   user\<^sub>1} alone. Note that this holds for any @{term path} given, the

   890   particular @{term bogus_path} is not required here.

   891 \<close>

   892

   893 lemma preserve_invariant:

   894   assumes tr: "root \<midarrow>x\<rightarrow> root'"

   895     and inv: "invariant root path"

   896     and uid: "uid_of x = user\<^sub>1"

   897   shows "invariant root' path"

   898 proof -

   899   from inv obtain att dir where

   900       inv1: "access root path user\<^sub>1 {} = Some (Env att dir)" and

   901       inv2: "dir \<noteq> empty" and

   902       inv3: "user\<^sub>1 \<noteq> owner att" and

   903       inv4: "access root path user\<^sub>1 {Writable} = None"

   904     by (auto simp add: invariant_def)

   905

   906   from inv1 have lookup: "lookup root path = Some (Env att dir)"

   907     by (simp only: access_empty_lookup)

   908

   909   from inv1 inv3 inv4 and user\<^sub>1_not_root

   910   have not_writable: "Writable \<notin> others att"

   911     by (auto simp add: access_def split: option.splits)

   912

   913   show ?thesis

   914   proof cases

   915     assume "root' = root"

   916     with inv show "invariant root' path" by (simp only:)

   917   next

   918     assume changed: "root' \<noteq> root"

   919     with tr obtain opt where root': "root' = update (path_of x) opt root"

   920       by cases auto

   921     show ?thesis

   922     proof (rule prefix_cases)

   923       assume "path_of x \<parallel> path"

   924       with inv root'

   925       have "\<And>perms. access root' path user\<^sub>1 perms = access root path user\<^sub>1 perms"

   926         by (simp only: access_update_other)

   927       with inv show "invariant root' path"

   928         by (simp only: invariant_def)

   929     next

   930       assume "prefix (path_of x) path"

   931       then obtain ys where path: "path = path_of x @ ys" ..

   932

   933       show ?thesis

   934       proof (cases ys)

   935         assume "ys = []"

   936         with tr uid inv2 inv3 lookup changed path and user\<^sub>1_not_root

   937         have False

   938           by cases (auto simp add: access_empty_lookup dest: access_some_lookup)

   939         then show ?thesis ..

   940       next

   941         fix z zs assume ys: "ys = z # zs"

   942         have "lookup root' path = lookup root path"

   943         proof -

   944           from inv2 lookup path ys

   945           have look: "lookup root (path_of x @ z # zs) = Some (Env att dir)"

   946             by (simp only:)

   947           then obtain att' dir' file' where

   948               look': "lookup root (path_of x) = Some (Env att' dir')" and

   949               dir': "dir' z = Some file'" and

   950               file': "lookup file' zs = Some (Env att dir)"

   951             by (blast dest: lookup_some_upper)

   952

   953           from tr uid changed look' dir' obtain att'' where

   954               look'': "lookup root' (path_of x) = Some (Env att'' dir')"

   955             by cases (auto simp add: access_empty_lookup lookup_update_some

   956               dest: access_some_lookup)

   957           with dir' file' have "lookup root' (path_of x @ z # zs) =

   958               Some (Env att dir)"

   959             by (simp add: lookup_append_some)

   960           with look path ys show ?thesis

   961             by simp

   962         qed

   963         with inv show "invariant root' path"

   964           by (simp only: invariant_def access_def)

   965       qed

   966     next

   967       assume "strict_prefix path (path_of x)"

   968       then obtain y ys where path: "path_of x = path @ y # ys" ..

   969

   970       obtain dir' where

   971         lookup': "lookup root' path = Some (Env att dir')" and

   972         inv2': "dir' \<noteq> empty"

   973       proof (cases ys)

   974         assume "ys = []"

   975         with path have parent: "path_of x = path @ [y]" by simp

   976         with tr uid inv4 changed obtain "file" where

   977             "root' = update (path_of x) (Some file) root"

   978           by cases auto

   979         with lookup parent have "lookup root' path = Some (Env att (dir(y\<mapsto>file)))"

   980           by (simp only: update_append_some update_cons_nil_env)

   981         moreover have "dir(y\<mapsto>file) \<noteq> empty" by simp

   982         ultimately show ?thesis ..

   983       next

   984         fix z zs assume ys: "ys = z # zs"

   985         with lookup root' path

   986         have "lookup root' path = Some (update (y # ys) opt (Env att dir))"

   987           by (simp only: update_append_some)

   988         also obtain file' where

   989           "update (y # ys) opt (Env att dir) = Env att (dir(y\<mapsto>file'))"

   990         proof -

   991           have "dir y \<noteq> None"

   992           proof -

   993             have "dir y = lookup (Env att dir) [y]"

   994               by (simp split: option.splits)

   995             also from lookup have "\<dots> = lookup root (path @ [y])"

   996               by (simp only: lookup_append_some)

   997             also have "\<dots> \<noteq> None"

   998             proof -

   999               from ys obtain us u where rev_ys: "ys = us @ [u]"

  1000                 by (cases ys rule: rev_cases) simp

  1001               with tr path

  1002               have "lookup root ((path @ [y]) @ (us @ [u])) \<noteq> None \<or>

  1003                   lookup root ((path @ [y]) @ us) \<noteq> None"

  1004                 by cases (auto dest: access_some_lookup)

  1005               then show ?thesis

  1006                 by (fastforce dest!: lookup_some_append)

  1007             qed

  1008             finally show ?thesis .

  1009           qed

  1010           with ys show ?thesis using that by auto

  1011         qed

  1012         also have "dir(y\<mapsto>file') \<noteq> empty" by simp

  1013         ultimately show ?thesis ..

  1014       qed

  1015

  1016       from lookup' have inv1': "access root' path user\<^sub>1 {} = Some (Env att dir')"

  1017         by (simp only: access_empty_lookup)

  1018

  1019       from inv3 lookup' and not_writable user\<^sub>1_not_root

  1020       have "access root' path user\<^sub>1 {Writable} = None"

  1021         by (simp add: access_def)

  1022       with inv1' inv2' inv3 show ?thesis unfolding invariant_def by blast

  1023     qed

  1024   qed

  1025 qed

  1026

  1027

  1028 subsection \<open>Putting it all together \label{sec:unix-main-result}\<close>

  1029

  1030 text \<open>

  1031   The main result is now imminent, just by composing the three invariance

  1032   lemmas (see \secref{sec:unix-inv-lemmas}) according the the overall

  1033   procedure (see \secref{sec:unix-inv-procedure}).

  1034 \<close>

  1035

  1036 corollary

  1037   assumes bogus: "init users \<Midarrow>bogus\<Rightarrow> root"

  1038   shows "\<not> (\<exists>xs. (\<forall>x \<in> set xs. uid_of x = user\<^sub>1) \<and>

  1039     can_exec root (xs @ [Rmdir user\<^sub>1 [user\<^sub>1, name\<^sub>1]]))"

  1040 proof -

  1041   from cannot_rmdir init_invariant preserve_invariant

  1042     and bogus show ?thesis by (rule general_procedure)

  1043 qed

  1044

  1045 end

  1046

  1047 end