doc-src/IsarImplementation/Thy/ML_old.thy
changeset 39867 a8363532cd4d
parent 39859 381e16bb5f46
child 39874 bbac63bbcffe
equal deleted inserted replaced
39866:5ec01d5acd0c 39867:a8363532cd4d
    97       int_of_string}, not @{verbatim intOfString}).\footnote{Some
    97       int_of_string}, not @{verbatim intOfString}).\footnote{Some
    98       recent tools for Emacs include special precautions to cope with
    98       recent tools for Emacs include special precautions to cope with
    99       bumpy names in @{text "camelCase"}, e.g.\ for improved on-screen
    99       bumpy names in @{text "camelCase"}, e.g.\ for improved on-screen
   100       readability.  It is easier to abstain from using such names in the
   100       readability.  It is easier to abstain from using such names in the
   101       first place.}
   101       first place.}
   102 
       
   103   \end{description}
       
   104 *}
       
   105 
       
   106 
       
   107 section {* Thread-safe programming *}
       
   108 
       
   109 text {*
       
   110   Recent versions of Poly/ML (5.2.1 or later) support robust
       
   111   multithreaded execution, based on native operating system threads of
       
   112   the underlying platform.  Thus threads will actually be executed in
       
   113   parallel on multi-core systems.  A speedup-factor of approximately
       
   114   1.5--3 can be expected on a regular 4-core machine.\footnote{There
       
   115   is some inherent limitation of the speedup factor due to garbage
       
   116   collection, which is still sequential.  It helps to provide initial
       
   117   heap space generously, using the \texttt{-H} option of Poly/ML.}
       
   118   Threads also help to organize advanced operations of the system,
       
   119   with explicit communication between sub-components, real-time
       
   120   conditions, time-outs etc.
       
   121 
       
   122   Threads lack the memory protection of separate processes, and
       
   123   operate concurrently on shared heap memory.  This has the advantage
       
   124   that results of independent computations are immediately available
       
   125   to other threads, without requiring untyped character streams,
       
   126   awkward serialization etc.
       
   127 
       
   128   On the other hand, some programming guidelines need to be observed
       
   129   in order to make unprotected parallelism work out smoothly.  While
       
   130   the ML system implementation is responsible to maintain basic
       
   131   integrity of the representation of ML values in memory, the
       
   132   application programmer needs to ensure that multithreaded execution
       
   133   does not break the intended semantics.
       
   134 
       
   135   \medskip \paragraph{Critical shared resources.} Actually only those
       
   136   parts outside the purely functional world of ML are critical.  In
       
   137   particular, this covers
       
   138 
       
   139   \begin{itemize}
       
   140 
       
   141   \item global references (or arrays), i.e.\ those that persist over
       
   142   several invocations of associated operations,\footnote{This is
       
   143   independent of the visibility of such mutable values in the toplevel
       
   144   scope.}
       
   145 
       
   146   \item direct I/O on shared channels, notably @{text "stdin"}, @{text
       
   147   "stdout"}, @{text "stderr"}.
       
   148 
       
   149   \end{itemize}
       
   150 
       
   151   The majority of tools implemented within the Isabelle/Isar framework
       
   152   will not require any of these critical elements: nothing special
       
   153   needs to be observed when staying in the purely functional fragment
       
   154   of ML.  Note that output via the official Isabelle channels does not
       
   155   count as direct I/O, so the operations @{ML "writeln"}, @{ML
       
   156   "warning"}, @{ML "tracing"} etc.\ are safe.
       
   157 
       
   158   Moreover, ML bindings within the toplevel environment (@{verbatim
       
   159   "type"}, @{verbatim val}, @{verbatim "structure"} etc.) due to
       
   160   run-time invocation of the compiler are also safe, because
       
   161   Isabelle/Isar manages this as part of the theory or proof context.
       
   162 
       
   163   \paragraph{Multithreading in Isabelle/Isar.}  The theory loader
       
   164   automatically exploits the overall parallelism of independent nodes
       
   165   in the development graph, as well as the inherent irrelevance of
       
   166   proofs for goals being fully specified in advance.  This means,
       
   167   checking of individual Isar proofs is parallelized by default.
       
   168   Beyond that, very sophisticated proof tools may use local
       
   169   parallelism internally, via the general programming model of
       
   170   ``future values'' (see also @{"file"
       
   171   "~~/src/Pure/Concurrent/future.ML"}).
       
   172 
       
   173   Any ML code that works relatively to the present background theory
       
   174   is already safe.  Contextual data may be easily stored within the
       
   175   theory or proof context, thanks to the generic data concept of
       
   176   Isabelle/Isar (see \secref{sec:context-data}).  This greatly
       
   177   diminishes the demand for global state information in the first
       
   178   place.
       
   179 
       
   180   \medskip In rare situations where actual mutable content needs to be
       
   181   manipulated, Isabelle provides a single \emph{critical section} that
       
   182   may be entered while preventing any other thread from doing the
       
   183   same.  Entering the critical section without contention is very
       
   184   fast, and several basic system operations do so frequently.  This
       
   185   also means that each thread should leave the critical section
       
   186   quickly, otherwise parallel execution performance may degrade
       
   187   significantly.
       
   188 
       
   189   Despite this potential bottle-neck, centralized locking is
       
   190   convenient, because it prevents deadlocks without demanding special
       
   191   precautions.  Explicit communication demands other means, though.
       
   192   The high-level abstraction of synchronized variables @{"file"
       
   193   "~~/src/Pure/Concurrent/synchronized.ML"} enables parallel
       
   194   components to communicate via shared state; see also @{"file"
       
   195   "~~/src/Pure/Concurrent/mailbox.ML"} as canonical example.
       
   196 
       
   197   \paragraph{Good conduct of impure programs.} The following
       
   198   guidelines enable non-functional programs to participate in
       
   199   multithreading.
       
   200 
       
   201   \begin{itemize}
       
   202 
       
   203   \item Minimize global state information.  Using proper theory and
       
   204   proof context data will actually return to functional update of
       
   205   values, without any special precautions for multithreading.  Apart
       
   206   from the fully general functors for theory and proof data (see
       
   207   \secref{sec:context-data}) there are drop-in replacements that
       
   208   emulate primitive references for common cases of \emph{configuration
       
   209   options} for type @{ML_type "bool"}/@{ML_type "int"}/@{ML_type
       
   210   "string"} (see structure @{ML_struct Config} and @{ML
       
   211   Attrib.config_bool} etc.), and lists of theorems (see functor
       
   212   @{ML_functor Named_Thms}).
       
   213 
       
   214   \item Keep components with local state information
       
   215   \emph{re-entrant}.  Instead of poking initial values into (private)
       
   216   global references, create a new state record on each invocation, and
       
   217   pass that through any auxiliary functions of the component.  The
       
   218   state record may well contain mutable references, without requiring
       
   219   any special synchronizations, as long as each invocation sees its
       
   220   own copy.  Occasionally, one might even return to plain functional
       
   221   updates on non-mutable record values here.
       
   222 
       
   223   \item Isolate process configuration flags.  The main legitimate
       
   224   application of global references is to configure the whole process
       
   225   in a certain way, essentially affecting all threads.  A typical
       
   226   example is the @{ML show_types} flag, which tells the pretty printer
       
   227   to output explicit type information for terms.  Such flags usually
       
   228   do not affect the functionality of the core system, but only the
       
   229   view being presented to the user.
       
   230 
       
   231   Occasionally, such global process flags are treated like implicit
       
   232   arguments to certain operations, by using the @{ML setmp_CRITICAL} combinator
       
   233   for safe temporary assignment.  Its traditional purpose was to
       
   234   ensure proper recovery of the original value when exceptions are
       
   235   raised in the body, now the functionality is extended to enter the
       
   236   \emph{critical section} (with its usual potential of degrading
       
   237   parallelism).
       
   238 
       
   239   Note that recovery of plain value passing semantics via @{ML
       
   240   setmp_CRITICAL}~@{text "ref value"} assumes that this @{text "ref"} is
       
   241   exclusively manipulated within the critical section.  In particular,
       
   242   any persistent global assignment of @{text "ref := value"} needs to
       
   243   be marked critical as well, to prevent intruding another threads
       
   244   local view, and a lost-update in the global scope, too.
       
   245 
       
   246   \end{itemize}
       
   247 
       
   248   Recall that in an open ``LCF-style'' system like Isabelle/Isar, the
       
   249   user participates in constructing the overall environment.  This
       
   250   means that state-based facilities offered by one component will
       
   251   require special caution later on.  So minimizing critical elements,
       
   252   by staying within the plain value-oriented view relative to theory
       
   253   or proof contexts most of the time, will also reduce the chance of
       
   254   mishaps occurring to end-users.
       
   255 *}
       
   256 
       
   257 text %mlref {*
       
   258   \begin{mldecls}
       
   259   @{index_ML NAMED_CRITICAL: "string -> (unit -> 'a) -> 'a"} \\
       
   260   @{index_ML CRITICAL: "(unit -> 'a) -> 'a"} \\
       
   261   @{index_ML setmp_CRITICAL: "'a Unsynchronized.ref -> 'a -> ('b -> 'c) -> 'b -> 'c"} \\
       
   262   \end{mldecls}
       
   263 
       
   264   \begin{description}
       
   265 
       
   266   \item @{ML NAMED_CRITICAL}~@{text "name f"} evaluates @{text "f ()"}
       
   267   while staying within the critical section of Isabelle/Isar.  No
       
   268   other thread may do so at the same time, but non-critical parallel
       
   269   execution will continue.  The @{text "name"} argument serves for
       
   270   diagnostic purposes and might help to spot sources of congestion.
       
   271 
       
   272   \item @{ML CRITICAL} is the same as @{ML NAMED_CRITICAL} with empty
       
   273   name argument.
       
   274 
       
   275   \item @{ML setmp_CRITICAL}~@{text "ref value f x"} evaluates @{text "f x"}
       
   276   while staying within the critical section and having @{text "ref :=
       
   277   value"} assigned temporarily.  This recovers a value-passing
       
   278   semantics involving global references, regardless of exceptions or
       
   279   concurrency.
       
   280 
   102 
   281   \end{description}
   103   \end{description}
   282 *}
   104 *}
   283 
   105 
   284 
   106