src/Doc/IsarImplementation/Prelim.thy
changeset 48985 5386df44a037
parent 48938 d468d72a458f
child 48992 0518bf89c777
equal deleted inserted replaced
48984:f51d4a302962 48985:5386df44a037
       
     1 theory Prelim
       
     2 imports Base
       
     3 begin
       
     4 
       
     5 chapter {* Preliminaries *}
       
     6 
       
     7 section {* Contexts \label{sec:context} *}
       
     8 
       
     9 text {*
       
    10   A logical context represents the background that is required for
       
    11   formulating statements and composing proofs.  It acts as a medium to
       
    12   produce formal content, depending on earlier material (declarations,
       
    13   results etc.).
       
    14 
       
    15   For example, derivations within the Isabelle/Pure logic can be
       
    16   described as a judgment @{text "\<Gamma> \<turnstile>\<^sub>\<Theta> \<phi>"}, which means that a
       
    17   proposition @{text "\<phi>"} is derivable from hypotheses @{text "\<Gamma>"}
       
    18   within the theory @{text "\<Theta>"}.  There are logical reasons for
       
    19   keeping @{text "\<Theta>"} and @{text "\<Gamma>"} separate: theories can be
       
    20   liberal about supporting type constructors and schematic
       
    21   polymorphism of constants and axioms, while the inner calculus of
       
    22   @{text "\<Gamma> \<turnstile> \<phi>"} is strictly limited to Simple Type Theory (with
       
    23   fixed type variables in the assumptions).
       
    24 
       
    25   \medskip Contexts and derivations are linked by the following key
       
    26   principles:
       
    27 
       
    28   \begin{itemize}
       
    29 
       
    30   \item Transfer: monotonicity of derivations admits results to be
       
    31   transferred into a \emph{larger} context, i.e.\ @{text "\<Gamma> \<turnstile>\<^sub>\<Theta>
       
    32   \<phi>"} implies @{text "\<Gamma>' \<turnstile>\<^sub>\<Theta>\<^sub>' \<phi>"} for contexts @{text "\<Theta>'
       
    33   \<supseteq> \<Theta>"} and @{text "\<Gamma>' \<supseteq> \<Gamma>"}.
       
    34 
       
    35   \item Export: discharge of hypotheses admits results to be exported
       
    36   into a \emph{smaller} context, i.e.\ @{text "\<Gamma>' \<turnstile>\<^sub>\<Theta> \<phi>"}
       
    37   implies @{text "\<Gamma> \<turnstile>\<^sub>\<Theta> \<Delta> \<Longrightarrow> \<phi>"} where @{text "\<Gamma>' \<supseteq> \<Gamma>"} and
       
    38   @{text "\<Delta> = \<Gamma>' - \<Gamma>"}.  Note that @{text "\<Theta>"} remains unchanged here,
       
    39   only the @{text "\<Gamma>"} part is affected.
       
    40 
       
    41   \end{itemize}
       
    42 
       
    43   \medskip By modeling the main characteristics of the primitive
       
    44   @{text "\<Theta>"} and @{text "\<Gamma>"} above, and abstracting over any
       
    45   particular logical content, we arrive at the fundamental notions of
       
    46   \emph{theory context} and \emph{proof context} in Isabelle/Isar.
       
    47   These implement a certain policy to manage arbitrary \emph{context
       
    48   data}.  There is a strongly-typed mechanism to declare new kinds of
       
    49   data at compile time.
       
    50 
       
    51   The internal bootstrap process of Isabelle/Pure eventually reaches a
       
    52   stage where certain data slots provide the logical content of @{text
       
    53   "\<Theta>"} and @{text "\<Gamma>"} sketched above, but this does not stop there!
       
    54   Various additional data slots support all kinds of mechanisms that
       
    55   are not necessarily part of the core logic.
       
    56 
       
    57   For example, there would be data for canonical introduction and
       
    58   elimination rules for arbitrary operators (depending on the
       
    59   object-logic and application), which enables users to perform
       
    60   standard proof steps implicitly (cf.\ the @{text "rule"} method
       
    61   \cite{isabelle-isar-ref}).
       
    62 
       
    63   \medskip Thus Isabelle/Isar is able to bring forth more and more
       
    64   concepts successively.  In particular, an object-logic like
       
    65   Isabelle/HOL continues the Isabelle/Pure setup by adding specific
       
    66   components for automated reasoning (classical reasoner, tableau
       
    67   prover, structured induction etc.) and derived specification
       
    68   mechanisms (inductive predicates, recursive functions etc.).  All of
       
    69   this is ultimately based on the generic data management by theory
       
    70   and proof contexts introduced here.
       
    71 *}
       
    72 
       
    73 
       
    74 subsection {* Theory context \label{sec:context-theory} *}
       
    75 
       
    76 text {* A \emph{theory} is a data container with explicit name and
       
    77   unique identifier.  Theories are related by a (nominal) sub-theory
       
    78   relation, which corresponds to the dependency graph of the original
       
    79   construction; each theory is derived from a certain sub-graph of
       
    80   ancestor theories.  To this end, the system maintains a set of
       
    81   symbolic ``identification stamps'' within each theory.
       
    82 
       
    83   In order to avoid the full-scale overhead of explicit sub-theory
       
    84   identification of arbitrary intermediate stages, a theory is
       
    85   switched into @{text "draft"} mode under certain circumstances.  A
       
    86   draft theory acts like a linear type, where updates invalidate
       
    87   earlier versions.  An invalidated draft is called \emph{stale}.
       
    88 
       
    89   The @{text "checkpoint"} operation produces a safe stepping stone
       
    90   that will survive the next update without becoming stale: both the
       
    91   old and the new theory remain valid and are related by the
       
    92   sub-theory relation.  Checkpointing essentially recovers purely
       
    93   functional theory values, at the expense of some extra internal
       
    94   bookkeeping.
       
    95 
       
    96   The @{text "copy"} operation produces an auxiliary version that has
       
    97   the same data content, but is unrelated to the original: updates of
       
    98   the copy do not affect the original, neither does the sub-theory
       
    99   relation hold.
       
   100 
       
   101   The @{text "merge"} operation produces the least upper bound of two
       
   102   theories, which actually degenerates into absorption of one theory
       
   103   into the other (according to the nominal sub-theory relation).
       
   104 
       
   105   The @{text "begin"} operation starts a new theory by importing
       
   106   several parent theories and entering a special mode of nameless
       
   107   incremental updates, until the final @{text "end"} operation is
       
   108   performed.
       
   109 
       
   110   \medskip The example in \figref{fig:ex-theory} below shows a theory
       
   111   graph derived from @{text "Pure"}, with theory @{text "Length"}
       
   112   importing @{text "Nat"} and @{text "List"}.  The body of @{text
       
   113   "Length"} consists of a sequence of updates, working mostly on
       
   114   drafts internally, while transaction boundaries of Isar top-level
       
   115   commands (\secref{sec:isar-toplevel}) are guaranteed to be safe
       
   116   checkpoints.
       
   117 
       
   118   \begin{figure}[htb]
       
   119   \begin{center}
       
   120   \begin{tabular}{rcccl}
       
   121         &            & @{text "Pure"} \\
       
   122         &            & @{text "\<down>"} \\
       
   123         &            & @{text "FOL"} \\
       
   124         & $\swarrow$ &              & $\searrow$ & \\
       
   125   @{text "Nat"} &    &              &            & @{text "List"} \\
       
   126         & $\searrow$ &              & $\swarrow$ \\
       
   127         &            & @{text "Length"} \\
       
   128         &            & \multicolumn{3}{l}{~~@{keyword "imports"}} \\
       
   129         &            & \multicolumn{3}{l}{~~@{keyword "begin"}} \\
       
   130         &            & $\vdots$~~ \\
       
   131         &            & @{text "\<bullet>"}~~ \\
       
   132         &            & $\vdots$~~ \\
       
   133         &            & @{text "\<bullet>"}~~ \\
       
   134         &            & $\vdots$~~ \\
       
   135         &            & \multicolumn{3}{l}{~~@{command "end"}} \\
       
   136   \end{tabular}
       
   137   \caption{A theory definition depending on ancestors}\label{fig:ex-theory}
       
   138   \end{center}
       
   139   \end{figure}
       
   140 
       
   141   \medskip There is a separate notion of \emph{theory reference} for
       
   142   maintaining a live link to an evolving theory context: updates on
       
   143   drafts are propagated automatically.  Dynamic updating stops when
       
   144   the next @{text "checkpoint"} is reached.
       
   145 
       
   146   Derived entities may store a theory reference in order to indicate
       
   147   the formal context from which they are derived.  This implicitly
       
   148   assumes monotonic reasoning, because the referenced context may
       
   149   become larger without further notice.
       
   150 *}
       
   151 
       
   152 text %mlref {*
       
   153   \begin{mldecls}
       
   154   @{index_ML_type theory} \\
       
   155   @{index_ML Theory.eq_thy: "theory * theory -> bool"} \\
       
   156   @{index_ML Theory.subthy: "theory * theory -> bool"} \\
       
   157   @{index_ML Theory.checkpoint: "theory -> theory"} \\
       
   158   @{index_ML Theory.copy: "theory -> theory"} \\
       
   159   @{index_ML Theory.merge: "theory * theory -> theory"} \\
       
   160   @{index_ML Theory.begin_theory: "string * Position.T -> theory list -> theory"} \\
       
   161   @{index_ML Theory.parents_of: "theory -> theory list"} \\
       
   162   @{index_ML Theory.ancestors_of: "theory -> theory list"} \\
       
   163   \end{mldecls}
       
   164   \begin{mldecls}
       
   165   @{index_ML_type theory_ref} \\
       
   166   @{index_ML Theory.deref: "theory_ref -> theory"} \\
       
   167   @{index_ML Theory.check_thy: "theory -> theory_ref"} \\
       
   168   \end{mldecls}
       
   169 
       
   170   \begin{description}
       
   171 
       
   172   \item Type @{ML_type theory} represents theory contexts.  This is
       
   173   essentially a linear type, with explicit runtime checking.
       
   174   Primitive theory operations destroy the original version, which then
       
   175   becomes ``stale''.  This can be prevented by explicit checkpointing,
       
   176   which the system does at least at the boundary of toplevel command
       
   177   transactions \secref{sec:isar-toplevel}.
       
   178 
       
   179   \item @{ML "Theory.eq_thy"}~@{text "(thy\<^sub>1, thy\<^sub>2)"} check strict
       
   180   identity of two theories.
       
   181 
       
   182   \item @{ML "Theory.subthy"}~@{text "(thy\<^sub>1, thy\<^sub>2)"} compares theories
       
   183   according to the intrinsic graph structure of the construction.
       
   184   This sub-theory relation is a nominal approximation of inclusion
       
   185   (@{text "\<subseteq>"}) of the corresponding content (according to the
       
   186   semantics of the ML modules that implement the data).
       
   187 
       
   188   \item @{ML "Theory.checkpoint"}~@{text "thy"} produces a safe
       
   189   stepping stone in the linear development of @{text "thy"}.  This
       
   190   changes the old theory, but the next update will result in two
       
   191   related, valid theories.
       
   192 
       
   193   \item @{ML "Theory.copy"}~@{text "thy"} produces a variant of @{text
       
   194   "thy"} with the same data.  The copy is not related to the original,
       
   195   but the original is unchanged.
       
   196 
       
   197   \item @{ML "Theory.merge"}~@{text "(thy\<^sub>1, thy\<^sub>2)"} absorbs one theory
       
   198   into the other, without changing @{text "thy\<^sub>1"} or @{text "thy\<^sub>2"}.
       
   199   This version of ad-hoc theory merge fails for unrelated theories!
       
   200 
       
   201   \item @{ML "Theory.begin_theory"}~@{text "name parents"} constructs
       
   202   a new theory based on the given parents.  This ML function is
       
   203   normally not invoked directly.
       
   204 
       
   205   \item @{ML "Theory.parents_of"}~@{text "thy"} returns the direct
       
   206   ancestors of @{text thy}.
       
   207 
       
   208   \item @{ML "Theory.ancestors_of"}~@{text "thy"} returns all
       
   209   ancestors of @{text thy} (not including @{text thy} itself).
       
   210 
       
   211   \item Type @{ML_type theory_ref} represents a sliding reference to
       
   212   an always valid theory; updates on the original are propagated
       
   213   automatically.
       
   214 
       
   215   \item @{ML "Theory.deref"}~@{text "thy_ref"} turns a @{ML_type
       
   216   "theory_ref"} into an @{ML_type "theory"} value.  As the referenced
       
   217   theory evolves monotonically over time, later invocations of @{ML
       
   218   "Theory.deref"} may refer to a larger context.
       
   219 
       
   220   \item @{ML "Theory.check_thy"}~@{text "thy"} produces a @{ML_type
       
   221   "theory_ref"} from a valid @{ML_type "theory"} value.
       
   222 
       
   223   \end{description}
       
   224 *}
       
   225 
       
   226 text %mlantiq {*
       
   227   \begin{matharray}{rcl}
       
   228   @{ML_antiquotation_def "theory"} & : & @{text ML_antiquotation} \\
       
   229   \end{matharray}
       
   230 
       
   231   @{rail "
       
   232   @@{ML_antiquotation theory} nameref?
       
   233   "}
       
   234 
       
   235   \begin{description}
       
   236 
       
   237   \item @{text "@{theory}"} refers to the background theory of the
       
   238   current context --- as abstract value.
       
   239 
       
   240   \item @{text "@{theory A}"} refers to an explicitly named ancestor
       
   241   theory @{text "A"} of the background theory of the current context
       
   242   --- as abstract value.
       
   243 
       
   244   \end{description}
       
   245 *}
       
   246 
       
   247 
       
   248 subsection {* Proof context \label{sec:context-proof} *}
       
   249 
       
   250 text {* A proof context is a container for pure data with a
       
   251   back-reference to the theory from which it is derived.  The @{text
       
   252   "init"} operation creates a proof context from a given theory.
       
   253   Modifications to draft theories are propagated to the proof context
       
   254   as usual, but there is also an explicit @{text "transfer"} operation
       
   255   to force resynchronization with more substantial updates to the
       
   256   underlying theory.
       
   257 
       
   258   Entities derived in a proof context need to record logical
       
   259   requirements explicitly, since there is no separate context
       
   260   identification or symbolic inclusion as for theories.  For example,
       
   261   hypotheses used in primitive derivations (cf.\ \secref{sec:thms})
       
   262   are recorded separately within the sequent @{text "\<Gamma> \<turnstile> \<phi>"}, just to
       
   263   make double sure.  Results could still leak into an alien proof
       
   264   context due to programming errors, but Isabelle/Isar includes some
       
   265   extra validity checks in critical positions, notably at the end of a
       
   266   sub-proof.
       
   267 
       
   268   Proof contexts may be manipulated arbitrarily, although the common
       
   269   discipline is to follow block structure as a mental model: a given
       
   270   context is extended consecutively, and results are exported back
       
   271   into the original context.  Note that an Isar proof state models
       
   272   block-structured reasoning explicitly, using a stack of proof
       
   273   contexts internally.  For various technical reasons, the background
       
   274   theory of an Isar proof state must not be changed while the proof is
       
   275   still under construction!
       
   276 *}
       
   277 
       
   278 text %mlref {*
       
   279   \begin{mldecls}
       
   280   @{index_ML_type Proof.context} \\
       
   281   @{index_ML Proof_Context.init_global: "theory -> Proof.context"} \\
       
   282   @{index_ML Proof_Context.theory_of: "Proof.context -> theory"} \\
       
   283   @{index_ML Proof_Context.transfer: "theory -> Proof.context -> Proof.context"} \\
       
   284   \end{mldecls}
       
   285 
       
   286   \begin{description}
       
   287 
       
   288   \item Type @{ML_type Proof.context} represents proof contexts.
       
   289   Elements of this type are essentially pure values, with a sliding
       
   290   reference to the background theory.
       
   291 
       
   292   \item @{ML Proof_Context.init_global}~@{text "thy"} produces a proof context
       
   293   derived from @{text "thy"}, initializing all data.
       
   294 
       
   295   \item @{ML Proof_Context.theory_of}~@{text "ctxt"} selects the
       
   296   background theory from @{text "ctxt"}, dereferencing its internal
       
   297   @{ML_type theory_ref}.
       
   298 
       
   299   \item @{ML Proof_Context.transfer}~@{text "thy ctxt"} promotes the
       
   300   background theory of @{text "ctxt"} to the super theory @{text
       
   301   "thy"}.
       
   302 
       
   303   \end{description}
       
   304 *}
       
   305 
       
   306 text %mlantiq {*
       
   307   \begin{matharray}{rcl}
       
   308   @{ML_antiquotation_def "context"} & : & @{text ML_antiquotation} \\
       
   309   \end{matharray}
       
   310 
       
   311   \begin{description}
       
   312 
       
   313   \item @{text "@{context}"} refers to \emph{the} context at
       
   314   compile-time --- as abstract value.  Independently of (local) theory
       
   315   or proof mode, this always produces a meaningful result.
       
   316 
       
   317   This is probably the most common antiquotation in interactive
       
   318   experimentation with ML inside Isar.
       
   319 
       
   320   \end{description}
       
   321 *}
       
   322 
       
   323 
       
   324 subsection {* Generic contexts \label{sec:generic-context} *}
       
   325 
       
   326 text {*
       
   327   A generic context is the disjoint sum of either a theory or proof
       
   328   context.  Occasionally, this enables uniform treatment of generic
       
   329   context data, typically extra-logical information.  Operations on
       
   330   generic contexts include the usual injections, partial selections,
       
   331   and combinators for lifting operations on either component of the
       
   332   disjoint sum.
       
   333 
       
   334   Moreover, there are total operations @{text "theory_of"} and @{text
       
   335   "proof_of"} to convert a generic context into either kind: a theory
       
   336   can always be selected from the sum, while a proof context might
       
   337   have to be constructed by an ad-hoc @{text "init"} operation, which
       
   338   incurs a small runtime overhead.
       
   339 *}
       
   340 
       
   341 text %mlref {*
       
   342   \begin{mldecls}
       
   343   @{index_ML_type Context.generic} \\
       
   344   @{index_ML Context.theory_of: "Context.generic -> theory"} \\
       
   345   @{index_ML Context.proof_of: "Context.generic -> Proof.context"} \\
       
   346   \end{mldecls}
       
   347 
       
   348   \begin{description}
       
   349 
       
   350   \item Type @{ML_type Context.generic} is the direct sum of @{ML_type
       
   351   "theory"} and @{ML_type "Proof.context"}, with the datatype
       
   352   constructors @{ML "Context.Theory"} and @{ML "Context.Proof"}.
       
   353 
       
   354   \item @{ML Context.theory_of}~@{text "context"} always produces a
       
   355   theory from the generic @{text "context"}, using @{ML
       
   356   "Proof_Context.theory_of"} as required.
       
   357 
       
   358   \item @{ML Context.proof_of}~@{text "context"} always produces a
       
   359   proof context from the generic @{text "context"}, using @{ML
       
   360   "Proof_Context.init_global"} as required (note that this re-initializes the
       
   361   context data with each invocation).
       
   362 
       
   363   \end{description}
       
   364 *}
       
   365 
       
   366 
       
   367 subsection {* Context data \label{sec:context-data} *}
       
   368 
       
   369 text {* The main purpose of theory and proof contexts is to manage
       
   370   arbitrary (pure) data.  New data types can be declared incrementally
       
   371   at compile time.  There are separate declaration mechanisms for any
       
   372   of the three kinds of contexts: theory, proof, generic.
       
   373 
       
   374   \paragraph{Theory data} declarations need to implement the following
       
   375   SML signature:
       
   376 
       
   377   \medskip
       
   378   \begin{tabular}{ll}
       
   379   @{text "\<type> T"} & representing type \\
       
   380   @{text "\<val> empty: T"} & empty default value \\
       
   381   @{text "\<val> extend: T \<rightarrow> T"} & re-initialize on import \\
       
   382   @{text "\<val> merge: T \<times> T \<rightarrow> T"} & join on import \\
       
   383   \end{tabular}
       
   384   \medskip
       
   385 
       
   386   The @{text "empty"} value acts as initial default for \emph{any}
       
   387   theory that does not declare actual data content; @{text "extend"}
       
   388   is acts like a unitary version of @{text "merge"}.
       
   389 
       
   390   Implementing @{text "merge"} can be tricky.  The general idea is
       
   391   that @{text "merge (data\<^sub>1, data\<^sub>2)"} inserts those parts of @{text
       
   392   "data\<^sub>2"} into @{text "data\<^sub>1"} that are not yet present, while
       
   393   keeping the general order of things.  The @{ML Library.merge}
       
   394   function on plain lists may serve as canonical template.
       
   395 
       
   396   Particularly note that shared parts of the data must not be
       
   397   duplicated by naive concatenation, or a theory graph that is like a
       
   398   chain of diamonds would cause an exponential blowup!
       
   399 
       
   400   \paragraph{Proof context data} declarations need to implement the
       
   401   following SML signature:
       
   402 
       
   403   \medskip
       
   404   \begin{tabular}{ll}
       
   405   @{text "\<type> T"} & representing type \\
       
   406   @{text "\<val> init: theory \<rightarrow> T"} & produce initial value \\
       
   407   \end{tabular}
       
   408   \medskip
       
   409 
       
   410   The @{text "init"} operation is supposed to produce a pure value
       
   411   from the given background theory and should be somehow
       
   412   ``immediate''.  Whenever a proof context is initialized, which
       
   413   happens frequently, the the system invokes the @{text "init"}
       
   414   operation of \emph{all} theory data slots ever declared.  This also
       
   415   means that one needs to be economic about the total number of proof
       
   416   data declarations in the system, i.e.\ each ML module should declare
       
   417   at most one, sometimes two data slots for its internal use.
       
   418   Repeated data declarations to simulate a record type should be
       
   419   avoided!
       
   420 
       
   421   \paragraph{Generic data} provides a hybrid interface for both theory
       
   422   and proof data.  The @{text "init"} operation for proof contexts is
       
   423   predefined to select the current data value from the background
       
   424   theory.
       
   425 
       
   426   \bigskip Any of the above data declarations over type @{text "T"}
       
   427   result in an ML structure with the following signature:
       
   428 
       
   429   \medskip
       
   430   \begin{tabular}{ll}
       
   431   @{text "get: context \<rightarrow> T"} \\
       
   432   @{text "put: T \<rightarrow> context \<rightarrow> context"} \\
       
   433   @{text "map: (T \<rightarrow> T) \<rightarrow> context \<rightarrow> context"} \\
       
   434   \end{tabular}
       
   435   \medskip
       
   436 
       
   437   These other operations provide exclusive access for the particular
       
   438   kind of context (theory, proof, or generic context).  This interface
       
   439   observes the ML discipline for types and scopes: there is no other
       
   440   way to access the corresponding data slot of a context.  By keeping
       
   441   these operations private, an Isabelle/ML module may maintain
       
   442   abstract values authentically.  *}
       
   443 
       
   444 text %mlref {*
       
   445   \begin{mldecls}
       
   446   @{index_ML_functor Theory_Data} \\
       
   447   @{index_ML_functor Proof_Data} \\
       
   448   @{index_ML_functor Generic_Data} \\
       
   449   \end{mldecls}
       
   450 
       
   451   \begin{description}
       
   452 
       
   453   \item @{ML_functor Theory_Data}@{text "(spec)"} declares data for
       
   454   type @{ML_type theory} according to the specification provided as
       
   455   argument structure.  The resulting structure provides data init and
       
   456   access operations as described above.
       
   457 
       
   458   \item @{ML_functor Proof_Data}@{text "(spec)"} is analogous to
       
   459   @{ML_functor Theory_Data} for type @{ML_type Proof.context}.
       
   460 
       
   461   \item @{ML_functor Generic_Data}@{text "(spec)"} is analogous to
       
   462   @{ML_functor Theory_Data} for type @{ML_type Context.generic}.
       
   463 
       
   464   \end{description}
       
   465 *}
       
   466 
       
   467 text %mlex {*
       
   468   The following artificial example demonstrates theory
       
   469   data: we maintain a set of terms that are supposed to be wellformed
       
   470   wrt.\ the enclosing theory.  The public interface is as follows:
       
   471 *}
       
   472 
       
   473 ML {*
       
   474   signature WELLFORMED_TERMS =
       
   475   sig
       
   476     val get: theory -> term list
       
   477     val add: term -> theory -> theory
       
   478   end;
       
   479 *}
       
   480 
       
   481 text {* The implementation uses private theory data internally, and
       
   482   only exposes an operation that involves explicit argument checking
       
   483   wrt.\ the given theory. *}
       
   484 
       
   485 ML {*
       
   486   structure Wellformed_Terms: WELLFORMED_TERMS =
       
   487   struct
       
   488 
       
   489   structure Terms = Theory_Data
       
   490   (
       
   491     type T = term Ord_List.T;
       
   492     val empty = [];
       
   493     val extend = I;
       
   494     fun merge (ts1, ts2) =
       
   495       Ord_List.union Term_Ord.fast_term_ord ts1 ts2;
       
   496   );
       
   497 
       
   498   val get = Terms.get;
       
   499 
       
   500   fun add raw_t thy =
       
   501     let
       
   502       val t = Sign.cert_term thy raw_t;
       
   503     in
       
   504       Terms.map (Ord_List.insert Term_Ord.fast_term_ord t) thy
       
   505     end;
       
   506 
       
   507   end;
       
   508 *}
       
   509 
       
   510 text {* Type @{ML_type "term Ord_List.T"} is used for reasonably
       
   511   efficient representation of a set of terms: all operations are
       
   512   linear in the number of stored elements.  Here we assume that users
       
   513   of this module do not care about the declaration order, since that
       
   514   data structure forces its own arrangement of elements.
       
   515 
       
   516   Observe how the @{ML_text merge} operation joins the data slots of
       
   517   the two constituents: @{ML Ord_List.union} prevents duplication of
       
   518   common data from different branches, thus avoiding the danger of
       
   519   exponential blowup.  Plain list append etc.\ must never be used for
       
   520   theory data merges!
       
   521 
       
   522   \medskip Our intended invariant is achieved as follows:
       
   523   \begin{enumerate}
       
   524 
       
   525   \item @{ML Wellformed_Terms.add} only admits terms that have passed
       
   526   the @{ML Sign.cert_term} check of the given theory at that point.
       
   527 
       
   528   \item Wellformedness in the sense of @{ML Sign.cert_term} is
       
   529   monotonic wrt.\ the sub-theory relation.  So our data can move
       
   530   upwards in the hierarchy (via extension or merges), and maintain
       
   531   wellformedness without further checks.
       
   532 
       
   533   \end{enumerate}
       
   534 
       
   535   Note that all basic operations of the inference kernel (which
       
   536   includes @{ML Sign.cert_term}) observe this monotonicity principle,
       
   537   but other user-space tools don't.  For example, fully-featured
       
   538   type-inference via @{ML Syntax.check_term} (cf.\
       
   539   \secref{sec:term-check}) is not necessarily monotonic wrt.\ the
       
   540   background theory, since constraints of term constants can be
       
   541   modified by later declarations, for example.
       
   542 
       
   543   In most cases, user-space context data does not have to take such
       
   544   invariants too seriously.  The situation is different in the
       
   545   implementation of the inference kernel itself, which uses the very
       
   546   same data mechanisms for types, constants, axioms etc.
       
   547 *}
       
   548 
       
   549 
       
   550 subsection {* Configuration options \label{sec:config-options} *}
       
   551 
       
   552 text {* A \emph{configuration option} is a named optional value of
       
   553   some basic type (Boolean, integer, string) that is stored in the
       
   554   context.  It is a simple application of general context data
       
   555   (\secref{sec:context-data}) that is sufficiently common to justify
       
   556   customized setup, which includes some concrete declarations for
       
   557   end-users using existing notation for attributes (cf.\
       
   558   \secref{sec:attributes}).
       
   559 
       
   560   For example, the predefined configuration option @{attribute
       
   561   show_types} controls output of explicit type constraints for
       
   562   variables in printed terms (cf.\ \secref{sec:read-print}).  Its
       
   563   value can be modified within Isar text like this:
       
   564 *}
       
   565 
       
   566 declare [[show_types = false]]
       
   567   -- {* declaration within (local) theory context *}
       
   568 
       
   569 notepad
       
   570 begin
       
   571   note [[show_types = true]]
       
   572     -- {* declaration within proof (forward mode) *}
       
   573   term x
       
   574 
       
   575   have "x = x"
       
   576     using [[show_types = false]]
       
   577       -- {* declaration within proof (backward mode) *}
       
   578     ..
       
   579 end
       
   580 
       
   581 text {* Configuration options that are not set explicitly hold a
       
   582   default value that can depend on the application context.  This
       
   583   allows to retrieve the value from another slot within the context,
       
   584   or fall back on a global preference mechanism, for example.
       
   585 
       
   586   The operations to declare configuration options and get/map their
       
   587   values are modeled as direct replacements for historic global
       
   588   references, only that the context is made explicit.  This allows
       
   589   easy configuration of tools, without relying on the execution order
       
   590   as required for old-style mutable references.  *}
       
   591 
       
   592 text %mlref {*
       
   593   \begin{mldecls}
       
   594   @{index_ML Config.get: "Proof.context -> 'a Config.T -> 'a"} \\
       
   595   @{index_ML Config.map: "'a Config.T -> ('a -> 'a) -> Proof.context -> Proof.context"} \\
       
   596   @{index_ML Attrib.setup_config_bool: "binding -> (Context.generic -> bool) ->
       
   597   bool Config.T"} \\
       
   598   @{index_ML Attrib.setup_config_int: "binding -> (Context.generic -> int) ->
       
   599   int Config.T"} \\
       
   600   @{index_ML Attrib.setup_config_real: "binding -> (Context.generic -> real) ->
       
   601   real Config.T"} \\
       
   602   @{index_ML Attrib.setup_config_string: "binding -> (Context.generic -> string) ->
       
   603   string Config.T"} \\
       
   604   \end{mldecls}
       
   605 
       
   606   \begin{description}
       
   607 
       
   608   \item @{ML Config.get}~@{text "ctxt config"} gets the value of
       
   609   @{text "config"} in the given context.
       
   610 
       
   611   \item @{ML Config.map}~@{text "config f ctxt"} updates the context
       
   612   by updating the value of @{text "config"}.
       
   613 
       
   614   \item @{text "config ="}~@{ML Attrib.setup_config_bool}~@{text "name
       
   615   default"} creates a named configuration option of type @{ML_type
       
   616   bool}, with the given @{text "default"} depending on the application
       
   617   context.  The resulting @{text "config"} can be used to get/map its
       
   618   value in a given context.  There is an implicit update of the
       
   619   background theory that registers the option as attribute with some
       
   620   concrete syntax.
       
   621 
       
   622   \item @{ML Attrib.config_int}, @{ML Attrib.config_real}, and @{ML
       
   623   Attrib.config_string} work like @{ML Attrib.config_bool}, but for
       
   624   types @{ML_type int} and @{ML_type string}, respectively.
       
   625 
       
   626   \end{description}
       
   627 *}
       
   628 
       
   629 text %mlex {* The following example shows how to declare and use a
       
   630   Boolean configuration option called @{text "my_flag"} with constant
       
   631   default value @{ML false}.  *}
       
   632 
       
   633 ML {*
       
   634   val my_flag =
       
   635     Attrib.setup_config_bool @{binding my_flag} (K false)
       
   636 *}
       
   637 
       
   638 text {* Now the user can refer to @{attribute my_flag} in
       
   639   declarations, while ML tools can retrieve the current value from the
       
   640   context via @{ML Config.get}.  *}
       
   641 
       
   642 ML_val {* @{assert} (Config.get @{context} my_flag = false) *}
       
   643 
       
   644 declare [[my_flag = true]]
       
   645 
       
   646 ML_val {* @{assert} (Config.get @{context} my_flag = true) *}
       
   647 
       
   648 notepad
       
   649 begin
       
   650   {
       
   651     note [[my_flag = false]]
       
   652     ML_val {* @{assert} (Config.get @{context} my_flag = false) *}
       
   653   }
       
   654   ML_val {* @{assert} (Config.get @{context} my_flag = true) *}
       
   655 end
       
   656 
       
   657 text {* Here is another example involving ML type @{ML_type real}
       
   658   (floating-point numbers). *}
       
   659 
       
   660 ML {*
       
   661   val airspeed_velocity =
       
   662     Attrib.setup_config_real @{binding airspeed_velocity} (K 0.0)
       
   663 *}
       
   664 
       
   665 declare [[airspeed_velocity = 10]]
       
   666 declare [[airspeed_velocity = 9.9]]
       
   667 
       
   668 
       
   669 section {* Names \label{sec:names} *}
       
   670 
       
   671 text {* In principle, a name is just a string, but there are various
       
   672   conventions for representing additional structure.  For example,
       
   673   ``@{text "Foo.bar.baz"}'' is considered as a long name consisting of
       
   674   qualifier @{text "Foo.bar"} and base name @{text "baz"}.  The
       
   675   individual constituents of a name may have further substructure,
       
   676   e.g.\ the string ``\verb,\,\verb,<alpha>,'' encodes as a single
       
   677   symbol.
       
   678 
       
   679   \medskip Subsequently, we shall introduce specific categories of
       
   680   names.  Roughly speaking these correspond to logical entities as
       
   681   follows:
       
   682   \begin{itemize}
       
   683 
       
   684   \item Basic names (\secref{sec:basic-name}): free and bound
       
   685   variables.
       
   686 
       
   687   \item Indexed names (\secref{sec:indexname}): schematic variables.
       
   688 
       
   689   \item Long names (\secref{sec:long-name}): constants of any kind
       
   690   (type constructors, term constants, other concepts defined in user
       
   691   space).  Such entities are typically managed via name spaces
       
   692   (\secref{sec:name-space}).
       
   693 
       
   694   \end{itemize}
       
   695 *}
       
   696 
       
   697 
       
   698 subsection {* Strings of symbols \label{sec:symbols} *}
       
   699 
       
   700 text {* A \emph{symbol} constitutes the smallest textual unit in
       
   701   Isabelle --- raw ML characters are normally not encountered at all!
       
   702   Isabelle strings consist of a sequence of symbols, represented as a
       
   703   packed string or an exploded list of strings.  Each symbol is in
       
   704   itself a small string, which has either one of the following forms:
       
   705 
       
   706   \begin{enumerate}
       
   707 
       
   708   \item a single ASCII character ``@{text "c"}'', for example
       
   709   ``\verb,a,'',
       
   710 
       
   711   \item a codepoint according to UTF8 (non-ASCII byte sequence),
       
   712 
       
   713   \item a regular symbol ``\verb,\,\verb,<,@{text "ident"}\verb,>,'',
       
   714   for example ``\verb,\,\verb,<alpha>,'',
       
   715 
       
   716   \item a control symbol ``\verb,\,\verb,<^,@{text "ident"}\verb,>,'',
       
   717   for example ``\verb,\,\verb,<^bold>,'',
       
   718 
       
   719   \item a raw symbol ``\verb,\,\verb,<^raw:,@{text text}\verb,>,''
       
   720   where @{text text} consists of printable characters excluding
       
   721   ``\verb,.,'' and ``\verb,>,'', for example
       
   722   ``\verb,\,\verb,<^raw:$\sum_{i = 1}^n$>,'',
       
   723 
       
   724   \item a numbered raw control symbol ``\verb,\,\verb,<^raw,@{text
       
   725   n}\verb,>, where @{text n} consists of digits, for example
       
   726   ``\verb,\,\verb,<^raw42>,''.
       
   727 
       
   728   \end{enumerate}
       
   729 
       
   730   The @{text "ident"} syntax for symbol names is @{text "letter
       
   731   (letter | digit)\<^sup>*"}, where @{text "letter = A..Za..z"} and @{text
       
   732   "digit = 0..9"}.  There are infinitely many regular symbols and
       
   733   control symbols, but a fixed collection of standard symbols is
       
   734   treated specifically.  For example, ``\verb,\,\verb,<alpha>,'' is
       
   735   classified as a letter, which means it may occur within regular
       
   736   Isabelle identifiers.
       
   737 
       
   738   The character set underlying Isabelle symbols is 7-bit ASCII, but
       
   739   8-bit character sequences are passed-through unchanged.  Unicode/UCS
       
   740   data in UTF-8 encoding is processed in a non-strict fashion, such
       
   741   that well-formed code sequences are recognized
       
   742   accordingly.\footnote{Note that ISO-Latin-1 differs from UTF-8 only
       
   743   in some special punctuation characters that even have replacements
       
   744   within the standard collection of Isabelle symbols.  Text consisting
       
   745   of ASCII plus accented letters can be processed in either encoding.}
       
   746   Unicode provides its own collection of mathematical symbols, but
       
   747   within the core Isabelle/ML world there is no link to the standard
       
   748   collection of Isabelle regular symbols.
       
   749 
       
   750   \medskip Output of Isabelle symbols depends on the print mode
       
   751   (\cite{isabelle-isar-ref}).  For example, the standard {\LaTeX}
       
   752   setup of the Isabelle document preparation system would present
       
   753   ``\verb,\,\verb,<alpha>,'' as @{text "\<alpha>"}, and
       
   754   ``\verb,\,\verb,<^bold>,\verb,\,\verb,<alpha>,'' as @{text "\<^bold>\<alpha>"}.
       
   755   On-screen rendering usually works by mapping a finite subset of
       
   756   Isabelle symbols to suitable Unicode characters.
       
   757 *}
       
   758 
       
   759 text %mlref {*
       
   760   \begin{mldecls}
       
   761   @{index_ML_type "Symbol.symbol": string} \\
       
   762   @{index_ML Symbol.explode: "string -> Symbol.symbol list"} \\
       
   763   @{index_ML Symbol.is_letter: "Symbol.symbol -> bool"} \\
       
   764   @{index_ML Symbol.is_digit: "Symbol.symbol -> bool"} \\
       
   765   @{index_ML Symbol.is_quasi: "Symbol.symbol -> bool"} \\
       
   766   @{index_ML Symbol.is_blank: "Symbol.symbol -> bool"} \\
       
   767   \end{mldecls}
       
   768   \begin{mldecls}
       
   769   @{index_ML_type "Symbol.sym"} \\
       
   770   @{index_ML Symbol.decode: "Symbol.symbol -> Symbol.sym"} \\
       
   771   \end{mldecls}
       
   772 
       
   773   \begin{description}
       
   774 
       
   775   \item Type @{ML_type "Symbol.symbol"} represents individual Isabelle
       
   776   symbols.
       
   777 
       
   778   \item @{ML "Symbol.explode"}~@{text "str"} produces a symbol list
       
   779   from the packed form.  This function supersedes @{ML
       
   780   "String.explode"} for virtually all purposes of manipulating text in
       
   781   Isabelle!\footnote{The runtime overhead for exploded strings is
       
   782   mainly that of the list structure: individual symbols that happen to
       
   783   be a singleton string do not require extra memory in Poly/ML.}
       
   784 
       
   785   \item @{ML "Symbol.is_letter"}, @{ML "Symbol.is_digit"}, @{ML
       
   786   "Symbol.is_quasi"}, @{ML "Symbol.is_blank"} classify standard
       
   787   symbols according to fixed syntactic conventions of Isabelle, cf.\
       
   788   \cite{isabelle-isar-ref}.
       
   789 
       
   790   \item Type @{ML_type "Symbol.sym"} is a concrete datatype that
       
   791   represents the different kinds of symbols explicitly, with
       
   792   constructors @{ML "Symbol.Char"}, @{ML "Symbol.Sym"}, @{ML
       
   793   "Symbol.UTF8"}, @{ML "Symbol.Ctrl"}, @{ML "Symbol.Raw"}.
       
   794 
       
   795   \item @{ML "Symbol.decode"} converts the string representation of a
       
   796   symbol into the datatype version.
       
   797 
       
   798   \end{description}
       
   799 
       
   800   \paragraph{Historical note.} In the original SML90 standard the
       
   801   primitive ML type @{ML_type char} did not exists, and the @{ML_text
       
   802   "explode: string -> string list"} operation would produce a list of
       
   803   singleton strings as does @{ML "raw_explode: string -> string list"}
       
   804   in Isabelle/ML today.  When SML97 came out, Isabelle did not adopt
       
   805   its slightly anachronistic 8-bit characters, but the idea of
       
   806   exploding a string into a list of small strings was extended to
       
   807   ``symbols'' as explained above.  Thus Isabelle sources can refer to
       
   808   an infinite store of user-defined symbols, without having to worry
       
   809   about the multitude of Unicode encodings.  *}
       
   810 
       
   811 
       
   812 subsection {* Basic names \label{sec:basic-name} *}
       
   813 
       
   814 text {*
       
   815   A \emph{basic name} essentially consists of a single Isabelle
       
   816   identifier.  There are conventions to mark separate classes of basic
       
   817   names, by attaching a suffix of underscores: one underscore means
       
   818   \emph{internal name}, two underscores means \emph{Skolem name},
       
   819   three underscores means \emph{internal Skolem name}.
       
   820 
       
   821   For example, the basic name @{text "foo"} has the internal version
       
   822   @{text "foo_"}, with Skolem versions @{text "foo__"} and @{text
       
   823   "foo___"}, respectively.
       
   824 
       
   825   These special versions provide copies of the basic name space, apart
       
   826   from anything that normally appears in the user text.  For example,
       
   827   system generated variables in Isar proof contexts are usually marked
       
   828   as internal, which prevents mysterious names like @{text "xaa"} to
       
   829   appear in human-readable text.
       
   830 
       
   831   \medskip Manipulating binding scopes often requires on-the-fly
       
   832   renamings.  A \emph{name context} contains a collection of already
       
   833   used names.  The @{text "declare"} operation adds names to the
       
   834   context.
       
   835 
       
   836   The @{text "invents"} operation derives a number of fresh names from
       
   837   a given starting point.  For example, the first three names derived
       
   838   from @{text "a"} are @{text "a"}, @{text "b"}, @{text "c"}.
       
   839 
       
   840   The @{text "variants"} operation produces fresh names by
       
   841   incrementing tentative names as base-26 numbers (with digits @{text
       
   842   "a..z"}) until all clashes are resolved.  For example, name @{text
       
   843   "foo"} results in variants @{text "fooa"}, @{text "foob"}, @{text
       
   844   "fooc"}, \dots, @{text "fooaa"}, @{text "fooab"} etc.; each renaming
       
   845   step picks the next unused variant from this sequence.
       
   846 *}
       
   847 
       
   848 text %mlref {*
       
   849   \begin{mldecls}
       
   850   @{index_ML Name.internal: "string -> string"} \\
       
   851   @{index_ML Name.skolem: "string -> string"} \\
       
   852   \end{mldecls}
       
   853   \begin{mldecls}
       
   854   @{index_ML_type Name.context} \\
       
   855   @{index_ML Name.context: Name.context} \\
       
   856   @{index_ML Name.declare: "string -> Name.context -> Name.context"} \\
       
   857   @{index_ML Name.invent: "Name.context -> string -> int -> string list"} \\
       
   858   @{index_ML Name.variant: "string -> Name.context -> string * Name.context"} \\
       
   859   \end{mldecls}
       
   860   \begin{mldecls}
       
   861   @{index_ML Variable.names_of: "Proof.context -> Name.context"} \\
       
   862   \end{mldecls}
       
   863 
       
   864   \begin{description}
       
   865 
       
   866   \item @{ML Name.internal}~@{text "name"} produces an internal name
       
   867   by adding one underscore.
       
   868 
       
   869   \item @{ML Name.skolem}~@{text "name"} produces a Skolem name by
       
   870   adding two underscores.
       
   871 
       
   872   \item Type @{ML_type Name.context} represents the context of already
       
   873   used names; the initial value is @{ML "Name.context"}.
       
   874 
       
   875   \item @{ML Name.declare}~@{text "name"} enters a used name into the
       
   876   context.
       
   877 
       
   878   \item @{ML Name.invent}~@{text "context name n"} produces @{text
       
   879   "n"} fresh names derived from @{text "name"}.
       
   880 
       
   881   \item @{ML Name.variant}~@{text "name context"} produces a fresh
       
   882   variant of @{text "name"}; the result is declared to the context.
       
   883 
       
   884   \item @{ML Variable.names_of}~@{text "ctxt"} retrieves the context
       
   885   of declared type and term variable names.  Projecting a proof
       
   886   context down to a primitive name context is occasionally useful when
       
   887   invoking lower-level operations.  Regular management of ``fresh
       
   888   variables'' is done by suitable operations of structure @{ML_struct
       
   889   Variable}, which is also able to provide an official status of
       
   890   ``locally fixed variable'' within the logical environment (cf.\
       
   891   \secref{sec:variables}).
       
   892 
       
   893   \end{description}
       
   894 *}
       
   895 
       
   896 text %mlex {* The following simple examples demonstrate how to produce
       
   897   fresh names from the initial @{ML Name.context}. *}
       
   898 
       
   899 ML {*
       
   900   val list1 = Name.invent Name.context "a" 5;
       
   901   @{assert} (list1 = ["a", "b", "c", "d", "e"]);
       
   902 
       
   903   val list2 =
       
   904     #1 (fold_map Name.variant ["x", "x", "a", "a", "'a", "'a"] Name.context);
       
   905   @{assert} (list2 = ["x", "xa", "a", "aa", "'a", "'aa"]);
       
   906 *}
       
   907 
       
   908 text {* \medskip The same works relatively to the formal context as
       
   909   follows. *}
       
   910 
       
   911 locale ex = fixes a b c :: 'a
       
   912 begin
       
   913 
       
   914 ML {*
       
   915   val names = Variable.names_of @{context};
       
   916 
       
   917   val list1 = Name.invent names "a" 5;
       
   918   @{assert} (list1 = ["d", "e", "f", "g", "h"]);
       
   919 
       
   920   val list2 =
       
   921     #1 (fold_map Name.variant ["x", "x", "a", "a", "'a", "'a"] names);
       
   922   @{assert} (list2 = ["x", "xa", "aa", "ab", "'aa", "'ab"]);
       
   923 *}
       
   924 
       
   925 end
       
   926 
       
   927 
       
   928 subsection {* Indexed names \label{sec:indexname} *}
       
   929 
       
   930 text {*
       
   931   An \emph{indexed name} (or @{text "indexname"}) is a pair of a basic
       
   932   name and a natural number.  This representation allows efficient
       
   933   renaming by incrementing the second component only.  The canonical
       
   934   way to rename two collections of indexnames apart from each other is
       
   935   this: determine the maximum index @{text "maxidx"} of the first
       
   936   collection, then increment all indexes of the second collection by
       
   937   @{text "maxidx + 1"}; the maximum index of an empty collection is
       
   938   @{text "-1"}.
       
   939 
       
   940   Occasionally, basic names are injected into the same pair type of
       
   941   indexed names: then @{text "(x, -1)"} is used to encode the basic
       
   942   name @{text "x"}.
       
   943 
       
   944   \medskip Isabelle syntax observes the following rules for
       
   945   representing an indexname @{text "(x, i)"} as a packed string:
       
   946 
       
   947   \begin{itemize}
       
   948 
       
   949   \item @{text "?x"} if @{text "x"} does not end with a digit and @{text "i = 0"},
       
   950 
       
   951   \item @{text "?xi"} if @{text "x"} does not end with a digit,
       
   952 
       
   953   \item @{text "?x.i"} otherwise.
       
   954 
       
   955   \end{itemize}
       
   956 
       
   957   Indexnames may acquire large index numbers after several maxidx
       
   958   shifts have been applied.  Results are usually normalized towards
       
   959   @{text "0"} at certain checkpoints, notably at the end of a proof.
       
   960   This works by producing variants of the corresponding basic name
       
   961   components.  For example, the collection @{text "?x1, ?x7, ?x42"}
       
   962   becomes @{text "?x, ?xa, ?xb"}.
       
   963 *}
       
   964 
       
   965 text %mlref {*
       
   966   \begin{mldecls}
       
   967   @{index_ML_type indexname: "string * int"} \\
       
   968   \end{mldecls}
       
   969 
       
   970   \begin{description}
       
   971 
       
   972   \item Type @{ML_type indexname} represents indexed names.  This is
       
   973   an abbreviation for @{ML_type "string * int"}.  The second component
       
   974   is usually non-negative, except for situations where @{text "(x,
       
   975   -1)"} is used to inject basic names into this type.  Other negative
       
   976   indexes should not be used.
       
   977 
       
   978   \end{description}
       
   979 *}
       
   980 
       
   981 
       
   982 subsection {* Long names \label{sec:long-name} *}
       
   983 
       
   984 text {* A \emph{long name} consists of a sequence of non-empty name
       
   985   components.  The packed representation uses a dot as separator, as
       
   986   in ``@{text "A.b.c"}''.  The last component is called \emph{base
       
   987   name}, the remaining prefix is called \emph{qualifier} (which may be
       
   988   empty).  The qualifier can be understood as the access path to the
       
   989   named entity while passing through some nested block-structure,
       
   990   although our free-form long names do not really enforce any strict
       
   991   discipline.
       
   992 
       
   993   For example, an item named ``@{text "A.b.c"}'' may be understood as
       
   994   a local entity @{text "c"}, within a local structure @{text "b"},
       
   995   within a global structure @{text "A"}.  In practice, long names
       
   996   usually represent 1--3 levels of qualification.  User ML code should
       
   997   not make any assumptions about the particular structure of long
       
   998   names!
       
   999 
       
  1000   The empty name is commonly used as an indication of unnamed
       
  1001   entities, or entities that are not entered into the corresponding
       
  1002   name space, whenever this makes any sense.  The basic operations on
       
  1003   long names map empty names again to empty names.
       
  1004 *}
       
  1005 
       
  1006 text %mlref {*
       
  1007   \begin{mldecls}
       
  1008   @{index_ML Long_Name.base_name: "string -> string"} \\
       
  1009   @{index_ML Long_Name.qualifier: "string -> string"} \\
       
  1010   @{index_ML Long_Name.append: "string -> string -> string"} \\
       
  1011   @{index_ML Long_Name.implode: "string list -> string"} \\
       
  1012   @{index_ML Long_Name.explode: "string -> string list"} \\
       
  1013   \end{mldecls}
       
  1014 
       
  1015   \begin{description}
       
  1016 
       
  1017   \item @{ML Long_Name.base_name}~@{text "name"} returns the base name
       
  1018   of a long name.
       
  1019 
       
  1020   \item @{ML Long_Name.qualifier}~@{text "name"} returns the qualifier
       
  1021   of a long name.
       
  1022 
       
  1023   \item @{ML Long_Name.append}~@{text "name\<^isub>1 name\<^isub>2"} appends two long
       
  1024   names.
       
  1025 
       
  1026   \item @{ML Long_Name.implode}~@{text "names"} and @{ML
       
  1027   Long_Name.explode}~@{text "name"} convert between the packed string
       
  1028   representation and the explicit list form of long names.
       
  1029 
       
  1030   \end{description}
       
  1031 *}
       
  1032 
       
  1033 
       
  1034 subsection {* Name spaces \label{sec:name-space} *}
       
  1035 
       
  1036 text {* A @{text "name space"} manages a collection of long names,
       
  1037   together with a mapping between partially qualified external names
       
  1038   and fully qualified internal names (in both directions).  Note that
       
  1039   the corresponding @{text "intern"} and @{text "extern"} operations
       
  1040   are mostly used for parsing and printing only!  The @{text
       
  1041   "declare"} operation augments a name space according to the accesses
       
  1042   determined by a given binding, and a naming policy from the context.
       
  1043 
       
  1044   \medskip A @{text "binding"} specifies details about the prospective
       
  1045   long name of a newly introduced formal entity.  It consists of a
       
  1046   base name, prefixes for qualification (separate ones for system
       
  1047   infrastructure and user-space mechanisms), a slot for the original
       
  1048   source position, and some additional flags.
       
  1049 
       
  1050   \medskip A @{text "naming"} provides some additional details for
       
  1051   producing a long name from a binding.  Normally, the naming is
       
  1052   implicit in the theory or proof context.  The @{text "full"}
       
  1053   operation (and its variants for different context types) produces a
       
  1054   fully qualified internal name to be entered into a name space.  The
       
  1055   main equation of this ``chemical reaction'' when binding new
       
  1056   entities in a context is as follows:
       
  1057 
       
  1058   \medskip
       
  1059   \begin{tabular}{l}
       
  1060   @{text "binding + naming \<longrightarrow> long name + name space accesses"}
       
  1061   \end{tabular}
       
  1062 
       
  1063   \bigskip As a general principle, there is a separate name space for
       
  1064   each kind of formal entity, e.g.\ fact, logical constant, type
       
  1065   constructor, type class.  It is usually clear from the occurrence in
       
  1066   concrete syntax (or from the scope) which kind of entity a name
       
  1067   refers to.  For example, the very same name @{text "c"} may be used
       
  1068   uniformly for a constant, type constructor, and type class.
       
  1069 
       
  1070   There are common schemes to name derived entities systematically
       
  1071   according to the name of the main logical entity involved, e.g.\
       
  1072   fact @{text "c.intro"} for a canonical introduction rule related to
       
  1073   constant @{text "c"}.  This technique of mapping names from one
       
  1074   space into another requires some care in order to avoid conflicts.
       
  1075   In particular, theorem names derived from a type constructor or type
       
  1076   class should get an additional suffix in addition to the usual
       
  1077   qualification.  This leads to the following conventions for derived
       
  1078   names:
       
  1079 
       
  1080   \medskip
       
  1081   \begin{tabular}{ll}
       
  1082   logical entity & fact name \\\hline
       
  1083   constant @{text "c"} & @{text "c.intro"} \\
       
  1084   type @{text "c"} & @{text "c_type.intro"} \\
       
  1085   class @{text "c"} & @{text "c_class.intro"} \\
       
  1086   \end{tabular}
       
  1087 *}
       
  1088 
       
  1089 text %mlref {*
       
  1090   \begin{mldecls}
       
  1091   @{index_ML_type binding} \\
       
  1092   @{index_ML Binding.empty: binding} \\
       
  1093   @{index_ML Binding.name: "string -> binding"} \\
       
  1094   @{index_ML Binding.qualify: "bool -> string -> binding -> binding"} \\
       
  1095   @{index_ML Binding.prefix: "bool -> string -> binding -> binding"} \\
       
  1096   @{index_ML Binding.conceal: "binding -> binding"} \\
       
  1097   @{index_ML Binding.print: "binding -> string"} \\
       
  1098   \end{mldecls}
       
  1099   \begin{mldecls}
       
  1100   @{index_ML_type Name_Space.naming} \\
       
  1101   @{index_ML Name_Space.default_naming: Name_Space.naming} \\
       
  1102   @{index_ML Name_Space.add_path: "string -> Name_Space.naming -> Name_Space.naming"} \\
       
  1103   @{index_ML Name_Space.full_name: "Name_Space.naming -> binding -> string"} \\
       
  1104   \end{mldecls}
       
  1105   \begin{mldecls}
       
  1106   @{index_ML_type Name_Space.T} \\
       
  1107   @{index_ML Name_Space.empty: "string -> Name_Space.T"} \\
       
  1108   @{index_ML Name_Space.merge: "Name_Space.T * Name_Space.T -> Name_Space.T"} \\
       
  1109   @{index_ML Name_Space.declare: "Context.generic -> bool ->
       
  1110   binding -> Name_Space.T -> string * Name_Space.T"} \\
       
  1111   @{index_ML Name_Space.intern: "Name_Space.T -> string -> string"} \\
       
  1112   @{index_ML Name_Space.extern: "Proof.context -> Name_Space.T -> string -> string"} \\
       
  1113   @{index_ML Name_Space.is_concealed: "Name_Space.T -> string -> bool"}
       
  1114   \end{mldecls}
       
  1115 
       
  1116   \begin{description}
       
  1117 
       
  1118   \item Type @{ML_type binding} represents the abstract concept of
       
  1119   name bindings.
       
  1120 
       
  1121   \item @{ML Binding.empty} is the empty binding.
       
  1122 
       
  1123   \item @{ML Binding.name}~@{text "name"} produces a binding with base
       
  1124   name @{text "name"}.  Note that this lacks proper source position
       
  1125   information; see also the ML antiquotation @{ML_antiquotation
       
  1126   binding}.
       
  1127 
       
  1128   \item @{ML Binding.qualify}~@{text "mandatory name binding"}
       
  1129   prefixes qualifier @{text "name"} to @{text "binding"}.  The @{text
       
  1130   "mandatory"} flag tells if this name component always needs to be
       
  1131   given in name space accesses --- this is mostly @{text "false"} in
       
  1132   practice.  Note that this part of qualification is typically used in
       
  1133   derived specification mechanisms.
       
  1134 
       
  1135   \item @{ML Binding.prefix} is similar to @{ML Binding.qualify}, but
       
  1136   affects the system prefix.  This part of extra qualification is
       
  1137   typically used in the infrastructure for modular specifications,
       
  1138   notably ``local theory targets'' (see also \chref{ch:local-theory}).
       
  1139 
       
  1140   \item @{ML Binding.conceal}~@{text "binding"} indicates that the
       
  1141   binding shall refer to an entity that serves foundational purposes
       
  1142   only.  This flag helps to mark implementation details of
       
  1143   specification mechanism etc.  Other tools should not depend on the
       
  1144   particulars of concealed entities (cf.\ @{ML
       
  1145   Name_Space.is_concealed}).
       
  1146 
       
  1147   \item @{ML Binding.print}~@{text "binding"} produces a string
       
  1148   representation for human-readable output, together with some formal
       
  1149   markup that might get used in GUI front-ends, for example.
       
  1150 
       
  1151   \item Type @{ML_type Name_Space.naming} represents the abstract
       
  1152   concept of a naming policy.
       
  1153 
       
  1154   \item @{ML Name_Space.default_naming} is the default naming policy.
       
  1155   In a theory context, this is usually augmented by a path prefix
       
  1156   consisting of the theory name.
       
  1157 
       
  1158   \item @{ML Name_Space.add_path}~@{text "path naming"} augments the
       
  1159   naming policy by extending its path component.
       
  1160 
       
  1161   \item @{ML Name_Space.full_name}~@{text "naming binding"} turns a
       
  1162   name binding (usually a basic name) into the fully qualified
       
  1163   internal name, according to the given naming policy.
       
  1164 
       
  1165   \item Type @{ML_type Name_Space.T} represents name spaces.
       
  1166 
       
  1167   \item @{ML Name_Space.empty}~@{text "kind"} and @{ML Name_Space.merge}~@{text
       
  1168   "(space\<^isub>1, space\<^isub>2)"} are the canonical operations for
       
  1169   maintaining name spaces according to theory data management
       
  1170   (\secref{sec:context-data}); @{text "kind"} is a formal comment
       
  1171   to characterize the purpose of a name space.
       
  1172 
       
  1173   \item @{ML Name_Space.declare}~@{text "context strict binding
       
  1174   space"} enters a name binding as fully qualified internal name into
       
  1175   the name space, using the naming of the context.
       
  1176 
       
  1177   \item @{ML Name_Space.intern}~@{text "space name"} internalizes a
       
  1178   (partially qualified) external name.
       
  1179 
       
  1180   This operation is mostly for parsing!  Note that fully qualified
       
  1181   names stemming from declarations are produced via @{ML
       
  1182   "Name_Space.full_name"} and @{ML "Name_Space.declare"}
       
  1183   (or their derivatives for @{ML_type theory} and
       
  1184   @{ML_type Proof.context}).
       
  1185 
       
  1186   \item @{ML Name_Space.extern}~@{text "ctxt space name"} externalizes a
       
  1187   (fully qualified) internal name.
       
  1188 
       
  1189   This operation is mostly for printing!  User code should not rely on
       
  1190   the precise result too much.
       
  1191 
       
  1192   \item @{ML Name_Space.is_concealed}~@{text "space name"} indicates
       
  1193   whether @{text "name"} refers to a strictly private entity that
       
  1194   other tools are supposed to ignore!
       
  1195 
       
  1196   \end{description}
       
  1197 *}
       
  1198 
       
  1199 text %mlantiq {*
       
  1200   \begin{matharray}{rcl}
       
  1201   @{ML_antiquotation_def "binding"} & : & @{text ML_antiquotation} \\
       
  1202   \end{matharray}
       
  1203 
       
  1204   @{rail "
       
  1205   @@{ML_antiquotation binding} name
       
  1206   "}
       
  1207 
       
  1208   \begin{description}
       
  1209 
       
  1210   \item @{text "@{binding name}"} produces a binding with base name
       
  1211   @{text "name"} and the source position taken from the concrete
       
  1212   syntax of this antiquotation.  In many situations this is more
       
  1213   appropriate than the more basic @{ML Binding.name} function.
       
  1214 
       
  1215   \end{description}
       
  1216 *}
       
  1217 
       
  1218 text %mlex {* The following example yields the source position of some
       
  1219   concrete binding inlined into the text:
       
  1220 *}
       
  1221 
       
  1222 ML {* Binding.pos_of @{binding here} *}
       
  1223 
       
  1224 text {* \medskip That position can be also printed in a message as
       
  1225   follows: *}
       
  1226 
       
  1227 ML_command {*
       
  1228   writeln
       
  1229     ("Look here" ^ Position.str_of (Binding.pos_of @{binding here}))
       
  1230 *}
       
  1231 
       
  1232 text {* This illustrates a key virtue of formalized bindings as
       
  1233   opposed to raw specifications of base names: the system can use this
       
  1234   additional information for feedback given to the user (error
       
  1235   messages etc.). *}
       
  1236 
       
  1237 end