src/Doc/IsarImplementation/Prelim.thy
author wenzelm
Wed Mar 12 22:44:55 2014 +0100 (2014-03-12)
changeset 56071 2ffdedb0c044
parent 55837 154855d9a564
permissions -rw-r--r--
added ML antiquotation @{here};
     1 theory Prelim
     2 imports Base
     3 begin
     4 
     5 chapter {* Preliminaries *}
     6 
     7 section {* Contexts \label{sec:context} *}
     8 
     9 text {*
    10   A logical context represents the background that is required for
    11   formulating statements and composing proofs.  It acts as a medium to
    12   produce formal content, depending on earlier material (declarations,
    13   results etc.).
    14 
    15   For example, derivations within the Isabelle/Pure logic can be
    16   described as a judgment @{text "\<Gamma> \<turnstile>\<^sub>\<Theta> \<phi>"}, which means that a
    17   proposition @{text "\<phi>"} is derivable from hypotheses @{text "\<Gamma>"}
    18   within the theory @{text "\<Theta>"}.  There are logical reasons for
    19   keeping @{text "\<Theta>"} and @{text "\<Gamma>"} separate: theories can be
    20   liberal about supporting type constructors and schematic
    21   polymorphism of constants and axioms, while the inner calculus of
    22   @{text "\<Gamma> \<turnstile> \<phi>"} is strictly limited to Simple Type Theory (with
    23   fixed type variables in the assumptions).
    24 
    25   \medskip Contexts and derivations are linked by the following key
    26   principles:
    27 
    28   \begin{itemize}
    29 
    30   \item Transfer: monotonicity of derivations admits results to be
    31   transferred into a \emph{larger} context, i.e.\ @{text "\<Gamma> \<turnstile>\<^sub>\<Theta>
    32   \<phi>"} implies @{text "\<Gamma>' \<turnstile>\<^sub>\<Theta>\<^sub>' \<phi>"} for contexts @{text "\<Theta>'
    33   \<supseteq> \<Theta>"} and @{text "\<Gamma>' \<supseteq> \<Gamma>"}.
    34 
    35   \item Export: discharge of hypotheses admits results to be exported
    36   into a \emph{smaller} context, i.e.\ @{text "\<Gamma>' \<turnstile>\<^sub>\<Theta> \<phi>"}
    37   implies @{text "\<Gamma> \<turnstile>\<^sub>\<Theta> \<Delta> \<Longrightarrow> \<phi>"} where @{text "\<Gamma>' \<supseteq> \<Gamma>"} and
    38   @{text "\<Delta> = \<Gamma>' - \<Gamma>"}.  Note that @{text "\<Theta>"} remains unchanged here,
    39   only the @{text "\<Gamma>"} part is affected.
    40 
    41   \end{itemize}
    42 
    43   \medskip By modeling the main characteristics of the primitive
    44   @{text "\<Theta>"} and @{text "\<Gamma>"} above, and abstracting over any
    45   particular logical content, we arrive at the fundamental notions of
    46   \emph{theory context} and \emph{proof context} in Isabelle/Isar.
    47   These implement a certain policy to manage arbitrary \emph{context
    48   data}.  There is a strongly-typed mechanism to declare new kinds of
    49   data at compile time.
    50 
    51   The internal bootstrap process of Isabelle/Pure eventually reaches a
    52   stage where certain data slots provide the logical content of @{text
    53   "\<Theta>"} and @{text "\<Gamma>"} sketched above, but this does not stop there!
    54   Various additional data slots support all kinds of mechanisms that
    55   are not necessarily part of the core logic.
    56 
    57   For example, there would be data for canonical introduction and
    58   elimination rules for arbitrary operators (depending on the
    59   object-logic and application), which enables users to perform
    60   standard proof steps implicitly (cf.\ the @{text "rule"} method
    61   \cite{isabelle-isar-ref}).
    62 
    63   \medskip Thus Isabelle/Isar is able to bring forth more and more
    64   concepts successively.  In particular, an object-logic like
    65   Isabelle/HOL continues the Isabelle/Pure setup by adding specific
    66   components for automated reasoning (classical reasoner, tableau
    67   prover, structured induction etc.) and derived specification
    68   mechanisms (inductive predicates, recursive functions etc.).  All of
    69   this is ultimately based on the generic data management by theory
    70   and proof contexts introduced here.
    71 *}
    72 
    73 
    74 subsection {* Theory context \label{sec:context-theory} *}
    75 
    76 text {* A \emph{theory} is a data container with explicit name and
    77   unique identifier.  Theories are related by a (nominal) sub-theory
    78   relation, which corresponds to the dependency graph of the original
    79   construction; each theory is derived from a certain sub-graph of
    80   ancestor theories.  To this end, the system maintains a set of
    81   symbolic ``identification stamps'' within each theory.
    82 
    83   The @{text "merge"} operation produces the least upper bound of two
    84   theories, which actually degenerates into absorption of one theory
    85   into the other (according to the nominal sub-theory relation).
    86 
    87   The @{text "begin"} operation starts a new theory by importing
    88   several parent theories and entering a special mode of nameless
    89   incremental updates, until the final @{text "end"} operation is
    90   performed.
    91 
    92   \medskip The example in \figref{fig:ex-theory} below shows a theory
    93   graph derived from @{text "Pure"}, with theory @{text "Length"}
    94   importing @{text "Nat"} and @{text "List"}.  The body of @{text
    95   "Length"} consists of a sequence of updates, resulting in locally a
    96   linear sub-theory relation for each intermediate step.
    97 
    98   \begin{figure}[htb]
    99   \begin{center}
   100   \begin{tabular}{rcccl}
   101         &            & @{text "Pure"} \\
   102         &            & @{text "\<down>"} \\
   103         &            & @{text "FOL"} \\
   104         & $\swarrow$ &              & $\searrow$ & \\
   105   @{text "Nat"} &    &              &            & @{text "List"} \\
   106         & $\searrow$ &              & $\swarrow$ \\
   107         &            & @{text "Length"} \\
   108         &            & \multicolumn{3}{l}{~~@{keyword "begin"}} \\
   109         &            & $\vdots$~~ \\
   110         &            & \multicolumn{3}{l}{~~@{command "end"}} \\
   111   \end{tabular}
   112   \caption{A theory definition depending on ancestors}\label{fig:ex-theory}
   113   \end{center}
   114   \end{figure}
   115 
   116   \medskip Derived formal entities may retain a reference to the
   117   background theory in order to indicate the formal context from which
   118   they were produced.  This provides an immutable certificate of the
   119   background theory.  *}
   120 
   121 text %mlref {*
   122   \begin{mldecls}
   123   @{index_ML_type theory} \\
   124   @{index_ML Theory.eq_thy: "theory * theory -> bool"} \\
   125   @{index_ML Theory.subthy: "theory * theory -> bool"} \\
   126   @{index_ML Theory.merge: "theory * theory -> theory"} \\
   127   @{index_ML Theory.begin_theory: "string * Position.T -> theory list -> theory"} \\
   128   @{index_ML Theory.parents_of: "theory -> theory list"} \\
   129   @{index_ML Theory.ancestors_of: "theory -> theory list"} \\
   130   \end{mldecls}
   131 
   132   \begin{description}
   133 
   134   \item Type @{ML_type theory} represents theory contexts.
   135 
   136   \item @{ML "Theory.eq_thy"}~@{text "(thy\<^sub>1, thy\<^sub>2)"} check strict
   137   identity of two theories.
   138 
   139   \item @{ML "Theory.subthy"}~@{text "(thy\<^sub>1, thy\<^sub>2)"} compares theories
   140   according to the intrinsic graph structure of the construction.
   141   This sub-theory relation is a nominal approximation of inclusion
   142   (@{text "\<subseteq>"}) of the corresponding content (according to the
   143   semantics of the ML modules that implement the data).
   144 
   145   \item @{ML "Theory.merge"}~@{text "(thy\<^sub>1, thy\<^sub>2)"} absorbs one theory
   146   into the other.  This version of ad-hoc theory merge fails for
   147   unrelated theories!
   148 
   149   \item @{ML "Theory.begin_theory"}~@{text "name parents"} constructs
   150   a new theory based on the given parents.  This ML function is
   151   normally not invoked directly.
   152 
   153   \item @{ML "Theory.parents_of"}~@{text "thy"} returns the direct
   154   ancestors of @{text thy}.
   155 
   156   \item @{ML "Theory.ancestors_of"}~@{text "thy"} returns all
   157   ancestors of @{text thy} (not including @{text thy} itself).
   158 
   159   \end{description}
   160 *}
   161 
   162 text %mlantiq {*
   163   \begin{matharray}{rcl}
   164   @{ML_antiquotation_def "theory"} & : & @{text ML_antiquotation} \\
   165   @{ML_antiquotation_def "theory_context"} & : & @{text ML_antiquotation} \\
   166   \end{matharray}
   167 
   168   @{rail \<open>
   169   @@{ML_antiquotation theory} nameref?
   170   ;
   171   @@{ML_antiquotation theory_context} nameref
   172   \<close>}
   173 
   174   \begin{description}
   175 
   176   \item @{text "@{theory}"} refers to the background theory of the
   177   current context --- as abstract value.
   178 
   179   \item @{text "@{theory A}"} refers to an explicitly named ancestor
   180   theory @{text "A"} of the background theory of the current context
   181   --- as abstract value.
   182 
   183   \item @{text "@{theory_context A}"} is similar to @{text "@{theory
   184   A}"}, but presents the result as initial @{ML_type Proof.context}
   185   (see also @{ML Proof_Context.init_global}).
   186 
   187   \end{description}
   188 *}
   189 
   190 
   191 subsection {* Proof context \label{sec:context-proof} *}
   192 
   193 text {* A proof context is a container for pure data that refers to
   194   the theory from which it is derived. The @{text "init"} operation
   195   creates a proof context from a given theory. There is an explicit
   196   @{text "transfer"} operation to force resynchronization with updates
   197   to the background theory -- this is rarely required in practice.
   198 
   199   Entities derived in a proof context need to record logical
   200   requirements explicitly, since there is no separate context
   201   identification or symbolic inclusion as for theories.  For example,
   202   hypotheses used in primitive derivations (cf.\ \secref{sec:thms})
   203   are recorded separately within the sequent @{text "\<Gamma> \<turnstile> \<phi>"}, just to
   204   make double sure.  Results could still leak into an alien proof
   205   context due to programming errors, but Isabelle/Isar includes some
   206   extra validity checks in critical positions, notably at the end of a
   207   sub-proof.
   208 
   209   Proof contexts may be manipulated arbitrarily, although the common
   210   discipline is to follow block structure as a mental model: a given
   211   context is extended consecutively, and results are exported back
   212   into the original context.  Note that an Isar proof state models
   213   block-structured reasoning explicitly, using a stack of proof
   214   contexts internally.  For various technical reasons, the background
   215   theory of an Isar proof state must not be changed while the proof is
   216   still under construction!
   217 *}
   218 
   219 text %mlref {*
   220   \begin{mldecls}
   221   @{index_ML_type Proof.context} \\
   222   @{index_ML Proof_Context.init_global: "theory -> Proof.context"} \\
   223   @{index_ML Proof_Context.theory_of: "Proof.context -> theory"} \\
   224   @{index_ML Proof_Context.transfer: "theory -> Proof.context -> Proof.context"} \\
   225   \end{mldecls}
   226 
   227   \begin{description}
   228 
   229   \item Type @{ML_type Proof.context} represents proof contexts.
   230 
   231   \item @{ML Proof_Context.init_global}~@{text "thy"} produces a proof
   232   context derived from @{text "thy"}, initializing all data.
   233 
   234   \item @{ML Proof_Context.theory_of}~@{text "ctxt"} selects the
   235   background theory from @{text "ctxt"}.
   236 
   237   \item @{ML Proof_Context.transfer}~@{text "thy ctxt"} promotes the
   238   background theory of @{text "ctxt"} to the super theory @{text
   239   "thy"}.
   240 
   241   \end{description}
   242 *}
   243 
   244 text %mlantiq {*
   245   \begin{matharray}{rcl}
   246   @{ML_antiquotation_def "context"} & : & @{text ML_antiquotation} \\
   247   \end{matharray}
   248 
   249   \begin{description}
   250 
   251   \item @{text "@{context}"} refers to \emph{the} context at
   252   compile-time --- as abstract value.  Independently of (local) theory
   253   or proof mode, this always produces a meaningful result.
   254 
   255   This is probably the most common antiquotation in interactive
   256   experimentation with ML inside Isar.
   257 
   258   \end{description}
   259 *}
   260 
   261 
   262 subsection {* Generic contexts \label{sec:generic-context} *}
   263 
   264 text {*
   265   A generic context is the disjoint sum of either a theory or proof
   266   context.  Occasionally, this enables uniform treatment of generic
   267   context data, typically extra-logical information.  Operations on
   268   generic contexts include the usual injections, partial selections,
   269   and combinators for lifting operations on either component of the
   270   disjoint sum.
   271 
   272   Moreover, there are total operations @{text "theory_of"} and @{text
   273   "proof_of"} to convert a generic context into either kind: a theory
   274   can always be selected from the sum, while a proof context might
   275   have to be constructed by an ad-hoc @{text "init"} operation, which
   276   incurs a small runtime overhead.
   277 *}
   278 
   279 text %mlref {*
   280   \begin{mldecls}
   281   @{index_ML_type Context.generic} \\
   282   @{index_ML Context.theory_of: "Context.generic -> theory"} \\
   283   @{index_ML Context.proof_of: "Context.generic -> Proof.context"} \\
   284   \end{mldecls}
   285 
   286   \begin{description}
   287 
   288   \item Type @{ML_type Context.generic} is the direct sum of @{ML_type
   289   "theory"} and @{ML_type "Proof.context"}, with the datatype
   290   constructors @{ML "Context.Theory"} and @{ML "Context.Proof"}.
   291 
   292   \item @{ML Context.theory_of}~@{text "context"} always produces a
   293   theory from the generic @{text "context"}, using @{ML
   294   "Proof_Context.theory_of"} as required.
   295 
   296   \item @{ML Context.proof_of}~@{text "context"} always produces a
   297   proof context from the generic @{text "context"}, using @{ML
   298   "Proof_Context.init_global"} as required (note that this re-initializes the
   299   context data with each invocation).
   300 
   301   \end{description}
   302 *}
   303 
   304 
   305 subsection {* Context data \label{sec:context-data} *}
   306 
   307 text {* The main purpose of theory and proof contexts is to manage
   308   arbitrary (pure) data.  New data types can be declared incrementally
   309   at compile time.  There are separate declaration mechanisms for any
   310   of the three kinds of contexts: theory, proof, generic.
   311 
   312   \paragraph{Theory data} declarations need to implement the following
   313   SML signature:
   314 
   315   \medskip
   316   \begin{tabular}{ll}
   317   @{text "\<type> T"} & representing type \\
   318   @{text "\<val> empty: T"} & empty default value \\
   319   @{text "\<val> extend: T \<rightarrow> T"} & re-initialize on import \\
   320   @{text "\<val> merge: T \<times> T \<rightarrow> T"} & join on import \\
   321   \end{tabular}
   322   \medskip
   323 
   324   The @{text "empty"} value acts as initial default for \emph{any}
   325   theory that does not declare actual data content; @{text "extend"}
   326   is acts like a unitary version of @{text "merge"}.
   327 
   328   Implementing @{text "merge"} can be tricky.  The general idea is
   329   that @{text "merge (data\<^sub>1, data\<^sub>2)"} inserts those parts of @{text
   330   "data\<^sub>2"} into @{text "data\<^sub>1"} that are not yet present, while
   331   keeping the general order of things.  The @{ML Library.merge}
   332   function on plain lists may serve as canonical template.
   333 
   334   Particularly note that shared parts of the data must not be
   335   duplicated by naive concatenation, or a theory graph that is like a
   336   chain of diamonds would cause an exponential blowup!
   337 
   338   \paragraph{Proof context data} declarations need to implement the
   339   following SML signature:
   340 
   341   \medskip
   342   \begin{tabular}{ll}
   343   @{text "\<type> T"} & representing type \\
   344   @{text "\<val> init: theory \<rightarrow> T"} & produce initial value \\
   345   \end{tabular}
   346   \medskip
   347 
   348   The @{text "init"} operation is supposed to produce a pure value
   349   from the given background theory and should be somehow
   350   ``immediate''.  Whenever a proof context is initialized, which
   351   happens frequently, the the system invokes the @{text "init"}
   352   operation of \emph{all} theory data slots ever declared.  This also
   353   means that one needs to be economic about the total number of proof
   354   data declarations in the system, i.e.\ each ML module should declare
   355   at most one, sometimes two data slots for its internal use.
   356   Repeated data declarations to simulate a record type should be
   357   avoided!
   358 
   359   \paragraph{Generic data} provides a hybrid interface for both theory
   360   and proof data.  The @{text "init"} operation for proof contexts is
   361   predefined to select the current data value from the background
   362   theory.
   363 
   364   \bigskip Any of the above data declarations over type @{text "T"}
   365   result in an ML structure with the following signature:
   366 
   367   \medskip
   368   \begin{tabular}{ll}
   369   @{text "get: context \<rightarrow> T"} \\
   370   @{text "put: T \<rightarrow> context \<rightarrow> context"} \\
   371   @{text "map: (T \<rightarrow> T) \<rightarrow> context \<rightarrow> context"} \\
   372   \end{tabular}
   373   \medskip
   374 
   375   These other operations provide exclusive access for the particular
   376   kind of context (theory, proof, or generic context).  This interface
   377   observes the ML discipline for types and scopes: there is no other
   378   way to access the corresponding data slot of a context.  By keeping
   379   these operations private, an Isabelle/ML module may maintain
   380   abstract values authentically.  *}
   381 
   382 text %mlref {*
   383   \begin{mldecls}
   384   @{index_ML_functor Theory_Data} \\
   385   @{index_ML_functor Proof_Data} \\
   386   @{index_ML_functor Generic_Data} \\
   387   \end{mldecls}
   388 
   389   \begin{description}
   390 
   391   \item @{ML_functor Theory_Data}@{text "(spec)"} declares data for
   392   type @{ML_type theory} according to the specification provided as
   393   argument structure.  The resulting structure provides data init and
   394   access operations as described above.
   395 
   396   \item @{ML_functor Proof_Data}@{text "(spec)"} is analogous to
   397   @{ML_functor Theory_Data} for type @{ML_type Proof.context}.
   398 
   399   \item @{ML_functor Generic_Data}@{text "(spec)"} is analogous to
   400   @{ML_functor Theory_Data} for type @{ML_type Context.generic}.
   401 
   402   \end{description}
   403 *}
   404 
   405 text %mlex {*
   406   The following artificial example demonstrates theory
   407   data: we maintain a set of terms that are supposed to be wellformed
   408   wrt.\ the enclosing theory.  The public interface is as follows:
   409 *}
   410 
   411 ML {*
   412   signature WELLFORMED_TERMS =
   413   sig
   414     val get: theory -> term list
   415     val add: term -> theory -> theory
   416   end;
   417 *}
   418 
   419 text {* The implementation uses private theory data internally, and
   420   only exposes an operation that involves explicit argument checking
   421   wrt.\ the given theory. *}
   422 
   423 ML {*
   424   structure Wellformed_Terms: WELLFORMED_TERMS =
   425   struct
   426 
   427   structure Terms = Theory_Data
   428   (
   429     type T = term Ord_List.T;
   430     val empty = [];
   431     val extend = I;
   432     fun merge (ts1, ts2) =
   433       Ord_List.union Term_Ord.fast_term_ord ts1 ts2;
   434   );
   435 
   436   val get = Terms.get;
   437 
   438   fun add raw_t thy =
   439     let
   440       val t = Sign.cert_term thy raw_t;
   441     in
   442       Terms.map (Ord_List.insert Term_Ord.fast_term_ord t) thy
   443     end;
   444 
   445   end;
   446 *}
   447 
   448 text {* Type @{ML_type "term Ord_List.T"} is used for reasonably
   449   efficient representation of a set of terms: all operations are
   450   linear in the number of stored elements.  Here we assume that users
   451   of this module do not care about the declaration order, since that
   452   data structure forces its own arrangement of elements.
   453 
   454   Observe how the @{ML_text merge} operation joins the data slots of
   455   the two constituents: @{ML Ord_List.union} prevents duplication of
   456   common data from different branches, thus avoiding the danger of
   457   exponential blowup.  Plain list append etc.\ must never be used for
   458   theory data merges!
   459 
   460   \medskip Our intended invariant is achieved as follows:
   461   \begin{enumerate}
   462 
   463   \item @{ML Wellformed_Terms.add} only admits terms that have passed
   464   the @{ML Sign.cert_term} check of the given theory at that point.
   465 
   466   \item Wellformedness in the sense of @{ML Sign.cert_term} is
   467   monotonic wrt.\ the sub-theory relation.  So our data can move
   468   upwards in the hierarchy (via extension or merges), and maintain
   469   wellformedness without further checks.
   470 
   471   \end{enumerate}
   472 
   473   Note that all basic operations of the inference kernel (which
   474   includes @{ML Sign.cert_term}) observe this monotonicity principle,
   475   but other user-space tools don't.  For example, fully-featured
   476   type-inference via @{ML Syntax.check_term} (cf.\
   477   \secref{sec:term-check}) is not necessarily monotonic wrt.\ the
   478   background theory, since constraints of term constants can be
   479   modified by later declarations, for example.
   480 
   481   In most cases, user-space context data does not have to take such
   482   invariants too seriously.  The situation is different in the
   483   implementation of the inference kernel itself, which uses the very
   484   same data mechanisms for types, constants, axioms etc.
   485 *}
   486 
   487 
   488 subsection {* Configuration options \label{sec:config-options} *}
   489 
   490 text {* A \emph{configuration option} is a named optional value of
   491   some basic type (Boolean, integer, string) that is stored in the
   492   context.  It is a simple application of general context data
   493   (\secref{sec:context-data}) that is sufficiently common to justify
   494   customized setup, which includes some concrete declarations for
   495   end-users using existing notation for attributes (cf.\
   496   \secref{sec:attributes}).
   497 
   498   For example, the predefined configuration option @{attribute
   499   show_types} controls output of explicit type constraints for
   500   variables in printed terms (cf.\ \secref{sec:read-print}).  Its
   501   value can be modified within Isar text like this:
   502 *}
   503 
   504 declare [[show_types = false]]
   505   -- {* declaration within (local) theory context *}
   506 
   507 notepad
   508 begin
   509   note [[show_types = true]]
   510     -- {* declaration within proof (forward mode) *}
   511   term x
   512 
   513   have "x = x"
   514     using [[show_types = false]]
   515       -- {* declaration within proof (backward mode) *}
   516     ..
   517 end
   518 
   519 text {* Configuration options that are not set explicitly hold a
   520   default value that can depend on the application context.  This
   521   allows to retrieve the value from another slot within the context,
   522   or fall back on a global preference mechanism, for example.
   523 
   524   The operations to declare configuration options and get/map their
   525   values are modeled as direct replacements for historic global
   526   references, only that the context is made explicit.  This allows
   527   easy configuration of tools, without relying on the execution order
   528   as required for old-style mutable references.  *}
   529 
   530 text %mlref {*
   531   \begin{mldecls}
   532   @{index_ML Config.get: "Proof.context -> 'a Config.T -> 'a"} \\
   533   @{index_ML Config.map: "'a Config.T -> ('a -> 'a) -> Proof.context -> Proof.context"} \\
   534   @{index_ML Attrib.setup_config_bool: "binding -> (Context.generic -> bool) ->
   535   bool Config.T"} \\
   536   @{index_ML Attrib.setup_config_int: "binding -> (Context.generic -> int) ->
   537   int Config.T"} \\
   538   @{index_ML Attrib.setup_config_real: "binding -> (Context.generic -> real) ->
   539   real Config.T"} \\
   540   @{index_ML Attrib.setup_config_string: "binding -> (Context.generic -> string) ->
   541   string Config.T"} \\
   542   \end{mldecls}
   543 
   544   \begin{description}
   545 
   546   \item @{ML Config.get}~@{text "ctxt config"} gets the value of
   547   @{text "config"} in the given context.
   548 
   549   \item @{ML Config.map}~@{text "config f ctxt"} updates the context
   550   by updating the value of @{text "config"}.
   551 
   552   \item @{text "config ="}~@{ML Attrib.setup_config_bool}~@{text "name
   553   default"} creates a named configuration option of type @{ML_type
   554   bool}, with the given @{text "default"} depending on the application
   555   context.  The resulting @{text "config"} can be used to get/map its
   556   value in a given context.  There is an implicit update of the
   557   background theory that registers the option as attribute with some
   558   concrete syntax.
   559 
   560   \item @{ML Attrib.config_int}, @{ML Attrib.config_real}, and @{ML
   561   Attrib.config_string} work like @{ML Attrib.config_bool}, but for
   562   types @{ML_type int} and @{ML_type string}, respectively.
   563 
   564   \end{description}
   565 *}
   566 
   567 text %mlex {* The following example shows how to declare and use a
   568   Boolean configuration option called @{text "my_flag"} with constant
   569   default value @{ML false}.  *}
   570 
   571 ML {*
   572   val my_flag =
   573     Attrib.setup_config_bool @{binding my_flag} (K false)
   574 *}
   575 
   576 text {* Now the user can refer to @{attribute my_flag} in
   577   declarations, while ML tools can retrieve the current value from the
   578   context via @{ML Config.get}.  *}
   579 
   580 ML_val {* @{assert} (Config.get @{context} my_flag = false) *}
   581 
   582 declare [[my_flag = true]]
   583 
   584 ML_val {* @{assert} (Config.get @{context} my_flag = true) *}
   585 
   586 notepad
   587 begin
   588   {
   589     note [[my_flag = false]]
   590     ML_val {* @{assert} (Config.get @{context} my_flag = false) *}
   591   }
   592   ML_val {* @{assert} (Config.get @{context} my_flag = true) *}
   593 end
   594 
   595 text {* Here is another example involving ML type @{ML_type real}
   596   (floating-point numbers). *}
   597 
   598 ML {*
   599   val airspeed_velocity =
   600     Attrib.setup_config_real @{binding airspeed_velocity} (K 0.0)
   601 *}
   602 
   603 declare [[airspeed_velocity = 10]]
   604 declare [[airspeed_velocity = 9.9]]
   605 
   606 
   607 section {* Names \label{sec:names} *}
   608 
   609 text {* In principle, a name is just a string, but there are various
   610   conventions for representing additional structure.  For example,
   611   ``@{text "Foo.bar.baz"}'' is considered as a long name consisting of
   612   qualifier @{text "Foo.bar"} and base name @{text "baz"}.  The
   613   individual constituents of a name may have further substructure,
   614   e.g.\ the string ``\verb,\,\verb,<alpha>,'' encodes as a single
   615   symbol (\secref{sec:symbols}).
   616 
   617   \medskip Subsequently, we shall introduce specific categories of
   618   names.  Roughly speaking these correspond to logical entities as
   619   follows:
   620   \begin{itemize}
   621 
   622   \item Basic names (\secref{sec:basic-name}): free and bound
   623   variables.
   624 
   625   \item Indexed names (\secref{sec:indexname}): schematic variables.
   626 
   627   \item Long names (\secref{sec:long-name}): constants of any kind
   628   (type constructors, term constants, other concepts defined in user
   629   space).  Such entities are typically managed via name spaces
   630   (\secref{sec:name-space}).
   631 
   632   \end{itemize}
   633 *}
   634 
   635 
   636 subsection {* Basic names \label{sec:basic-name} *}
   637 
   638 text {*
   639   A \emph{basic name} essentially consists of a single Isabelle
   640   identifier.  There are conventions to mark separate classes of basic
   641   names, by attaching a suffix of underscores: one underscore means
   642   \emph{internal name}, two underscores means \emph{Skolem name},
   643   three underscores means \emph{internal Skolem name}.
   644 
   645   For example, the basic name @{text "foo"} has the internal version
   646   @{text "foo_"}, with Skolem versions @{text "foo__"} and @{text
   647   "foo___"}, respectively.
   648 
   649   These special versions provide copies of the basic name space, apart
   650   from anything that normally appears in the user text.  For example,
   651   system generated variables in Isar proof contexts are usually marked
   652   as internal, which prevents mysterious names like @{text "xaa"} to
   653   appear in human-readable text.
   654 
   655   \medskip Manipulating binding scopes often requires on-the-fly
   656   renamings.  A \emph{name context} contains a collection of already
   657   used names.  The @{text "declare"} operation adds names to the
   658   context.
   659 
   660   The @{text "invents"} operation derives a number of fresh names from
   661   a given starting point.  For example, the first three names derived
   662   from @{text "a"} are @{text "a"}, @{text "b"}, @{text "c"}.
   663 
   664   The @{text "variants"} operation produces fresh names by
   665   incrementing tentative names as base-26 numbers (with digits @{text
   666   "a..z"}) until all clashes are resolved.  For example, name @{text
   667   "foo"} results in variants @{text "fooa"}, @{text "foob"}, @{text
   668   "fooc"}, \dots, @{text "fooaa"}, @{text "fooab"} etc.; each renaming
   669   step picks the next unused variant from this sequence.
   670 *}
   671 
   672 text %mlref {*
   673   \begin{mldecls}
   674   @{index_ML Name.internal: "string -> string"} \\
   675   @{index_ML Name.skolem: "string -> string"} \\
   676   \end{mldecls}
   677   \begin{mldecls}
   678   @{index_ML_type Name.context} \\
   679   @{index_ML Name.context: Name.context} \\
   680   @{index_ML Name.declare: "string -> Name.context -> Name.context"} \\
   681   @{index_ML Name.invent: "Name.context -> string -> int -> string list"} \\
   682   @{index_ML Name.variant: "string -> Name.context -> string * Name.context"} \\
   683   \end{mldecls}
   684   \begin{mldecls}
   685   @{index_ML Variable.names_of: "Proof.context -> Name.context"} \\
   686   \end{mldecls}
   687 
   688   \begin{description}
   689 
   690   \item @{ML Name.internal}~@{text "name"} produces an internal name
   691   by adding one underscore.
   692 
   693   \item @{ML Name.skolem}~@{text "name"} produces a Skolem name by
   694   adding two underscores.
   695 
   696   \item Type @{ML_type Name.context} represents the context of already
   697   used names; the initial value is @{ML "Name.context"}.
   698 
   699   \item @{ML Name.declare}~@{text "name"} enters a used name into the
   700   context.
   701 
   702   \item @{ML Name.invent}~@{text "context name n"} produces @{text
   703   "n"} fresh names derived from @{text "name"}.
   704 
   705   \item @{ML Name.variant}~@{text "name context"} produces a fresh
   706   variant of @{text "name"}; the result is declared to the context.
   707 
   708   \item @{ML Variable.names_of}~@{text "ctxt"} retrieves the context
   709   of declared type and term variable names.  Projecting a proof
   710   context down to a primitive name context is occasionally useful when
   711   invoking lower-level operations.  Regular management of ``fresh
   712   variables'' is done by suitable operations of structure @{ML_structure
   713   Variable}, which is also able to provide an official status of
   714   ``locally fixed variable'' within the logical environment (cf.\
   715   \secref{sec:variables}).
   716 
   717   \end{description}
   718 *}
   719 
   720 text %mlex {* The following simple examples demonstrate how to produce
   721   fresh names from the initial @{ML Name.context}. *}
   722 
   723 ML {*
   724   val list1 = Name.invent Name.context "a" 5;
   725   @{assert} (list1 = ["a", "b", "c", "d", "e"]);
   726 
   727   val list2 =
   728     #1 (fold_map Name.variant ["x", "x", "a", "a", "'a", "'a"] Name.context);
   729   @{assert} (list2 = ["x", "xa", "a", "aa", "'a", "'aa"]);
   730 *}
   731 
   732 text {* \medskip The same works relatively to the formal context as
   733   follows. *}
   734 
   735 locale ex = fixes a b c :: 'a
   736 begin
   737 
   738 ML {*
   739   val names = Variable.names_of @{context};
   740 
   741   val list1 = Name.invent names "a" 5;
   742   @{assert} (list1 = ["d", "e", "f", "g", "h"]);
   743 
   744   val list2 =
   745     #1 (fold_map Name.variant ["x", "x", "a", "a", "'a", "'a"] names);
   746   @{assert} (list2 = ["x", "xa", "aa", "ab", "'aa", "'ab"]);
   747 *}
   748 
   749 end
   750 
   751 
   752 subsection {* Indexed names \label{sec:indexname} *}
   753 
   754 text {*
   755   An \emph{indexed name} (or @{text "indexname"}) is a pair of a basic
   756   name and a natural number.  This representation allows efficient
   757   renaming by incrementing the second component only.  The canonical
   758   way to rename two collections of indexnames apart from each other is
   759   this: determine the maximum index @{text "maxidx"} of the first
   760   collection, then increment all indexes of the second collection by
   761   @{text "maxidx + 1"}; the maximum index of an empty collection is
   762   @{text "-1"}.
   763 
   764   Occasionally, basic names are injected into the same pair type of
   765   indexed names: then @{text "(x, -1)"} is used to encode the basic
   766   name @{text "x"}.
   767 
   768   \medskip Isabelle syntax observes the following rules for
   769   representing an indexname @{text "(x, i)"} as a packed string:
   770 
   771   \begin{itemize}
   772 
   773   \item @{text "?x"} if @{text "x"} does not end with a digit and @{text "i = 0"},
   774 
   775   \item @{text "?xi"} if @{text "x"} does not end with a digit,
   776 
   777   \item @{text "?x.i"} otherwise.
   778 
   779   \end{itemize}
   780 
   781   Indexnames may acquire large index numbers after several maxidx
   782   shifts have been applied.  Results are usually normalized towards
   783   @{text "0"} at certain checkpoints, notably at the end of a proof.
   784   This works by producing variants of the corresponding basic name
   785   components.  For example, the collection @{text "?x1, ?x7, ?x42"}
   786   becomes @{text "?x, ?xa, ?xb"}.
   787 *}
   788 
   789 text %mlref {*
   790   \begin{mldecls}
   791   @{index_ML_type indexname: "string * int"} \\
   792   \end{mldecls}
   793 
   794   \begin{description}
   795 
   796   \item Type @{ML_type indexname} represents indexed names.  This is
   797   an abbreviation for @{ML_type "string * int"}.  The second component
   798   is usually non-negative, except for situations where @{text "(x,
   799   -1)"} is used to inject basic names into this type.  Other negative
   800   indexes should not be used.
   801 
   802   \end{description}
   803 *}
   804 
   805 
   806 subsection {* Long names \label{sec:long-name} *}
   807 
   808 text {* A \emph{long name} consists of a sequence of non-empty name
   809   components.  The packed representation uses a dot as separator, as
   810   in ``@{text "A.b.c"}''.  The last component is called \emph{base
   811   name}, the remaining prefix is called \emph{qualifier} (which may be
   812   empty).  The qualifier can be understood as the access path to the
   813   named entity while passing through some nested block-structure,
   814   although our free-form long names do not really enforce any strict
   815   discipline.
   816 
   817   For example, an item named ``@{text "A.b.c"}'' may be understood as
   818   a local entity @{text "c"}, within a local structure @{text "b"},
   819   within a global structure @{text "A"}.  In practice, long names
   820   usually represent 1--3 levels of qualification.  User ML code should
   821   not make any assumptions about the particular structure of long
   822   names!
   823 
   824   The empty name is commonly used as an indication of unnamed
   825   entities, or entities that are not entered into the corresponding
   826   name space, whenever this makes any sense.  The basic operations on
   827   long names map empty names again to empty names.
   828 *}
   829 
   830 text %mlref {*
   831   \begin{mldecls}
   832   @{index_ML Long_Name.base_name: "string -> string"} \\
   833   @{index_ML Long_Name.qualifier: "string -> string"} \\
   834   @{index_ML Long_Name.append: "string -> string -> string"} \\
   835   @{index_ML Long_Name.implode: "string list -> string"} \\
   836   @{index_ML Long_Name.explode: "string -> string list"} \\
   837   \end{mldecls}
   838 
   839   \begin{description}
   840 
   841   \item @{ML Long_Name.base_name}~@{text "name"} returns the base name
   842   of a long name.
   843 
   844   \item @{ML Long_Name.qualifier}~@{text "name"} returns the qualifier
   845   of a long name.
   846 
   847   \item @{ML Long_Name.append}~@{text "name\<^sub>1 name\<^sub>2"} appends two long
   848   names.
   849 
   850   \item @{ML Long_Name.implode}~@{text "names"} and @{ML
   851   Long_Name.explode}~@{text "name"} convert between the packed string
   852   representation and the explicit list form of long names.
   853 
   854   \end{description}
   855 *}
   856 
   857 
   858 subsection {* Name spaces \label{sec:name-space} *}
   859 
   860 text {* A @{text "name space"} manages a collection of long names,
   861   together with a mapping between partially qualified external names
   862   and fully qualified internal names (in both directions).  Note that
   863   the corresponding @{text "intern"} and @{text "extern"} operations
   864   are mostly used for parsing and printing only!  The @{text
   865   "declare"} operation augments a name space according to the accesses
   866   determined by a given binding, and a naming policy from the context.
   867 
   868   \medskip A @{text "binding"} specifies details about the prospective
   869   long name of a newly introduced formal entity.  It consists of a
   870   base name, prefixes for qualification (separate ones for system
   871   infrastructure and user-space mechanisms), a slot for the original
   872   source position, and some additional flags.
   873 
   874   \medskip A @{text "naming"} provides some additional details for
   875   producing a long name from a binding.  Normally, the naming is
   876   implicit in the theory or proof context.  The @{text "full"}
   877   operation (and its variants for different context types) produces a
   878   fully qualified internal name to be entered into a name space.  The
   879   main equation of this ``chemical reaction'' when binding new
   880   entities in a context is as follows:
   881 
   882   \medskip
   883   \begin{tabular}{l}
   884   @{text "binding + naming \<longrightarrow> long name + name space accesses"}
   885   \end{tabular}
   886 
   887   \bigskip As a general principle, there is a separate name space for
   888   each kind of formal entity, e.g.\ fact, logical constant, type
   889   constructor, type class.  It is usually clear from the occurrence in
   890   concrete syntax (or from the scope) which kind of entity a name
   891   refers to.  For example, the very same name @{text "c"} may be used
   892   uniformly for a constant, type constructor, and type class.
   893 
   894   There are common schemes to name derived entities systematically
   895   according to the name of the main logical entity involved, e.g.\
   896   fact @{text "c.intro"} for a canonical introduction rule related to
   897   constant @{text "c"}.  This technique of mapping names from one
   898   space into another requires some care in order to avoid conflicts.
   899   In particular, theorem names derived from a type constructor or type
   900   class should get an additional suffix in addition to the usual
   901   qualification.  This leads to the following conventions for derived
   902   names:
   903 
   904   \medskip
   905   \begin{tabular}{ll}
   906   logical entity & fact name \\\hline
   907   constant @{text "c"} & @{text "c.intro"} \\
   908   type @{text "c"} & @{text "c_type.intro"} \\
   909   class @{text "c"} & @{text "c_class.intro"} \\
   910   \end{tabular}
   911 *}
   912 
   913 text %mlref {*
   914   \begin{mldecls}
   915   @{index_ML_type binding} \\
   916   @{index_ML Binding.empty: binding} \\
   917   @{index_ML Binding.name: "string -> binding"} \\
   918   @{index_ML Binding.qualify: "bool -> string -> binding -> binding"} \\
   919   @{index_ML Binding.prefix: "bool -> string -> binding -> binding"} \\
   920   @{index_ML Binding.conceal: "binding -> binding"} \\
   921   @{index_ML Binding.print: "binding -> string"} \\
   922   \end{mldecls}
   923   \begin{mldecls}
   924   @{index_ML_type Name_Space.naming} \\
   925   @{index_ML Name_Space.default_naming: Name_Space.naming} \\
   926   @{index_ML Name_Space.add_path: "string -> Name_Space.naming -> Name_Space.naming"} \\
   927   @{index_ML Name_Space.full_name: "Name_Space.naming -> binding -> string"} \\
   928   \end{mldecls}
   929   \begin{mldecls}
   930   @{index_ML_type Name_Space.T} \\
   931   @{index_ML Name_Space.empty: "string -> Name_Space.T"} \\
   932   @{index_ML Name_Space.merge: "Name_Space.T * Name_Space.T -> Name_Space.T"} \\
   933   @{index_ML Name_Space.declare: "Context.generic -> bool ->
   934   binding -> Name_Space.T -> string * Name_Space.T"} \\
   935   @{index_ML Name_Space.intern: "Name_Space.T -> string -> string"} \\
   936   @{index_ML Name_Space.extern: "Proof.context -> Name_Space.T -> string -> string"} \\
   937   @{index_ML Name_Space.is_concealed: "Name_Space.T -> string -> bool"}
   938   \end{mldecls}
   939 
   940   \begin{description}
   941 
   942   \item Type @{ML_type binding} represents the abstract concept of
   943   name bindings.
   944 
   945   \item @{ML Binding.empty} is the empty binding.
   946 
   947   \item @{ML Binding.name}~@{text "name"} produces a binding with base
   948   name @{text "name"}.  Note that this lacks proper source position
   949   information; see also the ML antiquotation @{ML_antiquotation
   950   binding}.
   951 
   952   \item @{ML Binding.qualify}~@{text "mandatory name binding"}
   953   prefixes qualifier @{text "name"} to @{text "binding"}.  The @{text
   954   "mandatory"} flag tells if this name component always needs to be
   955   given in name space accesses --- this is mostly @{text "false"} in
   956   practice.  Note that this part of qualification is typically used in
   957   derived specification mechanisms.
   958 
   959   \item @{ML Binding.prefix} is similar to @{ML Binding.qualify}, but
   960   affects the system prefix.  This part of extra qualification is
   961   typically used in the infrastructure for modular specifications,
   962   notably ``local theory targets'' (see also \chref{ch:local-theory}).
   963 
   964   \item @{ML Binding.conceal}~@{text "binding"} indicates that the
   965   binding shall refer to an entity that serves foundational purposes
   966   only.  This flag helps to mark implementation details of
   967   specification mechanism etc.  Other tools should not depend on the
   968   particulars of concealed entities (cf.\ @{ML
   969   Name_Space.is_concealed}).
   970 
   971   \item @{ML Binding.print}~@{text "binding"} produces a string
   972   representation for human-readable output, together with some formal
   973   markup that might get used in GUI front-ends, for example.
   974 
   975   \item Type @{ML_type Name_Space.naming} represents the abstract
   976   concept of a naming policy.
   977 
   978   \item @{ML Name_Space.default_naming} is the default naming policy.
   979   In a theory context, this is usually augmented by a path prefix
   980   consisting of the theory name.
   981 
   982   \item @{ML Name_Space.add_path}~@{text "path naming"} augments the
   983   naming policy by extending its path component.
   984 
   985   \item @{ML Name_Space.full_name}~@{text "naming binding"} turns a
   986   name binding (usually a basic name) into the fully qualified
   987   internal name, according to the given naming policy.
   988 
   989   \item Type @{ML_type Name_Space.T} represents name spaces.
   990 
   991   \item @{ML Name_Space.empty}~@{text "kind"} and @{ML Name_Space.merge}~@{text
   992   "(space\<^sub>1, space\<^sub>2)"} are the canonical operations for
   993   maintaining name spaces according to theory data management
   994   (\secref{sec:context-data}); @{text "kind"} is a formal comment
   995   to characterize the purpose of a name space.
   996 
   997   \item @{ML Name_Space.declare}~@{text "context strict binding
   998   space"} enters a name binding as fully qualified internal name into
   999   the name space, using the naming of the context.
  1000 
  1001   \item @{ML Name_Space.intern}~@{text "space name"} internalizes a
  1002   (partially qualified) external name.
  1003 
  1004   This operation is mostly for parsing!  Note that fully qualified
  1005   names stemming from declarations are produced via @{ML
  1006   "Name_Space.full_name"} and @{ML "Name_Space.declare"}
  1007   (or their derivatives for @{ML_type theory} and
  1008   @{ML_type Proof.context}).
  1009 
  1010   \item @{ML Name_Space.extern}~@{text "ctxt space name"} externalizes a
  1011   (fully qualified) internal name.
  1012 
  1013   This operation is mostly for printing!  User code should not rely on
  1014   the precise result too much.
  1015 
  1016   \item @{ML Name_Space.is_concealed}~@{text "space name"} indicates
  1017   whether @{text "name"} refers to a strictly private entity that
  1018   other tools are supposed to ignore!
  1019 
  1020   \end{description}
  1021 *}
  1022 
  1023 text %mlantiq {*
  1024   \begin{matharray}{rcl}
  1025   @{ML_antiquotation_def "binding"} & : & @{text ML_antiquotation} \\
  1026   \end{matharray}
  1027 
  1028   @{rail \<open>
  1029   @@{ML_antiquotation binding} name
  1030   \<close>}
  1031 
  1032   \begin{description}
  1033 
  1034   \item @{text "@{binding name}"} produces a binding with base name
  1035   @{text "name"} and the source position taken from the concrete
  1036   syntax of this antiquotation.  In many situations this is more
  1037   appropriate than the more basic @{ML Binding.name} function.
  1038 
  1039   \end{description}
  1040 *}
  1041 
  1042 text %mlex {* The following example yields the source position of some
  1043   concrete binding inlined into the text:
  1044 *}
  1045 
  1046 ML {* Binding.pos_of @{binding here} *}
  1047 
  1048 text {* \medskip That position can be also printed in a message as
  1049   follows: *}
  1050 
  1051 ML_command {*
  1052   writeln
  1053     ("Look here" ^ Position.here (Binding.pos_of @{binding here}))
  1054 *}
  1055 
  1056 text {* This illustrates a key virtue of formalized bindings as
  1057   opposed to raw specifications of base names: the system can use this
  1058   additional information for feedback given to the user (error
  1059   messages etc.).
  1060 
  1061   \medskip The following example refers to its source position
  1062   directly, which is occasionally useful for experimentation and
  1063   diagnostic purposes: *}
  1064 
  1065 ML_command {*
  1066   warning ("Look here" ^ Position.here @{here})
  1067 *}
  1068 
  1069 end