doc-src/IsarImplementation/Thy/prelim.thy
changeset 30152 0ddd8028f98c
parent 30151 629f3a92863e
parent 30148 5d04b67a866e
child 30153 051d3825a15d
equal deleted inserted replaced
30151:629f3a92863e 30152:0ddd8028f98c
     1 
       
     2 (* $Id$ *)
       
     3 
       
     4 theory prelim imports base begin
       
     5 
       
     6 chapter {* Preliminaries *}
       
     7 
       
     8 section {* Contexts \label{sec:context} *}
       
     9 
       
    10 text {*
       
    11   A logical context represents the background that is required for
       
    12   formulating statements and composing proofs.  It acts as a medium to
       
    13   produce formal content, depending on earlier material (declarations,
       
    14   results etc.).
       
    15 
       
    16   For example, derivations within the Isabelle/Pure logic can be
       
    17   described as a judgment @{text "\<Gamma> \<turnstile>\<^sub>\<Theta> \<phi>"}, which means that a
       
    18   proposition @{text "\<phi>"} is derivable from hypotheses @{text "\<Gamma>"}
       
    19   within the theory @{text "\<Theta>"}.  There are logical reasons for
       
    20   keeping @{text "\<Theta>"} and @{text "\<Gamma>"} separate: theories can be
       
    21   liberal about supporting type constructors and schematic
       
    22   polymorphism of constants and axioms, while the inner calculus of
       
    23   @{text "\<Gamma> \<turnstile> \<phi>"} is strictly limited to Simple Type Theory (with
       
    24   fixed type variables in the assumptions).
       
    25 
       
    26   \medskip Contexts and derivations are linked by the following key
       
    27   principles:
       
    28 
       
    29   \begin{itemize}
       
    30 
       
    31   \item Transfer: monotonicity of derivations admits results to be
       
    32   transferred into a \emph{larger} context, i.e.\ @{text "\<Gamma> \<turnstile>\<^sub>\<Theta>
       
    33   \<phi>"} implies @{text "\<Gamma>' \<turnstile>\<^sub>\<Theta>\<^sub>' \<phi>"} for contexts @{text "\<Theta>'
       
    34   \<supseteq> \<Theta>"} and @{text "\<Gamma>' \<supseteq> \<Gamma>"}.
       
    35 
       
    36   \item Export: discharge of hypotheses admits results to be exported
       
    37   into a \emph{smaller} context, i.e.\ @{text "\<Gamma>' \<turnstile>\<^sub>\<Theta> \<phi>"}
       
    38   implies @{text "\<Gamma> \<turnstile>\<^sub>\<Theta> \<Delta> \<Longrightarrow> \<phi>"} where @{text "\<Gamma>' \<supseteq> \<Gamma>"} and
       
    39   @{text "\<Delta> = \<Gamma>' - \<Gamma>"}.  Note that @{text "\<Theta>"} remains unchanged here,
       
    40   only the @{text "\<Gamma>"} part is affected.
       
    41 
       
    42   \end{itemize}
       
    43 
       
    44   \medskip By modeling the main characteristics of the primitive
       
    45   @{text "\<Theta>"} and @{text "\<Gamma>"} above, and abstracting over any
       
    46   particular logical content, we arrive at the fundamental notions of
       
    47   \emph{theory context} and \emph{proof context} in Isabelle/Isar.
       
    48   These implement a certain policy to manage arbitrary \emph{context
       
    49   data}.  There is a strongly-typed mechanism to declare new kinds of
       
    50   data at compile time.
       
    51 
       
    52   The internal bootstrap process of Isabelle/Pure eventually reaches a
       
    53   stage where certain data slots provide the logical content of @{text
       
    54   "\<Theta>"} and @{text "\<Gamma>"} sketched above, but this does not stop there!
       
    55   Various additional data slots support all kinds of mechanisms that
       
    56   are not necessarily part of the core logic.
       
    57 
       
    58   For example, there would be data for canonical introduction and
       
    59   elimination rules for arbitrary operators (depending on the
       
    60   object-logic and application), which enables users to perform
       
    61   standard proof steps implicitly (cf.\ the @{text "rule"} method
       
    62   \cite{isabelle-isar-ref}).
       
    63 
       
    64   \medskip Thus Isabelle/Isar is able to bring forth more and more
       
    65   concepts successively.  In particular, an object-logic like
       
    66   Isabelle/HOL continues the Isabelle/Pure setup by adding specific
       
    67   components for automated reasoning (classical reasoner, tableau
       
    68   prover, structured induction etc.) and derived specification
       
    69   mechanisms (inductive predicates, recursive functions etc.).  All of
       
    70   this is ultimately based on the generic data management by theory
       
    71   and proof contexts introduced here.
       
    72 *}
       
    73 
       
    74 
       
    75 subsection {* Theory context \label{sec:context-theory} *}
       
    76 
       
    77 text {*
       
    78   \glossary{Theory}{FIXME}
       
    79 
       
    80   A \emph{theory} is a data container with explicit named and unique
       
    81   identifier.  Theories are related by a (nominal) sub-theory
       
    82   relation, which corresponds to the dependency graph of the original
       
    83   construction; each theory is derived from a certain sub-graph of
       
    84   ancestor theories.
       
    85 
       
    86   The @{text "merge"} operation produces the least upper bound of two
       
    87   theories, which actually degenerates into absorption of one theory
       
    88   into the other (due to the nominal sub-theory relation).
       
    89 
       
    90   The @{text "begin"} operation starts a new theory by importing
       
    91   several parent theories and entering a special @{text "draft"} mode,
       
    92   which is sustained until the final @{text "end"} operation.  A draft
       
    93   theory acts like a linear type, where updates invalidate earlier
       
    94   versions.  An invalidated draft is called ``stale''.
       
    95 
       
    96   The @{text "checkpoint"} operation produces an intermediate stepping
       
    97   stone that will survive the next update: both the original and the
       
    98   changed theory remain valid and are related by the sub-theory
       
    99   relation.  Checkpointing essentially recovers purely functional
       
   100   theory values, at the expense of some extra internal bookkeeping.
       
   101 
       
   102   The @{text "copy"} operation produces an auxiliary version that has
       
   103   the same data content, but is unrelated to the original: updates of
       
   104   the copy do not affect the original, neither does the sub-theory
       
   105   relation hold.
       
   106 
       
   107   \medskip The example in \figref{fig:ex-theory} below shows a theory
       
   108   graph derived from @{text "Pure"}, with theory @{text "Length"}
       
   109   importing @{text "Nat"} and @{text "List"}.  The body of @{text
       
   110   "Length"} consists of a sequence of updates, working mostly on
       
   111   drafts.  Intermediate checkpoints may occur as well, due to the
       
   112   history mechanism provided by the Isar top-level, cf.\
       
   113   \secref{sec:isar-toplevel}.
       
   114 
       
   115   \begin{figure}[htb]
       
   116   \begin{center}
       
   117   \begin{tabular}{rcccl}
       
   118         &            & @{text "Pure"} \\
       
   119         &            & @{text "\<down>"} \\
       
   120         &            & @{text "FOL"} \\
       
   121         & $\swarrow$ &              & $\searrow$ & \\
       
   122   @{text "Nat"} &    &              &            & @{text "List"} \\
       
   123         & $\searrow$ &              & $\swarrow$ \\
       
   124         &            & @{text "Length"} \\
       
   125         &            & \multicolumn{3}{l}{~~@{keyword "imports"}} \\
       
   126         &            & \multicolumn{3}{l}{~~@{keyword "begin"}} \\
       
   127         &            & $\vdots$~~ \\
       
   128         &            & @{text "\<bullet>"}~~ \\
       
   129         &            & $\vdots$~~ \\
       
   130         &            & @{text "\<bullet>"}~~ \\
       
   131         &            & $\vdots$~~ \\
       
   132         &            & \multicolumn{3}{l}{~~@{command "end"}} \\
       
   133   \end{tabular}
       
   134   \caption{A theory definition depending on ancestors}\label{fig:ex-theory}
       
   135   \end{center}
       
   136   \end{figure}
       
   137 
       
   138   \medskip There is a separate notion of \emph{theory reference} for
       
   139   maintaining a live link to an evolving theory context: updates on
       
   140   drafts are propagated automatically.  Dynamic updating stops after
       
   141   an explicit @{text "end"} only.
       
   142 
       
   143   Derived entities may store a theory reference in order to indicate
       
   144   the context they belong to.  This implicitly assumes monotonic
       
   145   reasoning, because the referenced context may become larger without
       
   146   further notice.
       
   147 *}
       
   148 
       
   149 text %mlref {*
       
   150   \begin{mldecls}
       
   151   @{index_ML_type theory} \\
       
   152   @{index_ML Theory.subthy: "theory * theory -> bool"} \\
       
   153   @{index_ML Theory.merge: "theory * theory -> theory"} \\
       
   154   @{index_ML Theory.checkpoint: "theory -> theory"} \\
       
   155   @{index_ML Theory.copy: "theory -> theory"} \\
       
   156   \end{mldecls}
       
   157   \begin{mldecls}
       
   158   @{index_ML_type theory_ref} \\
       
   159   @{index_ML Theory.deref: "theory_ref -> theory"} \\
       
   160   @{index_ML Theory.check_thy: "theory -> theory_ref"} \\
       
   161   \end{mldecls}
       
   162 
       
   163   \begin{description}
       
   164 
       
   165   \item @{ML_type theory} represents theory contexts.  This is
       
   166   essentially a linear type!  Most operations destroy the original
       
   167   version, which then becomes ``stale''.
       
   168 
       
   169   \item @{ML "Theory.subthy"}~@{text "(thy\<^sub>1, thy\<^sub>2)"}
       
   170   compares theories according to the inherent graph structure of the
       
   171   construction.  This sub-theory relation is a nominal approximation
       
   172   of inclusion (@{text "\<subseteq>"}) of the corresponding content.
       
   173 
       
   174   \item @{ML "Theory.merge"}~@{text "(thy\<^sub>1, thy\<^sub>2)"}
       
   175   absorbs one theory into the other.  This fails for unrelated
       
   176   theories!
       
   177 
       
   178   \item @{ML "Theory.checkpoint"}~@{text "thy"} produces a safe
       
   179   stepping stone in the linear development of @{text "thy"}.  The next
       
   180   update will result in two related, valid theories.
       
   181 
       
   182   \item @{ML "Theory.copy"}~@{text "thy"} produces a variant of @{text
       
   183   "thy"} that holds a copy of the same data.  The result is not
       
   184   related to the original; the original is unchanched.
       
   185 
       
   186   \item @{ML_type theory_ref} represents a sliding reference to an
       
   187   always valid theory; updates on the original are propagated
       
   188   automatically.
       
   189 
       
   190   \item @{ML "Theory.deref"}~@{text "thy_ref"} turns a @{ML_type
       
   191   "theory_ref"} into an @{ML_type "theory"} value.  As the referenced
       
   192   theory evolves monotonically over time, later invocations of @{ML
       
   193   "Theory.deref"} may refer to a larger context.
       
   194 
       
   195   \item @{ML "Theory.check_thy"}~@{text "thy"} produces a @{ML_type
       
   196   "theory_ref"} from a valid @{ML_type "theory"} value.
       
   197 
       
   198   \end{description}
       
   199 *}
       
   200 
       
   201 
       
   202 subsection {* Proof context \label{sec:context-proof} *}
       
   203 
       
   204 text {*
       
   205   \glossary{Proof context}{The static context of a structured proof,
       
   206   acts like a local ``theory'' of the current portion of Isar proof
       
   207   text, generalizes the idea of local hypotheses @{text "\<Gamma>"} in
       
   208   judgments @{text "\<Gamma> \<turnstile> \<phi>"} of natural deduction calculi.  There is a
       
   209   generic notion of introducing and discharging hypotheses.
       
   210   Arbritrary auxiliary context data may be adjoined.}
       
   211 
       
   212   A proof context is a container for pure data with a back-reference
       
   213   to the theory it belongs to.  The @{text "init"} operation creates a
       
   214   proof context from a given theory.  Modifications to draft theories
       
   215   are propagated to the proof context as usual, but there is also an
       
   216   explicit @{text "transfer"} operation to force resynchronization
       
   217   with more substantial updates to the underlying theory.  The actual
       
   218   context data does not require any special bookkeeping, thanks to the
       
   219   lack of destructive features.
       
   220 
       
   221   Entities derived in a proof context need to record inherent logical
       
   222   requirements explicitly, since there is no separate context
       
   223   identification as for theories.  For example, hypotheses used in
       
   224   primitive derivations (cf.\ \secref{sec:thms}) are recorded
       
   225   separately within the sequent @{text "\<Gamma> \<turnstile> \<phi>"}, just to make double
       
   226   sure.  Results could still leak into an alien proof context do to
       
   227   programming errors, but Isabelle/Isar includes some extra validity
       
   228   checks in critical positions, notably at the end of a sub-proof.
       
   229 
       
   230   Proof contexts may be manipulated arbitrarily, although the common
       
   231   discipline is to follow block structure as a mental model: a given
       
   232   context is extended consecutively, and results are exported back
       
   233   into the original context.  Note that the Isar proof states model
       
   234   block-structured reasoning explicitly, using a stack of proof
       
   235   contexts internally, cf.\ \secref{sec:isar-proof-state}.
       
   236 *}
       
   237 
       
   238 text %mlref {*
       
   239   \begin{mldecls}
       
   240   @{index_ML_type Proof.context} \\
       
   241   @{index_ML ProofContext.init: "theory -> Proof.context"} \\
       
   242   @{index_ML ProofContext.theory_of: "Proof.context -> theory"} \\
       
   243   @{index_ML ProofContext.transfer: "theory -> Proof.context -> Proof.context"} \\
       
   244   \end{mldecls}
       
   245 
       
   246   \begin{description}
       
   247 
       
   248   \item @{ML_type Proof.context} represents proof contexts.  Elements
       
   249   of this type are essentially pure values, with a sliding reference
       
   250   to the background theory.
       
   251 
       
   252   \item @{ML ProofContext.init}~@{text "thy"} produces a proof context
       
   253   derived from @{text "thy"}, initializing all data.
       
   254 
       
   255   \item @{ML ProofContext.theory_of}~@{text "ctxt"} selects the
       
   256   background theory from @{text "ctxt"}, dereferencing its internal
       
   257   @{ML_type theory_ref}.
       
   258 
       
   259   \item @{ML ProofContext.transfer}~@{text "thy ctxt"} promotes the
       
   260   background theory of @{text "ctxt"} to the super theory @{text
       
   261   "thy"}.
       
   262 
       
   263   \end{description}
       
   264 *}
       
   265 
       
   266 
       
   267 subsection {* Generic contexts \label{sec:generic-context} *}
       
   268 
       
   269 text {*
       
   270   A generic context is the disjoint sum of either a theory or proof
       
   271   context.  Occasionally, this enables uniform treatment of generic
       
   272   context data, typically extra-logical information.  Operations on
       
   273   generic contexts include the usual injections, partial selections,
       
   274   and combinators for lifting operations on either component of the
       
   275   disjoint sum.
       
   276 
       
   277   Moreover, there are total operations @{text "theory_of"} and @{text
       
   278   "proof_of"} to convert a generic context into either kind: a theory
       
   279   can always be selected from the sum, while a proof context might
       
   280   have to be constructed by an ad-hoc @{text "init"} operation.
       
   281 *}
       
   282 
       
   283 text %mlref {*
       
   284   \begin{mldecls}
       
   285   @{index_ML_type Context.generic} \\
       
   286   @{index_ML Context.theory_of: "Context.generic -> theory"} \\
       
   287   @{index_ML Context.proof_of: "Context.generic -> Proof.context"} \\
       
   288   \end{mldecls}
       
   289 
       
   290   \begin{description}
       
   291 
       
   292   \item @{ML_type Context.generic} is the direct sum of @{ML_type
       
   293   "theory"} and @{ML_type "Proof.context"}, with the datatype
       
   294   constructors @{ML "Context.Theory"} and @{ML "Context.Proof"}.
       
   295 
       
   296   \item @{ML Context.theory_of}~@{text "context"} always produces a
       
   297   theory from the generic @{text "context"}, using @{ML
       
   298   "ProofContext.theory_of"} as required.
       
   299 
       
   300   \item @{ML Context.proof_of}~@{text "context"} always produces a
       
   301   proof context from the generic @{text "context"}, using @{ML
       
   302   "ProofContext.init"} as required (note that this re-initializes the
       
   303   context data with each invocation).
       
   304 
       
   305   \end{description}
       
   306 *}
       
   307 
       
   308 
       
   309 subsection {* Context data \label{sec:context-data} *}
       
   310 
       
   311 text {*
       
   312   The main purpose of theory and proof contexts is to manage arbitrary
       
   313   data.  New data types can be declared incrementally at compile time.
       
   314   There are separate declaration mechanisms for any of the three kinds
       
   315   of contexts: theory, proof, generic.
       
   316 
       
   317   \paragraph{Theory data} may refer to destructive entities, which are
       
   318   maintained in direct correspondence to the linear evolution of
       
   319   theory values, including explicit copies.\footnote{Most existing
       
   320   instances of destructive theory data are merely historical relics
       
   321   (e.g.\ the destructive theorem storage, and destructive hints for
       
   322   the Simplifier and Classical rules).}  A theory data declaration
       
   323   needs to implement the following SML signature:
       
   324 
       
   325   \medskip
       
   326   \begin{tabular}{ll}
       
   327   @{text "\<type> T"} & representing type \\
       
   328   @{text "\<val> empty: T"} & empty default value \\
       
   329   @{text "\<val> copy: T \<rightarrow> T"} & refresh impure data \\
       
   330   @{text "\<val> extend: T \<rightarrow> T"} & re-initialize on import \\
       
   331   @{text "\<val> merge: T \<times> T \<rightarrow> T"} & join on import \\
       
   332   \end{tabular}
       
   333   \medskip
       
   334 
       
   335   \noindent The @{text "empty"} value acts as initial default for
       
   336   \emph{any} theory that does not declare actual data content; @{text
       
   337   "copy"} maintains persistent integrity for impure data, it is just
       
   338   the identity for pure values; @{text "extend"} is acts like a
       
   339   unitary version of @{text "merge"}, both operations should also
       
   340   include the functionality of @{text "copy"} for impure data.
       
   341 
       
   342   \paragraph{Proof context data} is purely functional.  A declaration
       
   343   needs to implement the following SML signature:
       
   344 
       
   345   \medskip
       
   346   \begin{tabular}{ll}
       
   347   @{text "\<type> T"} & representing type \\
       
   348   @{text "\<val> init: theory \<rightarrow> T"} & produce initial value \\
       
   349   \end{tabular}
       
   350   \medskip
       
   351 
       
   352   \noindent The @{text "init"} operation is supposed to produce a pure
       
   353   value from the given background theory.
       
   354 
       
   355   \paragraph{Generic data} provides a hybrid interface for both theory
       
   356   and proof data.  The declaration is essentially the same as for
       
   357   (pure) theory data, without @{text "copy"}.  The @{text "init"}
       
   358   operation for proof contexts merely selects the current data value
       
   359   from the background theory.
       
   360 
       
   361   \bigskip A data declaration of type @{text "T"} results in the
       
   362   following interface:
       
   363 
       
   364   \medskip
       
   365   \begin{tabular}{ll}
       
   366   @{text "init: theory \<rightarrow> theory"} \\
       
   367   @{text "get: context \<rightarrow> T"} \\
       
   368   @{text "put: T \<rightarrow> context \<rightarrow> context"} \\
       
   369   @{text "map: (T \<rightarrow> T) \<rightarrow> context \<rightarrow> context"} \\
       
   370   \end{tabular}
       
   371   \medskip
       
   372 
       
   373   \noindent Here @{text "init"} is only applicable to impure theory
       
   374   data to install a fresh copy persistently (destructive update on
       
   375   uninitialized has no permanent effect).  The other operations provide
       
   376   access for the particular kind of context (theory, proof, or generic
       
   377   context).  Note that this is a safe interface: there is no other way
       
   378   to access the corresponding data slot of a context.  By keeping
       
   379   these operations private, a component may maintain abstract values
       
   380   authentically, without other components interfering.
       
   381 *}
       
   382 
       
   383 text %mlref {*
       
   384   \begin{mldecls}
       
   385   @{index_ML_functor TheoryDataFun} \\
       
   386   @{index_ML_functor ProofDataFun} \\
       
   387   @{index_ML_functor GenericDataFun} \\
       
   388   \end{mldecls}
       
   389 
       
   390   \begin{description}
       
   391 
       
   392   \item @{ML_functor TheoryDataFun}@{text "(spec)"} declares data for
       
   393   type @{ML_type theory} according to the specification provided as
       
   394   argument structure.  The resulting structure provides data init and
       
   395   access operations as described above.
       
   396 
       
   397   \item @{ML_functor ProofDataFun}@{text "(spec)"} is analogous to
       
   398   @{ML_functor TheoryDataFun} for type @{ML_type Proof.context}.
       
   399 
       
   400   \item @{ML_functor GenericDataFun}@{text "(spec)"} is analogous to
       
   401   @{ML_functor TheoryDataFun} for type @{ML_type Context.generic}.
       
   402 
       
   403   \end{description}
       
   404 *}
       
   405 
       
   406 
       
   407 section {* Names \label{sec:names} *}
       
   408 
       
   409 text {*
       
   410   In principle, a name is just a string, but there are various
       
   411   convention for encoding additional structure.  For example, ``@{text
       
   412   "Foo.bar.baz"}'' is considered as a qualified name consisting of
       
   413   three basic name components.  The individual constituents of a name
       
   414   may have further substructure, e.g.\ the string
       
   415   ``\verb,\,\verb,<alpha>,'' encodes as a single symbol.
       
   416 *}
       
   417 
       
   418 
       
   419 subsection {* Strings of symbols *}
       
   420 
       
   421 text {*
       
   422   \glossary{Symbol}{The smallest unit of text in Isabelle, subsumes
       
   423   plain ASCII characters as well as an infinite collection of named
       
   424   symbols (for greek, math etc.).}
       
   425 
       
   426   A \emph{symbol} constitutes the smallest textual unit in Isabelle
       
   427   --- raw characters are normally not encountered at all.  Isabelle
       
   428   strings consist of a sequence of symbols, represented as a packed
       
   429   string or a list of strings.  Each symbol is in itself a small
       
   430   string, which has either one of the following forms:
       
   431 
       
   432   \begin{enumerate}
       
   433 
       
   434   \item a single ASCII character ``@{text "c"}'', for example
       
   435   ``\verb,a,'',
       
   436 
       
   437   \item a regular symbol ``\verb,\,\verb,<,@{text "ident"}\verb,>,'',
       
   438   for example ``\verb,\,\verb,<alpha>,'',
       
   439 
       
   440   \item a control symbol ``\verb,\,\verb,<^,@{text "ident"}\verb,>,'',
       
   441   for example ``\verb,\,\verb,<^bold>,'',
       
   442 
       
   443   \item a raw symbol ``\verb,\,\verb,<^raw:,@{text text}\verb,>,''
       
   444   where @{text text} constists of printable characters excluding
       
   445   ``\verb,.,'' and ``\verb,>,'', for example
       
   446   ``\verb,\,\verb,<^raw:$\sum_{i = 1}^n$>,'',
       
   447 
       
   448   \item a numbered raw control symbol ``\verb,\,\verb,<^raw,@{text
       
   449   n}\verb,>, where @{text n} consists of digits, for example
       
   450   ``\verb,\,\verb,<^raw42>,''.
       
   451 
       
   452   \end{enumerate}
       
   453 
       
   454   \noindent The @{text "ident"} syntax for symbol names is @{text
       
   455   "letter (letter | digit)\<^sup>*"}, where @{text "letter =
       
   456   A..Za..z"} and @{text "digit = 0..9"}.  There are infinitely many
       
   457   regular symbols and control symbols, but a fixed collection of
       
   458   standard symbols is treated specifically.  For example,
       
   459   ``\verb,\,\verb,<alpha>,'' is classified as a letter, which means it
       
   460   may occur within regular Isabelle identifiers.
       
   461 
       
   462   Since the character set underlying Isabelle symbols is 7-bit ASCII
       
   463   and 8-bit characters are passed through transparently, Isabelle may
       
   464   also process Unicode/UCS data in UTF-8 encoding.  Unicode provides
       
   465   its own collection of mathematical symbols, but there is no built-in
       
   466   link to the standard collection of Isabelle.
       
   467 
       
   468   \medskip Output of Isabelle symbols depends on the print mode
       
   469   (\secref{FIXME}).  For example, the standard {\LaTeX} setup of the
       
   470   Isabelle document preparation system would present
       
   471   ``\verb,\,\verb,<alpha>,'' as @{text "\<alpha>"}, and
       
   472   ``\verb,\,\verb,<^bold>,\verb,\,\verb,<alpha>,'' as @{text
       
   473   "\<^bold>\<alpha>"}.
       
   474 *}
       
   475 
       
   476 text %mlref {*
       
   477   \begin{mldecls}
       
   478   @{index_ML_type "Symbol.symbol"} \\
       
   479   @{index_ML Symbol.explode: "string -> Symbol.symbol list"} \\
       
   480   @{index_ML Symbol.is_letter: "Symbol.symbol -> bool"} \\
       
   481   @{index_ML Symbol.is_digit: "Symbol.symbol -> bool"} \\
       
   482   @{index_ML Symbol.is_quasi: "Symbol.symbol -> bool"} \\
       
   483   @{index_ML Symbol.is_blank: "Symbol.symbol -> bool"} \\
       
   484   \end{mldecls}
       
   485   \begin{mldecls}
       
   486   @{index_ML_type "Symbol.sym"} \\
       
   487   @{index_ML Symbol.decode: "Symbol.symbol -> Symbol.sym"} \\
       
   488   \end{mldecls}
       
   489 
       
   490   \begin{description}
       
   491 
       
   492   \item @{ML_type "Symbol.symbol"} represents individual Isabelle
       
   493   symbols; this is an alias for @{ML_type "string"}.
       
   494 
       
   495   \item @{ML "Symbol.explode"}~@{text "str"} produces a symbol list
       
   496   from the packed form.  This function supercedes @{ML
       
   497   "String.explode"} for virtually all purposes of manipulating text in
       
   498   Isabelle!
       
   499 
       
   500   \item @{ML "Symbol.is_letter"}, @{ML "Symbol.is_digit"}, @{ML
       
   501   "Symbol.is_quasi"}, @{ML "Symbol.is_blank"} classify standard
       
   502   symbols according to fixed syntactic conventions of Isabelle, cf.\
       
   503   \cite{isabelle-isar-ref}.
       
   504 
       
   505   \item @{ML_type "Symbol.sym"} is a concrete datatype that represents
       
   506   the different kinds of symbols explicitly, with constructors @{ML
       
   507   "Symbol.Char"}, @{ML "Symbol.Sym"}, @{ML "Symbol.Ctrl"}, @{ML
       
   508   "Symbol.Raw"}.
       
   509 
       
   510   \item @{ML "Symbol.decode"} converts the string representation of a
       
   511   symbol into the datatype version.
       
   512 
       
   513   \end{description}
       
   514 *}
       
   515 
       
   516 
       
   517 subsection {* Basic names \label{sec:basic-names} *}
       
   518 
       
   519 text {*
       
   520   A \emph{basic name} essentially consists of a single Isabelle
       
   521   identifier.  There are conventions to mark separate classes of basic
       
   522   names, by attaching a suffix of underscores (@{text "_"}): one
       
   523   underscore means \emph{internal name}, two underscores means
       
   524   \emph{Skolem name}, three underscores means \emph{internal Skolem
       
   525   name}.
       
   526 
       
   527   For example, the basic name @{text "foo"} has the internal version
       
   528   @{text "foo_"}, with Skolem versions @{text "foo__"} and @{text
       
   529   "foo___"}, respectively.
       
   530 
       
   531   These special versions provide copies of the basic name space, apart
       
   532   from anything that normally appears in the user text.  For example,
       
   533   system generated variables in Isar proof contexts are usually marked
       
   534   as internal, which prevents mysterious name references like @{text
       
   535   "xaa"} to appear in the text.
       
   536 
       
   537   \medskip Manipulating binding scopes often requires on-the-fly
       
   538   renamings.  A \emph{name context} contains a collection of already
       
   539   used names.  The @{text "declare"} operation adds names to the
       
   540   context.
       
   541 
       
   542   The @{text "invents"} operation derives a number of fresh names from
       
   543   a given starting point.  For example, the first three names derived
       
   544   from @{text "a"} are @{text "a"}, @{text "b"}, @{text "c"}.
       
   545 
       
   546   The @{text "variants"} operation produces fresh names by
       
   547   incrementing tentative names as base-26 numbers (with digits @{text
       
   548   "a..z"}) until all clashes are resolved.  For example, name @{text
       
   549   "foo"} results in variants @{text "fooa"}, @{text "foob"}, @{text
       
   550   "fooc"}, \dots, @{text "fooaa"}, @{text "fooab"} etc.; each renaming
       
   551   step picks the next unused variant from this sequence.
       
   552 *}
       
   553 
       
   554 text %mlref {*
       
   555   \begin{mldecls}
       
   556   @{index_ML Name.internal: "string -> string"} \\
       
   557   @{index_ML Name.skolem: "string -> string"} \\
       
   558   \end{mldecls}
       
   559   \begin{mldecls}
       
   560   @{index_ML_type Name.context} \\
       
   561   @{index_ML Name.context: Name.context} \\
       
   562   @{index_ML Name.declare: "string -> Name.context -> Name.context"} \\
       
   563   @{index_ML Name.invents: "Name.context -> string -> int -> string list"} \\
       
   564   @{index_ML Name.variants: "string list -> Name.context -> string list * Name.context"} \\
       
   565   \end{mldecls}
       
   566 
       
   567   \begin{description}
       
   568 
       
   569   \item @{ML Name.internal}~@{text "name"} produces an internal name
       
   570   by adding one underscore.
       
   571 
       
   572   \item @{ML Name.skolem}~@{text "name"} produces a Skolem name by
       
   573   adding two underscores.
       
   574 
       
   575   \item @{ML_type Name.context} represents the context of already used
       
   576   names; the initial value is @{ML "Name.context"}.
       
   577 
       
   578   \item @{ML Name.declare}~@{text "name"} enters a used name into the
       
   579   context.
       
   580 
       
   581   \item @{ML Name.invents}~@{text "context name n"} produces @{text
       
   582   "n"} fresh names derived from @{text "name"}.
       
   583 
       
   584   \item @{ML Name.variants}~@{text "names context"} produces fresh
       
   585   varians of @{text "names"}; the result is entered into the context.
       
   586 
       
   587   \end{description}
       
   588 *}
       
   589 
       
   590 
       
   591 subsection {* Indexed names *}
       
   592 
       
   593 text {*
       
   594   An \emph{indexed name} (or @{text "indexname"}) is a pair of a basic
       
   595   name and a natural number.  This representation allows efficient
       
   596   renaming by incrementing the second component only.  The canonical
       
   597   way to rename two collections of indexnames apart from each other is
       
   598   this: determine the maximum index @{text "maxidx"} of the first
       
   599   collection, then increment all indexes of the second collection by
       
   600   @{text "maxidx + 1"}; the maximum index of an empty collection is
       
   601   @{text "-1"}.
       
   602 
       
   603   Occasionally, basic names and indexed names are injected into the
       
   604   same pair type: the (improper) indexname @{text "(x, -1)"} is used
       
   605   to encode basic names.
       
   606 
       
   607   \medskip Isabelle syntax observes the following rules for
       
   608   representing an indexname @{text "(x, i)"} as a packed string:
       
   609 
       
   610   \begin{itemize}
       
   611 
       
   612   \item @{text "?x"} if @{text "x"} does not end with a digit and @{text "i = 0"},
       
   613 
       
   614   \item @{text "?xi"} if @{text "x"} does not end with a digit,
       
   615 
       
   616   \item @{text "?x.i"} otherwise.
       
   617 
       
   618   \end{itemize}
       
   619 
       
   620   Indexnames may acquire large index numbers over time.  Results are
       
   621   normalized towards @{text "0"} at certain checkpoints, notably at
       
   622   the end of a proof.  This works by producing variants of the
       
   623   corresponding basic name components.  For example, the collection
       
   624   @{text "?x1, ?x7, ?x42"} becomes @{text "?x, ?xa, ?xb"}.
       
   625 *}
       
   626 
       
   627 text %mlref {*
       
   628   \begin{mldecls}
       
   629   @{index_ML_type indexname} \\
       
   630   \end{mldecls}
       
   631 
       
   632   \begin{description}
       
   633 
       
   634   \item @{ML_type indexname} represents indexed names.  This is an
       
   635   abbreviation for @{ML_type "string * int"}.  The second component is
       
   636   usually non-negative, except for situations where @{text "(x, -1)"}
       
   637   is used to embed basic names into this type.
       
   638 
       
   639   \end{description}
       
   640 *}
       
   641 
       
   642 
       
   643 subsection {* Qualified names and name spaces *}
       
   644 
       
   645 text {*
       
   646   A \emph{qualified name} consists of a non-empty sequence of basic
       
   647   name components.  The packed representation uses a dot as separator,
       
   648   as in ``@{text "A.b.c"}''.  The last component is called \emph{base}
       
   649   name, the remaining prefix \emph{qualifier} (which may be empty).
       
   650   The idea of qualified names is to encode nested structures by
       
   651   recording the access paths as qualifiers.  For example, an item
       
   652   named ``@{text "A.b.c"}'' may be understood as a local entity @{text
       
   653   "c"}, within a local structure @{text "b"}, within a global
       
   654   structure @{text "A"}.  Typically, name space hierarchies consist of
       
   655   1--2 levels of qualification, but this need not be always so.
       
   656 
       
   657   The empty name is commonly used as an indication of unnamed
       
   658   entities, whenever this makes any sense.  The basic operations on
       
   659   qualified names are smart enough to pass through such improper names
       
   660   unchanged.
       
   661 
       
   662   \medskip A @{text "naming"} policy tells how to turn a name
       
   663   specification into a fully qualified internal name (by the @{text
       
   664   "full"} operation), and how fully qualified names may be accessed
       
   665   externally.  For example, the default naming policy is to prefix an
       
   666   implicit path: @{text "full x"} produces @{text "path.x"}, and the
       
   667   standard accesses for @{text "path.x"} include both @{text "x"} and
       
   668   @{text "path.x"}.  Normally, the naming is implicit in the theory or
       
   669   proof context; there are separate versions of the corresponding.
       
   670 
       
   671   \medskip A @{text "name space"} manages a collection of fully
       
   672   internalized names, together with a mapping between external names
       
   673   and internal names (in both directions).  The corresponding @{text
       
   674   "intern"} and @{text "extern"} operations are mostly used for
       
   675   parsing and printing only!  The @{text "declare"} operation augments
       
   676   a name space according to the accesses determined by the naming
       
   677   policy.
       
   678 
       
   679   \medskip As a general principle, there is a separate name space for
       
   680   each kind of formal entity, e.g.\ logical constant, type
       
   681   constructor, type class, theorem.  It is usually clear from the
       
   682   occurrence in concrete syntax (or from the scope) which kind of
       
   683   entity a name refers to.  For example, the very same name @{text
       
   684   "c"} may be used uniformly for a constant, type constructor, and
       
   685   type class.
       
   686 
       
   687   There are common schemes to name theorems systematically, according
       
   688   to the name of the main logical entity involved, e.g.\ @{text
       
   689   "c.intro"} for a canonical theorem related to constant @{text "c"}.
       
   690   This technique of mapping names from one space into another requires
       
   691   some care in order to avoid conflicts.  In particular, theorem names
       
   692   derived from a type constructor or type class are better suffixed in
       
   693   addition to the usual qualification, e.g.\ @{text "c_type.intro"}
       
   694   and @{text "c_class.intro"} for theorems related to type @{text "c"}
       
   695   and class @{text "c"}, respectively.
       
   696 *}
       
   697 
       
   698 text %mlref {*
       
   699   \begin{mldecls}
       
   700   @{index_ML NameSpace.base: "string -> string"} \\
       
   701   @{index_ML NameSpace.qualifier: "string -> string"} \\
       
   702   @{index_ML NameSpace.append: "string -> string -> string"} \\
       
   703   @{index_ML NameSpace.implode: "string list -> string"} \\
       
   704   @{index_ML NameSpace.explode: "string -> string list"} \\
       
   705   \end{mldecls}
       
   706   \begin{mldecls}
       
   707   @{index_ML_type NameSpace.naming} \\
       
   708   @{index_ML NameSpace.default_naming: NameSpace.naming} \\
       
   709   @{index_ML NameSpace.add_path: "string -> NameSpace.naming -> NameSpace.naming"} \\
       
   710   @{index_ML NameSpace.full_name: "NameSpace.naming -> binding -> string"} \\
       
   711   \end{mldecls}
       
   712   \begin{mldecls}
       
   713   @{index_ML_type NameSpace.T} \\
       
   714   @{index_ML NameSpace.empty: NameSpace.T} \\
       
   715   @{index_ML NameSpace.merge: "NameSpace.T * NameSpace.T -> NameSpace.T"} \\
       
   716   @{index_ML NameSpace.declare: "NameSpace.naming -> binding -> NameSpace.T -> string * NameSpace.T"} \\
       
   717   @{index_ML NameSpace.intern: "NameSpace.T -> string -> string"} \\
       
   718   @{index_ML NameSpace.extern: "NameSpace.T -> string -> string"} \\
       
   719   \end{mldecls}
       
   720 
       
   721   \begin{description}
       
   722 
       
   723   \item @{ML NameSpace.base}~@{text "name"} returns the base name of a
       
   724   qualified name.
       
   725 
       
   726   \item @{ML NameSpace.qualifier}~@{text "name"} returns the qualifier
       
   727   of a qualified name.
       
   728 
       
   729   \item @{ML NameSpace.append}~@{text "name\<^isub>1 name\<^isub>2"}
       
   730   appends two qualified names.
       
   731 
       
   732   \item @{ML NameSpace.implode}~@{text "name"} and @{ML
       
   733   NameSpace.explode}~@{text "names"} convert between the packed string
       
   734   representation and the explicit list form of qualified names.
       
   735 
       
   736   \item @{ML_type NameSpace.naming} represents the abstract concept of
       
   737   a naming policy.
       
   738 
       
   739   \item @{ML NameSpace.default_naming} is the default naming policy.
       
   740   In a theory context, this is usually augmented by a path prefix
       
   741   consisting of the theory name.
       
   742 
       
   743   \item @{ML NameSpace.add_path}~@{text "path naming"} augments the
       
   744   naming policy by extending its path component.
       
   745 
       
   746   \item @{ML NameSpace.full_name}@{text "naming binding"} turns a name
       
   747   binding (usually a basic name) into the fully qualified
       
   748   internal name, according to the given naming policy.
       
   749 
       
   750   \item @{ML_type NameSpace.T} represents name spaces.
       
   751 
       
   752   \item @{ML NameSpace.empty} and @{ML NameSpace.merge}~@{text
       
   753   "(space\<^isub>1, space\<^isub>2)"} are the canonical operations for
       
   754   maintaining name spaces according to theory data management
       
   755   (\secref{sec:context-data}).
       
   756 
       
   757   \item @{ML NameSpace.declare}~@{text "naming bindings space"} enters a
       
   758   name binding as fully qualified internal name into the name space,
       
   759   with external accesses determined by the naming policy.
       
   760 
       
   761   \item @{ML NameSpace.intern}~@{text "space name"} internalizes a
       
   762   (partially qualified) external name.
       
   763 
       
   764   This operation is mostly for parsing!  Note that fully qualified
       
   765   names stemming from declarations are produced via @{ML
       
   766   "NameSpace.full_name"} and @{ML "NameSpace.declare"}
       
   767   (or their derivatives for @{ML_type theory} and
       
   768   @{ML_type Proof.context}).
       
   769 
       
   770   \item @{ML NameSpace.extern}~@{text "space name"} externalizes a
       
   771   (fully qualified) internal name.
       
   772 
       
   773   This operation is mostly for printing!  Note unqualified names are
       
   774   produced via @{ML NameSpace.base}.
       
   775 
       
   776   \end{description}
       
   777 *}
       
   778 
       
   779 end