doc-src/IsarImplementation/Thy/Prelim.thy
author wenzelm
Sun Mar 08 17:37:18 2009 +0100 (2009-03-08)
changeset 30365 790129514c2d
parent 30281 9ad15d8ed311
child 33174 1f2051f41335
permissions -rw-r--r--
adapted to structure Long_Name;
     1 theory Prelim
     2 imports Base
     3 begin
     4 
     5 chapter {* Preliminaries *}
     6 
     7 section {* Contexts \label{sec:context} *}
     8 
     9 text {*
    10   A logical context represents the background that is required for
    11   formulating statements and composing proofs.  It acts as a medium to
    12   produce formal content, depending on earlier material (declarations,
    13   results etc.).
    14 
    15   For example, derivations within the Isabelle/Pure logic can be
    16   described as a judgment @{text "\<Gamma> \<turnstile>\<^sub>\<Theta> \<phi>"}, which means that a
    17   proposition @{text "\<phi>"} is derivable from hypotheses @{text "\<Gamma>"}
    18   within the theory @{text "\<Theta>"}.  There are logical reasons for
    19   keeping @{text "\<Theta>"} and @{text "\<Gamma>"} separate: theories can be
    20   liberal about supporting type constructors and schematic
    21   polymorphism of constants and axioms, while the inner calculus of
    22   @{text "\<Gamma> \<turnstile> \<phi>"} is strictly limited to Simple Type Theory (with
    23   fixed type variables in the assumptions).
    24 
    25   \medskip Contexts and derivations are linked by the following key
    26   principles:
    27 
    28   \begin{itemize}
    29 
    30   \item Transfer: monotonicity of derivations admits results to be
    31   transferred into a \emph{larger} context, i.e.\ @{text "\<Gamma> \<turnstile>\<^sub>\<Theta>
    32   \<phi>"} implies @{text "\<Gamma>' \<turnstile>\<^sub>\<Theta>\<^sub>' \<phi>"} for contexts @{text "\<Theta>'
    33   \<supseteq> \<Theta>"} and @{text "\<Gamma>' \<supseteq> \<Gamma>"}.
    34 
    35   \item Export: discharge of hypotheses admits results to be exported
    36   into a \emph{smaller} context, i.e.\ @{text "\<Gamma>' \<turnstile>\<^sub>\<Theta> \<phi>"}
    37   implies @{text "\<Gamma> \<turnstile>\<^sub>\<Theta> \<Delta> \<Longrightarrow> \<phi>"} where @{text "\<Gamma>' \<supseteq> \<Gamma>"} and
    38   @{text "\<Delta> = \<Gamma>' - \<Gamma>"}.  Note that @{text "\<Theta>"} remains unchanged here,
    39   only the @{text "\<Gamma>"} part is affected.
    40 
    41   \end{itemize}
    42 
    43   \medskip By modeling the main characteristics of the primitive
    44   @{text "\<Theta>"} and @{text "\<Gamma>"} above, and abstracting over any
    45   particular logical content, we arrive at the fundamental notions of
    46   \emph{theory context} and \emph{proof context} in Isabelle/Isar.
    47   These implement a certain policy to manage arbitrary \emph{context
    48   data}.  There is a strongly-typed mechanism to declare new kinds of
    49   data at compile time.
    50 
    51   The internal bootstrap process of Isabelle/Pure eventually reaches a
    52   stage where certain data slots provide the logical content of @{text
    53   "\<Theta>"} and @{text "\<Gamma>"} sketched above, but this does not stop there!
    54   Various additional data slots support all kinds of mechanisms that
    55   are not necessarily part of the core logic.
    56 
    57   For example, there would be data for canonical introduction and
    58   elimination rules for arbitrary operators (depending on the
    59   object-logic and application), which enables users to perform
    60   standard proof steps implicitly (cf.\ the @{text "rule"} method
    61   \cite{isabelle-isar-ref}).
    62 
    63   \medskip Thus Isabelle/Isar is able to bring forth more and more
    64   concepts successively.  In particular, an object-logic like
    65   Isabelle/HOL continues the Isabelle/Pure setup by adding specific
    66   components for automated reasoning (classical reasoner, tableau
    67   prover, structured induction etc.) and derived specification
    68   mechanisms (inductive predicates, recursive functions etc.).  All of
    69   this is ultimately based on the generic data management by theory
    70   and proof contexts introduced here.
    71 *}
    72 
    73 
    74 subsection {* Theory context \label{sec:context-theory} *}
    75 
    76 text {*
    77   A \emph{theory} is a data container with explicit name and unique
    78   identifier.  Theories are related by a (nominal) sub-theory
    79   relation, which corresponds to the dependency graph of the original
    80   construction; each theory is derived from a certain sub-graph of
    81   ancestor theories.
    82 
    83   The @{text "merge"} operation produces the least upper bound of two
    84   theories, which actually degenerates into absorption of one theory
    85   into the other (due to the nominal sub-theory relation).
    86 
    87   The @{text "begin"} operation starts a new theory by importing
    88   several parent theories and entering a special @{text "draft"} mode,
    89   which is sustained until the final @{text "end"} operation.  A draft
    90   theory acts like a linear type, where updates invalidate earlier
    91   versions.  An invalidated draft is called ``stale''.
    92 
    93   The @{text "checkpoint"} operation produces an intermediate stepping
    94   stone that will survive the next update: both the original and the
    95   changed theory remain valid and are related by the sub-theory
    96   relation.  Checkpointing essentially recovers purely functional
    97   theory values, at the expense of some extra internal bookkeeping.
    98 
    99   The @{text "copy"} operation produces an auxiliary version that has
   100   the same data content, but is unrelated to the original: updates of
   101   the copy do not affect the original, neither does the sub-theory
   102   relation hold.
   103 
   104   \medskip The example in \figref{fig:ex-theory} below shows a theory
   105   graph derived from @{text "Pure"}, with theory @{text "Length"}
   106   importing @{text "Nat"} and @{text "List"}.  The body of @{text
   107   "Length"} consists of a sequence of updates, working mostly on
   108   drafts.  Intermediate checkpoints may occur as well, due to the
   109   history mechanism provided by the Isar top-level, cf.\
   110   \secref{sec:isar-toplevel}.
   111 
   112   \begin{figure}[htb]
   113   \begin{center}
   114   \begin{tabular}{rcccl}
   115         &            & @{text "Pure"} \\
   116         &            & @{text "\<down>"} \\
   117         &            & @{text "FOL"} \\
   118         & $\swarrow$ &              & $\searrow$ & \\
   119   @{text "Nat"} &    &              &            & @{text "List"} \\
   120         & $\searrow$ &              & $\swarrow$ \\
   121         &            & @{text "Length"} \\
   122         &            & \multicolumn{3}{l}{~~@{keyword "imports"}} \\
   123         &            & \multicolumn{3}{l}{~~@{keyword "begin"}} \\
   124         &            & $\vdots$~~ \\
   125         &            & @{text "\<bullet>"}~~ \\
   126         &            & $\vdots$~~ \\
   127         &            & @{text "\<bullet>"}~~ \\
   128         &            & $\vdots$~~ \\
   129         &            & \multicolumn{3}{l}{~~@{command "end"}} \\
   130   \end{tabular}
   131   \caption{A theory definition depending on ancestors}\label{fig:ex-theory}
   132   \end{center}
   133   \end{figure}
   134 
   135   \medskip There is a separate notion of \emph{theory reference} for
   136   maintaining a live link to an evolving theory context: updates on
   137   drafts are propagated automatically.  Dynamic updating stops after
   138   an explicit @{text "end"} only.
   139 
   140   Derived entities may store a theory reference in order to indicate
   141   the context they belong to.  This implicitly assumes monotonic
   142   reasoning, because the referenced context may become larger without
   143   further notice.
   144 *}
   145 
   146 text %mlref {*
   147   \begin{mldecls}
   148   @{index_ML_type theory} \\
   149   @{index_ML Theory.subthy: "theory * theory -> bool"} \\
   150   @{index_ML Theory.merge: "theory * theory -> theory"} \\
   151   @{index_ML Theory.checkpoint: "theory -> theory"} \\
   152   @{index_ML Theory.copy: "theory -> theory"} \\
   153   \end{mldecls}
   154   \begin{mldecls}
   155   @{index_ML_type theory_ref} \\
   156   @{index_ML Theory.deref: "theory_ref -> theory"} \\
   157   @{index_ML Theory.check_thy: "theory -> theory_ref"} \\
   158   \end{mldecls}
   159 
   160   \begin{description}
   161 
   162   \item @{ML_type theory} represents theory contexts.  This is
   163   essentially a linear type!  Most operations destroy the original
   164   version, which then becomes ``stale''.
   165 
   166   \item @{ML "Theory.subthy"}~@{text "(thy\<^sub>1, thy\<^sub>2)"}
   167   compares theories according to the inherent graph structure of the
   168   construction.  This sub-theory relation is a nominal approximation
   169   of inclusion (@{text "\<subseteq>"}) of the corresponding content.
   170 
   171   \item @{ML "Theory.merge"}~@{text "(thy\<^sub>1, thy\<^sub>2)"}
   172   absorbs one theory into the other.  This fails for unrelated
   173   theories!
   174 
   175   \item @{ML "Theory.checkpoint"}~@{text "thy"} produces a safe
   176   stepping stone in the linear development of @{text "thy"}.  The next
   177   update will result in two related, valid theories.
   178 
   179   \item @{ML "Theory.copy"}~@{text "thy"} produces a variant of @{text
   180   "thy"} that holds a copy of the same data.  The result is not
   181   related to the original; the original is unchanged.
   182 
   183   \item @{ML_type theory_ref} represents a sliding reference to an
   184   always valid theory; updates on the original are propagated
   185   automatically.
   186 
   187   \item @{ML "Theory.deref"}~@{text "thy_ref"} turns a @{ML_type
   188   "theory_ref"} into an @{ML_type "theory"} value.  As the referenced
   189   theory evolves monotonically over time, later invocations of @{ML
   190   "Theory.deref"} may refer to a larger context.
   191 
   192   \item @{ML "Theory.check_thy"}~@{text "thy"} produces a @{ML_type
   193   "theory_ref"} from a valid @{ML_type "theory"} value.
   194 
   195   \end{description}
   196 *}
   197 
   198 
   199 subsection {* Proof context \label{sec:context-proof} *}
   200 
   201 text {*
   202   A proof context is a container for pure data with a back-reference
   203   to the theory it belongs to.  The @{text "init"} operation creates a
   204   proof context from a given theory.  Modifications to draft theories
   205   are propagated to the proof context as usual, but there is also an
   206   explicit @{text "transfer"} operation to force resynchronization
   207   with more substantial updates to the underlying theory.  The actual
   208   context data does not require any special bookkeeping, thanks to the
   209   lack of destructive features.
   210 
   211   Entities derived in a proof context need to record inherent logical
   212   requirements explicitly, since there is no separate context
   213   identification as for theories.  For example, hypotheses used in
   214   primitive derivations (cf.\ \secref{sec:thms}) are recorded
   215   separately within the sequent @{text "\<Gamma> \<turnstile> \<phi>"}, just to make double
   216   sure.  Results could still leak into an alien proof context due to
   217   programming errors, but Isabelle/Isar includes some extra validity
   218   checks in critical positions, notably at the end of a sub-proof.
   219 
   220   Proof contexts may be manipulated arbitrarily, although the common
   221   discipline is to follow block structure as a mental model: a given
   222   context is extended consecutively, and results are exported back
   223   into the original context.  Note that the Isar proof states model
   224   block-structured reasoning explicitly, using a stack of proof
   225   contexts internally.
   226 *}
   227 
   228 text %mlref {*
   229   \begin{mldecls}
   230   @{index_ML_type Proof.context} \\
   231   @{index_ML ProofContext.init: "theory -> Proof.context"} \\
   232   @{index_ML ProofContext.theory_of: "Proof.context -> theory"} \\
   233   @{index_ML ProofContext.transfer: "theory -> Proof.context -> Proof.context"} \\
   234   \end{mldecls}
   235 
   236   \begin{description}
   237 
   238   \item @{ML_type Proof.context} represents proof contexts.  Elements
   239   of this type are essentially pure values, with a sliding reference
   240   to the background theory.
   241 
   242   \item @{ML ProofContext.init}~@{text "thy"} produces a proof context
   243   derived from @{text "thy"}, initializing all data.
   244 
   245   \item @{ML ProofContext.theory_of}~@{text "ctxt"} selects the
   246   background theory from @{text "ctxt"}, dereferencing its internal
   247   @{ML_type theory_ref}.
   248 
   249   \item @{ML ProofContext.transfer}~@{text "thy ctxt"} promotes the
   250   background theory of @{text "ctxt"} to the super theory @{text
   251   "thy"}.
   252 
   253   \end{description}
   254 *}
   255 
   256 
   257 subsection {* Generic contexts \label{sec:generic-context} *}
   258 
   259 text {*
   260   A generic context is the disjoint sum of either a theory or proof
   261   context.  Occasionally, this enables uniform treatment of generic
   262   context data, typically extra-logical information.  Operations on
   263   generic contexts include the usual injections, partial selections,
   264   and combinators for lifting operations on either component of the
   265   disjoint sum.
   266 
   267   Moreover, there are total operations @{text "theory_of"} and @{text
   268   "proof_of"} to convert a generic context into either kind: a theory
   269   can always be selected from the sum, while a proof context might
   270   have to be constructed by an ad-hoc @{text "init"} operation.
   271 *}
   272 
   273 text %mlref {*
   274   \begin{mldecls}
   275   @{index_ML_type Context.generic} \\
   276   @{index_ML Context.theory_of: "Context.generic -> theory"} \\
   277   @{index_ML Context.proof_of: "Context.generic -> Proof.context"} \\
   278   \end{mldecls}
   279 
   280   \begin{description}
   281 
   282   \item @{ML_type Context.generic} is the direct sum of @{ML_type
   283   "theory"} and @{ML_type "Proof.context"}, with the datatype
   284   constructors @{ML "Context.Theory"} and @{ML "Context.Proof"}.
   285 
   286   \item @{ML Context.theory_of}~@{text "context"} always produces a
   287   theory from the generic @{text "context"}, using @{ML
   288   "ProofContext.theory_of"} as required.
   289 
   290   \item @{ML Context.proof_of}~@{text "context"} always produces a
   291   proof context from the generic @{text "context"}, using @{ML
   292   "ProofContext.init"} as required (note that this re-initializes the
   293   context data with each invocation).
   294 
   295   \end{description}
   296 *}
   297 
   298 
   299 subsection {* Context data \label{sec:context-data} *}
   300 
   301 text {*
   302   The main purpose of theory and proof contexts is to manage arbitrary
   303   data.  New data types can be declared incrementally at compile time.
   304   There are separate declaration mechanisms for any of the three kinds
   305   of contexts: theory, proof, generic.
   306 
   307   \paragraph{Theory data} may refer to destructive entities, which are
   308   maintained in direct correspondence to the linear evolution of
   309   theory values, including explicit copies.\footnote{Most existing
   310   instances of destructive theory data are merely historical relics
   311   (e.g.\ the destructive theorem storage, and destructive hints for
   312   the Simplifier and Classical rules).}  A theory data declaration
   313   needs to implement the following SML signature:
   314 
   315   \medskip
   316   \begin{tabular}{ll}
   317   @{text "\<type> T"} & representing type \\
   318   @{text "\<val> empty: T"} & empty default value \\
   319   @{text "\<val> copy: T \<rightarrow> T"} & refresh impure data \\
   320   @{text "\<val> extend: T \<rightarrow> T"} & re-initialize on import \\
   321   @{text "\<val> merge: T \<times> T \<rightarrow> T"} & join on import \\
   322   \end{tabular}
   323   \medskip
   324 
   325   \noindent The @{text "empty"} value acts as initial default for
   326   \emph{any} theory that does not declare actual data content; @{text
   327   "copy"} maintains persistent integrity for impure data, it is just
   328   the identity for pure values; @{text "extend"} is acts like a
   329   unitary version of @{text "merge"}, both operations should also
   330   include the functionality of @{text "copy"} for impure data.
   331 
   332   \paragraph{Proof context data} is purely functional.  A declaration
   333   needs to implement the following SML signature:
   334 
   335   \medskip
   336   \begin{tabular}{ll}
   337   @{text "\<type> T"} & representing type \\
   338   @{text "\<val> init: theory \<rightarrow> T"} & produce initial value \\
   339   \end{tabular}
   340   \medskip
   341 
   342   \noindent The @{text "init"} operation is supposed to produce a pure
   343   value from the given background theory.
   344 
   345   \paragraph{Generic data} provides a hybrid interface for both theory
   346   and proof data.  The declaration is essentially the same as for
   347   (pure) theory data, without @{text "copy"}.  The @{text "init"}
   348   operation for proof contexts merely selects the current data value
   349   from the background theory.
   350 
   351   \bigskip A data declaration of type @{text "T"} results in the
   352   following interface:
   353 
   354   \medskip
   355   \begin{tabular}{ll}
   356   @{text "init: theory \<rightarrow> T"} \\
   357   @{text "get: context \<rightarrow> T"} \\
   358   @{text "put: T \<rightarrow> context \<rightarrow> context"} \\
   359   @{text "map: (T \<rightarrow> T) \<rightarrow> context \<rightarrow> context"} \\
   360   \end{tabular}
   361   \medskip
   362 
   363   \noindent Here @{text "init"} is only applicable to impure theory
   364   data to install a fresh copy persistently (destructive update on
   365   uninitialized has no permanent effect).  The other operations provide
   366   access for the particular kind of context (theory, proof, or generic
   367   context).  Note that this is a safe interface: there is no other way
   368   to access the corresponding data slot of a context.  By keeping
   369   these operations private, a component may maintain abstract values
   370   authentically, without other components interfering.
   371 *}
   372 
   373 text %mlref {*
   374   \begin{mldecls}
   375   @{index_ML_functor TheoryDataFun} \\
   376   @{index_ML_functor ProofDataFun} \\
   377   @{index_ML_functor GenericDataFun} \\
   378   \end{mldecls}
   379 
   380   \begin{description}
   381 
   382   \item @{ML_functor TheoryDataFun}@{text "(spec)"} declares data for
   383   type @{ML_type theory} according to the specification provided as
   384   argument structure.  The resulting structure provides data init and
   385   access operations as described above.
   386 
   387   \item @{ML_functor ProofDataFun}@{text "(spec)"} is analogous to
   388   @{ML_functor TheoryDataFun} for type @{ML_type Proof.context}.
   389 
   390   \item @{ML_functor GenericDataFun}@{text "(spec)"} is analogous to
   391   @{ML_functor TheoryDataFun} for type @{ML_type Context.generic}.
   392 
   393   \end{description}
   394 *}
   395 
   396 
   397 section {* Names \label{sec:names} *}
   398 
   399 text {*
   400   In principle, a name is just a string, but there are various
   401   convention for encoding additional structure.  For example, ``@{text
   402   "Foo.bar.baz"}'' is considered as a qualified name consisting of
   403   three basic name components.  The individual constituents of a name
   404   may have further substructure, e.g.\ the string
   405   ``\verb,\,\verb,<alpha>,'' encodes as a single symbol.
   406 *}
   407 
   408 
   409 subsection {* Strings of symbols *}
   410 
   411 text {*
   412   A \emph{symbol} constitutes the smallest textual unit in Isabelle
   413   --- raw characters are normally not encountered at all.  Isabelle
   414   strings consist of a sequence of symbols, represented as a packed
   415   string or a list of strings.  Each symbol is in itself a small
   416   string, which has either one of the following forms:
   417 
   418   \begin{enumerate}
   419 
   420   \item a single ASCII character ``@{text "c"}'', for example
   421   ``\verb,a,'',
   422 
   423   \item a regular symbol ``\verb,\,\verb,<,@{text "ident"}\verb,>,'',
   424   for example ``\verb,\,\verb,<alpha>,'',
   425 
   426   \item a control symbol ``\verb,\,\verb,<^,@{text "ident"}\verb,>,'',
   427   for example ``\verb,\,\verb,<^bold>,'',
   428 
   429   \item a raw symbol ``\verb,\,\verb,<^raw:,@{text text}\verb,>,''
   430   where @{text text} constists of printable characters excluding
   431   ``\verb,.,'' and ``\verb,>,'', for example
   432   ``\verb,\,\verb,<^raw:$\sum_{i = 1}^n$>,'',
   433 
   434   \item a numbered raw control symbol ``\verb,\,\verb,<^raw,@{text
   435   n}\verb,>, where @{text n} consists of digits, for example
   436   ``\verb,\,\verb,<^raw42>,''.
   437 
   438   \end{enumerate}
   439 
   440   \noindent The @{text "ident"} syntax for symbol names is @{text
   441   "letter (letter | digit)\<^sup>*"}, where @{text "letter =
   442   A..Za..z"} and @{text "digit = 0..9"}.  There are infinitely many
   443   regular symbols and control symbols, but a fixed collection of
   444   standard symbols is treated specifically.  For example,
   445   ``\verb,\,\verb,<alpha>,'' is classified as a letter, which means it
   446   may occur within regular Isabelle identifiers.
   447 
   448   Since the character set underlying Isabelle symbols is 7-bit ASCII
   449   and 8-bit characters are passed through transparently, Isabelle may
   450   also process Unicode/UCS data in UTF-8 encoding.  Unicode provides
   451   its own collection of mathematical symbols, but there is no built-in
   452   link to the standard collection of Isabelle.
   453 
   454   \medskip Output of Isabelle symbols depends on the print mode
   455   (\secref{print-mode}).  For example, the standard {\LaTeX} setup of
   456   the Isabelle document preparation system would present
   457   ``\verb,\,\verb,<alpha>,'' as @{text "\<alpha>"}, and
   458   ``\verb,\,\verb,<^bold>,\verb,\,\verb,<alpha>,'' as @{text
   459   "\<^bold>\<alpha>"}.
   460 *}
   461 
   462 text %mlref {*
   463   \begin{mldecls}
   464   @{index_ML_type "Symbol.symbol"} \\
   465   @{index_ML Symbol.explode: "string -> Symbol.symbol list"} \\
   466   @{index_ML Symbol.is_letter: "Symbol.symbol -> bool"} \\
   467   @{index_ML Symbol.is_digit: "Symbol.symbol -> bool"} \\
   468   @{index_ML Symbol.is_quasi: "Symbol.symbol -> bool"} \\
   469   @{index_ML Symbol.is_blank: "Symbol.symbol -> bool"} \\
   470   \end{mldecls}
   471   \begin{mldecls}
   472   @{index_ML_type "Symbol.sym"} \\
   473   @{index_ML Symbol.decode: "Symbol.symbol -> Symbol.sym"} \\
   474   \end{mldecls}
   475 
   476   \begin{description}
   477 
   478   \item @{ML_type "Symbol.symbol"} represents individual Isabelle
   479   symbols; this is an alias for @{ML_type "string"}.
   480 
   481   \item @{ML "Symbol.explode"}~@{text "str"} produces a symbol list
   482   from the packed form.  This function supercedes @{ML
   483   "String.explode"} for virtually all purposes of manipulating text in
   484   Isabelle!
   485 
   486   \item @{ML "Symbol.is_letter"}, @{ML "Symbol.is_digit"}, @{ML
   487   "Symbol.is_quasi"}, @{ML "Symbol.is_blank"} classify standard
   488   symbols according to fixed syntactic conventions of Isabelle, cf.\
   489   \cite{isabelle-isar-ref}.
   490 
   491   \item @{ML_type "Symbol.sym"} is a concrete datatype that represents
   492   the different kinds of symbols explicitly, with constructors @{ML
   493   "Symbol.Char"}, @{ML "Symbol.Sym"}, @{ML "Symbol.Ctrl"}, @{ML
   494   "Symbol.Raw"}.
   495 
   496   \item @{ML "Symbol.decode"} converts the string representation of a
   497   symbol into the datatype version.
   498 
   499   \end{description}
   500 *}
   501 
   502 
   503 subsection {* Basic names \label{sec:basic-names} *}
   504 
   505 text {*
   506   A \emph{basic name} essentially consists of a single Isabelle
   507   identifier.  There are conventions to mark separate classes of basic
   508   names, by attaching a suffix of underscores: one underscore means
   509   \emph{internal name}, two underscores means \emph{Skolem name},
   510   three underscores means \emph{internal Skolem name}.
   511 
   512   For example, the basic name @{text "foo"} has the internal version
   513   @{text "foo_"}, with Skolem versions @{text "foo__"} and @{text
   514   "foo___"}, respectively.
   515 
   516   These special versions provide copies of the basic name space, apart
   517   from anything that normally appears in the user text.  For example,
   518   system generated variables in Isar proof contexts are usually marked
   519   as internal, which prevents mysterious name references like @{text
   520   "xaa"} to appear in the text.
   521 
   522   \medskip Manipulating binding scopes often requires on-the-fly
   523   renamings.  A \emph{name context} contains a collection of already
   524   used names.  The @{text "declare"} operation adds names to the
   525   context.
   526 
   527   The @{text "invents"} operation derives a number of fresh names from
   528   a given starting point.  For example, the first three names derived
   529   from @{text "a"} are @{text "a"}, @{text "b"}, @{text "c"}.
   530 
   531   The @{text "variants"} operation produces fresh names by
   532   incrementing tentative names as base-26 numbers (with digits @{text
   533   "a..z"}) until all clashes are resolved.  For example, name @{text
   534   "foo"} results in variants @{text "fooa"}, @{text "foob"}, @{text
   535   "fooc"}, \dots, @{text "fooaa"}, @{text "fooab"} etc.; each renaming
   536   step picks the next unused variant from this sequence.
   537 *}
   538 
   539 text %mlref {*
   540   \begin{mldecls}
   541   @{index_ML Name.internal: "string -> string"} \\
   542   @{index_ML Name.skolem: "string -> string"} \\
   543   \end{mldecls}
   544   \begin{mldecls}
   545   @{index_ML_type Name.context} \\
   546   @{index_ML Name.context: Name.context} \\
   547   @{index_ML Name.declare: "string -> Name.context -> Name.context"} \\
   548   @{index_ML Name.invents: "Name.context -> string -> int -> string list"} \\
   549   @{index_ML Name.variants: "string list -> Name.context -> string list * Name.context"} \\
   550   \end{mldecls}
   551 
   552   \begin{description}
   553 
   554   \item @{ML Name.internal}~@{text "name"} produces an internal name
   555   by adding one underscore.
   556 
   557   \item @{ML Name.skolem}~@{text "name"} produces a Skolem name by
   558   adding two underscores.
   559 
   560   \item @{ML_type Name.context} represents the context of already used
   561   names; the initial value is @{ML "Name.context"}.
   562 
   563   \item @{ML Name.declare}~@{text "name"} enters a used name into the
   564   context.
   565 
   566   \item @{ML Name.invents}~@{text "context name n"} produces @{text
   567   "n"} fresh names derived from @{text "name"}.
   568 
   569   \item @{ML Name.variants}~@{text "names context"} produces fresh
   570   variants of @{text "names"}; the result is entered into the context.
   571 
   572   \end{description}
   573 *}
   574 
   575 
   576 subsection {* Indexed names *}
   577 
   578 text {*
   579   An \emph{indexed name} (or @{text "indexname"}) is a pair of a basic
   580   name and a natural number.  This representation allows efficient
   581   renaming by incrementing the second component only.  The canonical
   582   way to rename two collections of indexnames apart from each other is
   583   this: determine the maximum index @{text "maxidx"} of the first
   584   collection, then increment all indexes of the second collection by
   585   @{text "maxidx + 1"}; the maximum index of an empty collection is
   586   @{text "-1"}.
   587 
   588   Occasionally, basic names and indexed names are injected into the
   589   same pair type: the (improper) indexname @{text "(x, -1)"} is used
   590   to encode basic names.
   591 
   592   \medskip Isabelle syntax observes the following rules for
   593   representing an indexname @{text "(x, i)"} as a packed string:
   594 
   595   \begin{itemize}
   596 
   597   \item @{text "?x"} if @{text "x"} does not end with a digit and @{text "i = 0"},
   598 
   599   \item @{text "?xi"} if @{text "x"} does not end with a digit,
   600 
   601   \item @{text "?x.i"} otherwise.
   602 
   603   \end{itemize}
   604 
   605   Indexnames may acquire large index numbers over time.  Results are
   606   normalized towards @{text "0"} at certain checkpoints, notably at
   607   the end of a proof.  This works by producing variants of the
   608   corresponding basic name components.  For example, the collection
   609   @{text "?x1, ?x7, ?x42"} becomes @{text "?x, ?xa, ?xb"}.
   610 *}
   611 
   612 text %mlref {*
   613   \begin{mldecls}
   614   @{index_ML_type indexname} \\
   615   \end{mldecls}
   616 
   617   \begin{description}
   618 
   619   \item @{ML_type indexname} represents indexed names.  This is an
   620   abbreviation for @{ML_type "string * int"}.  The second component is
   621   usually non-negative, except for situations where @{text "(x, -1)"}
   622   is used to embed basic names into this type.
   623 
   624   \end{description}
   625 *}
   626 
   627 
   628 subsection {* Qualified names and name spaces *}
   629 
   630 text {*
   631   A \emph{qualified name} consists of a non-empty sequence of basic
   632   name components.  The packed representation uses a dot as separator,
   633   as in ``@{text "A.b.c"}''.  The last component is called \emph{base}
   634   name, the remaining prefix \emph{qualifier} (which may be empty).
   635   The idea of qualified names is to encode nested structures by
   636   recording the access paths as qualifiers.  For example, an item
   637   named ``@{text "A.b.c"}'' may be understood as a local entity @{text
   638   "c"}, within a local structure @{text "b"}, within a global
   639   structure @{text "A"}.  Typically, name space hierarchies consist of
   640   1--2 levels of qualification, but this need not be always so.
   641 
   642   The empty name is commonly used as an indication of unnamed
   643   entities, whenever this makes any sense.  The basic operations on
   644   qualified names are smart enough to pass through such improper names
   645   unchanged.
   646 
   647   \medskip A @{text "naming"} policy tells how to turn a name
   648   specification into a fully qualified internal name (by the @{text
   649   "full"} operation), and how fully qualified names may be accessed
   650   externally.  For example, the default naming policy is to prefix an
   651   implicit path: @{text "full x"} produces @{text "path.x"}, and the
   652   standard accesses for @{text "path.x"} include both @{text "x"} and
   653   @{text "path.x"}.  Normally, the naming is implicit in the theory or
   654   proof context; there are separate versions of the corresponding.
   655 
   656   \medskip A @{text "name space"} manages a collection of fully
   657   internalized names, together with a mapping between external names
   658   and internal names (in both directions).  The corresponding @{text
   659   "intern"} and @{text "extern"} operations are mostly used for
   660   parsing and printing only!  The @{text "declare"} operation augments
   661   a name space according to the accesses determined by the naming
   662   policy.
   663 
   664   \medskip As a general principle, there is a separate name space for
   665   each kind of formal entity, e.g.\ logical constant, type
   666   constructor, type class, theorem.  It is usually clear from the
   667   occurrence in concrete syntax (or from the scope) which kind of
   668   entity a name refers to.  For example, the very same name @{text
   669   "c"} may be used uniformly for a constant, type constructor, and
   670   type class.
   671 
   672   There are common schemes to name theorems systematically, according
   673   to the name of the main logical entity involved, e.g.\ @{text
   674   "c.intro"} for a canonical theorem related to constant @{text "c"}.
   675   This technique of mapping names from one space into another requires
   676   some care in order to avoid conflicts.  In particular, theorem names
   677   derived from a type constructor or type class are better suffixed in
   678   addition to the usual qualification, e.g.\ @{text "c_type.intro"}
   679   and @{text "c_class.intro"} for theorems related to type @{text "c"}
   680   and class @{text "c"}, respectively.
   681 *}
   682 
   683 text %mlref {*
   684   \begin{mldecls}
   685   @{index_ML Long_Name.base_name: "string -> string"} \\
   686   @{index_ML Long_Name.qualifier: "string -> string"} \\
   687   @{index_ML Long_Name.append: "string -> string -> string"} \\
   688   @{index_ML Long_Name.implode: "string list -> string"} \\
   689   @{index_ML Long_Name.explode: "string -> string list"} \\
   690   \end{mldecls}
   691   \begin{mldecls}
   692   @{index_ML_type NameSpace.naming} \\
   693   @{index_ML NameSpace.default_naming: NameSpace.naming} \\
   694   @{index_ML NameSpace.add_path: "string -> NameSpace.naming -> NameSpace.naming"} \\
   695   @{index_ML NameSpace.full_name: "NameSpace.naming -> binding -> string"} \\
   696   \end{mldecls}
   697   \begin{mldecls}
   698   @{index_ML_type NameSpace.T} \\
   699   @{index_ML NameSpace.empty: NameSpace.T} \\
   700   @{index_ML NameSpace.merge: "NameSpace.T * NameSpace.T -> NameSpace.T"} \\
   701   @{index_ML NameSpace.declare: "NameSpace.naming -> binding -> NameSpace.T ->
   702   string * NameSpace.T"} \\
   703   @{index_ML NameSpace.intern: "NameSpace.T -> string -> string"} \\
   704   @{index_ML NameSpace.extern: "NameSpace.T -> string -> string"} \\
   705   \end{mldecls}
   706 
   707   \begin{description}
   708 
   709   \item @{ML Long_Name.base_name}~@{text "name"} returns the base name of a
   710   qualified name.
   711 
   712   \item @{ML Long_Name.qualifier}~@{text "name"} returns the qualifier
   713   of a qualified name.
   714 
   715   \item @{ML Long_Name.append}~@{text "name\<^isub>1 name\<^isub>2"}
   716   appends two qualified names.
   717 
   718   \item @{ML Long_Name.implode}~@{text "names"} and @{ML
   719   Long_Name.explode}~@{text "name"} convert between the packed string
   720   representation and the explicit list form of qualified names.
   721 
   722   \item @{ML_type NameSpace.naming} represents the abstract concept of
   723   a naming policy.
   724 
   725   \item @{ML NameSpace.default_naming} is the default naming policy.
   726   In a theory context, this is usually augmented by a path prefix
   727   consisting of the theory name.
   728 
   729   \item @{ML NameSpace.add_path}~@{text "path naming"} augments the
   730   naming policy by extending its path component.
   731 
   732   \item @{ML NameSpace.full_name}~@{text "naming binding"} turns a
   733   name binding (usually a basic name) into the fully qualified
   734   internal name, according to the given naming policy.
   735 
   736   \item @{ML_type NameSpace.T} represents name spaces.
   737 
   738   \item @{ML NameSpace.empty} and @{ML NameSpace.merge}~@{text
   739   "(space\<^isub>1, space\<^isub>2)"} are the canonical operations for
   740   maintaining name spaces according to theory data management
   741   (\secref{sec:context-data}).
   742 
   743   \item @{ML NameSpace.declare}~@{text "naming bindings space"} enters a
   744   name binding as fully qualified internal name into the name space,
   745   with external accesses determined by the naming policy.
   746 
   747   \item @{ML NameSpace.intern}~@{text "space name"} internalizes a
   748   (partially qualified) external name.
   749 
   750   This operation is mostly for parsing!  Note that fully qualified
   751   names stemming from declarations are produced via @{ML
   752   "NameSpace.full_name"} and @{ML "NameSpace.declare"}
   753   (or their derivatives for @{ML_type theory} and
   754   @{ML_type Proof.context}).
   755 
   756   \item @{ML NameSpace.extern}~@{text "space name"} externalizes a
   757   (fully qualified) internal name.
   758 
   759   This operation is mostly for printing!  User code should not rely on
   760   the precise result too much.
   761 
   762   \end{description}
   763 *}
   764 
   765 end