doc-src/IsarImplementation/Thy/Prelim.thy
author wenzelm
Sun Jan 31 21:40:44 2010 +0100 (2010-01-31)
changeset 34925 38a44d813a3c
parent 34924 520727474bbe
child 34926 19294b07e445
permissions -rw-r--r--
more details on Isabelle symbols;
wenzelm@29755
     1
theory Prelim
wenzelm@29755
     2
imports Base
wenzelm@29755
     3
begin
wenzelm@18537
     4
wenzelm@18537
     5
chapter {* Preliminaries *}
wenzelm@18537
     6
wenzelm@20429
     7
section {* Contexts \label{sec:context} *}
wenzelm@18537
     8
wenzelm@20429
     9
text {*
wenzelm@20451
    10
  A logical context represents the background that is required for
wenzelm@20451
    11
  formulating statements and composing proofs.  It acts as a medium to
wenzelm@20451
    12
  produce formal content, depending on earlier material (declarations,
wenzelm@20451
    13
  results etc.).
wenzelm@18537
    14
wenzelm@20451
    15
  For example, derivations within the Isabelle/Pure logic can be
wenzelm@20451
    16
  described as a judgment @{text "\<Gamma> \<turnstile>\<^sub>\<Theta> \<phi>"}, which means that a
wenzelm@20429
    17
  proposition @{text "\<phi>"} is derivable from hypotheses @{text "\<Gamma>"}
wenzelm@20429
    18
  within the theory @{text "\<Theta>"}.  There are logical reasons for
wenzelm@20451
    19
  keeping @{text "\<Theta>"} and @{text "\<Gamma>"} separate: theories can be
wenzelm@20451
    20
  liberal about supporting type constructors and schematic
wenzelm@20451
    21
  polymorphism of constants and axioms, while the inner calculus of
wenzelm@20451
    22
  @{text "\<Gamma> \<turnstile> \<phi>"} is strictly limited to Simple Type Theory (with
wenzelm@20451
    23
  fixed type variables in the assumptions).
wenzelm@18537
    24
wenzelm@20429
    25
  \medskip Contexts and derivations are linked by the following key
wenzelm@20429
    26
  principles:
wenzelm@20429
    27
wenzelm@20429
    28
  \begin{itemize}
wenzelm@20429
    29
wenzelm@20429
    30
  \item Transfer: monotonicity of derivations admits results to be
wenzelm@20451
    31
  transferred into a \emph{larger} context, i.e.\ @{text "\<Gamma> \<turnstile>\<^sub>\<Theta>
wenzelm@20451
    32
  \<phi>"} implies @{text "\<Gamma>' \<turnstile>\<^sub>\<Theta>\<^sub>' \<phi>"} for contexts @{text "\<Theta>'
wenzelm@20451
    33
  \<supseteq> \<Theta>"} and @{text "\<Gamma>' \<supseteq> \<Gamma>"}.
wenzelm@18537
    34
wenzelm@20429
    35
  \item Export: discharge of hypotheses admits results to be exported
wenzelm@20451
    36
  into a \emph{smaller} context, i.e.\ @{text "\<Gamma>' \<turnstile>\<^sub>\<Theta> \<phi>"}
wenzelm@20451
    37
  implies @{text "\<Gamma> \<turnstile>\<^sub>\<Theta> \<Delta> \<Longrightarrow> \<phi>"} where @{text "\<Gamma>' \<supseteq> \<Gamma>"} and
wenzelm@20451
    38
  @{text "\<Delta> = \<Gamma>' - \<Gamma>"}.  Note that @{text "\<Theta>"} remains unchanged here,
wenzelm@20451
    39
  only the @{text "\<Gamma>"} part is affected.
wenzelm@20429
    40
wenzelm@20429
    41
  \end{itemize}
wenzelm@18537
    42
wenzelm@20451
    43
  \medskip By modeling the main characteristics of the primitive
wenzelm@20451
    44
  @{text "\<Theta>"} and @{text "\<Gamma>"} above, and abstracting over any
wenzelm@20451
    45
  particular logical content, we arrive at the fundamental notions of
wenzelm@20451
    46
  \emph{theory context} and \emph{proof context} in Isabelle/Isar.
wenzelm@20451
    47
  These implement a certain policy to manage arbitrary \emph{context
wenzelm@20451
    48
  data}.  There is a strongly-typed mechanism to declare new kinds of
wenzelm@20429
    49
  data at compile time.
wenzelm@18537
    50
wenzelm@20451
    51
  The internal bootstrap process of Isabelle/Pure eventually reaches a
wenzelm@20451
    52
  stage where certain data slots provide the logical content of @{text
wenzelm@20451
    53
  "\<Theta>"} and @{text "\<Gamma>"} sketched above, but this does not stop there!
wenzelm@20451
    54
  Various additional data slots support all kinds of mechanisms that
wenzelm@20451
    55
  are not necessarily part of the core logic.
wenzelm@18537
    56
wenzelm@20429
    57
  For example, there would be data for canonical introduction and
wenzelm@20429
    58
  elimination rules for arbitrary operators (depending on the
wenzelm@20429
    59
  object-logic and application), which enables users to perform
wenzelm@20451
    60
  standard proof steps implicitly (cf.\ the @{text "rule"} method
wenzelm@20451
    61
  \cite{isabelle-isar-ref}).
wenzelm@18537
    62
wenzelm@20451
    63
  \medskip Thus Isabelle/Isar is able to bring forth more and more
wenzelm@20451
    64
  concepts successively.  In particular, an object-logic like
wenzelm@20451
    65
  Isabelle/HOL continues the Isabelle/Pure setup by adding specific
wenzelm@20451
    66
  components for automated reasoning (classical reasoner, tableau
wenzelm@20451
    67
  prover, structured induction etc.) and derived specification
wenzelm@20451
    68
  mechanisms (inductive predicates, recursive functions etc.).  All of
wenzelm@20451
    69
  this is ultimately based on the generic data management by theory
wenzelm@20451
    70
  and proof contexts introduced here.
wenzelm@18537
    71
*}
wenzelm@18537
    72
wenzelm@18537
    73
wenzelm@18537
    74
subsection {* Theory context \label{sec:context-theory} *}
wenzelm@18537
    75
wenzelm@34921
    76
text {* A \emph{theory} is a data container with explicit name and
wenzelm@34921
    77
  unique identifier.  Theories are related by a (nominal) sub-theory
wenzelm@20451
    78
  relation, which corresponds to the dependency graph of the original
wenzelm@20451
    79
  construction; each theory is derived from a certain sub-graph of
wenzelm@34921
    80
  ancestor theories.  To this end, the system maintains a set of
wenzelm@34921
    81
  symbolic ``identification stamps'' within each theory.
wenzelm@18537
    82
wenzelm@34921
    83
  In order to avoid the full-scale overhead of explicit sub-theory
wenzelm@34921
    84
  identification of arbitrary intermediate stages, a theory is
wenzelm@34921
    85
  switched into @{text "draft"} mode under certain circumstances.  A
wenzelm@34921
    86
  draft theory acts like a linear type, where updates invalidate
wenzelm@34921
    87
  earlier versions.  An invalidated draft is called \emph{stale}.
wenzelm@20429
    88
wenzelm@34921
    89
  The @{text "checkpoint"} operation produces a safe stepping stone
wenzelm@34921
    90
  that will survive the next update without becoming stale: both the
wenzelm@34921
    91
  old and the new theory remain valid and are related by the
wenzelm@34921
    92
  sub-theory relation.  Checkpointing essentially recovers purely
wenzelm@34921
    93
  functional theory values, at the expense of some extra internal
wenzelm@34921
    94
  bookkeeping.
wenzelm@20447
    95
wenzelm@20447
    96
  The @{text "copy"} operation produces an auxiliary version that has
wenzelm@20447
    97
  the same data content, but is unrelated to the original: updates of
wenzelm@20447
    98
  the copy do not affect the original, neither does the sub-theory
wenzelm@20447
    99
  relation hold.
wenzelm@20429
   100
wenzelm@34921
   101
  The @{text "merge"} operation produces the least upper bound of two
wenzelm@34921
   102
  theories, which actually degenerates into absorption of one theory
wenzelm@34921
   103
  into the other (according to the nominal sub-theory relation).
wenzelm@34921
   104
wenzelm@34921
   105
  The @{text "begin"} operation starts a new theory by importing
wenzelm@34921
   106
  several parent theories and entering a special mode of nameless
wenzelm@34921
   107
  incremental updates, until the final @{text "end"} operation is
wenzelm@34921
   108
  performed.
wenzelm@34921
   109
wenzelm@20447
   110
  \medskip The example in \figref{fig:ex-theory} below shows a theory
wenzelm@20451
   111
  graph derived from @{text "Pure"}, with theory @{text "Length"}
wenzelm@20451
   112
  importing @{text "Nat"} and @{text "List"}.  The body of @{text
wenzelm@20451
   113
  "Length"} consists of a sequence of updates, working mostly on
wenzelm@34921
   114
  drafts internally, while transaction boundaries of Isar top-level
wenzelm@34921
   115
  commands (\secref{sec:isar-toplevel}) are guaranteed to be safe
wenzelm@34921
   116
  checkpoints.
wenzelm@20447
   117
wenzelm@20447
   118
  \begin{figure}[htb]
wenzelm@20447
   119
  \begin{center}
wenzelm@20429
   120
  \begin{tabular}{rcccl}
wenzelm@20447
   121
        &            & @{text "Pure"} \\
wenzelm@20447
   122
        &            & @{text "\<down>"} \\
wenzelm@20447
   123
        &            & @{text "FOL"} \\
wenzelm@18537
   124
        & $\swarrow$ &              & $\searrow$ & \\
wenzelm@21852
   125
  @{text "Nat"} &    &              &            & @{text "List"} \\
wenzelm@18537
   126
        & $\searrow$ &              & $\swarrow$ \\
wenzelm@20447
   127
        &            & @{text "Length"} \\
wenzelm@26864
   128
        &            & \multicolumn{3}{l}{~~@{keyword "imports"}} \\
wenzelm@26864
   129
        &            & \multicolumn{3}{l}{~~@{keyword "begin"}} \\
wenzelm@18537
   130
        &            & $\vdots$~~ \\
wenzelm@20447
   131
        &            & @{text "\<bullet>"}~~ \\
wenzelm@20447
   132
        &            & $\vdots$~~ \\
wenzelm@20447
   133
        &            & @{text "\<bullet>"}~~ \\
wenzelm@20447
   134
        &            & $\vdots$~~ \\
wenzelm@26864
   135
        &            & \multicolumn{3}{l}{~~@{command "end"}} \\
wenzelm@20429
   136
  \end{tabular}
wenzelm@20451
   137
  \caption{A theory definition depending on ancestors}\label{fig:ex-theory}
wenzelm@20447
   138
  \end{center}
wenzelm@20447
   139
  \end{figure}
wenzelm@20451
   140
wenzelm@20451
   141
  \medskip There is a separate notion of \emph{theory reference} for
wenzelm@20451
   142
  maintaining a live link to an evolving theory context: updates on
wenzelm@20488
   143
  drafts are propagated automatically.  Dynamic updating stops after
wenzelm@20488
   144
  an explicit @{text "end"} only.
wenzelm@20451
   145
wenzelm@20451
   146
  Derived entities may store a theory reference in order to indicate
wenzelm@20451
   147
  the context they belong to.  This implicitly assumes monotonic
wenzelm@20451
   148
  reasoning, because the referenced context may become larger without
wenzelm@20451
   149
  further notice.
wenzelm@18537
   150
*}
wenzelm@18537
   151
wenzelm@20430
   152
text %mlref {*
wenzelm@20447
   153
  \begin{mldecls}
wenzelm@20447
   154
  @{index_ML_type theory} \\
wenzelm@20447
   155
  @{index_ML Theory.subthy: "theory * theory -> bool"} \\
wenzelm@20447
   156
  @{index_ML Theory.checkpoint: "theory -> theory"} \\
wenzelm@20547
   157
  @{index_ML Theory.copy: "theory -> theory"} \\
wenzelm@34921
   158
  @{index_ML Theory.merge: "theory * theory -> theory"} \\
wenzelm@34921
   159
  @{index_ML Theory.begin_theory: "string -> theory list -> theory"} \\
wenzelm@20547
   160
  \end{mldecls}
wenzelm@20547
   161
  \begin{mldecls}
wenzelm@20447
   162
  @{index_ML_type theory_ref} \\
wenzelm@20447
   163
  @{index_ML Theory.deref: "theory_ref -> theory"} \\
wenzelm@24137
   164
  @{index_ML Theory.check_thy: "theory -> theory_ref"} \\
wenzelm@20447
   165
  \end{mldecls}
wenzelm@20447
   166
wenzelm@20447
   167
  \begin{description}
wenzelm@20447
   168
wenzelm@20451
   169
  \item @{ML_type theory} represents theory contexts.  This is
wenzelm@34921
   170
  essentially a linear type, with explicit runtime checking!  Most
wenzelm@34921
   171
  internal theory operations destroy the original version, which then
wenzelm@34921
   172
  becomes ``stale''.
wenzelm@20447
   173
wenzelm@34921
   174
  \item @{ML "Theory.subthy"}~@{text "(thy\<^sub>1, thy\<^sub>2)"} compares theories
wenzelm@34921
   175
  according to the intrinsic graph structure of the construction.
wenzelm@34921
   176
  This sub-theory relation is a nominal approximation of inclusion
wenzelm@34921
   177
  (@{text "\<subseteq>"}) of the corresponding content (according to the
wenzelm@34921
   178
  semantics of the ML modules that implement the data).
wenzelm@20447
   179
wenzelm@20447
   180
  \item @{ML "Theory.checkpoint"}~@{text "thy"} produces a safe
wenzelm@34921
   181
  stepping stone in the linear development of @{text "thy"}.  This
wenzelm@34921
   182
  changes the old theory, but the next update will result in two
wenzelm@34921
   183
  related, valid theories.
wenzelm@20447
   184
wenzelm@20447
   185
  \item @{ML "Theory.copy"}~@{text "thy"} produces a variant of @{text
wenzelm@34921
   186
  "thy"} with the same data.  The copy is not related to the original,
wenzelm@34921
   187
  but the original is unchanged.
wenzelm@34921
   188
wenzelm@34921
   189
  \item @{ML "Theory.merge"}~@{text "(thy\<^sub>1, thy\<^sub>2)"} absorbs one theory
wenzelm@34921
   190
  into the other, without changing @{text "thy\<^sub>1"} or @{text "thy\<^sub>2"}.
wenzelm@34921
   191
  This version of ad-hoc theory merge fails for unrelated theories!
wenzelm@34921
   192
wenzelm@34921
   193
  \item @{ML "Theory.begin_theory"}~@{text "name parents"} constructs
wenzelm@34921
   194
  a new theory based on the given parents.  This {\ML} function is
wenzelm@34921
   195
  normally not invoked directly.
wenzelm@20447
   196
wenzelm@20451
   197
  \item @{ML_type theory_ref} represents a sliding reference to an
wenzelm@20451
   198
  always valid theory; updates on the original are propagated
wenzelm@20447
   199
  automatically.
wenzelm@20447
   200
wenzelm@24137
   201
  \item @{ML "Theory.deref"}~@{text "thy_ref"} turns a @{ML_type
wenzelm@24137
   202
  "theory_ref"} into an @{ML_type "theory"} value.  As the referenced
wenzelm@24137
   203
  theory evolves monotonically over time, later invocations of @{ML
wenzelm@20451
   204
  "Theory.deref"} may refer to a larger context.
wenzelm@20447
   205
wenzelm@24137
   206
  \item @{ML "Theory.check_thy"}~@{text "thy"} produces a @{ML_type
wenzelm@24137
   207
  "theory_ref"} from a valid @{ML_type "theory"} value.
wenzelm@24137
   208
wenzelm@20447
   209
  \end{description}
wenzelm@20430
   210
*}
wenzelm@20430
   211
wenzelm@34924
   212
text %mlex {*
wenzelm@34924
   213
  The following artificial example demonstrates theory
wenzelm@34924
   214
  data: we maintain a set of terms that are supposed to be wellformed
wenzelm@34924
   215
  wrt.\ the enclosing theory.  The public interface is as follows:
wenzelm@34924
   216
*}
wenzelm@34924
   217
wenzelm@34924
   218
ML {*
wenzelm@34924
   219
signature WELLFORMED_TERMS =
wenzelm@34924
   220
sig
wenzelm@34924
   221
  val get: theory -> term list
wenzelm@34924
   222
  val add: term -> theory -> theory
wenzelm@34924
   223
end;
wenzelm@34924
   224
*}
wenzelm@34924
   225
wenzelm@34924
   226
text {* \noindent The implementation uses private theory data
wenzelm@34924
   227
  internally, and only exposes an operation that involves explicit
wenzelm@34924
   228
  argument checking wrt.\ the given theory. *}
wenzelm@34924
   229
wenzelm@34924
   230
ML {*
wenzelm@34924
   231
structure Wellformed_Terms: WELLFORMED_TERMS =
wenzelm@34924
   232
struct
wenzelm@34924
   233
wenzelm@34924
   234
structure Terms = Theory_Data
wenzelm@34924
   235
(
wenzelm@34924
   236
  type T = term OrdList.T;
wenzelm@34924
   237
  val empty = [];
wenzelm@34924
   238
  val extend = I;
wenzelm@34924
   239
  fun merge (ts1, ts2) =
wenzelm@34924
   240
    OrdList.union TermOrd.fast_term_ord ts1 ts2;
wenzelm@34924
   241
)
wenzelm@34924
   242
wenzelm@34924
   243
val get = Terms.get;
wenzelm@34924
   244
wenzelm@34924
   245
fun add raw_t thy =
wenzelm@34924
   246
  let val t = Sign.cert_term thy raw_t
wenzelm@34924
   247
  in Terms.map (OrdList.insert TermOrd.fast_term_ord t) thy end;
wenzelm@34924
   248
wenzelm@34924
   249
end;
wenzelm@34924
   250
*}
wenzelm@34924
   251
wenzelm@34924
   252
text {* We use @{ML_type "term OrdList.T"} for reasonably efficient
wenzelm@34924
   253
  representation of a set of terms: all operations are linear in the
wenzelm@34924
   254
  number of stored elements.  Here we assume that our users do not
wenzelm@34924
   255
  care about the declaration order, since that data structure forces
wenzelm@34924
   256
  its own arrangement of elements.
wenzelm@34924
   257
wenzelm@34924
   258
  Observe how the @{verbatim merge} operation joins the data slots of
wenzelm@34924
   259
  the two constituents: @{ML OrdList.union} prevents duplication of
wenzelm@34924
   260
  common data from different branches, thus avoiding the danger of
wenzelm@34924
   261
  exponential blowup.  (Plain list append etc.\ must never be used for
wenzelm@34924
   262
  theory data merges.)
wenzelm@34924
   263
wenzelm@34924
   264
  \medskip Our intended invariant is achieved as follows:
wenzelm@34924
   265
  \begin{enumerate}
wenzelm@34924
   266
wenzelm@34924
   267
  \item @{ML Wellformed_Terms.add} only admits terms that have passed
wenzelm@34924
   268
  the @{ML Sign.cert_term} check of the given theory at that point.
wenzelm@34924
   269
wenzelm@34924
   270
  \item Wellformedness in the sense of @{ML Sign.cert_term} is
wenzelm@34924
   271
  monotonic wrt.\ the sub-theory relation.  So our data can move
wenzelm@34924
   272
  upwards in the hierarchy (via extension or merges), and maintain
wenzelm@34924
   273
  wellformedness without further checks.
wenzelm@34924
   274
wenzelm@34924
   275
  \end{enumerate}
wenzelm@34924
   276
wenzelm@34924
   277
  Note that all basic operations of the inference kernel (which
wenzelm@34924
   278
  includes @{ML Sign.cert_term}) observe this monotonicity principle,
wenzelm@34924
   279
  but other user-space tools don't.  For example, fully-featured
wenzelm@34924
   280
  type-inference via @{ML Syntax.check_term} (cf.\
wenzelm@34924
   281
  \secref{sec:term-check}) is not necessarily monotonic wrt.\ the
wenzelm@34924
   282
  background theory, since constraints of term constants can be
wenzelm@34924
   283
  strengthened by later declarations, for example.
wenzelm@34924
   284
wenzelm@34924
   285
  In most cases, user-space context data does not have to take such
wenzelm@34924
   286
  invariants too seriously.  The situation is different in the
wenzelm@34924
   287
  implementation of the inference kernel itself, which uses the very
wenzelm@34924
   288
  same data mechanisms for types, constants, axioms etc.
wenzelm@34924
   289
*}
wenzelm@34924
   290
wenzelm@18537
   291
wenzelm@18537
   292
subsection {* Proof context \label{sec:context-proof} *}
wenzelm@18537
   293
wenzelm@34921
   294
text {* A proof context is a container for pure data with a
wenzelm@34921
   295
  back-reference to the theory it belongs to.  The @{text "init"}
wenzelm@34921
   296
  operation creates a proof context from a given theory.
wenzelm@34921
   297
  Modifications to draft theories are propagated to the proof context
wenzelm@34921
   298
  as usual, but there is also an explicit @{text "transfer"} operation
wenzelm@34921
   299
  to force resynchronization with more substantial updates to the
wenzelm@34921
   300
  underlying theory.
wenzelm@20429
   301
wenzelm@34921
   302
  Entities derived in a proof context need to record logical
wenzelm@20447
   303
  requirements explicitly, since there is no separate context
wenzelm@34921
   304
  identification or symbolic inclusion as for theories.  For example,
wenzelm@34921
   305
  hypotheses used in primitive derivations (cf.\ \secref{sec:thms})
wenzelm@34921
   306
  are recorded separately within the sequent @{text "\<Gamma> \<turnstile> \<phi>"}, just to
wenzelm@34921
   307
  make double sure.  Results could still leak into an alien proof
wenzelm@34921
   308
  context due to programming errors, but Isabelle/Isar includes some
wenzelm@34921
   309
  extra validity checks in critical positions, notably at the end of a
wenzelm@34921
   310
  sub-proof.
wenzelm@20429
   311
wenzelm@20451
   312
  Proof contexts may be manipulated arbitrarily, although the common
wenzelm@20451
   313
  discipline is to follow block structure as a mental model: a given
wenzelm@20451
   314
  context is extended consecutively, and results are exported back
wenzelm@34921
   315
  into the original context.  Note that an Isar proof state models
wenzelm@20451
   316
  block-structured reasoning explicitly, using a stack of proof
wenzelm@34921
   317
  contexts internally.  For various technical reasons, the background
wenzelm@34921
   318
  theory of an Isar proof state must not be changed while the proof is
wenzelm@34921
   319
  still under construction!
wenzelm@18537
   320
*}
wenzelm@18537
   321
wenzelm@20449
   322
text %mlref {*
wenzelm@20449
   323
  \begin{mldecls}
wenzelm@20449
   324
  @{index_ML_type Proof.context} \\
wenzelm@20449
   325
  @{index_ML ProofContext.init: "theory -> Proof.context"} \\
wenzelm@20449
   326
  @{index_ML ProofContext.theory_of: "Proof.context -> theory"} \\
wenzelm@20449
   327
  @{index_ML ProofContext.transfer: "theory -> Proof.context -> Proof.context"} \\
wenzelm@20449
   328
  \end{mldecls}
wenzelm@20449
   329
wenzelm@20449
   330
  \begin{description}
wenzelm@20449
   331
wenzelm@20449
   332
  \item @{ML_type Proof.context} represents proof contexts.  Elements
wenzelm@20449
   333
  of this type are essentially pure values, with a sliding reference
wenzelm@20449
   334
  to the background theory.
wenzelm@20449
   335
wenzelm@20449
   336
  \item @{ML ProofContext.init}~@{text "thy"} produces a proof context
wenzelm@20449
   337
  derived from @{text "thy"}, initializing all data.
wenzelm@20449
   338
wenzelm@20449
   339
  \item @{ML ProofContext.theory_of}~@{text "ctxt"} selects the
wenzelm@20451
   340
  background theory from @{text "ctxt"}, dereferencing its internal
wenzelm@20451
   341
  @{ML_type theory_ref}.
wenzelm@20449
   342
wenzelm@20449
   343
  \item @{ML ProofContext.transfer}~@{text "thy ctxt"} promotes the
wenzelm@20449
   344
  background theory of @{text "ctxt"} to the super theory @{text
wenzelm@20449
   345
  "thy"}.
wenzelm@20449
   346
wenzelm@20449
   347
  \end{description}
wenzelm@20449
   348
*}
wenzelm@20449
   349
wenzelm@20430
   350
wenzelm@20451
   351
subsection {* Generic contexts \label{sec:generic-context} *}
wenzelm@20429
   352
wenzelm@20449
   353
text {*
wenzelm@20449
   354
  A generic context is the disjoint sum of either a theory or proof
wenzelm@20451
   355
  context.  Occasionally, this enables uniform treatment of generic
wenzelm@20450
   356
  context data, typically extra-logical information.  Operations on
wenzelm@20449
   357
  generic contexts include the usual injections, partial selections,
wenzelm@20449
   358
  and combinators for lifting operations on either component of the
wenzelm@20449
   359
  disjoint sum.
wenzelm@20449
   360
wenzelm@20449
   361
  Moreover, there are total operations @{text "theory_of"} and @{text
wenzelm@20449
   362
  "proof_of"} to convert a generic context into either kind: a theory
wenzelm@20451
   363
  can always be selected from the sum, while a proof context might
wenzelm@34921
   364
  have to be constructed by an ad-hoc @{text "init"} operation, which
wenzelm@34921
   365
  incurs a small runtime overhead.
wenzelm@20449
   366
*}
wenzelm@20430
   367
wenzelm@20449
   368
text %mlref {*
wenzelm@20449
   369
  \begin{mldecls}
wenzelm@20449
   370
  @{index_ML_type Context.generic} \\
wenzelm@20449
   371
  @{index_ML Context.theory_of: "Context.generic -> theory"} \\
wenzelm@20449
   372
  @{index_ML Context.proof_of: "Context.generic -> Proof.context"} \\
wenzelm@20449
   373
  \end{mldecls}
wenzelm@20449
   374
wenzelm@20449
   375
  \begin{description}
wenzelm@20430
   376
wenzelm@20449
   377
  \item @{ML_type Context.generic} is the direct sum of @{ML_type
wenzelm@20451
   378
  "theory"} and @{ML_type "Proof.context"}, with the datatype
wenzelm@20451
   379
  constructors @{ML "Context.Theory"} and @{ML "Context.Proof"}.
wenzelm@20449
   380
wenzelm@20449
   381
  \item @{ML Context.theory_of}~@{text "context"} always produces a
wenzelm@20449
   382
  theory from the generic @{text "context"}, using @{ML
wenzelm@20449
   383
  "ProofContext.theory_of"} as required.
wenzelm@20449
   384
wenzelm@20449
   385
  \item @{ML Context.proof_of}~@{text "context"} always produces a
wenzelm@20449
   386
  proof context from the generic @{text "context"}, using @{ML
wenzelm@20451
   387
  "ProofContext.init"} as required (note that this re-initializes the
wenzelm@20451
   388
  context data with each invocation).
wenzelm@20449
   389
wenzelm@20449
   390
  \end{description}
wenzelm@20449
   391
*}
wenzelm@20437
   392
wenzelm@20476
   393
wenzelm@20476
   394
subsection {* Context data \label{sec:context-data} *}
wenzelm@20447
   395
wenzelm@33524
   396
text {* The main purpose of theory and proof contexts is to manage
wenzelm@33524
   397
  arbitrary (pure) data.  New data types can be declared incrementally
wenzelm@33524
   398
  at compile time.  There are separate declaration mechanisms for any
wenzelm@33524
   399
  of the three kinds of contexts: theory, proof, generic.
wenzelm@20449
   400
wenzelm@33524
   401
  \paragraph{Theory data} declarations need to implement the following
wenzelm@33524
   402
  SML signature:
wenzelm@20449
   403
wenzelm@20449
   404
  \medskip
wenzelm@20449
   405
  \begin{tabular}{ll}
wenzelm@22869
   406
  @{text "\<type> T"} & representing type \\
wenzelm@22869
   407
  @{text "\<val> empty: T"} & empty default value \\
wenzelm@22869
   408
  @{text "\<val> extend: T \<rightarrow> T"} & re-initialize on import \\
wenzelm@22869
   409
  @{text "\<val> merge: T \<times> T \<rightarrow> T"} & join on import \\
wenzelm@20449
   410
  \end{tabular}
wenzelm@20449
   411
  \medskip
wenzelm@20449
   412
wenzelm@22869
   413
  \noindent The @{text "empty"} value acts as initial default for
wenzelm@22869
   414
  \emph{any} theory that does not declare actual data content; @{text
wenzelm@33524
   415
  "extend"} is acts like a unitary version of @{text "merge"}.
wenzelm@20449
   416
wenzelm@34921
   417
  Implementing @{text "merge"} can be tricky.  The general idea is
wenzelm@34921
   418
  that @{text "merge (data\<^sub>1, data\<^sub>2)"} inserts those parts of @{text
wenzelm@34921
   419
  "data\<^sub>2"} into @{text "data\<^sub>1"} that are not yet present, while
wenzelm@34921
   420
  keeping the general order of things.  The @{ML Library.merge}
wenzelm@34921
   421
  function on plain lists may serve as canonical template.
wenzelm@34921
   422
wenzelm@34921
   423
  Particularly note that shared parts of the data must not be
wenzelm@34921
   424
  duplicated by naive concatenation, or a theory graph that is like a
wenzelm@34921
   425
  chain of diamonds would cause an exponential blowup!
wenzelm@34921
   426
wenzelm@33524
   427
  \paragraph{Proof context data} declarations need to implement the
wenzelm@33524
   428
  following SML signature:
wenzelm@20449
   429
wenzelm@20449
   430
  \medskip
wenzelm@20449
   431
  \begin{tabular}{ll}
wenzelm@22869
   432
  @{text "\<type> T"} & representing type \\
wenzelm@22869
   433
  @{text "\<val> init: theory \<rightarrow> T"} & produce initial value \\
wenzelm@20449
   434
  \end{tabular}
wenzelm@20449
   435
  \medskip
wenzelm@20449
   436
wenzelm@20449
   437
  \noindent The @{text "init"} operation is supposed to produce a pure
wenzelm@34921
   438
  value from the given background theory and should be somehow
wenzelm@34921
   439
  ``immediate''.  Whenever a proof context is initialized, which
wenzelm@34921
   440
  happens frequently, the the system invokes the @{text "init"}
wenzelm@34921
   441
  operation of \emph{all} theory data slots ever declared.
wenzelm@20449
   442
wenzelm@20451
   443
  \paragraph{Generic data} provides a hybrid interface for both theory
wenzelm@33524
   444
  and proof data.  The @{text "init"} operation for proof contexts is
wenzelm@33524
   445
  predefined to select the current data value from the background
wenzelm@33524
   446
  theory.
wenzelm@20449
   447
wenzelm@34921
   448
  \bigskip Any of these data declaration over type @{text "T"} result
wenzelm@34921
   449
  in an ML structure with the following signature:
wenzelm@20449
   450
wenzelm@20449
   451
  \medskip
wenzelm@20449
   452
  \begin{tabular}{ll}
wenzelm@20449
   453
  @{text "get: context \<rightarrow> T"} \\
wenzelm@20449
   454
  @{text "put: T \<rightarrow> context \<rightarrow> context"} \\
wenzelm@20449
   455
  @{text "map: (T \<rightarrow> T) \<rightarrow> context \<rightarrow> context"} \\
wenzelm@20449
   456
  \end{tabular}
wenzelm@20449
   457
  \medskip
wenzelm@20449
   458
wenzelm@34921
   459
  \noindent These other operations provide exclusive access for the
wenzelm@34921
   460
  particular kind of context (theory, proof, or generic context).
wenzelm@34921
   461
  This interface fully observes the ML discipline for types and
wenzelm@34921
   462
  scopes: there is no other way to access the corresponding data slot
wenzelm@34921
   463
  of a context.  By keeping these operations private, an Isabelle/ML
wenzelm@34921
   464
  module may maintain abstract values authentically.
wenzelm@20447
   465
*}
wenzelm@20447
   466
wenzelm@20450
   467
text %mlref {*
wenzelm@20450
   468
  \begin{mldecls}
wenzelm@33524
   469
  @{index_ML_functor Theory_Data} \\
wenzelm@33524
   470
  @{index_ML_functor Proof_Data} \\
wenzelm@33524
   471
  @{index_ML_functor Generic_Data} \\
wenzelm@20450
   472
  \end{mldecls}
wenzelm@20450
   473
wenzelm@20450
   474
  \begin{description}
wenzelm@20450
   475
wenzelm@33524
   476
  \item @{ML_functor Theory_Data}@{text "(spec)"} declares data for
wenzelm@20450
   477
  type @{ML_type theory} according to the specification provided as
wenzelm@20451
   478
  argument structure.  The resulting structure provides data init and
wenzelm@20451
   479
  access operations as described above.
wenzelm@20450
   480
wenzelm@33524
   481
  \item @{ML_functor Proof_Data}@{text "(spec)"} is analogous to
wenzelm@33524
   482
  @{ML_functor Theory_Data} for type @{ML_type Proof.context}.
wenzelm@20450
   483
wenzelm@33524
   484
  \item @{ML_functor Generic_Data}@{text "(spec)"} is analogous to
wenzelm@33524
   485
  @{ML_functor Theory_Data} for type @{ML_type Context.generic}.
wenzelm@20450
   486
wenzelm@20450
   487
  \end{description}
wenzelm@20450
   488
*}
wenzelm@20450
   489
wenzelm@20447
   490
wenzelm@26872
   491
section {* Names \label{sec:names} *}
wenzelm@20451
   492
wenzelm@34925
   493
text {* In principle, a name is just a string, but there are various
wenzelm@34925
   494
  conventions for representing additional structure.  For example,
wenzelm@34925
   495
  ``@{text "Foo.bar.baz"}'' is considered as a qualified name
wenzelm@34925
   496
  consisting of three basic name components.  The individual
wenzelm@34925
   497
  constituents of a name may have further substructure, e.g.\ the
wenzelm@34925
   498
  string ``\verb,\,\verb,<alpha>,'' encodes as a single symbol.
wenzelm@20451
   499
*}
wenzelm@20437
   500
wenzelm@20437
   501
wenzelm@20437
   502
subsection {* Strings of symbols *}
wenzelm@20437
   503
wenzelm@34925
   504
text {* A \emph{symbol} constitutes the smallest textual unit in
wenzelm@34925
   505
  Isabelle --- raw ML characters are normally not encountered at all!
wenzelm@34925
   506
  Isabelle strings consist of a sequence of symbols, represented as a
wenzelm@34925
   507
  packed string or an exploded list of strings.  Each symbol is in
wenzelm@34925
   508
  itself a small string, which has either one of the following forms:
wenzelm@20437
   509
wenzelm@20451
   510
  \begin{enumerate}
wenzelm@20437
   511
wenzelm@34925
   512
  \item a single ASCII character ``@{text "c"}'' or raw byte in the
wenzelm@34925
   513
  range of 128\dots 255, for example ``\verb,a,'',
wenzelm@20437
   514
wenzelm@20488
   515
  \item a regular symbol ``\verb,\,\verb,<,@{text "ident"}\verb,>,'',
wenzelm@20476
   516
  for example ``\verb,\,\verb,<alpha>,'',
wenzelm@20437
   517
wenzelm@20488
   518
  \item a control symbol ``\verb,\,\verb,<^,@{text "ident"}\verb,>,'',
wenzelm@20476
   519
  for example ``\verb,\,\verb,<^bold>,'',
wenzelm@20437
   520
wenzelm@20488
   521
  \item a raw symbol ``\verb,\,\verb,<^raw:,@{text text}\verb,>,''
wenzelm@34925
   522
  where @{text text} consists of printable characters excluding
wenzelm@20476
   523
  ``\verb,.,'' and ``\verb,>,'', for example
wenzelm@20476
   524
  ``\verb,\,\verb,<^raw:$\sum_{i = 1}^n$>,'',
wenzelm@20437
   525
wenzelm@20488
   526
  \item a numbered raw control symbol ``\verb,\,\verb,<^raw,@{text
wenzelm@20476
   527
  n}\verb,>, where @{text n} consists of digits, for example
wenzelm@20451
   528
  ``\verb,\,\verb,<^raw42>,''.
wenzelm@20437
   529
wenzelm@20451
   530
  \end{enumerate}
wenzelm@20437
   531
wenzelm@20476
   532
  \noindent The @{text "ident"} syntax for symbol names is @{text
wenzelm@20476
   533
  "letter (letter | digit)\<^sup>*"}, where @{text "letter =
wenzelm@20476
   534
  A..Za..z"} and @{text "digit = 0..9"}.  There are infinitely many
wenzelm@20476
   535
  regular symbols and control symbols, but a fixed collection of
wenzelm@20476
   536
  standard symbols is treated specifically.  For example,
wenzelm@20488
   537
  ``\verb,\,\verb,<alpha>,'' is classified as a letter, which means it
wenzelm@20488
   538
  may occur within regular Isabelle identifiers.
wenzelm@20437
   539
wenzelm@20488
   540
  Since the character set underlying Isabelle symbols is 7-bit ASCII
wenzelm@34925
   541
  and 8-bit characters are passed through transparently, Isabelle can
wenzelm@34925
   542
  also process Unicode/UCS data in UTF-8 encoding.\footnote{When
wenzelm@34925
   543
  counting precise source positions internally, bytes in the range of
wenzelm@34925
   544
  128\dots 191 are ignored.  In UTF-8 encoding, this interval covers
wenzelm@34925
   545
  the additional trailer bytes, so Isabelle happens to count Unicode
wenzelm@34925
   546
  characters here, not bytes in memory.  In ISO-Latin encoding, the
wenzelm@34925
   547
  ignored range merely includes some extra punctuation characters that
wenzelm@34925
   548
  even have replacements within the standard collection of Isabelle
wenzelm@34925
   549
  symbols; the accented letters range is counted properly.} Unicode
wenzelm@34925
   550
  provides its own collection of mathematical symbols, but within the
wenzelm@34925
   551
  core Isabelle/ML world there is no link to the standard collection
wenzelm@34925
   552
  of Isabelle regular symbols.
wenzelm@20476
   553
wenzelm@20476
   554
  \medskip Output of Isabelle symbols depends on the print mode
wenzelm@29758
   555
  (\secref{print-mode}).  For example, the standard {\LaTeX} setup of
wenzelm@29758
   556
  the Isabelle document preparation system would present
wenzelm@20451
   557
  ``\verb,\,\verb,<alpha>,'' as @{text "\<alpha>"}, and
wenzelm@20451
   558
  ``\verb,\,\verb,<^bold>,\verb,\,\verb,<alpha>,'' as @{text
wenzelm@34925
   559
  "\<^bold>\<alpha>"}.  On-screen rendering usually works by mapping a finite
wenzelm@34925
   560
  subset of Isabelle symbols to suitable Unicode characters.
wenzelm@20451
   561
*}
wenzelm@20437
   562
wenzelm@20437
   563
text %mlref {*
wenzelm@20437
   564
  \begin{mldecls}
wenzelm@34921
   565
  @{index_ML_type "Symbol.symbol": string} \\
wenzelm@20437
   566
  @{index_ML Symbol.explode: "string -> Symbol.symbol list"} \\
wenzelm@20437
   567
  @{index_ML Symbol.is_letter: "Symbol.symbol -> bool"} \\
wenzelm@20437
   568
  @{index_ML Symbol.is_digit: "Symbol.symbol -> bool"} \\
wenzelm@20437
   569
  @{index_ML Symbol.is_quasi: "Symbol.symbol -> bool"} \\
wenzelm@20547
   570
  @{index_ML Symbol.is_blank: "Symbol.symbol -> bool"} \\
wenzelm@20547
   571
  \end{mldecls}
wenzelm@20547
   572
  \begin{mldecls}
wenzelm@20437
   573
  @{index_ML_type "Symbol.sym"} \\
wenzelm@20437
   574
  @{index_ML Symbol.decode: "Symbol.symbol -> Symbol.sym"} \\
wenzelm@20437
   575
  \end{mldecls}
wenzelm@20437
   576
wenzelm@20437
   577
  \begin{description}
wenzelm@20437
   578
wenzelm@20488
   579
  \item @{ML_type "Symbol.symbol"} represents individual Isabelle
wenzelm@34921
   580
  symbols.
wenzelm@20437
   581
wenzelm@20476
   582
  \item @{ML "Symbol.explode"}~@{text "str"} produces a symbol list
wenzelm@20488
   583
  from the packed form.  This function supercedes @{ML
wenzelm@20476
   584
  "String.explode"} for virtually all purposes of manipulating text in
wenzelm@34925
   585
  Isabelle!\footnote{The runtime overhead for exploded strings is
wenzelm@34925
   586
  mainly that of the list structure: individual symbols that happen to
wenzelm@34925
   587
  be a singleton string --- which is the most common case --- do not
wenzelm@34925
   588
  require extra memory in Poly/ML.}
wenzelm@20437
   589
wenzelm@20437
   590
  \item @{ML "Symbol.is_letter"}, @{ML "Symbol.is_digit"}, @{ML
wenzelm@20476
   591
  "Symbol.is_quasi"}, @{ML "Symbol.is_blank"} classify standard
wenzelm@20476
   592
  symbols according to fixed syntactic conventions of Isabelle, cf.\
wenzelm@20476
   593
  \cite{isabelle-isar-ref}.
wenzelm@20437
   594
wenzelm@20437
   595
  \item @{ML_type "Symbol.sym"} is a concrete datatype that represents
wenzelm@20488
   596
  the different kinds of symbols explicitly, with constructors @{ML
wenzelm@20488
   597
  "Symbol.Char"}, @{ML "Symbol.Sym"}, @{ML "Symbol.Ctrl"}, @{ML
wenzelm@20451
   598
  "Symbol.Raw"}.
wenzelm@20437
   599
wenzelm@20437
   600
  \item @{ML "Symbol.decode"} converts the string representation of a
wenzelm@20451
   601
  symbol into the datatype version.
wenzelm@20437
   602
wenzelm@20437
   603
  \end{description}
wenzelm@34925
   604
wenzelm@34925
   605
  \paragraph{Historical note.} In the original SML90 standard the
wenzelm@34925
   606
  primitive ML type @{ML_type char} did not exists, and the basic @{ML
wenzelm@34925
   607
  "explode: string -> string list"} operation would produce a list of
wenzelm@34925
   608
  singleton strings as in Isabelle/ML today.  When SML97 came out,
wenzelm@34925
   609
  Isabelle ignored its slightly anachronistic 8-bit characters, but
wenzelm@34925
   610
  the idea of exploding a string into a list of small strings was
wenzelm@34925
   611
  extended to ``symbols'' as explained above.  Thus Isabelle sources
wenzelm@34925
   612
  can refer to an infinite store of user-defined symbols, without
wenzelm@34925
   613
  having to worry about the multitude of Unicode encodings.
wenzelm@20437
   614
*}
wenzelm@20437
   615
wenzelm@20437
   616
wenzelm@20476
   617
subsection {* Basic names \label{sec:basic-names} *}
wenzelm@20476
   618
wenzelm@20476
   619
text {*
wenzelm@20476
   620
  A \emph{basic name} essentially consists of a single Isabelle
wenzelm@20476
   621
  identifier.  There are conventions to mark separate classes of basic
wenzelm@29761
   622
  names, by attaching a suffix of underscores: one underscore means
wenzelm@29761
   623
  \emph{internal name}, two underscores means \emph{Skolem name},
wenzelm@29761
   624
  three underscores means \emph{internal Skolem name}.
wenzelm@20476
   625
wenzelm@20476
   626
  For example, the basic name @{text "foo"} has the internal version
wenzelm@20476
   627
  @{text "foo_"}, with Skolem versions @{text "foo__"} and @{text
wenzelm@20476
   628
  "foo___"}, respectively.
wenzelm@20476
   629
wenzelm@20488
   630
  These special versions provide copies of the basic name space, apart
wenzelm@20488
   631
  from anything that normally appears in the user text.  For example,
wenzelm@20488
   632
  system generated variables in Isar proof contexts are usually marked
wenzelm@20488
   633
  as internal, which prevents mysterious name references like @{text
wenzelm@20488
   634
  "xaa"} to appear in the text.
wenzelm@20476
   635
wenzelm@20488
   636
  \medskip Manipulating binding scopes often requires on-the-fly
wenzelm@20488
   637
  renamings.  A \emph{name context} contains a collection of already
wenzelm@20488
   638
  used names.  The @{text "declare"} operation adds names to the
wenzelm@20488
   639
  context.
wenzelm@20476
   640
wenzelm@20488
   641
  The @{text "invents"} operation derives a number of fresh names from
wenzelm@20488
   642
  a given starting point.  For example, the first three names derived
wenzelm@20488
   643
  from @{text "a"} are @{text "a"}, @{text "b"}, @{text "c"}.
wenzelm@20476
   644
wenzelm@20476
   645
  The @{text "variants"} operation produces fresh names by
wenzelm@20488
   646
  incrementing tentative names as base-26 numbers (with digits @{text
wenzelm@20488
   647
  "a..z"}) until all clashes are resolved.  For example, name @{text
wenzelm@20488
   648
  "foo"} results in variants @{text "fooa"}, @{text "foob"}, @{text
wenzelm@20488
   649
  "fooc"}, \dots, @{text "fooaa"}, @{text "fooab"} etc.; each renaming
wenzelm@20488
   650
  step picks the next unused variant from this sequence.
wenzelm@20476
   651
*}
wenzelm@20476
   652
wenzelm@20476
   653
text %mlref {*
wenzelm@20476
   654
  \begin{mldecls}
wenzelm@20476
   655
  @{index_ML Name.internal: "string -> string"} \\
wenzelm@20547
   656
  @{index_ML Name.skolem: "string -> string"} \\
wenzelm@20547
   657
  \end{mldecls}
wenzelm@20547
   658
  \begin{mldecls}
wenzelm@20476
   659
  @{index_ML_type Name.context} \\
wenzelm@20476
   660
  @{index_ML Name.context: Name.context} \\
wenzelm@20476
   661
  @{index_ML Name.declare: "string -> Name.context -> Name.context"} \\
wenzelm@20476
   662
  @{index_ML Name.invents: "Name.context -> string -> int -> string list"} \\
wenzelm@20476
   663
  @{index_ML Name.variants: "string list -> Name.context -> string list * Name.context"} \\
wenzelm@20476
   664
  \end{mldecls}
wenzelm@20476
   665
wenzelm@20476
   666
  \begin{description}
wenzelm@20476
   667
wenzelm@20476
   668
  \item @{ML Name.internal}~@{text "name"} produces an internal name
wenzelm@20476
   669
  by adding one underscore.
wenzelm@20476
   670
wenzelm@20476
   671
  \item @{ML Name.skolem}~@{text "name"} produces a Skolem name by
wenzelm@20476
   672
  adding two underscores.
wenzelm@20476
   673
wenzelm@20476
   674
  \item @{ML_type Name.context} represents the context of already used
wenzelm@20476
   675
  names; the initial value is @{ML "Name.context"}.
wenzelm@20476
   676
wenzelm@20488
   677
  \item @{ML Name.declare}~@{text "name"} enters a used name into the
wenzelm@20488
   678
  context.
wenzelm@20437
   679
wenzelm@20488
   680
  \item @{ML Name.invents}~@{text "context name n"} produces @{text
wenzelm@20488
   681
  "n"} fresh names derived from @{text "name"}.
wenzelm@20488
   682
wenzelm@20488
   683
  \item @{ML Name.variants}~@{text "names context"} produces fresh
wenzelm@29761
   684
  variants of @{text "names"}; the result is entered into the context.
wenzelm@20476
   685
wenzelm@20476
   686
  \end{description}
wenzelm@20476
   687
*}
wenzelm@20476
   688
wenzelm@20476
   689
wenzelm@20476
   690
subsection {* Indexed names *}
wenzelm@20476
   691
wenzelm@20476
   692
text {*
wenzelm@20476
   693
  An \emph{indexed name} (or @{text "indexname"}) is a pair of a basic
wenzelm@20488
   694
  name and a natural number.  This representation allows efficient
wenzelm@20488
   695
  renaming by incrementing the second component only.  The canonical
wenzelm@20488
   696
  way to rename two collections of indexnames apart from each other is
wenzelm@20488
   697
  this: determine the maximum index @{text "maxidx"} of the first
wenzelm@20488
   698
  collection, then increment all indexes of the second collection by
wenzelm@20488
   699
  @{text "maxidx + 1"}; the maximum index of an empty collection is
wenzelm@20488
   700
  @{text "-1"}.
wenzelm@20476
   701
wenzelm@20488
   702
  Occasionally, basic names and indexed names are injected into the
wenzelm@20488
   703
  same pair type: the (improper) indexname @{text "(x, -1)"} is used
wenzelm@20488
   704
  to encode basic names.
wenzelm@20488
   705
wenzelm@20488
   706
  \medskip Isabelle syntax observes the following rules for
wenzelm@20488
   707
  representing an indexname @{text "(x, i)"} as a packed string:
wenzelm@20476
   708
wenzelm@20476
   709
  \begin{itemize}
wenzelm@20476
   710
wenzelm@20479
   711
  \item @{text "?x"} if @{text "x"} does not end with a digit and @{text "i = 0"},
wenzelm@20476
   712
wenzelm@20476
   713
  \item @{text "?xi"} if @{text "x"} does not end with a digit,
wenzelm@20476
   714
wenzelm@20488
   715
  \item @{text "?x.i"} otherwise.
wenzelm@20476
   716
wenzelm@20476
   717
  \end{itemize}
wenzelm@20470
   718
wenzelm@20488
   719
  Indexnames may acquire large index numbers over time.  Results are
wenzelm@20488
   720
  normalized towards @{text "0"} at certain checkpoints, notably at
wenzelm@20488
   721
  the end of a proof.  This works by producing variants of the
wenzelm@20488
   722
  corresponding basic name components.  For example, the collection
wenzelm@20488
   723
  @{text "?x1, ?x7, ?x42"} becomes @{text "?x, ?xa, ?xb"}.
wenzelm@20476
   724
*}
wenzelm@20476
   725
wenzelm@20476
   726
text %mlref {*
wenzelm@20476
   727
  \begin{mldecls}
wenzelm@20476
   728
  @{index_ML_type indexname} \\
wenzelm@20476
   729
  \end{mldecls}
wenzelm@20476
   730
wenzelm@20476
   731
  \begin{description}
wenzelm@20476
   732
wenzelm@20476
   733
  \item @{ML_type indexname} represents indexed names.  This is an
wenzelm@20476
   734
  abbreviation for @{ML_type "string * int"}.  The second component is
wenzelm@20476
   735
  usually non-negative, except for situations where @{text "(x, -1)"}
wenzelm@20488
   736
  is used to embed basic names into this type.
wenzelm@20476
   737
wenzelm@20476
   738
  \end{description}
wenzelm@20476
   739
*}
wenzelm@20476
   740
wenzelm@20476
   741
wenzelm@20476
   742
subsection {* Qualified names and name spaces *}
wenzelm@20476
   743
wenzelm@20476
   744
text {*
wenzelm@20476
   745
  A \emph{qualified name} consists of a non-empty sequence of basic
wenzelm@20488
   746
  name components.  The packed representation uses a dot as separator,
wenzelm@20488
   747
  as in ``@{text "A.b.c"}''.  The last component is called \emph{base}
wenzelm@20488
   748
  name, the remaining prefix \emph{qualifier} (which may be empty).
wenzelm@20488
   749
  The idea of qualified names is to encode nested structures by
wenzelm@20488
   750
  recording the access paths as qualifiers.  For example, an item
wenzelm@20488
   751
  named ``@{text "A.b.c"}'' may be understood as a local entity @{text
wenzelm@20488
   752
  "c"}, within a local structure @{text "b"}, within a global
wenzelm@20488
   753
  structure @{text "A"}.  Typically, name space hierarchies consist of
wenzelm@20488
   754
  1--2 levels of qualification, but this need not be always so.
wenzelm@20437
   755
wenzelm@20476
   756
  The empty name is commonly used as an indication of unnamed
wenzelm@20488
   757
  entities, whenever this makes any sense.  The basic operations on
wenzelm@20488
   758
  qualified names are smart enough to pass through such improper names
wenzelm@20476
   759
  unchanged.
wenzelm@20476
   760
wenzelm@20476
   761
  \medskip A @{text "naming"} policy tells how to turn a name
wenzelm@20476
   762
  specification into a fully qualified internal name (by the @{text
wenzelm@20488
   763
  "full"} operation), and how fully qualified names may be accessed
wenzelm@20488
   764
  externally.  For example, the default naming policy is to prefix an
wenzelm@20488
   765
  implicit path: @{text "full x"} produces @{text "path.x"}, and the
wenzelm@20488
   766
  standard accesses for @{text "path.x"} include both @{text "x"} and
wenzelm@20488
   767
  @{text "path.x"}.  Normally, the naming is implicit in the theory or
wenzelm@20488
   768
  proof context; there are separate versions of the corresponding.
wenzelm@20437
   769
wenzelm@20476
   770
  \medskip A @{text "name space"} manages a collection of fully
wenzelm@20476
   771
  internalized names, together with a mapping between external names
wenzelm@20476
   772
  and internal names (in both directions).  The corresponding @{text
wenzelm@20476
   773
  "intern"} and @{text "extern"} operations are mostly used for
wenzelm@20476
   774
  parsing and printing only!  The @{text "declare"} operation augments
wenzelm@20488
   775
  a name space according to the accesses determined by the naming
wenzelm@20488
   776
  policy.
wenzelm@20476
   777
wenzelm@20488
   778
  \medskip As a general principle, there is a separate name space for
wenzelm@20488
   779
  each kind of formal entity, e.g.\ logical constant, type
wenzelm@20488
   780
  constructor, type class, theorem.  It is usually clear from the
wenzelm@20488
   781
  occurrence in concrete syntax (or from the scope) which kind of
wenzelm@20488
   782
  entity a name refers to.  For example, the very same name @{text
wenzelm@20488
   783
  "c"} may be used uniformly for a constant, type constructor, and
wenzelm@20488
   784
  type class.
wenzelm@20476
   785
wenzelm@20479
   786
  There are common schemes to name theorems systematically, according
wenzelm@20488
   787
  to the name of the main logical entity involved, e.g.\ @{text
wenzelm@20488
   788
  "c.intro"} for a canonical theorem related to constant @{text "c"}.
wenzelm@20488
   789
  This technique of mapping names from one space into another requires
wenzelm@20488
   790
  some care in order to avoid conflicts.  In particular, theorem names
wenzelm@20488
   791
  derived from a type constructor or type class are better suffixed in
wenzelm@20488
   792
  addition to the usual qualification, e.g.\ @{text "c_type.intro"}
wenzelm@20488
   793
  and @{text "c_class.intro"} for theorems related to type @{text "c"}
wenzelm@20488
   794
  and class @{text "c"}, respectively.
wenzelm@20437
   795
*}
wenzelm@20437
   796
wenzelm@20476
   797
text %mlref {*
wenzelm@20476
   798
  \begin{mldecls}
wenzelm@30365
   799
  @{index_ML Long_Name.base_name: "string -> string"} \\
wenzelm@30365
   800
  @{index_ML Long_Name.qualifier: "string -> string"} \\
wenzelm@30365
   801
  @{index_ML Long_Name.append: "string -> string -> string"} \\
wenzelm@30365
   802
  @{index_ML Long_Name.implode: "string list -> string"} \\
wenzelm@30365
   803
  @{index_ML Long_Name.explode: "string -> string list"} \\
wenzelm@20547
   804
  \end{mldecls}
wenzelm@20547
   805
  \begin{mldecls}
haftmann@33174
   806
  @{index_ML_type Name_Space.naming} \\
haftmann@33174
   807
  @{index_ML Name_Space.default_naming: Name_Space.naming} \\
haftmann@33174
   808
  @{index_ML Name_Space.add_path: "string -> Name_Space.naming -> Name_Space.naming"} \\
haftmann@33174
   809
  @{index_ML Name_Space.full_name: "Name_Space.naming -> binding -> string"} \\
wenzelm@20547
   810
  \end{mldecls}
wenzelm@20547
   811
  \begin{mldecls}
haftmann@33174
   812
  @{index_ML_type Name_Space.T} \\
haftmann@33174
   813
  @{index_ML Name_Space.empty: "string -> Name_Space.T"} \\
haftmann@33174
   814
  @{index_ML Name_Space.merge: "Name_Space.T * Name_Space.T -> Name_Space.T"} \\
haftmann@33174
   815
  @{index_ML Name_Space.declare: "bool -> Name_Space.naming -> binding -> Name_Space.T ->
haftmann@33174
   816
  string * Name_Space.T"} \\
haftmann@33174
   817
  @{index_ML Name_Space.intern: "Name_Space.T -> string -> string"} \\
haftmann@33174
   818
  @{index_ML Name_Space.extern: "Name_Space.T -> string -> string"} \\
wenzelm@20476
   819
  \end{mldecls}
wenzelm@20437
   820
wenzelm@20476
   821
  \begin{description}
wenzelm@20476
   822
wenzelm@30365
   823
  \item @{ML Long_Name.base_name}~@{text "name"} returns the base name of a
wenzelm@20476
   824
  qualified name.
wenzelm@20476
   825
wenzelm@30365
   826
  \item @{ML Long_Name.qualifier}~@{text "name"} returns the qualifier
wenzelm@20476
   827
  of a qualified name.
wenzelm@20437
   828
wenzelm@30365
   829
  \item @{ML Long_Name.append}~@{text "name\<^isub>1 name\<^isub>2"}
wenzelm@20476
   830
  appends two qualified names.
wenzelm@20437
   831
wenzelm@30365
   832
  \item @{ML Long_Name.implode}~@{text "names"} and @{ML
wenzelm@30365
   833
  Long_Name.explode}~@{text "name"} convert between the packed string
wenzelm@20488
   834
  representation and the explicit list form of qualified names.
wenzelm@20476
   835
haftmann@33174
   836
  \item @{ML_type Name_Space.naming} represents the abstract concept of
wenzelm@20476
   837
  a naming policy.
wenzelm@20437
   838
haftmann@33174
   839
  \item @{ML Name_Space.default_naming} is the default naming policy.
wenzelm@20476
   840
  In a theory context, this is usually augmented by a path prefix
wenzelm@20476
   841
  consisting of the theory name.
wenzelm@20476
   842
haftmann@33174
   843
  \item @{ML Name_Space.add_path}~@{text "path naming"} augments the
wenzelm@20488
   844
  naming policy by extending its path component.
wenzelm@20437
   845
haftmann@33174
   846
  \item @{ML Name_Space.full_name}~@{text "naming binding"} turns a
wenzelm@30281
   847
  name binding (usually a basic name) into the fully qualified
haftmann@29008
   848
  internal name, according to the given naming policy.
wenzelm@20476
   849
haftmann@33174
   850
  \item @{ML_type Name_Space.T} represents name spaces.
wenzelm@20476
   851
haftmann@33174
   852
  \item @{ML Name_Space.empty}~@{text "kind"} and @{ML Name_Space.merge}~@{text
wenzelm@20488
   853
  "(space\<^isub>1, space\<^isub>2)"} are the canonical operations for
wenzelm@20488
   854
  maintaining name spaces according to theory data management
haftmann@33174
   855
  (\secref{sec:context-data}); @{text "kind"} is a formal comment
haftmann@33174
   856
  to characterize the purpose of a name space.
wenzelm@20437
   857
haftmann@33174
   858
  \item @{ML Name_Space.declare}~@{text "strict naming bindings
haftmann@33174
   859
  space"} enters a name binding as fully qualified internal name into
haftmann@33174
   860
  the name space, with external accesses determined by the naming
haftmann@33174
   861
  policy.
wenzelm@20476
   862
haftmann@33174
   863
  \item @{ML Name_Space.intern}~@{text "space name"} internalizes a
wenzelm@20476
   864
  (partially qualified) external name.
wenzelm@20437
   865
wenzelm@20488
   866
  This operation is mostly for parsing!  Note that fully qualified
wenzelm@20476
   867
  names stemming from declarations are produced via @{ML
haftmann@33174
   868
  "Name_Space.full_name"} and @{ML "Name_Space.declare"}
haftmann@29008
   869
  (or their derivatives for @{ML_type theory} and
wenzelm@20488
   870
  @{ML_type Proof.context}).
wenzelm@20437
   871
haftmann@33174
   872
  \item @{ML Name_Space.extern}~@{text "space name"} externalizes a
wenzelm@20476
   873
  (fully qualified) internal name.
wenzelm@20476
   874
wenzelm@30281
   875
  This operation is mostly for printing!  User code should not rely on
wenzelm@30281
   876
  the precise result too much.
wenzelm@20476
   877
wenzelm@20476
   878
  \end{description}
wenzelm@20476
   879
*}
wenzelm@30272
   880
wenzelm@18537
   881
end