more standard document preparation within session context;
more uniform document build;
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc-src/IsarImplementation/Base.thy Mon Aug 27 17:11:55 2012 +0200
@@ -0,0 +1,8 @@
+theory Base
+imports Main
+begin
+
+ML_file "../antiquote_setup.ML"
+setup {* Antiquote_Setup.setup *}
+
+end
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc-src/IsarImplementation/Eq.thy Mon Aug 27 17:11:55 2012 +0200
@@ -0,0 +1,85 @@
+theory Eq
+imports Base
+begin
+
+chapter {* Equational reasoning *}
+
+text {* Equality is one of the most fundamental concepts of
+ mathematics. The Isabelle/Pure logic (\chref{ch:logic}) provides a
+ builtin relation @{text "\<equiv> :: \<alpha> \<Rightarrow> \<alpha> \<Rightarrow> prop"} that expresses equality
+ of arbitrary terms (or propositions) at the framework level, as
+ expressed by certain basic inference rules (\secref{sec:eq-rules}).
+
+ Equational reasoning means to replace equals by equals, using
+ reflexivity and transitivity to form chains of replacement steps,
+ and congruence rules to access sub-structures. Conversions
+ (\secref{sec:conv}) provide a convenient framework to compose basic
+ equational steps to build specific equational reasoning tools.
+
+ Higher-order matching is able to provide suitable instantiations for
+ giving equality rules, which leads to the versatile concept of
+ @{text "\<lambda>"}-term rewriting (\secref{sec:rewriting}). Internally
+ this is based on the general-purpose Simplifier engine of Isabelle,
+ which is more specific and more efficient than plain conversions.
+
+ Object-logics usually introduce specific notions of equality or
+ equivalence, and relate it with the Pure equality. This enables to
+ re-use the Pure tools for equational reasoning for particular
+ object-logic connectives as well.
+*}
+
+
+section {* Basic equality rules \label{sec:eq-rules} *}
+
+text {* FIXME *}
+
+
+section {* Conversions \label{sec:conv} *}
+
+text {* FIXME *}
+
+
+section {* Rewriting \label{sec:rewriting} *}
+
+text {* Rewriting normalizes a given term (theorem or goal) by
+ replacing instances of given equalities @{text "t \<equiv> u"} in subterms.
+ Rewriting continues until no rewrites are applicable to any subterm.
+ This may be used to unfold simple definitions of the form @{text "f
+ x\<^sub>1 \<dots> x\<^sub>n \<equiv> u"}, but is slightly more general than that.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML rewrite_rule: "thm list -> thm -> thm"} \\
+ @{index_ML rewrite_goals_rule: "thm list -> thm -> thm"} \\
+ @{index_ML rewrite_goal_tac: "thm list -> int -> tactic"} \\
+ @{index_ML rewrite_goals_tac: "thm list -> tactic"} \\
+ @{index_ML fold_goals_tac: "thm list -> tactic"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML rewrite_rule}~@{text "rules thm"} rewrites the whole
+ theorem by the given rules.
+
+ \item @{ML rewrite_goals_rule}~@{text "rules thm"} rewrites the
+ outer premises of the given theorem. Interpreting the same as a
+ goal state (\secref{sec:tactical-goals}) it means to rewrite all
+ subgoals (in the same manner as @{ML rewrite_goals_tac}).
+
+ \item @{ML rewrite_goal_tac}~@{text "rules i"} rewrites subgoal
+ @{text "i"} by the given rewrite rules.
+
+ \item @{ML rewrite_goals_tac}~@{text "rules"} rewrites all subgoals
+ by the given rewrite rules.
+
+ \item @{ML fold_goals_tac}~@{text "rules"} essentially uses @{ML
+ rewrite_goals_tac} with the symmetric form of each member of @{text
+ "rules"}, re-ordered to fold longer expression first. This supports
+ to idea to fold primitive definitions that appear in expended form
+ in the proof state.
+
+ \end{description}
+*}
+
+end
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc-src/IsarImplementation/Integration.thy Mon Aug 27 17:11:55 2012 +0200
@@ -0,0 +1,307 @@
+theory Integration
+imports Base
+begin
+
+chapter {* System integration *}
+
+section {* Isar toplevel \label{sec:isar-toplevel} *}
+
+text {* The Isar toplevel may be considered the centeral hub of the
+ Isabelle/Isar system, where all key components and sub-systems are
+ integrated into a single read-eval-print loop of Isar commands,
+ which also incorporates the underlying ML compiler.
+
+ Isabelle/Isar departs from the original ``LCF system architecture''
+ where ML was really The Meta Language for defining theories and
+ conducting proofs. Instead, ML now only serves as the
+ implementation language for the system (and user extensions), while
+ the specific Isar toplevel supports the concepts of theory and proof
+ development natively. This includes the graph structure of theories
+ and the block structure of proofs, support for unlimited undo,
+ facilities for tracing, debugging, timing, profiling etc.
+
+ \medskip The toplevel maintains an implicit state, which is
+ transformed by a sequence of transitions -- either interactively or
+ in batch-mode. In interactive mode, Isar state transitions are
+ encapsulated as safe transactions, such that both failure and undo
+ are handled conveniently without destroying the underlying draft
+ theory (cf.~\secref{sec:context-theory}). In batch mode,
+ transitions operate in a linear (destructive) fashion, such that
+ error conditions abort the present attempt to construct a theory or
+ proof altogether.
+
+ The toplevel state is a disjoint sum of empty @{text toplevel}, or
+ @{text theory}, or @{text proof}. On entering the main Isar loop we
+ start with an empty toplevel. A theory is commenced by giving a
+ @{text \<THEORY>} header; within a theory we may issue theory
+ commands such as @{text \<DEFINITION>}, or state a @{text
+ \<THEOREM>} to be proven. Now we are within a proof state, with a
+ rich collection of Isar proof commands for structured proof
+ composition, or unstructured proof scripts. When the proof is
+ concluded we get back to the theory, which is then updated by
+ storing the resulting fact. Further theory declarations or theorem
+ statements with proofs may follow, until we eventually conclude the
+ theory development by issuing @{text \<END>}. The resulting theory
+ is then stored within the theory database and we are back to the
+ empty toplevel.
+
+ In addition to these proper state transformations, there are also
+ some diagnostic commands for peeking at the toplevel state without
+ modifying it (e.g.\ \isakeyword{thm}, \isakeyword{term},
+ \isakeyword{print-cases}).
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type Toplevel.state} \\
+ @{index_ML Toplevel.UNDEF: "exn"} \\
+ @{index_ML Toplevel.is_toplevel: "Toplevel.state -> bool"} \\
+ @{index_ML Toplevel.theory_of: "Toplevel.state -> theory"} \\
+ @{index_ML Toplevel.proof_of: "Toplevel.state -> Proof.state"} \\
+ @{index_ML Toplevel.debug: "bool Unsynchronized.ref"} \\
+ @{index_ML Toplevel.timing: "bool Unsynchronized.ref"} \\
+ @{index_ML Toplevel.profiling: "int Unsynchronized.ref"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type Toplevel.state} represents Isar toplevel
+ states, which are normally manipulated through the concept of
+ toplevel transitions only (\secref{sec:toplevel-transition}). Also
+ note that a raw toplevel state is subject to the same linearity
+ restrictions as a theory context (cf.~\secref{sec:context-theory}).
+
+ \item @{ML Toplevel.UNDEF} is raised for undefined toplevel
+ operations. Many operations work only partially for certain cases,
+ since @{ML_type Toplevel.state} is a sum type.
+
+ \item @{ML Toplevel.is_toplevel}~@{text "state"} checks for an empty
+ toplevel state.
+
+ \item @{ML Toplevel.theory_of}~@{text "state"} selects the
+ background theory of @{text "state"}, raises @{ML Toplevel.UNDEF}
+ for an empty toplevel state.
+
+ \item @{ML Toplevel.proof_of}~@{text "state"} selects the Isar proof
+ state if available, otherwise raises @{ML Toplevel.UNDEF}.
+
+ \item @{ML "Toplevel.debug := true"} makes the toplevel print further
+ details about internal error conditions, exceptions being raised
+ etc.
+
+ \item @{ML "Toplevel.timing := true"} makes the toplevel print timing
+ information for each Isar command being executed.
+
+ \item @{ML Toplevel.profiling}~@{ML_text ":="}~@{text "n"} controls
+ low-level profiling of the underlying ML runtime system. For
+ Poly/ML, @{text "n = 1"} means time and @{text "n = 2"} space
+ profiling.
+
+ \end{description}
+*}
+
+text %mlantiq {*
+ \begin{matharray}{rcl}
+ @{ML_antiquotation_def "Isar.state"} & : & @{text ML_antiquotation} \\
+ \end{matharray}
+
+ \begin{description}
+
+ \item @{text "@{Isar.state}"} refers to Isar toplevel state at that
+ point --- as abstract value.
+
+ This only works for diagnostic ML commands, such as @{command
+ ML_val} or @{command ML_command}.
+
+ \end{description}
+*}
+
+
+subsection {* Toplevel transitions \label{sec:toplevel-transition} *}
+
+text {*
+ An Isar toplevel transition consists of a partial function on the
+ toplevel state, with additional information for diagnostics and
+ error reporting: there are fields for command name, source position,
+ optional source text, as well as flags for interactive-only commands
+ (which issue a warning in batch-mode), printing of result state,
+ etc.
+
+ The operational part is represented as the sequential union of a
+ list of partial functions, which are tried in turn until the first
+ one succeeds. This acts like an outer case-expression for various
+ alternative state transitions. For example, \isakeyword{qed} works
+ differently for a local proofs vs.\ the global ending of the main
+ proof.
+
+ Toplevel transitions are composed via transition transformers.
+ Internally, Isar commands are put together from an empty transition
+ extended by name and source position. It is then left to the
+ individual command parser to turn the given concrete syntax into a
+ suitable transition transformer that adjoins actual operations on a
+ theory or proof state etc.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML Toplevel.print: "Toplevel.transition -> Toplevel.transition"} \\
+ @{index_ML Toplevel.no_timing: "Toplevel.transition -> Toplevel.transition"} \\
+ @{index_ML Toplevel.keep: "(Toplevel.state -> unit) ->
+ Toplevel.transition -> Toplevel.transition"} \\
+ @{index_ML Toplevel.theory: "(theory -> theory) ->
+ Toplevel.transition -> Toplevel.transition"} \\
+ @{index_ML Toplevel.theory_to_proof: "(theory -> Proof.state) ->
+ Toplevel.transition -> Toplevel.transition"} \\
+ @{index_ML Toplevel.proof: "(Proof.state -> Proof.state) ->
+ Toplevel.transition -> Toplevel.transition"} \\
+ @{index_ML Toplevel.proofs: "(Proof.state -> Proof.state Seq.seq) ->
+ Toplevel.transition -> Toplevel.transition"} \\
+ @{index_ML Toplevel.end_proof: "(bool -> Proof.state -> Proof.context) ->
+ Toplevel.transition -> Toplevel.transition"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML Toplevel.print}~@{text "tr"} sets the print flag, which
+ causes the toplevel loop to echo the result state (in interactive
+ mode).
+
+ \item @{ML Toplevel.no_timing}~@{text "tr"} indicates that the
+ transition should never show timing information, e.g.\ because it is
+ a diagnostic command.
+
+ \item @{ML Toplevel.keep}~@{text "tr"} adjoins a diagnostic
+ function.
+
+ \item @{ML Toplevel.theory}~@{text "tr"} adjoins a theory
+ transformer.
+
+ \item @{ML Toplevel.theory_to_proof}~@{text "tr"} adjoins a global
+ goal function, which turns a theory into a proof state. The theory
+ may be changed before entering the proof; the generic Isar goal
+ setup includes an argument that specifies how to apply the proven
+ result to the theory, when the proof is finished.
+
+ \item @{ML Toplevel.proof}~@{text "tr"} adjoins a deterministic
+ proof command, with a singleton result.
+
+ \item @{ML Toplevel.proofs}~@{text "tr"} adjoins a general proof
+ command, with zero or more result states (represented as a lazy
+ list).
+
+ \item @{ML Toplevel.end_proof}~@{text "tr"} adjoins a concluding
+ proof command, that returns the resulting theory, after storing the
+ resulting facts in the context etc.
+
+ \end{description}
+*}
+
+
+section {* Theory database \label{sec:theory-database} *}
+
+text {*
+ The theory database maintains a collection of theories, together
+ with some administrative information about their original sources,
+ which are held in an external store (i.e.\ some directory within the
+ regular file system).
+
+ The theory database is organized as a directed acyclic graph;
+ entries are referenced by theory name. Although some additional
+ interfaces allow to include a directory specification as well, this
+ is only a hint to the underlying theory loader. The internal theory
+ name space is flat!
+
+ Theory @{text A} is associated with the main theory file @{text
+ A}\verb,.thy,, which needs to be accessible through the theory
+ loader path. Any number of additional ML source files may be
+ associated with each theory, by declaring these dependencies in the
+ theory header as @{text \<USES>}, and loading them consecutively
+ within the theory context. The system keeps track of incoming ML
+ sources and associates them with the current theory.
+
+ The basic internal actions of the theory database are @{text
+ "update"} and @{text "remove"}:
+
+ \begin{itemize}
+
+ \item @{text "update A"} introduces a link of @{text "A"} with a
+ @{text "theory"} value of the same name; it asserts that the theory
+ sources are now consistent with that value;
+
+ \item @{text "remove A"} deletes entry @{text "A"} from the theory
+ database.
+
+ \end{itemize}
+
+ These actions are propagated to sub- or super-graphs of a theory
+ entry as expected, in order to preserve global consistency of the
+ state of all loaded theories with the sources of the external store.
+ This implies certain causalities between actions: @{text "update"}
+ or @{text "remove"} of an entry will @{text "remove"} all
+ descendants.
+
+ \medskip There are separate user-level interfaces to operate on the
+ theory database directly or indirectly. The primitive actions then
+ just happen automatically while working with the system. In
+ particular, processing a theory header @{text "\<THEORY> A
+ \<IMPORTS> B\<^sub>1 \<dots> B\<^sub>n \<BEGIN>"} ensures that the
+ sub-graph of the collective imports @{text "B\<^sub>1 \<dots> B\<^sub>n"}
+ is up-to-date, too. Earlier theories are reloaded as required, with
+ @{text update} actions proceeding in topological order according to
+ theory dependencies. There may be also a wave of implied @{text
+ remove} actions for derived theory nodes until a stable situation
+ is achieved eventually.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML use_thy: "string -> unit"} \\
+ @{index_ML use_thys: "string list -> unit"} \\
+ @{index_ML Thy_Info.get_theory: "string -> theory"} \\
+ @{index_ML Thy_Info.remove_thy: "string -> unit"} \\[1ex]
+ @{index_ML Thy_Info.register_thy: "theory -> unit"} \\[1ex]
+ @{ML_text "datatype action = Update | Remove"} \\
+ @{index_ML Thy_Info.add_hook: "(Thy_Info.action -> string -> unit) -> unit"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML use_thy}~@{text A} ensures that theory @{text A} is fully
+ up-to-date wrt.\ the external file store, reloading outdated
+ ancestors as required. In batch mode, the simultaneous @{ML
+ use_thys} should be used exclusively.
+
+ \item @{ML use_thys} is similar to @{ML use_thy}, but handles
+ several theories simultaneously. Thus it acts like processing the
+ import header of a theory, without performing the merge of the
+ result. By loading a whole sub-graph of theories like that, the
+ intrinsic parallelism can be exploited by the system, to speedup
+ loading.
+
+ \item @{ML Thy_Info.get_theory}~@{text A} retrieves the theory value
+ presently associated with name @{text A}. Note that the result
+ might be outdated.
+
+ \item @{ML Thy_Info.remove_thy}~@{text A} deletes theory @{text A} and all
+ descendants from the theory database.
+
+ \item @{ML Thy_Info.register_thy}~@{text "text thy"} registers an
+ existing theory value with the theory loader database and updates
+ source version information according to the current file-system
+ state.
+
+ \item @{ML "Thy_Info.add_hook"}~@{text f} registers function @{text
+ f} as a hook for theory database actions. The function will be
+ invoked with the action and theory name being involved; thus derived
+ actions may be performed in associated system components, e.g.\
+ maintaining the state of an editor for the theory sources.
+
+ The kind and order of actions occurring in practice depends both on
+ user interactions and the internal process of resolving theory
+ imports. Hooks should not rely on a particular policy here! Any
+ exceptions raised by the hook are ignored.
+
+ \end{description}
+*}
+
+end
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc-src/IsarImplementation/Isar.thy Mon Aug 27 17:11:55 2012 +0200
@@ -0,0 +1,584 @@
+theory Isar
+imports Base
+begin
+
+chapter {* Isar language elements *}
+
+text {* The Isar proof language (see also
+ \cite[\S2]{isabelle-isar-ref}) consists of three main categories of
+ language elements as follows.
+
+ \begin{enumerate}
+
+ \item Proof \emph{commands} define the primary language of
+ transactions of the underlying Isar/VM interpreter. Typical
+ examples are @{command "fix"}, @{command "assume"}, @{command
+ "show"}, @{command "proof"}, and @{command "qed"}.
+
+ Composing proof commands according to the rules of the Isar/VM leads
+ to expressions of structured proof text, such that both the machine
+ and the human reader can give it a meaning as formal reasoning.
+
+ \item Proof \emph{methods} define a secondary language of mixed
+ forward-backward refinement steps involving facts and goals.
+ Typical examples are @{method rule}, @{method unfold}, and @{method
+ simp}.
+
+ Methods can occur in certain well-defined parts of the Isar proof
+ language, say as arguments to @{command "proof"}, @{command "qed"},
+ or @{command "by"}.
+
+ \item \emph{Attributes} define a tertiary language of small
+ annotations to theorems being defined or referenced. Attributes can
+ modify both the context and the theorem.
+
+ Typical examples are @{attribute intro} (which affects the context),
+ and @{attribute symmetric} (which affects the theorem).
+
+ \end{enumerate}
+*}
+
+
+section {* Proof commands *}
+
+text {* A \emph{proof command} is state transition of the Isar/VM
+ proof interpreter.
+
+ In principle, Isar proof commands could be defined in user-space as
+ well. The system is built like that in the first place: one part of
+ the commands are primitive, the other part is defined as derived
+ elements. Adding to the genuine structured proof language requires
+ profound understanding of the Isar/VM machinery, though, so this is
+ beyond the scope of this manual.
+
+ What can be done realistically is to define some diagnostic commands
+ that inspect the general state of the Isar/VM, and report some
+ feedback to the user. Typically this involves checking of the
+ linguistic \emph{mode} of a proof state, or peeking at the pending
+ goals (if available).
+
+ Another common application is to define a toplevel command that
+ poses a problem to the user as Isar proof state and processes the
+ final result relatively to the context. Thus a proof can be
+ incorporated into the context of some user-space tool, without
+ modifying the Isar proof language itself. *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type Proof.state} \\
+ @{index_ML Proof.assert_forward: "Proof.state -> Proof.state"} \\
+ @{index_ML Proof.assert_chain: "Proof.state -> Proof.state"} \\
+ @{index_ML Proof.assert_backward: "Proof.state -> Proof.state"} \\
+ @{index_ML Proof.simple_goal: "Proof.state -> {context: Proof.context, goal: thm}"} \\
+ @{index_ML Proof.goal: "Proof.state ->
+ {context: Proof.context, facts: thm list, goal: thm}"} \\
+ @{index_ML Proof.raw_goal: "Proof.state ->
+ {context: Proof.context, facts: thm list, goal: thm}"} \\
+ @{index_ML Proof.theorem: "Method.text option ->
+ (thm list list -> Proof.context -> Proof.context) ->
+ (term * term list) list list -> Proof.context -> Proof.state"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type Proof.state} represents Isar proof states.
+ This is a block-structured configuration with proof context,
+ linguistic mode, and optional goal. The latter consists of goal
+ context, goal facts (``@{text "using"}''), and tactical goal state
+ (see \secref{sec:tactical-goals}).
+
+ The general idea is that the facts shall contribute to the
+ refinement of some parts of the tactical goal --- how exactly is
+ defined by the proof method that is applied in that situation.
+
+ \item @{ML Proof.assert_forward}, @{ML Proof.assert_chain}, @{ML
+ Proof.assert_backward} are partial identity functions that fail
+ unless a certain linguistic mode is active, namely ``@{text
+ "proof(state)"}'', ``@{text "proof(chain)"}'', ``@{text
+ "proof(prove)"}'', respectively (using the terminology of
+ \cite{isabelle-isar-ref}).
+
+ It is advisable study the implementations of existing proof commands
+ for suitable modes to be asserted.
+
+ \item @{ML Proof.simple_goal}~@{text "state"} returns the structured
+ Isar goal (if available) in the form seen by ``simple'' methods
+ (like @{method simp} or @{method blast}). The Isar goal facts are
+ already inserted as premises into the subgoals, which are presented
+ individually as in @{ML Proof.goal}.
+
+ \item @{ML Proof.goal}~@{text "state"} returns the structured Isar
+ goal (if available) in the form seen by regular methods (like
+ @{method rule}). The auxiliary internal encoding of Pure
+ conjunctions is split into individual subgoals as usual.
+
+ \item @{ML Proof.raw_goal}~@{text "state"} returns the structured
+ Isar goal (if available) in the raw internal form seen by ``raw''
+ methods (like @{method induct}). This form is rarely appropriate
+ for dignostic tools; @{ML Proof.simple_goal} or @{ML Proof.goal}
+ should be used in most situations.
+
+ \item @{ML Proof.theorem}~@{text "before_qed after_qed statement ctxt"}
+ initializes a toplevel Isar proof state within a given context.
+
+ The optional @{text "before_qed"} method is applied at the end of
+ the proof, just before extracting the result (this feature is rarely
+ used).
+
+ The @{text "after_qed"} continuation receives the extracted result
+ in order to apply it to the final context in a suitable way (e.g.\
+ storing named facts). Note that at this generic level the target
+ context is specified as @{ML_type Proof.context}, but the usual
+ wrapping of toplevel proofs into command transactions will provide a
+ @{ML_type local_theory} here (\chref{ch:local-theory}). This
+ affects the way how results are stored.
+
+ The @{text "statement"} is given as a nested list of terms, each
+ associated with optional @{keyword "is"} patterns as usual in the
+ Isar source language. The original nested list structure over terms
+ is turned into one over theorems when @{text "after_qed"} is
+ invoked.
+
+ \end{description}
+*}
+
+
+text %mlantiq {*
+ \begin{matharray}{rcl}
+ @{ML_antiquotation_def "Isar.goal"} & : & @{text ML_antiquotation} \\
+ \end{matharray}
+
+ \begin{description}
+
+ \item @{text "@{Isar.goal}"} refers to the regular goal state (if
+ available) of the current proof state managed by the Isar toplevel
+ --- as abstract value.
+
+ This only works for diagnostic ML commands, such as @{command
+ ML_val} or @{command ML_command}.
+
+ \end{description}
+*}
+
+text %mlex {* The following example peeks at a certain goal configuration. *}
+
+notepad
+begin
+ have A and B and C
+ ML_val {*
+ val n = Thm.nprems_of (#goal @{Isar.goal});
+ @{assert} (n = 3);
+ *}
+ oops
+
+text {* Here we see 3 individual subgoals in the same way as regular
+ proof methods would do. *}
+
+
+section {* Proof methods *}
+
+text {* A @{text "method"} is a function @{text "context \<rightarrow> thm\<^sup>* \<rightarrow> goal
+ \<rightarrow> (cases \<times> goal)\<^sup>*\<^sup>*"} that operates on the full Isar goal
+ configuration with context, goal facts, and tactical goal state and
+ enumerates possible follow-up goal states, with the potential
+ addition of named extensions of the proof context (\emph{cases}).
+ The latter feature is rarely used.
+
+ This means a proof method is like a structurally enhanced tactic
+ (cf.\ \secref{sec:tactics}). The well-formedness conditions for
+ tactics need to hold for methods accordingly, with the following
+ additions.
+
+ \begin{itemize}
+
+ \item Goal addressing is further limited either to operate either
+ uniformly on \emph{all} subgoals, or specifically on the
+ \emph{first} subgoal.
+
+ Exception: old-style tactic emulations that are embedded into the
+ method space, e.g.\ @{method rule_tac}.
+
+ \item A non-trivial method always needs to make progress: an
+ identical follow-up goal state has to be avoided.\footnote{This
+ enables the user to write method expressions like @{text "meth\<^sup>+"}
+ without looping, while the trivial do-nothing case can be recovered
+ via @{text "meth\<^sup>?"}.}
+
+ Exception: trivial stuttering steps, such as ``@{method -}'' or
+ @{method succeed}.
+
+ \item Goal facts passed to the method must not be ignored. If there
+ is no sensible use of facts outside the goal state, facts should be
+ inserted into the subgoals that are addressed by the method.
+
+ \end{itemize}
+
+ \medskip Syntactically, the language of proof methods appears as
+ arguments to Isar commands like @{command "by"} or @{command apply}.
+ User-space additions are reasonably easy by plugging suitable
+ method-valued parser functions into the framework, using the
+ @{command method_setup} command, for example.
+
+ To get a better idea about the range of possibilities, consider the
+ following Isar proof schemes. This is the general form of
+ structured proof text:
+
+ \medskip
+ \begin{tabular}{l}
+ @{command from}~@{text "facts\<^sub>1"}~@{command have}~@{text "props"}~@{command using}~@{text "facts\<^sub>2"} \\
+ @{command proof}~@{text "(initial_method)"} \\
+ \quad@{text "body"} \\
+ @{command qed}~@{text "(terminal_method)"} \\
+ \end{tabular}
+ \medskip
+
+ The goal configuration consists of @{text "facts\<^sub>1"} and
+ @{text "facts\<^sub>2"} appended in that order, and various @{text
+ "props"} being claimed. The @{text "initial_method"} is invoked
+ with facts and goals together and refines the problem to something
+ that is handled recursively in the proof @{text "body"}. The @{text
+ "terminal_method"} has another chance to finish any remaining
+ subgoals, but it does not see the facts of the initial step.
+
+ \medskip This pattern illustrates unstructured proof scripts:
+
+ \medskip
+ \begin{tabular}{l}
+ @{command have}~@{text "props"} \\
+ \quad@{command using}~@{text "facts\<^sub>1"}~@{command apply}~@{text "method\<^sub>1"} \\
+ \quad@{command apply}~@{text "method\<^sub>2"} \\
+ \quad@{command using}~@{text "facts\<^sub>3"}~@{command apply}~@{text "method\<^sub>3"} \\
+ \quad@{command done} \\
+ \end{tabular}
+ \medskip
+
+ The @{text "method\<^sub>1"} operates on the original claim while
+ using @{text "facts\<^sub>1"}. Since the @{command apply} command
+ structurally resets the facts, the @{text "method\<^sub>2"} will
+ operate on the remaining goal state without facts. The @{text
+ "method\<^sub>3"} will see again a collection of @{text
+ "facts\<^sub>3"} that has been inserted into the script explicitly.
+
+ \medskip Empirically, any Isar proof method can be categorized as
+ follows.
+
+ \begin{enumerate}
+
+ \item \emph{Special method with cases} with named context additions
+ associated with the follow-up goal state.
+
+ Example: @{method "induct"}, which is also a ``raw'' method since it
+ operates on the internal representation of simultaneous claims as
+ Pure conjunction (\secref{sec:logic-aux}), instead of separate
+ subgoals (\secref{sec:tactical-goals}).
+
+ \item \emph{Structured method} with strong emphasis on facts outside
+ the goal state.
+
+ Example: @{method "rule"}, which captures the key ideas behind
+ structured reasoning in Isar in purest form.
+
+ \item \emph{Simple method} with weaker emphasis on facts, which are
+ inserted into subgoals to emulate old-style tactical as
+ ``premises''.
+
+ Examples: @{method "simp"}, @{method "blast"}, @{method "auto"}.
+
+ \item \emph{Old-style tactic emulation} with detailed numeric goal
+ addressing and explicit references to entities of the internal goal
+ state (which are otherwise invisible from proper Isar proof text).
+ The naming convention @{text "foo_tac"} makes this special
+ non-standard status clear.
+
+ Example: @{method "rule_tac"}.
+
+ \end{enumerate}
+
+ When implementing proof methods, it is advisable to study existing
+ implementations carefully and imitate the typical ``boiler plate''
+ for context-sensitive parsing and further combinators to wrap-up
+ tactic expressions as methods.\footnote{Aliases or abbreviations of
+ the standard method combinators should be avoided. Note that from
+ Isabelle99 until Isabelle2009 the system did provide various odd
+ combinations of method wrappers that made user applications more
+ complicated than necessary.}
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type Proof.method} \\
+ @{index_ML METHOD_CASES: "(thm list -> cases_tactic) -> Proof.method"} \\
+ @{index_ML METHOD: "(thm list -> tactic) -> Proof.method"} \\
+ @{index_ML SIMPLE_METHOD: "tactic -> Proof.method"} \\
+ @{index_ML SIMPLE_METHOD': "(int -> tactic) -> Proof.method"} \\
+ @{index_ML Method.insert_tac: "thm list -> int -> tactic"} \\
+ @{index_ML Method.setup: "binding -> (Proof.context -> Proof.method) context_parser ->
+ string -> theory -> theory"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type Proof.method} represents proof methods as
+ abstract type.
+
+ \item @{ML METHOD_CASES}~@{text "(fn facts => cases_tactic)"} wraps
+ @{text cases_tactic} depending on goal facts as proof method with
+ cases; the goal context is passed via method syntax.
+
+ \item @{ML METHOD}~@{text "(fn facts => tactic)"} wraps @{text
+ tactic} depending on goal facts as regular proof method; the goal
+ context is passed via method syntax.
+
+ \item @{ML SIMPLE_METHOD}~@{text "tactic"} wraps a tactic that
+ addresses all subgoals uniformly as simple proof method. Goal facts
+ are already inserted into all subgoals before @{text "tactic"} is
+ applied.
+
+ \item @{ML SIMPLE_METHOD'}~@{text "tactic"} wraps a tactic that
+ addresses a specific subgoal as simple proof method that operates on
+ subgoal 1. Goal facts are inserted into the subgoal then the @{text
+ "tactic"} is applied.
+
+ \item @{ML Method.insert_tac}~@{text "facts i"} inserts @{text
+ "facts"} into subgoal @{text "i"}. This is convenient to reproduce
+ part of the @{ML SIMPLE_METHOD} or @{ML SIMPLE_METHOD'} wrapping
+ within regular @{ML METHOD}, for example.
+
+ \item @{ML Method.setup}~@{text "name parser description"} provides
+ the functionality of the Isar command @{command method_setup} as ML
+ function.
+
+ \end{description}
+*}
+
+text %mlex {* See also @{command method_setup} in
+ \cite{isabelle-isar-ref} which includes some abstract examples.
+
+ \medskip The following toy examples illustrate how the goal facts
+ and state are passed to proof methods. The pre-defined proof method
+ called ``@{method tactic}'' wraps ML source of type @{ML_type
+ tactic} (abstracted over @{ML_text facts}). This allows immediate
+ experimentation without parsing of concrete syntax. *}
+
+notepad
+begin
+ assume a: A and b: B
+
+ have "A \<and> B"
+ apply (tactic {* rtac @{thm conjI} 1 *})
+ using a apply (tactic {* resolve_tac facts 1 *})
+ using b apply (tactic {* resolve_tac facts 1 *})
+ done
+
+ have "A \<and> B"
+ using a and b
+ ML_val "@{Isar.goal}"
+ apply (tactic {* Method.insert_tac facts 1 *})
+ apply (tactic {* (rtac @{thm conjI} THEN_ALL_NEW atac) 1 *})
+ done
+end
+
+text {* \medskip The next example implements a method that simplifies
+ the first subgoal by rewrite rules given as arguments. *}
+
+method_setup my_simp = {*
+ Attrib.thms >> (fn thms => fn ctxt =>
+ SIMPLE_METHOD' (fn i =>
+ CHANGED (asm_full_simp_tac
+ (HOL_basic_ss addsimps thms) i)))
+*} "rewrite subgoal by given rules"
+
+text {* The concrete syntax wrapping of @{command method_setup} always
+ passes-through the proof context at the end of parsing, but it is
+ not used in this example.
+
+ The @{ML Attrib.thms} parser produces a list of theorems from the
+ usual Isar syntax involving attribute expressions etc.\ (syntax
+ category @{syntax thmrefs}) \cite{isabelle-isar-ref}. The resulting
+ @{ML_text thms} are added to @{ML HOL_basic_ss} which already
+ contains the basic Simplifier setup for HOL.
+
+ The tactic @{ML asm_full_simp_tac} is the one that is also used in
+ method @{method simp} by default. The extra wrapping by the @{ML
+ CHANGED} tactical ensures progress of simplification: identical goal
+ states are filtered out explicitly to make the raw tactic conform to
+ standard Isar method behaviour.
+
+ \medskip Method @{method my_simp} can be used in Isar proofs like
+ this:
+*}
+
+notepad
+begin
+ fix a b c
+ assume a: "a = b"
+ assume b: "b = c"
+ have "a = c" by (my_simp a b)
+end
+
+text {* Here is a similar method that operates on all subgoals,
+ instead of just the first one. *}
+
+method_setup my_simp_all = {*
+ Attrib.thms >> (fn thms => fn ctxt =>
+ SIMPLE_METHOD
+ (CHANGED
+ (ALLGOALS (asm_full_simp_tac
+ (HOL_basic_ss addsimps thms)))))
+*} "rewrite all subgoals by given rules"
+
+notepad
+begin
+ fix a b c
+ assume a: "a = b"
+ assume b: "b = c"
+ have "a = c" and "c = b" by (my_simp_all a b)
+end
+
+text {* \medskip Apart from explicit arguments, common proof methods
+ typically work with a default configuration provided by the context.
+ As a shortcut to rule management we use a cheap solution via functor
+ @{ML_functor Named_Thms} (see also @{file
+ "~~/src/Pure/Tools/named_thms.ML"}). *}
+
+ML {*
+ structure My_Simps =
+ Named_Thms
+ (val name = @{binding my_simp} val description = "my_simp rule")
+*}
+setup My_Simps.setup
+
+text {* This provides ML access to a list of theorems in canonical
+ declaration order via @{ML My_Simps.get}. The user can add or
+ delete rules via the attribute @{attribute my_simp}. The actual
+ proof method is now defined as before, but we append the explicit
+ arguments and the rules from the context. *}
+
+method_setup my_simp' = {*
+ Attrib.thms >> (fn thms => fn ctxt =>
+ SIMPLE_METHOD' (fn i =>
+ CHANGED (asm_full_simp_tac
+ (HOL_basic_ss addsimps (thms @ My_Simps.get ctxt)) i)))
+*} "rewrite subgoal by given rules and my_simp rules from the context"
+
+text {*
+ \medskip Method @{method my_simp'} can be used in Isar proofs
+ like this:
+*}
+
+notepad
+begin
+ fix a b c
+ assume [my_simp]: "a \<equiv> b"
+ assume [my_simp]: "b \<equiv> c"
+ have "a \<equiv> c" by my_simp'
+end
+
+text {* \medskip The @{method my_simp} variants defined above are
+ ``simple'' methods, i.e.\ the goal facts are merely inserted as goal
+ premises by the @{ML SIMPLE_METHOD'} or @{ML SIMPLE_METHOD} wrapper.
+ For proof methods that are similar to the standard collection of
+ @{method simp}, @{method blast}, @{method fast}, @{method auto}
+ there is little more that can be done.
+
+ Note that using the primary goal facts in the same manner as the
+ method arguments obtained via concrete syntax or the context does
+ not meet the requirement of ``strong emphasis on facts'' of regular
+ proof methods, because rewrite rules as used above can be easily
+ ignored. A proof text ``@{command using}~@{text "foo"}~@{command
+ "by"}~@{text "my_simp"}'' where @{text "foo"} is not used would
+ deceive the reader.
+
+ \medskip The technical treatment of rules from the context requires
+ further attention. Above we rebuild a fresh @{ML_type simpset} from
+ the arguments and \emph{all} rules retrieved from the context on
+ every invocation of the method. This does not scale to really large
+ collections of rules, which easily emerges in the context of a big
+ theory library, for example.
+
+ This is an inherent limitation of the simplistic rule management via
+ functor @{ML_functor Named_Thms}, because it lacks tool-specific
+ storage and retrieval. More realistic applications require
+ efficient index-structures that organize theorems in a customized
+ manner, such as a discrimination net that is indexed by the
+ left-hand sides of rewrite rules. For variations on the Simplifier,
+ re-use of the existing type @{ML_type simpset} is adequate, but
+ scalability would require it be maintained statically within the
+ context data, not dynamically on each tool invocation. *}
+
+
+section {* Attributes \label{sec:attributes} *}
+
+text {* An \emph{attribute} is a function @{text "context \<times> thm \<rightarrow>
+ context \<times> thm"}, which means both a (generic) context and a theorem
+ can be modified simultaneously. In practice this mixed form is very
+ rare, instead attributes are presented either as \emph{declaration
+ attribute:} @{text "thm \<rightarrow> context \<rightarrow> context"} or \emph{rule
+ attribute:} @{text "context \<rightarrow> thm \<rightarrow> thm"}.
+
+ Attributes can have additional arguments via concrete syntax. There
+ is a collection of context-sensitive parsers for various logical
+ entities (types, terms, theorems). These already take care of
+ applying morphisms to the arguments when attribute expressions are
+ moved into a different context (see also \secref{sec:morphisms}).
+
+ When implementing declaration attributes, it is important to operate
+ exactly on the variant of the generic context that is provided by
+ the system, which is either global theory context or local proof
+ context. In particular, the background theory of a local context
+ must not be modified in this situation! *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type attribute} \\
+ @{index_ML Thm.rule_attribute: "(Context.generic -> thm -> thm) -> attribute"} \\
+ @{index_ML Thm.declaration_attribute: "
+ (thm -> Context.generic -> Context.generic) -> attribute"} \\
+ @{index_ML Attrib.setup: "binding -> attribute context_parser ->
+ string -> theory -> theory"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type attribute} represents attributes as concrete
+ type alias.
+
+ \item @{ML Thm.rule_attribute}~@{text "(fn context => rule)"} wraps
+ a context-dependent rule (mapping on @{ML_type thm}) as attribute.
+
+ \item @{ML Thm.declaration_attribute}~@{text "(fn thm => decl)"}
+ wraps a theorem-dependent declaration (mapping on @{ML_type
+ Context.generic}) as attribute.
+
+ \item @{ML Attrib.setup}~@{text "name parser description"} provides
+ the functionality of the Isar command @{command attribute_setup} as
+ ML function.
+
+ \end{description}
+*}
+
+text %mlantiq {*
+ \begin{matharray}{rcl}
+ @{ML_antiquotation_def attributes} & : & @{text ML_antiquotation} \\
+ \end{matharray}
+
+ @{rail "
+ @@{ML_antiquotation attributes} attributes
+ "}
+
+ \begin{description}
+
+ \item @{text "@{attributes [\<dots>]}"} embeds attribute source
+ representation into the ML text, which is particularly useful with
+ declarations like @{ML Local_Theory.note}. Attribute names are
+ internalized at compile time, but the source is unevaluated. This
+ means attributes with formal arguments (types, terms, theorems) may
+ be subject to odd effects of dynamic scoping!
+
+ \end{description}
+*}
+
+text %mlex {* See also @{command attribute_setup} in
+ \cite{isabelle-isar-ref} which includes some abstract examples. *}
+
+end
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc-src/IsarImplementation/Local_Theory.thy Mon Aug 27 17:11:55 2012 +0200
@@ -0,0 +1,167 @@
+theory Local_Theory
+imports Base
+begin
+
+chapter {* Local theory specifications \label{ch:local-theory} *}
+
+text {*
+ A \emph{local theory} combines aspects of both theory and proof
+ context (cf.\ \secref{sec:context}), such that definitional
+ specifications may be given relatively to parameters and
+ assumptions. A local theory is represented as a regular proof
+ context, augmented by administrative data about the \emph{target
+ context}.
+
+ The target is usually derived from the background theory by adding
+ local @{text "\<FIX>"} and @{text "\<ASSUME>"} elements, plus
+ suitable modifications of non-logical context data (e.g.\ a special
+ type-checking discipline). Once initialized, the target is ready to
+ absorb definitional primitives: @{text "\<DEFINE>"} for terms and
+ @{text "\<NOTE>"} for theorems. Such definitions may get
+ transformed in a target-specific way, but the programming interface
+ hides such details.
+
+ Isabelle/Pure provides target mechanisms for locales, type-classes,
+ type-class instantiations, and general overloading. In principle,
+ users can implement new targets as well, but this rather arcane
+ discipline is beyond the scope of this manual. In contrast,
+ implementing derived definitional packages to be used within a local
+ theory context is quite easy: the interfaces are even simpler and
+ more abstract than the underlying primitives for raw theories.
+
+ Many definitional packages for local theories are available in
+ Isabelle. Although a few old packages only work for global
+ theories, the standard way of implementing definitional packages in
+ Isabelle is via the local theory interface.
+*}
+
+
+section {* Definitional elements *}
+
+text {*
+ There are separate elements @{text "\<DEFINE> c \<equiv> t"} for terms, and
+ @{text "\<NOTE> b = thm"} for theorems. Types are treated
+ implicitly, according to Hindley-Milner discipline (cf.\
+ \secref{sec:variables}). These definitional primitives essentially
+ act like @{text "let"}-bindings within a local context that may
+ already contain earlier @{text "let"}-bindings and some initial
+ @{text "\<lambda>"}-bindings. Thus we gain \emph{dependent definitions}
+ that are relative to an initial axiomatic context. The following
+ diagram illustrates this idea of axiomatic elements versus
+ definitional elements:
+
+ \begin{center}
+ \begin{tabular}{|l|l|l|}
+ \hline
+ & @{text "\<lambda>"}-binding & @{text "let"}-binding \\
+ \hline
+ types & fixed @{text "\<alpha>"} & arbitrary @{text "\<beta>"} \\
+ terms & @{text "\<FIX> x :: \<tau>"} & @{text "\<DEFINE> c \<equiv> t"} \\
+ theorems & @{text "\<ASSUME> a: A"} & @{text "\<NOTE> b = \<^BG>B\<^EN>"} \\
+ \hline
+ \end{tabular}
+ \end{center}
+
+ A user package merely needs to produce suitable @{text "\<DEFINE>"}
+ and @{text "\<NOTE>"} elements according to the application. For
+ example, a package for inductive definitions might first @{text
+ "\<DEFINE>"} a certain predicate as some fixed-point construction,
+ then @{text "\<NOTE>"} a proven result about monotonicity of the
+ functor involved here, and then produce further derived concepts via
+ additional @{text "\<DEFINE>"} and @{text "\<NOTE>"} elements.
+
+ The cumulative sequence of @{text "\<DEFINE>"} and @{text "\<NOTE>"}
+ produced at package runtime is managed by the local theory
+ infrastructure by means of an \emph{auxiliary context}. Thus the
+ system holds up the impression of working within a fully abstract
+ situation with hypothetical entities: @{text "\<DEFINE> c \<equiv> t"}
+ always results in a literal fact @{text "\<^BG>c \<equiv> t\<^EN>"}, where
+ @{text "c"} is a fixed variable @{text "c"}. The details about
+ global constants, name spaces etc. are handled internally.
+
+ So the general structure of a local theory is a sandwich of three
+ layers:
+
+ \begin{center}
+ \framebox{\quad auxiliary context \quad\framebox{\quad target context \quad\framebox{\quad background theory\quad}}}
+ \end{center}
+
+ When a definitional package is finished, the auxiliary context is
+ reset to the target context. The target now holds definitions for
+ terms and theorems that stem from the hypothetical @{text
+ "\<DEFINE>"} and @{text "\<NOTE>"} elements, transformed by the
+ particular target policy (see \cite[\S4--5]{Haftmann-Wenzel:2009}
+ for details). *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type local_theory: Proof.context} \\
+ @{index_ML Named_Target.init: "(local_theory -> local_theory) ->
+ string -> theory -> local_theory"} \\[1ex]
+ @{index_ML Local_Theory.define: "(binding * mixfix) * (Attrib.binding * term) ->
+ local_theory -> (term * (string * thm)) * local_theory"} \\
+ @{index_ML Local_Theory.note: "Attrib.binding * thm list ->
+ local_theory -> (string * thm list) * local_theory"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type local_theory} represents local theories.
+ Although this is merely an alias for @{ML_type Proof.context}, it is
+ semantically a subtype of the same: a @{ML_type local_theory} holds
+ target information as special context data. Subtyping means that
+ any value @{text "lthy:"}~@{ML_type local_theory} can be also used
+ with operations on expecting a regular @{text "ctxt:"}~@{ML_type
+ Proof.context}.
+
+ \item @{ML Named_Target.init}~@{text "before_exit name thy"}
+ initializes a local theory derived from the given background theory.
+ An empty name refers to a \emph{global theory} context, and a
+ non-empty name refers to a @{command locale} or @{command class}
+ context (a fully-qualified internal name is expected here). This is
+ useful for experimentation --- normally the Isar toplevel already
+ takes care to initialize the local theory context. The given @{text
+ "before_exit"} function is invoked before leaving the context; in
+ most situations plain identity @{ML I} is sufficient.
+
+ \item @{ML Local_Theory.define}~@{text "((b, mx), (a, rhs))
+ lthy"} defines a local entity according to the specification that is
+ given relatively to the current @{text "lthy"} context. In
+ particular the term of the RHS may refer to earlier local entities
+ from the auxiliary context, or hypothetical parameters from the
+ target context. The result is the newly defined term (which is
+ always a fixed variable with exactly the same name as specified for
+ the LHS), together with an equational theorem that states the
+ definition as a hypothetical fact.
+
+ Unless an explicit name binding is given for the RHS, the resulting
+ fact will be called @{text "b_def"}. Any given attributes are
+ applied to that same fact --- immediately in the auxiliary context
+ \emph{and} in any transformed versions stemming from target-specific
+ policies or any later interpretations of results from the target
+ context (think of @{command locale} and @{command interpretation},
+ for example). This means that attributes should be usually plain
+ declarations such as @{attribute simp}, while non-trivial rules like
+ @{attribute simplified} are better avoided.
+
+ \item @{ML Local_Theory.note}~@{text "(a, ths) lthy"} is
+ analogous to @{ML Local_Theory.define}, but defines facts instead of
+ terms. There is also a slightly more general variant @{ML
+ Local_Theory.notes} that defines several facts (with attribute
+ expressions) simultaneously.
+
+ This is essentially the internal version of the @{command lemmas}
+ command, or @{command declare} if an empty name binding is given.
+
+ \end{description}
+*}
+
+
+section {* Morphisms and declarations \label{sec:morphisms} *}
+
+text {* FIXME
+
+ \medskip See also \cite{Chaieb-Wenzel:2007}.
+*}
+
+end
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc-src/IsarImplementation/Logic.thy Mon Aug 27 17:11:55 2012 +0200
@@ -0,0 +1,1137 @@
+theory Logic
+imports Base
+begin
+
+chapter {* Primitive logic \label{ch:logic} *}
+
+text {*
+ The logical foundations of Isabelle/Isar are that of the Pure logic,
+ which has been introduced as a Natural Deduction framework in
+ \cite{paulson700}. This is essentially the same logic as ``@{text
+ "\<lambda>HOL"}'' in the more abstract setting of Pure Type Systems (PTS)
+ \cite{Barendregt-Geuvers:2001}, although there are some key
+ differences in the specific treatment of simple types in
+ Isabelle/Pure.
+
+ Following type-theoretic parlance, the Pure logic consists of three
+ levels of @{text "\<lambda>"}-calculus with corresponding arrows, @{text
+ "\<Rightarrow>"} for syntactic function space (terms depending on terms), @{text
+ "\<And>"} for universal quantification (proofs depending on terms), and
+ @{text "\<Longrightarrow>"} for implication (proofs depending on proofs).
+
+ Derivations are relative to a logical theory, which declares type
+ constructors, constants, and axioms. Theory declarations support
+ schematic polymorphism, which is strictly speaking outside the
+ logic.\footnote{This is the deeper logical reason, why the theory
+ context @{text "\<Theta>"} is separate from the proof context @{text "\<Gamma>"}
+ of the core calculus: type constructors, term constants, and facts
+ (proof constants) may involve arbitrary type schemes, but the type
+ of a locally fixed term parameter is also fixed!}
+*}
+
+
+section {* Types \label{sec:types} *}
+
+text {*
+ The language of types is an uninterpreted order-sorted first-order
+ algebra; types are qualified by ordered type classes.
+
+ \medskip A \emph{type class} is an abstract syntactic entity
+ declared in the theory context. The \emph{subclass relation} @{text
+ "c\<^isub>1 \<subseteq> c\<^isub>2"} is specified by stating an acyclic
+ generating relation; the transitive closure is maintained
+ internally. The resulting relation is an ordering: reflexive,
+ transitive, and antisymmetric.
+
+ A \emph{sort} is a list of type classes written as @{text "s = {c\<^isub>1,
+ \<dots>, c\<^isub>m}"}, it represents symbolic intersection. Notationally, the
+ curly braces are omitted for singleton intersections, i.e.\ any
+ class @{text "c"} may be read as a sort @{text "{c}"}. The ordering
+ on type classes is extended to sorts according to the meaning of
+ intersections: @{text "{c\<^isub>1, \<dots> c\<^isub>m} \<subseteq> {d\<^isub>1, \<dots>, d\<^isub>n}"} iff @{text
+ "\<forall>j. \<exists>i. c\<^isub>i \<subseteq> d\<^isub>j"}. The empty intersection @{text "{}"} refers to
+ the universal sort, which is the largest element wrt.\ the sort
+ order. Thus @{text "{}"} represents the ``full sort'', not the
+ empty one! The intersection of all (finitely many) classes declared
+ in the current theory is the least element wrt.\ the sort ordering.
+
+ \medskip A \emph{fixed type variable} is a pair of a basic name
+ (starting with a @{text "'"} character) and a sort constraint, e.g.\
+ @{text "('a, s)"} which is usually printed as @{text "\<alpha>\<^isub>s"}.
+ A \emph{schematic type variable} is a pair of an indexname and a
+ sort constraint, e.g.\ @{text "(('a, 0), s)"} which is usually
+ printed as @{text "?\<alpha>\<^isub>s"}.
+
+ Note that \emph{all} syntactic components contribute to the identity
+ of type variables: basic name, index, and sort constraint. The core
+ logic handles type variables with the same name but different sorts
+ as different, although the type-inference layer (which is outside
+ the core) rejects anything like that.
+
+ A \emph{type constructor} @{text "\<kappa>"} is a @{text "k"}-ary operator
+ on types declared in the theory. Type constructor application is
+ written postfix as @{text "(\<alpha>\<^isub>1, \<dots>, \<alpha>\<^isub>k)\<kappa>"}. For
+ @{text "k = 0"} the argument tuple is omitted, e.g.\ @{text "prop"}
+ instead of @{text "()prop"}. For @{text "k = 1"} the parentheses
+ are omitted, e.g.\ @{text "\<alpha> list"} instead of @{text "(\<alpha>)list"}.
+ Further notation is provided for specific constructors, notably the
+ right-associative infix @{text "\<alpha> \<Rightarrow> \<beta>"} instead of @{text "(\<alpha>,
+ \<beta>)fun"}.
+
+ The logical category \emph{type} is defined inductively over type
+ variables and type constructors as follows: @{text "\<tau> = \<alpha>\<^isub>s | ?\<alpha>\<^isub>s |
+ (\<tau>\<^sub>1, \<dots>, \<tau>\<^sub>k)\<kappa>"}.
+
+ A \emph{type abbreviation} is a syntactic definition @{text
+ "(\<^vec>\<alpha>)\<kappa> = \<tau>"} of an arbitrary type expression @{text "\<tau>"} over
+ variables @{text "\<^vec>\<alpha>"}. Type abbreviations appear as type
+ constructors in the syntax, but are expanded before entering the
+ logical core.
+
+ A \emph{type arity} declares the image behavior of a type
+ constructor wrt.\ the algebra of sorts: @{text "\<kappa> :: (s\<^isub>1, \<dots>,
+ s\<^isub>k)s"} means that @{text "(\<tau>\<^isub>1, \<dots>, \<tau>\<^isub>k)\<kappa>"} is
+ of sort @{text "s"} if every argument type @{text "\<tau>\<^isub>i"} is
+ of sort @{text "s\<^isub>i"}. Arity declarations are implicitly
+ completed, i.e.\ @{text "\<kappa> :: (\<^vec>s)c"} entails @{text "\<kappa> ::
+ (\<^vec>s)c'"} for any @{text "c' \<supseteq> c"}.
+
+ \medskip The sort algebra is always maintained as \emph{coregular},
+ which means that type arities are consistent with the subclass
+ relation: for any type constructor @{text "\<kappa>"}, and classes @{text
+ "c\<^isub>1 \<subseteq> c\<^isub>2"}, and arities @{text "\<kappa> ::
+ (\<^vec>s\<^isub>1)c\<^isub>1"} and @{text "\<kappa> ::
+ (\<^vec>s\<^isub>2)c\<^isub>2"} holds @{text "\<^vec>s\<^isub>1 \<subseteq>
+ \<^vec>s\<^isub>2"} component-wise.
+
+ The key property of a coregular order-sorted algebra is that sort
+ constraints can be solved in a most general fashion: for each type
+ constructor @{text "\<kappa>"} and sort @{text "s"} there is a most general
+ vector of argument sorts @{text "(s\<^isub>1, \<dots>, s\<^isub>k)"} such
+ that a type scheme @{text "(\<alpha>\<^bsub>s\<^isub>1\<^esub>, \<dots>,
+ \<alpha>\<^bsub>s\<^isub>k\<^esub>)\<kappa>"} is of sort @{text "s"}.
+ Consequently, type unification has most general solutions (modulo
+ equivalence of sorts), so type-inference produces primary types as
+ expected \cite{nipkow-prehofer}.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type class: string} \\
+ @{index_ML_type sort: "class list"} \\
+ @{index_ML_type arity: "string * sort list * sort"} \\
+ @{index_ML_type typ} \\
+ @{index_ML Term.map_atyps: "(typ -> typ) -> typ -> typ"} \\
+ @{index_ML Term.fold_atyps: "(typ -> 'a -> 'a) -> typ -> 'a -> 'a"} \\
+ \end{mldecls}
+ \begin{mldecls}
+ @{index_ML Sign.subsort: "theory -> sort * sort -> bool"} \\
+ @{index_ML Sign.of_sort: "theory -> typ * sort -> bool"} \\
+ @{index_ML Sign.add_type: "Proof.context -> binding * int * mixfix -> theory -> theory"} \\
+ @{index_ML Sign.add_type_abbrev: "Proof.context ->
+ binding * string list * typ -> theory -> theory"} \\
+ @{index_ML Sign.primitive_class: "binding * class list -> theory -> theory"} \\
+ @{index_ML Sign.primitive_classrel: "class * class -> theory -> theory"} \\
+ @{index_ML Sign.primitive_arity: "arity -> theory -> theory"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type class} represents type classes.
+
+ \item Type @{ML_type sort} represents sorts, i.e.\ finite
+ intersections of classes. The empty list @{ML "[]: sort"} refers to
+ the empty class intersection, i.e.\ the ``full sort''.
+
+ \item Type @{ML_type arity} represents type arities. A triple
+ @{text "(\<kappa>, \<^vec>s, s) : arity"} represents @{text "\<kappa> ::
+ (\<^vec>s)s"} as described above.
+
+ \item Type @{ML_type typ} represents types; this is a datatype with
+ constructors @{ML TFree}, @{ML TVar}, @{ML Type}.
+
+ \item @{ML Term.map_atyps}~@{text "f \<tau>"} applies the mapping @{text
+ "f"} to all atomic types (@{ML TFree}, @{ML TVar}) occurring in
+ @{text "\<tau>"}.
+
+ \item @{ML Term.fold_atyps}~@{text "f \<tau>"} iterates the operation
+ @{text "f"} over all occurrences of atomic types (@{ML TFree}, @{ML
+ TVar}) in @{text "\<tau>"}; the type structure is traversed from left to
+ right.
+
+ \item @{ML Sign.subsort}~@{text "thy (s\<^isub>1, s\<^isub>2)"}
+ tests the subsort relation @{text "s\<^isub>1 \<subseteq> s\<^isub>2"}.
+
+ \item @{ML Sign.of_sort}~@{text "thy (\<tau>, s)"} tests whether type
+ @{text "\<tau>"} is of sort @{text "s"}.
+
+ \item @{ML Sign.add_type}~@{text "ctxt (\<kappa>, k, mx)"} declares a
+ new type constructors @{text "\<kappa>"} with @{text "k"} arguments and
+ optional mixfix syntax.
+
+ \item @{ML Sign.add_type_abbrev}~@{text "ctxt (\<kappa>, \<^vec>\<alpha>, \<tau>)"}
+ defines a new type abbreviation @{text "(\<^vec>\<alpha>)\<kappa> = \<tau>"}.
+
+ \item @{ML Sign.primitive_class}~@{text "(c, [c\<^isub>1, \<dots>,
+ c\<^isub>n])"} declares a new class @{text "c"}, together with class
+ relations @{text "c \<subseteq> c\<^isub>i"}, for @{text "i = 1, \<dots>, n"}.
+
+ \item @{ML Sign.primitive_classrel}~@{text "(c\<^isub>1,
+ c\<^isub>2)"} declares the class relation @{text "c\<^isub>1 \<subseteq>
+ c\<^isub>2"}.
+
+ \item @{ML Sign.primitive_arity}~@{text "(\<kappa>, \<^vec>s, s)"} declares
+ the arity @{text "\<kappa> :: (\<^vec>s)s"}.
+
+ \end{description}
+*}
+
+text %mlantiq {*
+ \begin{matharray}{rcl}
+ @{ML_antiquotation_def "class"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "sort"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "type_name"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "type_abbrev"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "nonterminal"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "typ"} & : & @{text ML_antiquotation} \\
+ \end{matharray}
+
+ @{rail "
+ @@{ML_antiquotation class} nameref
+ ;
+ @@{ML_antiquotation sort} sort
+ ;
+ (@@{ML_antiquotation type_name} |
+ @@{ML_antiquotation type_abbrev} |
+ @@{ML_antiquotation nonterminal}) nameref
+ ;
+ @@{ML_antiquotation typ} type
+ "}
+
+ \begin{description}
+
+ \item @{text "@{class c}"} inlines the internalized class @{text
+ "c"} --- as @{ML_type string} literal.
+
+ \item @{text "@{sort s}"} inlines the internalized sort @{text "s"}
+ --- as @{ML_type "string list"} literal.
+
+ \item @{text "@{type_name c}"} inlines the internalized type
+ constructor @{text "c"} --- as @{ML_type string} literal.
+
+ \item @{text "@{type_abbrev c}"} inlines the internalized type
+ abbreviation @{text "c"} --- as @{ML_type string} literal.
+
+ \item @{text "@{nonterminal c}"} inlines the internalized syntactic
+ type~/ grammar nonterminal @{text "c"} --- as @{ML_type string}
+ literal.
+
+ \item @{text "@{typ \<tau>}"} inlines the internalized type @{text "\<tau>"}
+ --- as constructor term for datatype @{ML_type typ}.
+
+ \end{description}
+*}
+
+
+section {* Terms \label{sec:terms} *}
+
+text {*
+ The language of terms is that of simply-typed @{text "\<lambda>"}-calculus
+ with de-Bruijn indices for bound variables (cf.\ \cite{debruijn72}
+ or \cite{paulson-ml2}), with the types being determined by the
+ corresponding binders. In contrast, free variables and constants
+ have an explicit name and type in each occurrence.
+
+ \medskip A \emph{bound variable} is a natural number @{text "b"},
+ which accounts for the number of intermediate binders between the
+ variable occurrence in the body and its binding position. For
+ example, the de-Bruijn term @{text "\<lambda>\<^bsub>bool\<^esub>. \<lambda>\<^bsub>bool\<^esub>. 1 \<and> 0"} would
+ correspond to @{text "\<lambda>x\<^bsub>bool\<^esub>. \<lambda>y\<^bsub>bool\<^esub>. x \<and> y"} in a named
+ representation. Note that a bound variable may be represented by
+ different de-Bruijn indices at different occurrences, depending on
+ the nesting of abstractions.
+
+ A \emph{loose variable} is a bound variable that is outside the
+ scope of local binders. The types (and names) for loose variables
+ can be managed as a separate context, that is maintained as a stack
+ of hypothetical binders. The core logic operates on closed terms,
+ without any loose variables.
+
+ A \emph{fixed variable} is a pair of a basic name and a type, e.g.\
+ @{text "(x, \<tau>)"} which is usually printed @{text "x\<^isub>\<tau>"} here. A
+ \emph{schematic variable} is a pair of an indexname and a type,
+ e.g.\ @{text "((x, 0), \<tau>)"} which is likewise printed as @{text
+ "?x\<^isub>\<tau>"}.
+
+ \medskip A \emph{constant} is a pair of a basic name and a type,
+ e.g.\ @{text "(c, \<tau>)"} which is usually printed as @{text "c\<^isub>\<tau>"}
+ here. Constants are declared in the context as polymorphic families
+ @{text "c :: \<sigma>"}, meaning that all substitution instances @{text
+ "c\<^isub>\<tau>"} for @{text "\<tau> = \<sigma>\<vartheta>"} are valid.
+
+ The vector of \emph{type arguments} of constant @{text "c\<^isub>\<tau>"} wrt.\
+ the declaration @{text "c :: \<sigma>"} is defined as the codomain of the
+ matcher @{text "\<vartheta> = {?\<alpha>\<^isub>1 \<mapsto> \<tau>\<^isub>1, \<dots>, ?\<alpha>\<^isub>n \<mapsto> \<tau>\<^isub>n}"} presented in
+ canonical order @{text "(\<tau>\<^isub>1, \<dots>, \<tau>\<^isub>n)"}, corresponding to the
+ left-to-right occurrences of the @{text "\<alpha>\<^isub>i"} in @{text "\<sigma>"}.
+ Within a given theory context, there is a one-to-one correspondence
+ between any constant @{text "c\<^isub>\<tau>"} and the application @{text "c(\<tau>\<^isub>1,
+ \<dots>, \<tau>\<^isub>n)"} of its type arguments. For example, with @{text "plus :: \<alpha>
+ \<Rightarrow> \<alpha> \<Rightarrow> \<alpha>"}, the instance @{text "plus\<^bsub>nat \<Rightarrow> nat \<Rightarrow> nat\<^esub>"} corresponds to
+ @{text "plus(nat)"}.
+
+ Constant declarations @{text "c :: \<sigma>"} may contain sort constraints
+ for type variables in @{text "\<sigma>"}. These are observed by
+ type-inference as expected, but \emph{ignored} by the core logic.
+ This means the primitive logic is able to reason with instances of
+ polymorphic constants that the user-level type-checker would reject
+ due to violation of type class restrictions.
+
+ \medskip An \emph{atomic term} is either a variable or constant.
+ The logical category \emph{term} is defined inductively over atomic
+ terms, with abstraction and application as follows: @{text "t = b |
+ x\<^isub>\<tau> | ?x\<^isub>\<tau> | c\<^isub>\<tau> | \<lambda>\<^isub>\<tau>. t | t\<^isub>1 t\<^isub>2"}. Parsing and printing takes care of
+ converting between an external representation with named bound
+ variables. Subsequently, we shall use the latter notation instead
+ of internal de-Bruijn representation.
+
+ The inductive relation @{text "t :: \<tau>"} assigns a (unique) type to a
+ term according to the structure of atomic terms, abstractions, and
+ applicatins:
+ \[
+ \infer{@{text "a\<^isub>\<tau> :: \<tau>"}}{}
+ \qquad
+ \infer{@{text "(\<lambda>x\<^sub>\<tau>. t) :: \<tau> \<Rightarrow> \<sigma>"}}{@{text "t :: \<sigma>"}}
+ \qquad
+ \infer{@{text "t u :: \<sigma>"}}{@{text "t :: \<tau> \<Rightarrow> \<sigma>"} & @{text "u :: \<tau>"}}
+ \]
+ A \emph{well-typed term} is a term that can be typed according to these rules.
+
+ Typing information can be omitted: type-inference is able to
+ reconstruct the most general type of a raw term, while assigning
+ most general types to all of its variables and constants.
+ Type-inference depends on a context of type constraints for fixed
+ variables, and declarations for polymorphic constants.
+
+ The identity of atomic terms consists both of the name and the type
+ component. This means that different variables @{text
+ "x\<^bsub>\<tau>\<^isub>1\<^esub>"} and @{text "x\<^bsub>\<tau>\<^isub>2\<^esub>"} may become the same after
+ type instantiation. Type-inference rejects variables of the same
+ name, but different types. In contrast, mixed instances of
+ polymorphic constants occur routinely.
+
+ \medskip The \emph{hidden polymorphism} of a term @{text "t :: \<sigma>"}
+ is the set of type variables occurring in @{text "t"}, but not in
+ its type @{text "\<sigma>"}. This means that the term implicitly depends
+ on type arguments that are not accounted in the result type, i.e.\
+ there are different type instances @{text "t\<vartheta> :: \<sigma>"} and
+ @{text "t\<vartheta>' :: \<sigma>"} with the same type. This slightly
+ pathological situation notoriously demands additional care.
+
+ \medskip A \emph{term abbreviation} is a syntactic definition @{text
+ "c\<^isub>\<sigma> \<equiv> t"} of a closed term @{text "t"} of type @{text "\<sigma>"},
+ without any hidden polymorphism. A term abbreviation looks like a
+ constant in the syntax, but is expanded before entering the logical
+ core. Abbreviations are usually reverted when printing terms, using
+ @{text "t \<rightarrow> c\<^isub>\<sigma>"} as rules for higher-order rewriting.
+
+ \medskip Canonical operations on @{text "\<lambda>"}-terms include @{text
+ "\<alpha>\<beta>\<eta>"}-conversion: @{text "\<alpha>"}-conversion refers to capture-free
+ renaming of bound variables; @{text "\<beta>"}-conversion contracts an
+ abstraction applied to an argument term, substituting the argument
+ in the body: @{text "(\<lambda>x. b)a"} becomes @{text "b[a/x]"}; @{text
+ "\<eta>"}-conversion contracts vacuous application-abstraction: @{text
+ "\<lambda>x. f x"} becomes @{text "f"}, provided that the bound variable
+ does not occur in @{text "f"}.
+
+ Terms are normally treated modulo @{text "\<alpha>"}-conversion, which is
+ implicit in the de-Bruijn representation. Names for bound variables
+ in abstractions are maintained separately as (meaningless) comments,
+ mostly for parsing and printing. Full @{text "\<alpha>\<beta>\<eta>"}-conversion is
+ commonplace in various standard operations (\secref{sec:obj-rules})
+ that are based on higher-order unification and matching.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type term} \\
+ @{index_ML_op "aconv": "term * term -> bool"} \\
+ @{index_ML Term.map_types: "(typ -> typ) -> term -> term"} \\
+ @{index_ML Term.fold_types: "(typ -> 'a -> 'a) -> term -> 'a -> 'a"} \\
+ @{index_ML Term.map_aterms: "(term -> term) -> term -> term"} \\
+ @{index_ML Term.fold_aterms: "(term -> 'a -> 'a) -> term -> 'a -> 'a"} \\
+ \end{mldecls}
+ \begin{mldecls}
+ @{index_ML fastype_of: "term -> typ"} \\
+ @{index_ML lambda: "term -> term -> term"} \\
+ @{index_ML betapply: "term * term -> term"} \\
+ @{index_ML incr_boundvars: "int -> term -> term"} \\
+ @{index_ML Sign.declare_const: "Proof.context ->
+ (binding * typ) * mixfix -> theory -> term * theory"} \\
+ @{index_ML Sign.add_abbrev: "string -> binding * term ->
+ theory -> (term * term) * theory"} \\
+ @{index_ML Sign.const_typargs: "theory -> string * typ -> typ list"} \\
+ @{index_ML Sign.const_instance: "theory -> string * typ list -> typ"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type term} represents de-Bruijn terms, with comments
+ in abstractions, and explicitly named free variables and constants;
+ this is a datatype with constructors @{ML Bound}, @{ML Free}, @{ML
+ Var}, @{ML Const}, @{ML Abs}, @{ML_op "$"}.
+
+ \item @{text "t"}~@{ML_text aconv}~@{text "u"} checks @{text
+ "\<alpha>"}-equivalence of two terms. This is the basic equality relation
+ on type @{ML_type term}; raw datatype equality should only be used
+ for operations related to parsing or printing!
+
+ \item @{ML Term.map_types}~@{text "f t"} applies the mapping @{text
+ "f"} to all types occurring in @{text "t"}.
+
+ \item @{ML Term.fold_types}~@{text "f t"} iterates the operation
+ @{text "f"} over all occurrences of types in @{text "t"}; the term
+ structure is traversed from left to right.
+
+ \item @{ML Term.map_aterms}~@{text "f t"} applies the mapping @{text
+ "f"} to all atomic terms (@{ML Bound}, @{ML Free}, @{ML Var}, @{ML
+ Const}) occurring in @{text "t"}.
+
+ \item @{ML Term.fold_aterms}~@{text "f t"} iterates the operation
+ @{text "f"} over all occurrences of atomic terms (@{ML Bound}, @{ML
+ Free}, @{ML Var}, @{ML Const}) in @{text "t"}; the term structure is
+ traversed from left to right.
+
+ \item @{ML fastype_of}~@{text "t"} determines the type of a
+ well-typed term. This operation is relatively slow, despite the
+ omission of any sanity checks.
+
+ \item @{ML lambda}~@{text "a b"} produces an abstraction @{text
+ "\<lambda>a. b"}, where occurrences of the atomic term @{text "a"} in the
+ body @{text "b"} are replaced by bound variables.
+
+ \item @{ML betapply}~@{text "(t, u)"} produces an application @{text
+ "t u"}, with topmost @{text "\<beta>"}-conversion if @{text "t"} is an
+ abstraction.
+
+ \item @{ML incr_boundvars}~@{text "j"} increments a term's dangling
+ bound variables by the offset @{text "j"}. This is required when
+ moving a subterm into a context where it is enclosed by a different
+ number of abstractions. Bound variables with a matching abstraction
+ are unaffected.
+
+ \item @{ML Sign.declare_const}~@{text "ctxt ((c, \<sigma>), mx)"} declares
+ a new constant @{text "c :: \<sigma>"} with optional mixfix syntax.
+
+ \item @{ML Sign.add_abbrev}~@{text "print_mode (c, t)"}
+ introduces a new term abbreviation @{text "c \<equiv> t"}.
+
+ \item @{ML Sign.const_typargs}~@{text "thy (c, \<tau>)"} and @{ML
+ Sign.const_instance}~@{text "thy (c, [\<tau>\<^isub>1, \<dots>, \<tau>\<^isub>n])"}
+ convert between two representations of polymorphic constants: full
+ type instance vs.\ compact type arguments form.
+
+ \end{description}
+*}
+
+text %mlantiq {*
+ \begin{matharray}{rcl}
+ @{ML_antiquotation_def "const_name"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "const_abbrev"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "const"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "term"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "prop"} & : & @{text ML_antiquotation} \\
+ \end{matharray}
+
+ @{rail "
+ (@@{ML_antiquotation const_name} |
+ @@{ML_antiquotation const_abbrev}) nameref
+ ;
+ @@{ML_antiquotation const} ('(' (type + ',') ')')?
+ ;
+ @@{ML_antiquotation term} term
+ ;
+ @@{ML_antiquotation prop} prop
+ "}
+
+ \begin{description}
+
+ \item @{text "@{const_name c}"} inlines the internalized logical
+ constant name @{text "c"} --- as @{ML_type string} literal.
+
+ \item @{text "@{const_abbrev c}"} inlines the internalized
+ abbreviated constant name @{text "c"} --- as @{ML_type string}
+ literal.
+
+ \item @{text "@{const c(\<^vec>\<tau>)}"} inlines the internalized
+ constant @{text "c"} with precise type instantiation in the sense of
+ @{ML Sign.const_instance} --- as @{ML Const} constructor term for
+ datatype @{ML_type term}.
+
+ \item @{text "@{term t}"} inlines the internalized term @{text "t"}
+ --- as constructor term for datatype @{ML_type term}.
+
+ \item @{text "@{prop \<phi>}"} inlines the internalized proposition
+ @{text "\<phi>"} --- as constructor term for datatype @{ML_type term}.
+
+ \end{description}
+*}
+
+
+section {* Theorems \label{sec:thms} *}
+
+text {*
+ A \emph{proposition} is a well-typed term of type @{text "prop"}, a
+ \emph{theorem} is a proven proposition (depending on a context of
+ hypotheses and the background theory). Primitive inferences include
+ plain Natural Deduction rules for the primary connectives @{text
+ "\<And>"} and @{text "\<Longrightarrow>"} of the framework. There is also a builtin
+ notion of equality/equivalence @{text "\<equiv>"}.
+*}
+
+
+subsection {* Primitive connectives and rules \label{sec:prim-rules} *}
+
+text {*
+ The theory @{text "Pure"} contains constant declarations for the
+ primitive connectives @{text "\<And>"}, @{text "\<Longrightarrow>"}, and @{text "\<equiv>"} of
+ the logical framework, see \figref{fig:pure-connectives}. The
+ derivability judgment @{text "A\<^isub>1, \<dots>, A\<^isub>n \<turnstile> B"} is
+ defined inductively by the primitive inferences given in
+ \figref{fig:prim-rules}, with the global restriction that the
+ hypotheses must \emph{not} contain any schematic variables. The
+ builtin equality is conceptually axiomatized as shown in
+ \figref{fig:pure-equality}, although the implementation works
+ directly with derived inferences.
+
+ \begin{figure}[htb]
+ \begin{center}
+ \begin{tabular}{ll}
+ @{text "all :: (\<alpha> \<Rightarrow> prop) \<Rightarrow> prop"} & universal quantification (binder @{text "\<And>"}) \\
+ @{text "\<Longrightarrow> :: prop \<Rightarrow> prop \<Rightarrow> prop"} & implication (right associative infix) \\
+ @{text "\<equiv> :: \<alpha> \<Rightarrow> \<alpha> \<Rightarrow> prop"} & equality relation (infix) \\
+ \end{tabular}
+ \caption{Primitive connectives of Pure}\label{fig:pure-connectives}
+ \end{center}
+ \end{figure}
+
+ \begin{figure}[htb]
+ \begin{center}
+ \[
+ \infer[@{text "(axiom)"}]{@{text "\<turnstile> A"}}{@{text "A \<in> \<Theta>"}}
+ \qquad
+ \infer[@{text "(assume)"}]{@{text "A \<turnstile> A"}}{}
+ \]
+ \[
+ \infer[@{text "(\<And>\<hyphen>intro)"}]{@{text "\<Gamma> \<turnstile> \<And>x. b[x]"}}{@{text "\<Gamma> \<turnstile> b[x]"} & @{text "x \<notin> \<Gamma>"}}
+ \qquad
+ \infer[@{text "(\<And>\<hyphen>elim)"}]{@{text "\<Gamma> \<turnstile> b[a]"}}{@{text "\<Gamma> \<turnstile> \<And>x. b[x]"}}
+ \]
+ \[
+ \infer[@{text "(\<Longrightarrow>\<hyphen>intro)"}]{@{text "\<Gamma> - A \<turnstile> A \<Longrightarrow> B"}}{@{text "\<Gamma> \<turnstile> B"}}
+ \qquad
+ \infer[@{text "(\<Longrightarrow>\<hyphen>elim)"}]{@{text "\<Gamma>\<^sub>1 \<union> \<Gamma>\<^sub>2 \<turnstile> B"}}{@{text "\<Gamma>\<^sub>1 \<turnstile> A \<Longrightarrow> B"} & @{text "\<Gamma>\<^sub>2 \<turnstile> A"}}
+ \]
+ \caption{Primitive inferences of Pure}\label{fig:prim-rules}
+ \end{center}
+ \end{figure}
+
+ \begin{figure}[htb]
+ \begin{center}
+ \begin{tabular}{ll}
+ @{text "\<turnstile> (\<lambda>x. b[x]) a \<equiv> b[a]"} & @{text "\<beta>"}-conversion \\
+ @{text "\<turnstile> x \<equiv> x"} & reflexivity \\
+ @{text "\<turnstile> x \<equiv> y \<Longrightarrow> P x \<Longrightarrow> P y"} & substitution \\
+ @{text "\<turnstile> (\<And>x. f x \<equiv> g x) \<Longrightarrow> f \<equiv> g"} & extensionality \\
+ @{text "\<turnstile> (A \<Longrightarrow> B) \<Longrightarrow> (B \<Longrightarrow> A) \<Longrightarrow> A \<equiv> B"} & logical equivalence \\
+ \end{tabular}
+ \caption{Conceptual axiomatization of Pure equality}\label{fig:pure-equality}
+ \end{center}
+ \end{figure}
+
+ The introduction and elimination rules for @{text "\<And>"} and @{text
+ "\<Longrightarrow>"} are analogous to formation of dependently typed @{text
+ "\<lambda>"}-terms representing the underlying proof objects. Proof terms
+ are irrelevant in the Pure logic, though; they cannot occur within
+ propositions. The system provides a runtime option to record
+ explicit proof terms for primitive inferences. Thus all three
+ levels of @{text "\<lambda>"}-calculus become explicit: @{text "\<Rightarrow>"} for
+ terms, and @{text "\<And>/\<Longrightarrow>"} for proofs (cf.\
+ \cite{Berghofer-Nipkow:2000:TPHOL}).
+
+ Observe that locally fixed parameters (as in @{text
+ "\<And>\<hyphen>intro"}) need not be recorded in the hypotheses, because
+ the simple syntactic types of Pure are always inhabitable.
+ ``Assumptions'' @{text "x :: \<tau>"} for type-membership are only
+ present as long as some @{text "x\<^isub>\<tau>"} occurs in the statement
+ body.\footnote{This is the key difference to ``@{text "\<lambda>HOL"}'' in
+ the PTS framework \cite{Barendregt-Geuvers:2001}, where hypotheses
+ @{text "x : A"} are treated uniformly for propositions and types.}
+
+ \medskip The axiomatization of a theory is implicitly closed by
+ forming all instances of type and term variables: @{text "\<turnstile>
+ A\<vartheta>"} holds for any substitution instance of an axiom
+ @{text "\<turnstile> A"}. By pushing substitutions through derivations
+ inductively, we also get admissible @{text "generalize"} and @{text
+ "instantiate"} rules as shown in \figref{fig:subst-rules}.
+
+ \begin{figure}[htb]
+ \begin{center}
+ \[
+ \infer{@{text "\<Gamma> \<turnstile> B[?\<alpha>]"}}{@{text "\<Gamma> \<turnstile> B[\<alpha>]"} & @{text "\<alpha> \<notin> \<Gamma>"}}
+ \quad
+ \infer[\quad@{text "(generalize)"}]{@{text "\<Gamma> \<turnstile> B[?x]"}}{@{text "\<Gamma> \<turnstile> B[x]"} & @{text "x \<notin> \<Gamma>"}}
+ \]
+ \[
+ \infer{@{text "\<Gamma> \<turnstile> B[\<tau>]"}}{@{text "\<Gamma> \<turnstile> B[?\<alpha>]"}}
+ \quad
+ \infer[\quad@{text "(instantiate)"}]{@{text "\<Gamma> \<turnstile> B[t]"}}{@{text "\<Gamma> \<turnstile> B[?x]"}}
+ \]
+ \caption{Admissible substitution rules}\label{fig:subst-rules}
+ \end{center}
+ \end{figure}
+
+ Note that @{text "instantiate"} does not require an explicit
+ side-condition, because @{text "\<Gamma>"} may never contain schematic
+ variables.
+
+ In principle, variables could be substituted in hypotheses as well,
+ but this would disrupt the monotonicity of reasoning: deriving
+ @{text "\<Gamma>\<vartheta> \<turnstile> B\<vartheta>"} from @{text "\<Gamma> \<turnstile> B"} is
+ correct, but @{text "\<Gamma>\<vartheta> \<supseteq> \<Gamma>"} does not necessarily hold:
+ the result belongs to a different proof context.
+
+ \medskip An \emph{oracle} is a function that produces axioms on the
+ fly. Logically, this is an instance of the @{text "axiom"} rule
+ (\figref{fig:prim-rules}), but there is an operational difference.
+ The system always records oracle invocations within derivations of
+ theorems by a unique tag.
+
+ Axiomatizations should be limited to the bare minimum, typically as
+ part of the initial logical basis of an object-logic formalization.
+ Later on, theories are usually developed in a strictly definitional
+ fashion, by stating only certain equalities over new constants.
+
+ A \emph{simple definition} consists of a constant declaration @{text
+ "c :: \<sigma>"} together with an axiom @{text "\<turnstile> c \<equiv> t"}, where @{text "t
+ :: \<sigma>"} is a closed term without any hidden polymorphism. The RHS
+ may depend on further defined constants, but not @{text "c"} itself.
+ Definitions of functions may be presented as @{text "c \<^vec>x \<equiv>
+ t"} instead of the puristic @{text "c \<equiv> \<lambda>\<^vec>x. t"}.
+
+ An \emph{overloaded definition} consists of a collection of axioms
+ for the same constant, with zero or one equations @{text
+ "c((\<^vec>\<alpha>)\<kappa>) \<equiv> t"} for each type constructor @{text "\<kappa>"} (for
+ distinct variables @{text "\<^vec>\<alpha>"}). The RHS may mention
+ previously defined constants as above, or arbitrary constants @{text
+ "d(\<alpha>\<^isub>i)"} for some @{text "\<alpha>\<^isub>i"} projected from @{text
+ "\<^vec>\<alpha>"}. Thus overloaded definitions essentially work by
+ primitive recursion over the syntactic structure of a single type
+ argument. See also \cite[\S4.3]{Haftmann-Wenzel:2006:classes}.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML Logic.all: "term -> term -> term"} \\
+ @{index_ML Logic.mk_implies: "term * term -> term"} \\
+ \end{mldecls}
+ \begin{mldecls}
+ @{index_ML_type ctyp} \\
+ @{index_ML_type cterm} \\
+ @{index_ML Thm.ctyp_of: "theory -> typ -> ctyp"} \\
+ @{index_ML Thm.cterm_of: "theory -> term -> cterm"} \\
+ @{index_ML Thm.apply: "cterm -> cterm -> cterm"} \\
+ @{index_ML Thm.lambda: "cterm -> cterm -> cterm"} \\
+ @{index_ML Thm.all: "cterm -> cterm -> cterm"} \\
+ @{index_ML Drule.mk_implies: "cterm * cterm -> cterm"} \\
+ \end{mldecls}
+ \begin{mldecls}
+ @{index_ML_type thm} \\
+ @{index_ML proofs: "int Unsynchronized.ref"} \\
+ @{index_ML Thm.transfer: "theory -> thm -> thm"} \\
+ @{index_ML Thm.assume: "cterm -> thm"} \\
+ @{index_ML Thm.forall_intr: "cterm -> thm -> thm"} \\
+ @{index_ML Thm.forall_elim: "cterm -> thm -> thm"} \\
+ @{index_ML Thm.implies_intr: "cterm -> thm -> thm"} \\
+ @{index_ML Thm.implies_elim: "thm -> thm -> thm"} \\
+ @{index_ML Thm.generalize: "string list * string list -> int -> thm -> thm"} \\
+ @{index_ML Thm.instantiate: "(ctyp * ctyp) list * (cterm * cterm) list -> thm -> thm"} \\
+ @{index_ML Thm.add_axiom: "Proof.context ->
+ binding * term -> theory -> (string * thm) * theory"} \\
+ @{index_ML Thm.add_oracle: "binding * ('a -> cterm) -> theory ->
+ (string * ('a -> thm)) * theory"} \\
+ @{index_ML Thm.add_def: "Proof.context -> bool -> bool ->
+ binding * term -> theory -> (string * thm) * theory"} \\
+ \end{mldecls}
+ \begin{mldecls}
+ @{index_ML Theory.add_deps: "Proof.context -> string ->
+ string * typ -> (string * typ) list -> theory -> theory"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML Logic.all}~@{text "a B"} produces a Pure quantification
+ @{text "\<And>a. B"}, where occurrences of the atomic term @{text "a"} in
+ the body proposition @{text "B"} are replaced by bound variables.
+ (See also @{ML lambda} on terms.)
+
+ \item @{ML Logic.mk_implies}~@{text "(A, B)"} produces a Pure
+ implication @{text "A \<Longrightarrow> B"}.
+
+ \item Types @{ML_type ctyp} and @{ML_type cterm} represent certified
+ types and terms, respectively. These are abstract datatypes that
+ guarantee that its values have passed the full well-formedness (and
+ well-typedness) checks, relative to the declarations of type
+ constructors, constants etc.\ in the background theory. The
+ abstract types @{ML_type ctyp} and @{ML_type cterm} are part of the
+ same inference kernel that is mainly responsible for @{ML_type thm}.
+ Thus syntactic operations on @{ML_type ctyp} and @{ML_type cterm}
+ are located in the @{ML_struct Thm} module, even though theorems are
+ not yet involved at that stage.
+
+ \item @{ML Thm.ctyp_of}~@{text "thy \<tau>"} and @{ML
+ Thm.cterm_of}~@{text "thy t"} explicitly checks types and terms,
+ respectively. This also involves some basic normalizations, such
+ expansion of type and term abbreviations from the theory context.
+ Full re-certification is relatively slow and should be avoided in
+ tight reasoning loops.
+
+ \item @{ML Thm.apply}, @{ML Thm.lambda}, @{ML Thm.all}, @{ML
+ Drule.mk_implies} etc.\ compose certified terms (or propositions)
+ incrementally. This is equivalent to @{ML Thm.cterm_of} after
+ unchecked @{ML_op "$"}, @{ML lambda}, @{ML Logic.all}, @{ML
+ Logic.mk_implies} etc., but there can be a big difference in
+ performance when large existing entities are composed by a few extra
+ constructions on top. There are separate operations to decompose
+ certified terms and theorems to produce certified terms again.
+
+ \item Type @{ML_type thm} represents proven propositions. This is
+ an abstract datatype that guarantees that its values have been
+ constructed by basic principles of the @{ML_struct Thm} module.
+ Every @{ML_type thm} value contains a sliding back-reference to the
+ enclosing theory, cf.\ \secref{sec:context-theory}.
+
+ \item @{ML proofs} specifies the detail of proof recording within
+ @{ML_type thm} values: @{ML 0} records only the names of oracles,
+ @{ML 1} records oracle names and propositions, @{ML 2} additionally
+ records full proof terms. Officially named theorems that contribute
+ to a result are recorded in any case.
+
+ \item @{ML Thm.transfer}~@{text "thy thm"} transfers the given
+ theorem to a \emph{larger} theory, see also \secref{sec:context}.
+ This formal adjustment of the background context has no logical
+ significance, but is occasionally required for formal reasons, e.g.\
+ when theorems that are imported from more basic theories are used in
+ the current situation.
+
+ \item @{ML Thm.assume}, @{ML Thm.forall_intr}, @{ML
+ Thm.forall_elim}, @{ML Thm.implies_intr}, and @{ML Thm.implies_elim}
+ correspond to the primitive inferences of \figref{fig:prim-rules}.
+
+ \item @{ML Thm.generalize}~@{text "(\<^vec>\<alpha>, \<^vec>x)"}
+ corresponds to the @{text "generalize"} rules of
+ \figref{fig:subst-rules}. Here collections of type and term
+ variables are generalized simultaneously, specified by the given
+ basic names.
+
+ \item @{ML Thm.instantiate}~@{text "(\<^vec>\<alpha>\<^isub>s,
+ \<^vec>x\<^isub>\<tau>)"} corresponds to the @{text "instantiate"} rules
+ of \figref{fig:subst-rules}. Type variables are substituted before
+ term variables. Note that the types in @{text "\<^vec>x\<^isub>\<tau>"}
+ refer to the instantiated versions.
+
+ \item @{ML Thm.add_axiom}~@{text "ctxt (name, A)"} declares an
+ arbitrary proposition as axiom, and retrieves it as a theorem from
+ the resulting theory, cf.\ @{text "axiom"} in
+ \figref{fig:prim-rules}. Note that the low-level representation in
+ the axiom table may differ slightly from the returned theorem.
+
+ \item @{ML Thm.add_oracle}~@{text "(binding, oracle)"} produces a named
+ oracle rule, essentially generating arbitrary axioms on the fly,
+ cf.\ @{text "axiom"} in \figref{fig:prim-rules}.
+
+ \item @{ML Thm.add_def}~@{text "ctxt unchecked overloaded (name, c
+ \<^vec>x \<equiv> t)"} states a definitional axiom for an existing constant
+ @{text "c"}. Dependencies are recorded via @{ML Theory.add_deps},
+ unless the @{text "unchecked"} option is set. Note that the
+ low-level representation in the axiom table may differ slightly from
+ the returned theorem.
+
+ \item @{ML Theory.add_deps}~@{text "ctxt name c\<^isub>\<tau> \<^vec>d\<^isub>\<sigma>"}
+ declares dependencies of a named specification for constant @{text
+ "c\<^isub>\<tau>"}, relative to existing specifications for constants @{text
+ "\<^vec>d\<^isub>\<sigma>"}.
+
+ \end{description}
+*}
+
+
+text %mlantiq {*
+ \begin{matharray}{rcl}
+ @{ML_antiquotation_def "ctyp"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "cterm"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "cprop"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "thm"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "thms"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "lemma"} & : & @{text ML_antiquotation} \\
+ \end{matharray}
+
+ @{rail "
+ @@{ML_antiquotation ctyp} typ
+ ;
+ @@{ML_antiquotation cterm} term
+ ;
+ @@{ML_antiquotation cprop} prop
+ ;
+ @@{ML_antiquotation thm} thmref
+ ;
+ @@{ML_antiquotation thms} thmrefs
+ ;
+ @@{ML_antiquotation lemma} ('(' @'open' ')')? ((prop +) + @'and') \\
+ @'by' method method?
+ "}
+
+ \begin{description}
+
+ \item @{text "@{ctyp \<tau>}"} produces a certified type wrt.\ the
+ current background theory --- as abstract value of type @{ML_type
+ ctyp}.
+
+ \item @{text "@{cterm t}"} and @{text "@{cprop \<phi>}"} produce a
+ certified term wrt.\ the current background theory --- as abstract
+ value of type @{ML_type cterm}.
+
+ \item @{text "@{thm a}"} produces a singleton fact --- as abstract
+ value of type @{ML_type thm}.
+
+ \item @{text "@{thms a}"} produces a general fact --- as abstract
+ value of type @{ML_type "thm list"}.
+
+ \item @{text "@{lemma \<phi> by meth}"} produces a fact that is proven on
+ the spot according to the minimal proof, which imitates a terminal
+ Isar proof. The result is an abstract value of type @{ML_type thm}
+ or @{ML_type "thm list"}, depending on the number of propositions
+ given here.
+
+ The internal derivation object lacks a proper theorem name, but it
+ is formally closed, unless the @{text "(open)"} option is specified
+ (this may impact performance of applications with proof terms).
+
+ Since ML antiquotations are always evaluated at compile-time, there
+ is no run-time overhead even for non-trivial proofs. Nonetheless,
+ the justification is syntactically limited to a single @{command
+ "by"} step. More complex Isar proofs should be done in regular
+ theory source, before compiling the corresponding ML text that uses
+ the result.
+
+ \end{description}
+
+*}
+
+
+subsection {* Auxiliary connectives \label{sec:logic-aux} *}
+
+text {* Theory @{text "Pure"} provides a few auxiliary connectives
+ that are defined on top of the primitive ones, see
+ \figref{fig:pure-aux}. These special constants are useful in
+ certain internal encodings, and are normally not directly exposed to
+ the user.
+
+ \begin{figure}[htb]
+ \begin{center}
+ \begin{tabular}{ll}
+ @{text "conjunction :: prop \<Rightarrow> prop \<Rightarrow> prop"} & (infix @{text "&&&"}) \\
+ @{text "\<turnstile> A &&& B \<equiv> (\<And>C. (A \<Longrightarrow> B \<Longrightarrow> C) \<Longrightarrow> C)"} \\[1ex]
+ @{text "prop :: prop \<Rightarrow> prop"} & (prefix @{text "#"}, suppressed) \\
+ @{text "#A \<equiv> A"} \\[1ex]
+ @{text "term :: \<alpha> \<Rightarrow> prop"} & (prefix @{text "TERM"}) \\
+ @{text "term x \<equiv> (\<And>A. A \<Longrightarrow> A)"} \\[1ex]
+ @{text "TYPE :: \<alpha> itself"} & (prefix @{text "TYPE"}) \\
+ @{text "(unspecified)"} \\
+ \end{tabular}
+ \caption{Definitions of auxiliary connectives}\label{fig:pure-aux}
+ \end{center}
+ \end{figure}
+
+ The introduction @{text "A \<Longrightarrow> B \<Longrightarrow> A &&& B"}, and eliminations
+ (projections) @{text "A &&& B \<Longrightarrow> A"} and @{text "A &&& B \<Longrightarrow> B"} are
+ available as derived rules. Conjunction allows to treat
+ simultaneous assumptions and conclusions uniformly, e.g.\ consider
+ @{text "A \<Longrightarrow> B \<Longrightarrow> C &&& D"}. In particular, the goal mechanism
+ represents multiple claims as explicit conjunction internally, but
+ this is refined (via backwards introduction) into separate sub-goals
+ before the user commences the proof; the final result is projected
+ into a list of theorems using eliminations (cf.\
+ \secref{sec:tactical-goals}).
+
+ The @{text "prop"} marker (@{text "#"}) makes arbitrarily complex
+ propositions appear as atomic, without changing the meaning: @{text
+ "\<Gamma> \<turnstile> A"} and @{text "\<Gamma> \<turnstile> #A"} are interchangeable. See
+ \secref{sec:tactical-goals} for specific operations.
+
+ The @{text "term"} marker turns any well-typed term into a derivable
+ proposition: @{text "\<turnstile> TERM t"} holds unconditionally. Although
+ this is logically vacuous, it allows to treat terms and proofs
+ uniformly, similar to a type-theoretic framework.
+
+ The @{text "TYPE"} constructor is the canonical representative of
+ the unspecified type @{text "\<alpha> itself"}; it essentially injects the
+ language of types into that of terms. There is specific notation
+ @{text "TYPE(\<tau>)"} for @{text "TYPE\<^bsub>\<tau>
+ itself\<^esub>"}.
+ Although being devoid of any particular meaning, the term @{text
+ "TYPE(\<tau>)"} accounts for the type @{text "\<tau>"} within the term
+ language. In particular, @{text "TYPE(\<alpha>)"} may be used as formal
+ argument in primitive definitions, in order to circumvent hidden
+ polymorphism (cf.\ \secref{sec:terms}). For example, @{text "c
+ TYPE(\<alpha>) \<equiv> A[\<alpha>]"} defines @{text "c :: \<alpha> itself \<Rightarrow> prop"} in terms of
+ a proposition @{text "A"} that depends on an additional type
+ argument, which is essentially a predicate on types.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML Conjunction.intr: "thm -> thm -> thm"} \\
+ @{index_ML Conjunction.elim: "thm -> thm * thm"} \\
+ @{index_ML Drule.mk_term: "cterm -> thm"} \\
+ @{index_ML Drule.dest_term: "thm -> cterm"} \\
+ @{index_ML Logic.mk_type: "typ -> term"} \\
+ @{index_ML Logic.dest_type: "term -> typ"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML Conjunction.intr} derives @{text "A &&& B"} from @{text
+ "A"} and @{text "B"}.
+
+ \item @{ML Conjunction.elim} derives @{text "A"} and @{text "B"}
+ from @{text "A &&& B"}.
+
+ \item @{ML Drule.mk_term} derives @{text "TERM t"}.
+
+ \item @{ML Drule.dest_term} recovers term @{text "t"} from @{text
+ "TERM t"}.
+
+ \item @{ML Logic.mk_type}~@{text "\<tau>"} produces the term @{text
+ "TYPE(\<tau>)"}.
+
+ \item @{ML Logic.dest_type}~@{text "TYPE(\<tau>)"} recovers the type
+ @{text "\<tau>"}.
+
+ \end{description}
+*}
+
+
+section {* Object-level rules \label{sec:obj-rules} *}
+
+text {*
+ The primitive inferences covered so far mostly serve foundational
+ purposes. User-level reasoning usually works via object-level rules
+ that are represented as theorems of Pure. Composition of rules
+ involves \emph{backchaining}, \emph{higher-order unification} modulo
+ @{text "\<alpha>\<beta>\<eta>"}-conversion of @{text "\<lambda>"}-terms, and so-called
+ \emph{lifting} of rules into a context of @{text "\<And>"} and @{text
+ "\<Longrightarrow>"} connectives. Thus the full power of higher-order Natural
+ Deduction in Isabelle/Pure becomes readily available.
+*}
+
+
+subsection {* Hereditary Harrop Formulae *}
+
+text {*
+ The idea of object-level rules is to model Natural Deduction
+ inferences in the style of Gentzen \cite{Gentzen:1935}, but we allow
+ arbitrary nesting similar to \cite{extensions91}. The most basic
+ rule format is that of a \emph{Horn Clause}:
+ \[
+ \infer{@{text "A"}}{@{text "A\<^sub>1"} & @{text "\<dots>"} & @{text "A\<^sub>n"}}
+ \]
+ where @{text "A, A\<^sub>1, \<dots>, A\<^sub>n"} are atomic propositions
+ of the framework, usually of the form @{text "Trueprop B"}, where
+ @{text "B"} is a (compound) object-level statement. This
+ object-level inference corresponds to an iterated implication in
+ Pure like this:
+ \[
+ @{text "A\<^sub>1 \<Longrightarrow> \<dots> A\<^sub>n \<Longrightarrow> A"}
+ \]
+ As an example consider conjunction introduction: @{text "A \<Longrightarrow> B \<Longrightarrow> A \<and>
+ B"}. Any parameters occurring in such rule statements are
+ conceptionally treated as arbitrary:
+ \[
+ @{text "\<And>x\<^sub>1 \<dots> x\<^sub>m. A\<^sub>1 x\<^sub>1 \<dots> x\<^sub>m \<Longrightarrow> \<dots> A\<^sub>n x\<^sub>1 \<dots> x\<^sub>m \<Longrightarrow> A x\<^sub>1 \<dots> x\<^sub>m"}
+ \]
+
+ Nesting of rules means that the positions of @{text "A\<^sub>i"} may
+ again hold compound rules, not just atomic propositions.
+ Propositions of this format are called \emph{Hereditary Harrop
+ Formulae} in the literature \cite{Miller:1991}. Here we give an
+ inductive characterization as follows:
+
+ \medskip
+ \begin{tabular}{ll}
+ @{text "\<^bold>x"} & set of variables \\
+ @{text "\<^bold>A"} & set of atomic propositions \\
+ @{text "\<^bold>H = \<And>\<^bold>x\<^sup>*. \<^bold>H\<^sup>* \<Longrightarrow> \<^bold>A"} & set of Hereditary Harrop Formulas \\
+ \end{tabular}
+ \medskip
+
+ Thus we essentially impose nesting levels on propositions formed
+ from @{text "\<And>"} and @{text "\<Longrightarrow>"}. At each level there is a prefix
+ of parameters and compound premises, concluding an atomic
+ proposition. Typical examples are @{text "\<longrightarrow>"}-introduction @{text
+ "(A \<Longrightarrow> B) \<Longrightarrow> A \<longrightarrow> B"} or mathematical induction @{text "P 0 \<Longrightarrow> (\<And>n. P n
+ \<Longrightarrow> P (Suc n)) \<Longrightarrow> P n"}. Even deeper nesting occurs in well-founded
+ induction @{text "(\<And>x. (\<And>y. y \<prec> x \<Longrightarrow> P y) \<Longrightarrow> P x) \<Longrightarrow> P x"}, but this
+ already marks the limit of rule complexity that is usually seen in
+ practice.
+
+ \medskip Regular user-level inferences in Isabelle/Pure always
+ maintain the following canonical form of results:
+
+ \begin{itemize}
+
+ \item Normalization by @{text "(A \<Longrightarrow> (\<And>x. B x)) \<equiv> (\<And>x. A \<Longrightarrow> B x)"},
+ which is a theorem of Pure, means that quantifiers are pushed in
+ front of implication at each level of nesting. The normal form is a
+ Hereditary Harrop Formula.
+
+ \item The outermost prefix of parameters is represented via
+ schematic variables: instead of @{text "\<And>\<^vec>x. \<^vec>H \<^vec>x
+ \<Longrightarrow> A \<^vec>x"} we have @{text "\<^vec>H ?\<^vec>x \<Longrightarrow> A ?\<^vec>x"}.
+ Note that this representation looses information about the order of
+ parameters, and vacuous quantifiers vanish automatically.
+
+ \end{itemize}
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML Simplifier.norm_hhf: "thm -> thm"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML Simplifier.norm_hhf}~@{text thm} normalizes the given
+ theorem according to the canonical form specified above. This is
+ occasionally helpful to repair some low-level tools that do not
+ handle Hereditary Harrop Formulae properly.
+
+ \end{description}
+*}
+
+
+subsection {* Rule composition *}
+
+text {*
+ The rule calculus of Isabelle/Pure provides two main inferences:
+ @{inference resolution} (i.e.\ back-chaining of rules) and
+ @{inference assumption} (i.e.\ closing a branch), both modulo
+ higher-order unification. There are also combined variants, notably
+ @{inference elim_resolution} and @{inference dest_resolution}.
+
+ To understand the all-important @{inference resolution} principle,
+ we first consider raw @{inference_def composition} (modulo
+ higher-order unification with substitution @{text "\<vartheta>"}):
+ \[
+ \infer[(@{inference_def composition})]{@{text "\<^vec>A\<vartheta> \<Longrightarrow> C\<vartheta>"}}
+ {@{text "\<^vec>A \<Longrightarrow> B"} & @{text "B' \<Longrightarrow> C"} & @{text "B\<vartheta> = B'\<vartheta>"}}
+ \]
+ Here the conclusion of the first rule is unified with the premise of
+ the second; the resulting rule instance inherits the premises of the
+ first and conclusion of the second. Note that @{text "C"} can again
+ consist of iterated implications. We can also permute the premises
+ of the second rule back-and-forth in order to compose with @{text
+ "B'"} in any position (subsequently we shall always refer to
+ position 1 w.l.o.g.).
+
+ In @{inference composition} the internal structure of the common
+ part @{text "B"} and @{text "B'"} is not taken into account. For
+ proper @{inference resolution} we require @{text "B"} to be atomic,
+ and explicitly observe the structure @{text "\<And>\<^vec>x. \<^vec>H
+ \<^vec>x \<Longrightarrow> B' \<^vec>x"} of the premise of the second rule. The
+ idea is to adapt the first rule by ``lifting'' it into this context,
+ by means of iterated application of the following inferences:
+ \[
+ \infer[(@{inference_def imp_lift})]{@{text "(\<^vec>H \<Longrightarrow> \<^vec>A) \<Longrightarrow> (\<^vec>H \<Longrightarrow> B)"}}{@{text "\<^vec>A \<Longrightarrow> B"}}
+ \]
+ \[
+ \infer[(@{inference_def all_lift})]{@{text "(\<And>\<^vec>x. \<^vec>A (?\<^vec>a \<^vec>x)) \<Longrightarrow> (\<And>\<^vec>x. B (?\<^vec>a \<^vec>x))"}}{@{text "\<^vec>A ?\<^vec>a \<Longrightarrow> B ?\<^vec>a"}}
+ \]
+ By combining raw composition with lifting, we get full @{inference
+ resolution} as follows:
+ \[
+ \infer[(@{inference_def resolution})]
+ {@{text "(\<And>\<^vec>x. \<^vec>H \<^vec>x \<Longrightarrow> \<^vec>A (?\<^vec>a \<^vec>x))\<vartheta> \<Longrightarrow> C\<vartheta>"}}
+ {\begin{tabular}{l}
+ @{text "\<^vec>A ?\<^vec>a \<Longrightarrow> B ?\<^vec>a"} \\
+ @{text "(\<And>\<^vec>x. \<^vec>H \<^vec>x \<Longrightarrow> B' \<^vec>x) \<Longrightarrow> C"} \\
+ @{text "(\<lambda>\<^vec>x. B (?\<^vec>a \<^vec>x))\<vartheta> = B'\<vartheta>"} \\
+ \end{tabular}}
+ \]
+
+ Continued resolution of rules allows to back-chain a problem towards
+ more and sub-problems. Branches are closed either by resolving with
+ a rule of 0 premises, or by producing a ``short-circuit'' within a
+ solved situation (again modulo unification):
+ \[
+ \infer[(@{inference_def assumption})]{@{text "C\<vartheta>"}}
+ {@{text "(\<And>\<^vec>x. \<^vec>H \<^vec>x \<Longrightarrow> A \<^vec>x) \<Longrightarrow> C"} & @{text "A\<vartheta> = H\<^sub>i\<vartheta>"}~~\text{(for some~@{text i})}}
+ \]
+
+ FIXME @{inference_def elim_resolution}, @{inference_def dest_resolution}
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_op "RSN": "thm * (int * thm) -> thm"} \\
+ @{index_ML_op "RS": "thm * thm -> thm"} \\
+
+ @{index_ML_op "RLN": "thm list * (int * thm list) -> thm list"} \\
+ @{index_ML_op "RL": "thm list * thm list -> thm list"} \\
+
+ @{index_ML_op "MRS": "thm list * thm -> thm"} \\
+ @{index_ML_op "OF": "thm * thm list -> thm"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{text "rule\<^sub>1 RSN (i, rule\<^sub>2)"} resolves the conclusion of
+ @{text "rule\<^sub>1"} with the @{text i}-th premise of @{text "rule\<^sub>2"},
+ according to the @{inference resolution} principle explained above.
+ Unless there is precisely one resolvent it raises exception @{ML
+ THM}.
+
+ This corresponds to the rule attribute @{attribute THEN} in Isar
+ source language.
+
+ \item @{text "rule\<^sub>1 RS rule\<^sub>2"} abbreviates @{text "rule\<^sub>1 RS (1,
+ rule\<^sub>2)"}.
+
+ \item @{text "rules\<^sub>1 RLN (i, rules\<^sub>2)"} joins lists of rules. For
+ every @{text "rule\<^sub>1"} in @{text "rules\<^sub>1"} and @{text "rule\<^sub>2"} in
+ @{text "rules\<^sub>2"}, it resolves the conclusion of @{text "rule\<^sub>1"} with
+ the @{text "i"}-th premise of @{text "rule\<^sub>2"}, accumulating multiple
+ results in one big list. Note that such strict enumerations of
+ higher-order unifications can be inefficient compared to the lazy
+ variant seen in elementary tactics like @{ML resolve_tac}.
+
+ \item @{text "rules\<^sub>1 RL rules\<^sub>2"} abbreviates @{text "rules\<^sub>1 RLN (1,
+ rules\<^sub>2)"}.
+
+ \item @{text "[rule\<^sub>1, \<dots>, rule\<^sub>n] MRS rule"} resolves @{text "rule\<^isub>i"}
+ against premise @{text "i"} of @{text "rule"}, for @{text "i = n, \<dots>,
+ 1"}. By working from right to left, newly emerging premises are
+ concatenated in the result, without interfering.
+
+ \item @{text "rule OF rules"} is an alternative notation for @{text
+ "rules MRS rule"}, which makes rule composition look more like
+ function application. Note that the argument @{text "rules"} need
+ not be atomic.
+
+ This corresponds to the rule attribute @{attribute OF} in Isar
+ source language.
+
+ \end{description}
+*}
+
+end
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc-src/IsarImplementation/ML.thy Mon Aug 27 17:11:55 2012 +0200
@@ -0,0 +1,1777 @@
+theory "ML"
+imports Base
+begin
+
+chapter {* Isabelle/ML *}
+
+text {* Isabelle/ML is best understood as a certain culture based on
+ Standard ML. Thus it is not a new programming language, but a
+ certain way to use SML at an advanced level within the Isabelle
+ environment. This covers a variety of aspects that are geared
+ towards an efficient and robust platform for applications of formal
+ logic with fully foundational proof construction --- according to
+ the well-known \emph{LCF principle}. There is specific
+ infrastructure with library modules to address the needs of this
+ difficult task. For example, the raw parallel programming model of
+ Poly/ML is presented as considerably more abstract concept of
+ \emph{future values}, which is then used to augment the inference
+ kernel, proof interpreter, and theory loader accordingly.
+
+ The main aspects of Isabelle/ML are introduced below. These
+ first-hand explanations should help to understand how proper
+ Isabelle/ML is to be read and written, and to get access to the
+ wealth of experience that is expressed in the source text and its
+ history of changes.\footnote{See
+ \url{http://isabelle.in.tum.de/repos/isabelle} for the full
+ Mercurial history. There are symbolic tags to refer to official
+ Isabelle releases, as opposed to arbitrary \emph{tip} versions that
+ merely reflect snapshots that are never really up-to-date.} *}
+
+
+section {* Style and orthography *}
+
+text {* The sources of Isabelle/Isar are optimized for
+ \emph{readability} and \emph{maintainability}. The main purpose is
+ to tell an informed reader what is really going on and how things
+ really work. This is a non-trivial aim, but it is supported by a
+ certain style of writing Isabelle/ML that has emerged from long
+ years of system development.\footnote{See also the interesting style
+ guide for OCaml
+ \url{http://caml.inria.fr/resources/doc/guides/guidelines.en.html}
+ which shares many of our means and ends.}
+
+ The main principle behind any coding style is \emph{consistency}.
+ For a single author of a small program this merely means ``choose
+ your style and stick to it''. A complex project like Isabelle, with
+ long years of development and different contributors, requires more
+ standardization. A coding style that is changed every few years or
+ with every new contributor is no style at all, because consistency
+ is quickly lost. Global consistency is hard to achieve, though.
+ Nonetheless, one should always strive at least for local consistency
+ of modules and sub-systems, without deviating from some general
+ principles how to write Isabelle/ML.
+
+ In a sense, good coding style is like an \emph{orthography} for the
+ sources: it helps to read quickly over the text and see through the
+ main points, without getting distracted by accidental presentation
+ of free-style code.
+*}
+
+
+subsection {* Header and sectioning *}
+
+text {* Isabelle source files have a certain standardized header
+ format (with precise spacing) that follows ancient traditions
+ reaching back to the earliest versions of the system by Larry
+ Paulson. See @{file "~~/src/Pure/thm.ML"}, for example.
+
+ The header includes at least @{verbatim Title} and @{verbatim
+ Author} entries, followed by a prose description of the purpose of
+ the module. The latter can range from a single line to several
+ paragraphs of explanations.
+
+ The rest of the file is divided into sections, subsections,
+ subsubsections, paragraphs etc.\ using a simple layout via ML
+ comments as follows.
+
+\begin{verbatim}
+(*** section ***)
+
+(** subsection **)
+
+(* subsubsection *)
+
+(*short paragraph*)
+
+(*
+ long paragraph,
+ with more text
+*)
+\end{verbatim}
+
+ As in regular typography, there is some extra space \emph{before}
+ section headings that are adjacent to plain text (not other headings
+ as in the example above).
+
+ \medskip The precise wording of the prose text given in these
+ headings is chosen carefully to introduce the main theme of the
+ subsequent formal ML text.
+*}
+
+
+subsection {* Naming conventions *}
+
+text {* Since ML is the primary medium to express the meaning of the
+ source text, naming of ML entities requires special care.
+
+ \paragraph{Notation.} A name consists of 1--3 \emph{words} (rarely
+ 4, but not more) that are separated by underscore. There are three
+ variants concerning upper or lower case letters, which are used for
+ certain ML categories as follows:
+
+ \medskip
+ \begin{tabular}{lll}
+ variant & example & ML categories \\\hline
+ lower-case & @{ML_text foo_bar} & values, types, record fields \\
+ capitalized & @{ML_text Foo_Bar} & datatype constructors, structures, functors \\
+ upper-case & @{ML_text FOO_BAR} & special values, exception constructors, signatures \\
+ \end{tabular}
+ \medskip
+
+ For historical reasons, many capitalized names omit underscores,
+ e.g.\ old-style @{ML_text FooBar} instead of @{ML_text Foo_Bar}.
+ Genuine mixed-case names are \emph{not} used, because clear division
+ of words is essential for readability.\footnote{Camel-case was
+ invented to workaround the lack of underscore in some early
+ non-ASCII character sets. Later it became habitual in some language
+ communities that are now strong in numbers.}
+
+ A single (capital) character does not count as ``word'' in this
+ respect: some Isabelle/ML names are suffixed by extra markers like
+ this: @{ML_text foo_barT}.
+
+ Name variants are produced by adding 1--3 primes, e.g.\ @{ML_text
+ foo'}, @{ML_text foo''}, or @{ML_text foo'''}, but not @{ML_text
+ foo''''} or more. Decimal digits scale better to larger numbers,
+ e.g.\ @{ML_text foo0}, @{ML_text foo1}, @{ML_text foo42}.
+
+ \paragraph{Scopes.} Apart from very basic library modules, ML
+ structures are not ``opened'', but names are referenced with
+ explicit qualification, as in @{ML Syntax.string_of_term} for
+ example. When devising names for structures and their components it
+ is important aim at eye-catching compositions of both parts, because
+ this is how they are seen in the sources and documentation. For the
+ same reasons, aliases of well-known library functions should be
+ avoided.
+
+ Local names of function abstraction or case/let bindings are
+ typically shorter, sometimes using only rudiments of ``words'',
+ while still avoiding cryptic shorthands. An auxiliary function
+ called @{ML_text helper}, @{ML_text aux}, or @{ML_text f} is
+ considered bad style.
+
+ Example:
+
+ \begin{verbatim}
+ (* RIGHT *)
+
+ fun print_foo ctxt foo =
+ let
+ fun print t = ... Syntax.string_of_term ctxt t ...
+ in ... end;
+
+
+ (* RIGHT *)
+
+ fun print_foo ctxt foo =
+ let
+ val string_of_term = Syntax.string_of_term ctxt;
+ fun print t = ... string_of_term t ...
+ in ... end;
+
+
+ (* WRONG *)
+
+ val string_of_term = Syntax.string_of_term;
+
+ fun print_foo ctxt foo =
+ let
+ fun aux t = ... string_of_term ctxt t ...
+ in ... end;
+
+ \end{verbatim}
+
+
+ \paragraph{Specific conventions.} Here are some specific name forms
+ that occur frequently in the sources.
+
+ \begin{itemize}
+
+ \item A function that maps @{ML_text foo} to @{ML_text bar} is
+ called @{ML_text foo_to_bar} or @{ML_text bar_of_foo} (never
+ @{ML_text foo2bar}, @{ML_text bar_from_foo}, @{ML_text
+ bar_for_foo}, or @{ML_text bar4foo}).
+
+ \item The name component @{ML_text legacy} means that the operation
+ is about to be discontinued soon.
+
+ \item The name component @{ML_text old} means that this is historic
+ material that might disappear at some later stage.
+
+ \item The name component @{ML_text global} means that this works
+ with the background theory instead of the regular local context
+ (\secref{sec:context}), sometimes for historical reasons, sometimes
+ due a genuine lack of locality of the concept involved, sometimes as
+ a fall-back for the lack of a proper context in the application
+ code. Whenever there is a non-global variant available, the
+ application should be migrated to use it with a proper local
+ context.
+
+ \item Variables of the main context types of the Isabelle/Isar
+ framework (\secref{sec:context} and \chref{ch:local-theory}) have
+ firm naming conventions as follows:
+
+ \begin{itemize}
+
+ \item theories are called @{ML_text thy}, rarely @{ML_text theory}
+ (never @{ML_text thry})
+
+ \item proof contexts are called @{ML_text ctxt}, rarely @{ML_text
+ context} (never @{ML_text ctx})
+
+ \item generic contexts are called @{ML_text context}, rarely
+ @{ML_text ctxt}
+
+ \item local theories are called @{ML_text lthy}, except for local
+ theories that are treated as proof context (which is a semantic
+ super-type)
+
+ \end{itemize}
+
+ Variations with primed or decimal numbers are always possible, as
+ well as sematic prefixes like @{ML_text foo_thy} or @{ML_text
+ bar_ctxt}, but the base conventions above need to be preserved.
+ This allows to visualize the their data flow via plain regular
+ expressions in the editor.
+
+ \item The main logical entities (\secref{ch:logic}) have established
+ naming convention as follows:
+
+ \begin{itemize}
+
+ \item sorts are called @{ML_text S}
+
+ \item types are called @{ML_text T}, @{ML_text U}, or @{ML_text
+ ty} (never @{ML_text t})
+
+ \item terms are called @{ML_text t}, @{ML_text u}, or @{ML_text
+ tm} (never @{ML_text trm})
+
+ \item certified types are called @{ML_text cT}, rarely @{ML_text
+ T}, with variants as for types
+
+ \item certified terms are called @{ML_text ct}, rarely @{ML_text
+ t}, with variants as for terms
+
+ \item theorems are called @{ML_text th}, or @{ML_text thm}
+
+ \end{itemize}
+
+ Proper semantic names override these conventions completely. For
+ example, the left-hand side of an equation (as a term) can be called
+ @{ML_text lhs} (not @{ML_text lhs_tm}). Or a term that is known
+ to be a variable can be called @{ML_text v} or @{ML_text x}.
+
+ \item Tactics (\secref{sec:tactics}) are sufficiently important to
+ have specific naming conventions. The name of a basic tactic
+ definition always has a @{ML_text "_tac"} suffix, the subgoal index
+ (if applicable) is always called @{ML_text i}, and the goal state
+ (if made explicit) is usually called @{ML_text st} instead of the
+ somewhat misleading @{ML_text thm}. Any other arguments are given
+ before the latter two, and the general context is given first.
+ Example:
+
+ \begin{verbatim}
+ fun my_tac ctxt arg1 arg2 i st = ...
+ \end{verbatim}
+
+ Note that the goal state @{ML_text st} above is rarely made
+ explicit, if tactic combinators (tacticals) are used as usual.
+
+ \end{itemize}
+*}
+
+
+subsection {* General source layout *}
+
+text {* The general Isabelle/ML source layout imitates regular
+ type-setting to some extent, augmented by the requirements for
+ deeply nested expressions that are commonplace in functional
+ programming.
+
+ \paragraph{Line length} is 80 characters according to ancient
+ standards, but we allow as much as 100 characters (not
+ more).\footnote{Readability requires to keep the beginning of a line
+ in view while watching its end. Modern wide-screen displays do not
+ change the way how the human brain works. Sources also need to be
+ printable on plain paper with reasonable font-size.} The extra 20
+ characters acknowledge the space requirements due to qualified
+ library references in Isabelle/ML.
+
+ \paragraph{White-space} is used to emphasize the structure of
+ expressions, following mostly standard conventions for mathematical
+ typesetting, as can be seen in plain {\TeX} or {\LaTeX}. This
+ defines positioning of spaces for parentheses, punctuation, and
+ infixes as illustrated here:
+
+ \begin{verbatim}
+ val x = y + z * (a + b);
+ val pair = (a, b);
+ val record = {foo = 1, bar = 2};
+ \end{verbatim}
+
+ Lines are normally broken \emph{after} an infix operator or
+ punctuation character. For example:
+
+ \begin{verbatim}
+ val x =
+ a +
+ b +
+ c;
+
+ val tuple =
+ (a,
+ b,
+ c);
+ \end{verbatim}
+
+ Some special infixes (e.g.\ @{ML_text "|>"}) work better at the
+ start of the line, but punctuation is always at the end.
+
+ Function application follows the tradition of @{text "\<lambda>"}-calculus,
+ not informal mathematics. For example: @{ML_text "f a b"} for a
+ curried function, or @{ML_text "g (a, b)"} for a tupled function.
+ Note that the space between @{ML_text g} and the pair @{ML_text
+ "(a, b)"} follows the important principle of
+ \emph{compositionality}: the layout of @{ML_text "g p"} does not
+ change when @{ML_text "p"} is refined to the concrete pair
+ @{ML_text "(a, b)"}.
+
+ \paragraph{Indentation} uses plain spaces, never hard
+ tabulators.\footnote{Tabulators were invented to move the carriage
+ of a type-writer to certain predefined positions. In software they
+ could be used as a primitive run-length compression of consecutive
+ spaces, but the precise result would depend on non-standardized
+ editor configuration.}
+
+ Each level of nesting is indented by 2 spaces, sometimes 1, very
+ rarely 4, never 8 or any other odd number.
+
+ Indentation follows a simple logical format that only depends on the
+ nesting depth, not the accidental length of the text that initiates
+ a level of nesting. Example:
+
+ \begin{verbatim}
+ (* RIGHT *)
+
+ if b then
+ expr1_part1
+ expr1_part2
+ else
+ expr2_part1
+ expr2_part2
+
+
+ (* WRONG *)
+
+ if b then expr1_part1
+ expr1_part2
+ else expr2_part1
+ expr2_part2
+ \end{verbatim}
+
+ The second form has many problems: it assumes a fixed-width font
+ when viewing the sources, it uses more space on the line and thus
+ makes it hard to observe its strict length limit (working against
+ \emph{readability}), it requires extra editing to adapt the layout
+ to changes of the initial text (working against
+ \emph{maintainability}) etc.
+
+ \medskip For similar reasons, any kind of two-dimensional or tabular
+ layouts, ASCII-art with lines or boxes of asterisks etc.\ should be
+ avoided.
+
+ \paragraph{Complex expressions} that consist of multi-clausal
+ function definitions, @{ML_text handle}, @{ML_text case},
+ @{ML_text let} (and combinations) require special attention. The
+ syntax of Standard ML is quite ambitious and admits a lot of
+ variance that can distort the meaning of the text.
+
+ Clauses of @{ML_text fun}, @{ML_text fn}, @{ML_text handle},
+ @{ML_text case} get extra indentation to indicate the nesting
+ clearly. Example:
+
+ \begin{verbatim}
+ (* RIGHT *)
+
+ fun foo p1 =
+ expr1
+ | foo p2 =
+ expr2
+
+
+ (* WRONG *)
+
+ fun foo p1 =
+ expr1
+ | foo p2 =
+ expr2
+ \end{verbatim}
+
+ Body expressions consisting of @{ML_text case} or @{ML_text let}
+ require care to maintain compositionality, to prevent loss of
+ logical indentation where it is especially important to see the
+ structure of the text. Example:
+
+ \begin{verbatim}
+ (* RIGHT *)
+
+ fun foo p1 =
+ (case e of
+ q1 => ...
+ | q2 => ...)
+ | foo p2 =
+ let
+ ...
+ in
+ ...
+ end
+
+
+ (* WRONG *)
+
+ fun foo p1 = case e of
+ q1 => ...
+ | q2 => ...
+ | foo p2 =
+ let
+ ...
+ in
+ ...
+ end
+ \end{verbatim}
+
+ Extra parentheses around @{ML_text case} expressions are optional,
+ but help to analyse the nesting based on character matching in the
+ editor.
+
+ \medskip There are two main exceptions to the overall principle of
+ compositionality in the layout of complex expressions.
+
+ \begin{enumerate}
+
+ \item @{ML_text "if"} expressions are iterated as if there would be
+ a multi-branch conditional in SML, e.g.
+
+ \begin{verbatim}
+ (* RIGHT *)
+
+ if b1 then e1
+ else if b2 then e2
+ else e3
+ \end{verbatim}
+
+ \item @{ML_text fn} abstractions are often layed-out as if they
+ would lack any structure by themselves. This traditional form is
+ motivated by the possibility to shift function arguments back and
+ forth wrt.\ additional combinators. Example:
+
+ \begin{verbatim}
+ (* RIGHT *)
+
+ fun foo x y = fold (fn z =>
+ expr)
+ \end{verbatim}
+
+ Here the visual appearance is that of three arguments @{ML_text x},
+ @{ML_text y}, @{ML_text z}.
+
+ \end{enumerate}
+
+ Such weakly structured layout should be use with great care. Here
+ are some counter-examples involving @{ML_text let} expressions:
+
+ \begin{verbatim}
+ (* WRONG *)
+
+ fun foo x = let
+ val y = ...
+ in ... end
+
+
+ (* WRONG *)
+
+ fun foo x = let
+ val y = ...
+ in ... end
+
+
+ (* WRONG *)
+
+ fun foo x =
+ let
+ val y = ...
+ in ... end
+ \end{verbatim}
+
+ \medskip In general the source layout is meant to emphasize the
+ structure of complex language expressions, not to pretend that SML
+ had a completely different syntax (say that of Haskell or Java).
+*}
+
+
+section {* SML embedded into Isabelle/Isar *}
+
+text {* ML and Isar are intertwined via an open-ended bootstrap
+ process that provides more and more programming facilities and
+ logical content in an alternating manner. Bootstrapping starts from
+ the raw environment of existing implementations of Standard ML
+ (mainly Poly/ML, but also SML/NJ).
+
+ Isabelle/Pure marks the point where the original ML toplevel is
+ superseded by the Isar toplevel that maintains a uniform context for
+ arbitrary ML values (see also \secref{sec:context}). This formal
+ environment holds ML compiler bindings, logical entities, and many
+ other things. Raw SML is never encountered again after the initial
+ bootstrap of Isabelle/Pure.
+
+ Object-logics like Isabelle/HOL are built within the
+ Isabelle/ML/Isar environment by introducing suitable theories with
+ associated ML modules, either inlined or as separate files. Thus
+ Isabelle/HOL is defined as a regular user-space application within
+ the Isabelle framework. Further add-on tools can be implemented in
+ ML within the Isar context in the same manner: ML is part of the
+ standard repertoire of Isabelle, and there is no distinction between
+ ``user'' and ``developer'' in this respect.
+*}
+
+
+subsection {* Isar ML commands *}
+
+text {* The primary Isar source language provides facilities to ``open
+ a window'' to the underlying ML compiler. Especially see the Isar
+ commands @{command_ref "use"} and @{command_ref "ML"}: both work the
+ same way, only the source text is provided via a file vs.\ inlined,
+ respectively. Apart from embedding ML into the main theory
+ definition like that, there are many more commands that refer to ML
+ source, such as @{command_ref setup} or @{command_ref declaration}.
+ Even more fine-grained embedding of ML into Isar is encountered in
+ the proof method @{method_ref tactic}, which refines the pending
+ goal state via a given expression of type @{ML_type tactic}.
+*}
+
+text %mlex {* The following artificial example demonstrates some ML
+ toplevel declarations within the implicit Isar theory context. This
+ is regular functional programming without referring to logical
+ entities yet.
+*}
+
+ML {*
+ fun factorial 0 = 1
+ | factorial n = n * factorial (n - 1)
+*}
+
+text {* Here the ML environment is already managed by Isabelle, i.e.\
+ the @{ML factorial} function is not yet accessible in the preceding
+ paragraph, nor in a different theory that is independent from the
+ current one in the import hierarchy.
+
+ Removing the above ML declaration from the source text will remove
+ any trace of this definition as expected. The Isabelle/ML toplevel
+ environment is managed in a \emph{stateless} way: unlike the raw ML
+ toplevel there are no global side-effects involved
+ here.\footnote{Such a stateless compilation environment is also a
+ prerequisite for robust parallel compilation within independent
+ nodes of the implicit theory development graph.}
+
+ \medskip The next example shows how to embed ML into Isar proofs, using
+ @{command_ref "ML_prf"} instead of Instead of @{command_ref "ML"}.
+ As illustrated below, the effect on the ML environment is local to
+ the whole proof body, ignoring the block structure.
+*}
+
+notepad
+begin
+ ML_prf %"ML" {* val a = 1 *}
+ {
+ ML_prf %"ML" {* val b = a + 1 *}
+ } -- {* Isar block structure ignored by ML environment *}
+ ML_prf %"ML" {* val c = b + 1 *}
+end
+
+text {* By side-stepping the normal scoping rules for Isar proof
+ blocks, embedded ML code can refer to the different contexts and
+ manipulate corresponding entities, e.g.\ export a fact from a block
+ context.
+
+ \medskip Two further ML commands are useful in certain situations:
+ @{command_ref ML_val} and @{command_ref ML_command} are
+ \emph{diagnostic} in the sense that there is no effect on the
+ underlying environment, and can thus used anywhere (even outside a
+ theory). The examples below produce long strings of digits by
+ invoking @{ML factorial}: @{command ML_val} already takes care of
+ printing the ML toplevel result, but @{command ML_command} is silent
+ so we produce an explicit output message. *}
+
+ML_val {* factorial 100 *}
+ML_command {* writeln (string_of_int (factorial 100)) *}
+
+notepad
+begin
+ ML_val {* factorial 100 *} (* FIXME check/fix indentation *)
+ ML_command {* writeln (string_of_int (factorial 100)) *}
+end
+
+
+subsection {* Compile-time context *}
+
+text {* Whenever the ML compiler is invoked within Isabelle/Isar, the
+ formal context is passed as a thread-local reference variable. Thus
+ ML code may access the theory context during compilation, by reading
+ or writing the (local) theory under construction. Note that such
+ direct access to the compile-time context is rare. In practice it
+ is typically done via some derived ML functions instead.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML ML_Context.the_generic_context: "unit -> Context.generic"} \\
+ @{index_ML "Context.>>": "(Context.generic -> Context.generic) -> unit"} \\
+ @{index_ML bind_thms: "string * thm list -> unit"} \\
+ @{index_ML bind_thm: "string * thm -> unit"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML "ML_Context.the_generic_context ()"} refers to the theory
+ context of the ML toplevel --- at compile time. ML code needs to
+ take care to refer to @{ML "ML_Context.the_generic_context ()"}
+ correctly. Recall that evaluation of a function body is delayed
+ until actual run-time.
+
+ \item @{ML "Context.>>"}~@{text f} applies context transformation
+ @{text f} to the implicit context of the ML toplevel.
+
+ \item @{ML bind_thms}~@{text "(name, thms)"} stores a list of
+ theorems produced in ML both in the (global) theory context and the
+ ML toplevel, associating it with the provided name. Theorems are
+ put into a global ``standard'' format before being stored.
+
+ \item @{ML bind_thm} is similar to @{ML bind_thms} but refers to a
+ singleton fact.
+
+ \end{description}
+
+ It is important to note that the above functions are really
+ restricted to the compile time, even though the ML compiler is
+ invoked at run-time. The majority of ML code either uses static
+ antiquotations (\secref{sec:ML-antiq}) or refers to the theory or
+ proof context at run-time, by explicit functional abstraction.
+*}
+
+
+subsection {* Antiquotations \label{sec:ML-antiq} *}
+
+text {* A very important consequence of embedding SML into Isar is the
+ concept of \emph{ML antiquotation}. The standard token language of
+ ML is augmented by special syntactic entities of the following form:
+
+ @{rail "
+ @{syntax_def antiquote}: '@{' nameref args '}' | '\<lbrace>' | '\<rbrace>'
+ "}
+
+ Here @{syntax nameref} and @{syntax args} are regular outer syntax
+ categories \cite{isabelle-isar-ref}. Attributes and proof methods
+ use similar syntax.
+
+ \medskip A regular antiquotation @{text "@{name args}"} processes
+ its arguments by the usual means of the Isar source language, and
+ produces corresponding ML source text, either as literal
+ \emph{inline} text (e.g. @{text "@{term t}"}) or abstract
+ \emph{value} (e.g. @{text "@{thm th}"}). This pre-compilation
+ scheme allows to refer to formal entities in a robust manner, with
+ proper static scoping and with some degree of logical checking of
+ small portions of the code.
+
+ Special antiquotations like @{text "@{let \<dots>}"} or @{text "@{note
+ \<dots>}"} augment the compilation context without generating code. The
+ non-ASCII braces @{text "\<lbrace>"} and @{text "\<rbrace>"} allow to delimit the
+ effect by introducing local blocks within the pre-compilation
+ environment.
+
+ \medskip See also \cite{Wenzel-Chaieb:2007b} for a broader
+ perspective on Isabelle/ML antiquotations. *}
+
+text %mlantiq {*
+ \begin{matharray}{rcl}
+ @{ML_antiquotation_def "let"} & : & @{text ML_antiquotation} \\
+ @{ML_antiquotation_def "note"} & : & @{text ML_antiquotation} \\
+ \end{matharray}
+
+ @{rail "
+ @@{ML_antiquotation let} ((term + @'and') '=' term + @'and')
+ ;
+ @@{ML_antiquotation note} (thmdef? thmrefs + @'and')
+ "}
+
+ \begin{description}
+
+ \item @{text "@{let p = t}"} binds schematic variables in the
+ pattern @{text "p"} by higher-order matching against the term @{text
+ "t"}. This is analogous to the regular @{command_ref let} command
+ in the Isar proof language. The pre-compilation environment is
+ augmented by auxiliary term bindings, without emitting ML source.
+
+ \item @{text "@{note a = b\<^sub>1 \<dots> b\<^sub>n}"} recalls existing facts @{text
+ "b\<^sub>1, \<dots>, b\<^sub>n"}, binding the result as @{text a}. This is analogous to
+ the regular @{command_ref note} command in the Isar proof language.
+ The pre-compilation environment is augmented by auxiliary fact
+ bindings, without emitting ML source.
+
+ \end{description}
+*}
+
+text %mlex {* The following artificial example gives some impression
+ about the antiquotation elements introduced so far, together with
+ the important @{text "@{thm}"} antiquotation defined later.
+*}
+
+ML {*
+ \<lbrace>
+ @{let ?t = my_term}
+ @{note my_refl = reflexive [of ?t]}
+ fun foo th = Thm.transitive th @{thm my_refl}
+ \<rbrace>
+*}
+
+text {* The extra block delimiters do not affect the compiled code
+ itself, i.e.\ function @{ML foo} is available in the present context
+ of this paragraph.
+*}
+
+
+section {* Canonical argument order \label{sec:canonical-argument-order} *}
+
+text {* Standard ML is a language in the tradition of @{text
+ "\<lambda>"}-calculus and \emph{higher-order functional programming},
+ similar to OCaml, Haskell, or Isabelle/Pure and HOL as logical
+ languages. Getting acquainted with the native style of representing
+ functions in that setting can save a lot of extra boiler-plate of
+ redundant shuffling of arguments, auxiliary abstractions etc.
+
+ Functions are usually \emph{curried}: the idea of turning arguments
+ of type @{text "\<tau>\<^sub>i"} (for @{text "i \<in> {1, \<dots> n}"}) into a result of
+ type @{text "\<tau>"} is represented by the iterated function space
+ @{text "\<tau>\<^sub>1 \<rightarrow> \<dots> \<rightarrow> \<tau>\<^sub>n \<rightarrow> \<tau>"}. This is isomorphic to the well-known
+ encoding via tuples @{text "\<tau>\<^sub>1 \<times> \<dots> \<times> \<tau>\<^sub>n \<rightarrow> \<tau>"}, but the curried
+ version fits more smoothly into the basic calculus.\footnote{The
+ difference is even more significant in higher-order logic, because
+ the redundant tuple structure needs to be accommodated by formal
+ reasoning.}
+
+ Currying gives some flexiblity due to \emph{partial application}. A
+ function @{text "f: \<tau>\<^sub>1 \<rightarrow> \<tau>\<^bsub>2\<^esub> \<rightarrow> \<tau>"} can be applied to @{text "x: \<tau>\<^sub>1"}
+ and the remaining @{text "(f x): \<tau>\<^sub>2 \<rightarrow> \<tau>"} passed to another function
+ etc. How well this works in practice depends on the order of
+ arguments. In the worst case, arguments are arranged erratically,
+ and using a function in a certain situation always requires some
+ glue code. Thus we would get exponentially many oppurtunities to
+ decorate the code with meaningless permutations of arguments.
+
+ This can be avoided by \emph{canonical argument order}, which
+ observes certain standard patterns and minimizes adhoc permutations
+ in their application. In Isabelle/ML, large portions of text can be
+ written without ever using @{text "swap: \<alpha> \<times> \<beta> \<rightarrow> \<beta> \<times> \<alpha>"}, or the
+ combinator @{text "C: (\<alpha> \<rightarrow> \<beta> \<rightarrow> \<gamma>) \<rightarrow> (\<beta> \<rightarrow> \<alpha> \<rightarrow> \<gamma>)"} that is not even
+ defined in our library.
+
+ \medskip The basic idea is that arguments that vary less are moved
+ further to the left than those that vary more. Two particularly
+ important categories of functions are \emph{selectors} and
+ \emph{updates}.
+
+ The subsequent scheme is based on a hypothetical set-like container
+ of type @{text "\<beta>"} that manages elements of type @{text "\<alpha>"}. Both
+ the names and types of the associated operations are canonical for
+ Isabelle/ML.
+
+ \medskip
+ \begin{tabular}{ll}
+ kind & canonical name and type \\\hline
+ selector & @{text "member: \<beta> \<rightarrow> \<alpha> \<rightarrow> bool"} \\
+ update & @{text "insert: \<alpha> \<rightarrow> \<beta> \<rightarrow> \<beta>"} \\
+ \end{tabular}
+ \medskip
+
+ Given a container @{text "B: \<beta>"}, the partially applied @{text
+ "member B"} is a predicate over elements @{text "\<alpha> \<rightarrow> bool"}, and
+ thus represents the intended denotation directly. It is customary
+ to pass the abstract predicate to further operations, not the
+ concrete container. The argument order makes it easy to use other
+ combinators: @{text "forall (member B) list"} will check a list of
+ elements for membership in @{text "B"} etc. Often the explicit
+ @{text "list"} is pointless and can be contracted to @{text "forall
+ (member B)"} to get directly a predicate again.
+
+ In contrast, an update operation varies the container, so it moves
+ to the right: @{text "insert a"} is a function @{text "\<beta> \<rightarrow> \<beta>"} to
+ insert a value @{text "a"}. These can be composed naturally as
+ @{text "insert c \<circ> insert b \<circ> insert a"}. The slightly awkward
+ inversion of the composition order is due to conventional
+ mathematical notation, which can be easily amended as explained
+ below.
+*}
+
+
+subsection {* Forward application and composition *}
+
+text {* Regular function application and infix notation works best for
+ relatively deeply structured expressions, e.g.\ @{text "h (f x y + g
+ z)"}. The important special case of \emph{linear transformation}
+ applies a cascade of functions @{text "f\<^sub>n (\<dots> (f\<^sub>1 x))"}. This
+ becomes hard to read and maintain if the functions are themselves
+ given as complex expressions. The notation can be significantly
+ improved by introducing \emph{forward} versions of application and
+ composition as follows:
+
+ \medskip
+ \begin{tabular}{lll}
+ @{text "x |> f"} & @{text "\<equiv>"} & @{text "f x"} \\
+ @{text "(f #> g) x"} & @{text "\<equiv>"} & @{text "x |> f |> g"} \\
+ \end{tabular}
+ \medskip
+
+ This enables to write conveniently @{text "x |> f\<^sub>1 |> \<dots> |> f\<^sub>n"} or
+ @{text "f\<^sub>1 #> \<dots> #> f\<^sub>n"} for its functional abstraction over @{text
+ "x"}.
+
+ \medskip There is an additional set of combinators to accommodate
+ multiple results (via pairs) that are passed on as multiple
+ arguments (via currying).
+
+ \medskip
+ \begin{tabular}{lll}
+ @{text "(x, y) |-> f"} & @{text "\<equiv>"} & @{text "f x y"} \\
+ @{text "(f #-> g) x"} & @{text "\<equiv>"} & @{text "x |> f |-> g"} \\
+ \end{tabular}
+ \medskip
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_op "|> ": "'a * ('a -> 'b) -> 'b"} \\
+ @{index_ML_op "|-> ": "('c * 'a) * ('c -> 'a -> 'b) -> 'b"} \\
+ @{index_ML_op "#> ": "('a -> 'b) * ('b -> 'c) -> 'a -> 'c"} \\
+ @{index_ML_op "#-> ": "('a -> 'c * 'b) * ('c -> 'b -> 'd) -> 'a -> 'd"} \\
+ \end{mldecls}
+
+ %FIXME description!?
+*}
+
+
+subsection {* Canonical iteration *}
+
+text {* As explained above, a function @{text "f: \<alpha> \<rightarrow> \<beta> \<rightarrow> \<beta>"} can be
+ understood as update on a configuration of type @{text "\<beta>"},
+ parametrized by arguments of type @{text "\<alpha>"}. Given @{text "a: \<alpha>"}
+ the partial application @{text "(f a): \<beta> \<rightarrow> \<beta>"} operates
+ homogeneously on @{text "\<beta>"}. This can be iterated naturally over a
+ list of parameters @{text "[a\<^sub>1, \<dots>, a\<^sub>n]"} as @{text "f a\<^sub>1 #> \<dots> #> f
+ a\<^bsub>n\<^esub>\<^bsub>\<^esub>"}. The latter expression is again a function @{text "\<beta> \<rightarrow> \<beta>"}.
+ It can be applied to an initial configuration @{text "b: \<beta>"} to
+ start the iteration over the given list of arguments: each @{text
+ "a"} in @{text "a\<^sub>1, \<dots>, a\<^sub>n"} is applied consecutively by updating a
+ cumulative configuration.
+
+ The @{text fold} combinator in Isabelle/ML lifts a function @{text
+ "f"} as above to its iterated version over a list of arguments.
+ Lifting can be repeated, e.g.\ @{text "(fold \<circ> fold) f"} iterates
+ over a list of lists as expected.
+
+ The variant @{text "fold_rev"} works inside-out over the list of
+ arguments, such that @{text "fold_rev f \<equiv> fold f \<circ> rev"} holds.
+
+ The @{text "fold_map"} combinator essentially performs @{text
+ "fold"} and @{text "map"} simultaneously: each application of @{text
+ "f"} produces an updated configuration together with a side-result;
+ the iteration collects all such side-results as a separate list.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML fold: "('a -> 'b -> 'b) -> 'a list -> 'b -> 'b"} \\
+ @{index_ML fold_rev: "('a -> 'b -> 'b) -> 'a list -> 'b -> 'b"} \\
+ @{index_ML fold_map: "('a -> 'b -> 'c * 'b) -> 'a list -> 'b -> 'c list * 'b"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML fold}~@{text f} lifts the parametrized update function
+ @{text "f"} to a list of parameters.
+
+ \item @{ML fold_rev}~@{text "f"} is similar to @{ML fold}~@{text
+ "f"}, but works inside-out.
+
+ \item @{ML fold_map}~@{text "f"} lifts the parametrized update
+ function @{text "f"} (with side-result) to a list of parameters and
+ cumulative side-results.
+
+ \end{description}
+
+ \begin{warn}
+ The literature on functional programming provides a multitude of
+ combinators called @{text "foldl"}, @{text "foldr"} etc. SML97
+ provides its own variations as @{ML List.foldl} and @{ML
+ List.foldr}, while the classic Isabelle library also has the
+ historic @{ML Library.foldl} and @{ML Library.foldr}. To avoid
+ further confusion, all of this should be ignored, and @{ML fold} (or
+ @{ML fold_rev}) used exclusively.
+ \end{warn}
+*}
+
+text %mlex {* The following example shows how to fill a text buffer
+ incrementally by adding strings, either individually or from a given
+ list.
+*}
+
+ML {*
+ val s =
+ Buffer.empty
+ |> Buffer.add "digits: "
+ |> fold (Buffer.add o string_of_int) (0 upto 9)
+ |> Buffer.content;
+
+ @{assert} (s = "digits: 0123456789");
+*}
+
+text {* Note how @{ML "fold (Buffer.add o string_of_int)"} above saves
+ an extra @{ML "map"} over the given list. This kind of peephole
+ optimization reduces both the code size and the tree structures in
+ memory (``deforestation''), but requires some practice to read and
+ write it fluently.
+
+ \medskip The next example elaborates the idea of canonical
+ iteration, demonstrating fast accumulation of tree content using a
+ text buffer.
+*}
+
+ML {*
+ datatype tree = Text of string | Elem of string * tree list;
+
+ fun slow_content (Text txt) = txt
+ | slow_content (Elem (name, ts)) =
+ "<" ^ name ^ ">" ^
+ implode (map slow_content ts) ^
+ "</" ^ name ^ ">"
+
+ fun add_content (Text txt) = Buffer.add txt
+ | add_content (Elem (name, ts)) =
+ Buffer.add ("<" ^ name ^ ">") #>
+ fold add_content ts #>
+ Buffer.add ("</" ^ name ^ ">");
+
+ fun fast_content tree =
+ Buffer.empty |> add_content tree |> Buffer.content;
+*}
+
+text {* The slow part of @{ML slow_content} is the @{ML implode} of
+ the recursive results, because it copies previously produced strings
+ again.
+
+ The incremental @{ML add_content} avoids this by operating on a
+ buffer that is passed through in a linear fashion. Using @{ML_text
+ "#>"} and contraction over the actual buffer argument saves some
+ additional boiler-plate. Of course, the two @{ML "Buffer.add"}
+ invocations with concatenated strings could have been split into
+ smaller parts, but this would have obfuscated the source without
+ making a big difference in allocations. Here we have done some
+ peephole-optimization for the sake of readability.
+
+ Another benefit of @{ML add_content} is its ``open'' form as a
+ function on buffers that can be continued in further linear
+ transformations, folding etc. Thus it is more compositional than
+ the naive @{ML slow_content}. As realistic example, compare the
+ old-style @{ML "Term.maxidx_of_term: term -> int"} with the newer
+ @{ML "Term.maxidx_term: term -> int -> int"} in Isabelle/Pure.
+
+ Note that @{ML fast_content} above is only defined as example. In
+ many practical situations, it is customary to provide the
+ incremental @{ML add_content} only and leave the initialization and
+ termination to the concrete application by the user.
+*}
+
+
+section {* Message output channels \label{sec:message-channels} *}
+
+text {* Isabelle provides output channels for different kinds of
+ messages: regular output, high-volume tracing information, warnings,
+ and errors.
+
+ Depending on the user interface involved, these messages may appear
+ in different text styles or colours. The standard output for
+ terminal sessions prefixes each line of warnings by @{verbatim
+ "###"} and errors by @{verbatim "***"}, but leaves anything else
+ unchanged.
+
+ Messages are associated with the transaction context of the running
+ Isar command. This enables the front-end to manage commands and
+ resulting messages together. For example, after deleting a command
+ from a given theory document version, the corresponding message
+ output can be retracted from the display.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML writeln: "string -> unit"} \\
+ @{index_ML tracing: "string -> unit"} \\
+ @{index_ML warning: "string -> unit"} \\
+ @{index_ML error: "string -> 'a"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML writeln}~@{text "text"} outputs @{text "text"} as regular
+ message. This is the primary message output operation of Isabelle
+ and should be used by default.
+
+ \item @{ML tracing}~@{text "text"} outputs @{text "text"} as special
+ tracing message, indicating potential high-volume output to the
+ front-end (hundreds or thousands of messages issued by a single
+ command). The idea is to allow the user-interface to downgrade the
+ quality of message display to achieve higher throughput.
+
+ Note that the user might have to take special actions to see tracing
+ output, e.g.\ switch to a different output window. So this channel
+ should not be used for regular output.
+
+ \item @{ML warning}~@{text "text"} outputs @{text "text"} as
+ warning, which typically means some extra emphasis on the front-end
+ side (color highlighting, icons, etc.).
+
+ \item @{ML error}~@{text "text"} raises exception @{ML ERROR}~@{text
+ "text"} and thus lets the Isar toplevel print @{text "text"} on the
+ error channel, which typically means some extra emphasis on the
+ front-end side (color highlighting, icons, etc.).
+
+ This assumes that the exception is not handled before the command
+ terminates. Handling exception @{ML ERROR}~@{text "text"} is a
+ perfectly legal alternative: it means that the error is absorbed
+ without any message output.
+
+ \begin{warn}
+ The actual error channel is accessed via @{ML Output.error_msg}, but
+ the interaction protocol of Proof~General \emph{crashes} if that
+ function is used in regular ML code: error output and toplevel
+ command failure always need to coincide.
+ \end{warn}
+
+ \end{description}
+
+ \begin{warn}
+ Regular Isabelle/ML code should output messages exclusively by the
+ official channels. Using raw I/O on \emph{stdout} or \emph{stderr}
+ instead (e.g.\ via @{ML TextIO.output}) is apt to cause problems in
+ the presence of parallel and asynchronous processing of Isabelle
+ theories. Such raw output might be displayed by the front-end in
+ some system console log, with a low chance that the user will ever
+ see it. Moreover, as a genuine side-effect on global process
+ channels, there is no proper way to retract output when Isar command
+ transactions are reset by the system.
+ \end{warn}
+
+ \begin{warn}
+ The message channels should be used in a message-oriented manner.
+ This means that multi-line output that logically belongs together is
+ issued by a \emph{single} invocation of @{ML writeln} etc.\ with the
+ functional concatenation of all message constituents.
+ \end{warn}
+*}
+
+text %mlex {* The following example demonstrates a multi-line
+ warning. Note that in some situations the user sees only the first
+ line, so the most important point should be made first.
+*}
+
+ML_command {*
+ warning (cat_lines
+ ["Beware the Jabberwock, my son!",
+ "The jaws that bite, the claws that catch!",
+ "Beware the Jubjub Bird, and shun",
+ "The frumious Bandersnatch!"]);
+*}
+
+
+section {* Exceptions \label{sec:exceptions} *}
+
+text {* The Standard ML semantics of strict functional evaluation
+ together with exceptions is rather well defined, but some delicate
+ points need to be observed to avoid that ML programs go wrong
+ despite static type-checking. Exceptions in Isabelle/ML are
+ subsequently categorized as follows.
+
+ \paragraph{Regular user errors.} These are meant to provide
+ informative feedback about malformed input etc.
+
+ The \emph{error} function raises the corresponding \emph{ERROR}
+ exception, with a plain text message as argument. \emph{ERROR}
+ exceptions can be handled internally, in order to be ignored, turned
+ into other exceptions, or cascaded by appending messages. If the
+ corresponding Isabelle/Isar command terminates with an \emph{ERROR}
+ exception state, the toplevel will print the result on the error
+ channel (see \secref{sec:message-channels}).
+
+ It is considered bad style to refer to internal function names or
+ values in ML source notation in user error messages.
+
+ Grammatical correctness of error messages can be improved by
+ \emph{omitting} final punctuation: messages are often concatenated
+ or put into a larger context (e.g.\ augmented with source position).
+ By not insisting in the final word at the origin of the error, the
+ system can perform its administrative tasks more easily and
+ robustly.
+
+ \paragraph{Program failures.} There is a handful of standard
+ exceptions that indicate general failure situations, or failures of
+ core operations on logical entities (types, terms, theorems,
+ theories, see \chref{ch:logic}).
+
+ These exceptions indicate a genuine breakdown of the program, so the
+ main purpose is to determine quickly what has happened where.
+ Traditionally, the (short) exception message would include the name
+ of an ML function, although this is no longer necessary, because the
+ ML runtime system prints a detailed source position of the
+ corresponding @{ML_text raise} keyword.
+
+ \medskip User modules can always introduce their own custom
+ exceptions locally, e.g.\ to organize internal failures robustly
+ without overlapping with existing exceptions. Exceptions that are
+ exposed in module signatures require extra care, though, and should
+ \emph{not} be introduced by default. Surprise by users of a module
+ can be often minimized by using plain user errors instead.
+
+ \paragraph{Interrupts.} These indicate arbitrary system events:
+ both the ML runtime system and the Isabelle/ML infrastructure signal
+ various exceptional situations by raising the special
+ \emph{Interrupt} exception in user code.
+
+ This is the one and only way that physical events can intrude an
+ Isabelle/ML program. Such an interrupt can mean out-of-memory,
+ stack overflow, timeout, internal signaling of threads, or the user
+ producing a console interrupt manually etc. An Isabelle/ML program
+ that intercepts interrupts becomes dependent on physical effects of
+ the environment. Even worse, exception handling patterns that are
+ too general by accident, e.g.\ by mispelled exception constructors,
+ will cover interrupts unintentionally and thus render the program
+ semantics ill-defined.
+
+ Note that the Interrupt exception dates back to the original SML90
+ language definition. It was excluded from the SML97 version to
+ avoid its malign impact on ML program semantics, but without
+ providing a viable alternative. Isabelle/ML recovers physical
+ interruptibility (which is an indispensable tool to implement
+ managed evaluation of command transactions), but requires user code
+ to be strictly transparent wrt.\ interrupts.
+
+ \begin{warn}
+ Isabelle/ML user code needs to terminate promptly on interruption,
+ without guessing at its meaning to the system infrastructure.
+ Temporary handling of interrupts for cleanup of global resources
+ etc.\ needs to be followed immediately by re-raising of the original
+ exception.
+ \end{warn}
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML try: "('a -> 'b) -> 'a -> 'b option"} \\
+ @{index_ML can: "('a -> 'b) -> 'a -> bool"} \\
+ @{index_ML ERROR: "string -> exn"} \\
+ @{index_ML Fail: "string -> exn"} \\
+ @{index_ML Exn.is_interrupt: "exn -> bool"} \\
+ @{index_ML reraise: "exn -> 'a"} \\
+ @{index_ML exception_trace: "(unit -> 'a) -> 'a"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML try}~@{text "f x"} makes the partiality of evaluating
+ @{text "f x"} explicit via the option datatype. Interrupts are
+ \emph{not} handled here, i.e.\ this form serves as safe replacement
+ for the \emph{unsafe} version @{ML_text "(SOME"}~@{text "f
+ x"}~@{ML_text "handle _ => NONE)"} that is occasionally seen in
+ books about SML.
+
+ \item @{ML can} is similar to @{ML try} with more abstract result.
+
+ \item @{ML ERROR}~@{text "msg"} represents user errors; this
+ exception is normally raised indirectly via the @{ML error} function
+ (see \secref{sec:message-channels}).
+
+ \item @{ML Fail}~@{text "msg"} represents general program failures.
+
+ \item @{ML Exn.is_interrupt} identifies interrupts robustly, without
+ mentioning concrete exception constructors in user code. Handled
+ interrupts need to be re-raised promptly!
+
+ \item @{ML reraise}~@{text "exn"} raises exception @{text "exn"}
+ while preserving its implicit position information (if possible,
+ depending on the ML platform).
+
+ \item @{ML exception_trace}~@{ML_text "(fn () =>"}~@{text
+ "e"}@{ML_text ")"} evaluates expression @{text "e"} while printing
+ a full trace of its stack of nested exceptions (if possible,
+ depending on the ML platform).\footnote{In versions of Poly/ML the
+ trace will appear on raw stdout of the Isabelle process.}
+
+ Inserting @{ML exception_trace} into ML code temporarily is useful
+ for debugging, but not suitable for production code.
+
+ \end{description}
+*}
+
+text %mlantiq {*
+ \begin{matharray}{rcl}
+ @{ML_antiquotation_def "assert"} & : & @{text ML_antiquotation} \\
+ \end{matharray}
+
+ \begin{description}
+
+ \item @{text "@{assert}"} inlines a function
+ @{ML_type "bool -> unit"} that raises @{ML Fail} if the argument is
+ @{ML false}. Due to inlining the source position of failed
+ assertions is included in the error output.
+
+ \end{description}
+*}
+
+
+section {* Basic data types *}
+
+text {* The basis library proposal of SML97 needs to be treated with
+ caution. Many of its operations simply do not fit with important
+ Isabelle/ML conventions (like ``canonical argument order'', see
+ \secref{sec:canonical-argument-order}), others cause problems with
+ the parallel evaluation model of Isabelle/ML (such as @{ML
+ TextIO.print} or @{ML OS.Process.system}).
+
+ Subsequently we give a brief overview of important operations on
+ basic ML data types.
+*}
+
+
+subsection {* Characters *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type char} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type char} is \emph{not} used. The smallest textual
+ unit in Isabelle is represented as a ``symbol'' (see
+ \secref{sec:symbols}).
+
+ \end{description}
+*}
+
+
+subsection {* Integers *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type int} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type int} represents regular mathematical integers,
+ which are \emph{unbounded}. Overflow never happens in
+ practice.\footnote{The size limit for integer bit patterns in memory
+ is 64\,MB for 32-bit Poly/ML, and much higher for 64-bit systems.}
+ This works uniformly for all supported ML platforms (Poly/ML and
+ SML/NJ).
+
+ Literal integers in ML text are forced to be of this one true
+ integer type --- overloading of SML97 is disabled.
+
+ Structure @{ML_struct IntInf} of SML97 is obsolete and superseded by
+ @{ML_struct Int}. Structure @{ML_struct Integer} in @{file
+ "~~/src/Pure/General/integer.ML"} provides some additional
+ operations.
+
+ \end{description}
+*}
+
+
+subsection {* Time *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type Time.time} \\
+ @{index_ML seconds: "real -> Time.time"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type Time.time} represents time abstractly according
+ to the SML97 basis library definition. This is adequate for
+ internal ML operations, but awkward in concrete time specifications.
+
+ \item @{ML seconds}~@{text "s"} turns the concrete scalar @{text
+ "s"} (measured in seconds) into an abstract time value. Floating
+ point numbers are easy to use as context parameters (e.g.\ via
+ configuration options, see \secref{sec:config-options}) or
+ preferences that are maintained by external tools as well.
+
+ \end{description}
+*}
+
+
+subsection {* Options *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML Option.map: "('a -> 'b) -> 'a option -> 'b option"} \\
+ @{index_ML is_some: "'a option -> bool"} \\
+ @{index_ML is_none: "'a option -> bool"} \\
+ @{index_ML the: "'a option -> 'a"} \\
+ @{index_ML these: "'a list option -> 'a list"} \\
+ @{index_ML the_list: "'a option -> 'a list"} \\
+ @{index_ML the_default: "'a -> 'a option -> 'a"} \\
+ \end{mldecls}
+*}
+
+text {* Apart from @{ML Option.map} most operations defined in
+ structure @{ML_struct Option} are alien to Isabelle/ML. The
+ operations shown above are defined in @{file
+ "~~/src/Pure/General/basics.ML"}, among others. *}
+
+
+subsection {* Lists *}
+
+text {* Lists are ubiquitous in ML as simple and light-weight
+ ``collections'' for many everyday programming tasks. Isabelle/ML
+ provides important additions and improvements over operations that
+ are predefined in the SML97 library. *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML cons: "'a -> 'a list -> 'a list"} \\
+ @{index_ML member: "('b * 'a -> bool) -> 'a list -> 'b -> bool"} \\
+ @{index_ML insert: "('a * 'a -> bool) -> 'a -> 'a list -> 'a list"} \\
+ @{index_ML remove: "('b * 'a -> bool) -> 'b -> 'a list -> 'a list"} \\
+ @{index_ML update: "('a * 'a -> bool) -> 'a -> 'a list -> 'a list"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML cons}~@{text "x xs"} evaluates to @{text "x :: xs"}.
+
+ Tupled infix operators are a historical accident in Standard ML.
+ The curried @{ML cons} amends this, but it should be only used when
+ partial application is required.
+
+ \item @{ML member}, @{ML insert}, @{ML remove}, @{ML update} treat
+ lists as a set-like container that maintains the order of elements.
+ See @{file "~~/src/Pure/library.ML"} for the full specifications
+ (written in ML). There are some further derived operations like
+ @{ML union} or @{ML inter}.
+
+ Note that @{ML insert} is conservative about elements that are
+ already a @{ML member} of the list, while @{ML update} ensures that
+ the latest entry is always put in front. The latter discipline is
+ often more appropriate in declarations of context data
+ (\secref{sec:context-data}) that are issued by the user in Isar
+ source: more recent declarations normally take precedence over
+ earlier ones.
+
+ \end{description}
+*}
+
+text %mlex {* Using canonical @{ML fold} together with @{ML cons}, or
+ similar standard operations, alternates the orientation of data.
+ The is quite natural and should not be altered forcible by inserting
+ extra applications of @{ML rev}. The alternative @{ML fold_rev} can
+ be used in the few situations, where alternation should be
+ prevented.
+*}
+
+ML {*
+ val items = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
+
+ val list1 = fold cons items [];
+ @{assert} (list1 = rev items);
+
+ val list2 = fold_rev cons items [];
+ @{assert} (list2 = items);
+*}
+
+text {* The subsequent example demonstrates how to \emph{merge} two
+ lists in a natural way. *}
+
+ML {*
+ fun merge_lists eq (xs, ys) = fold_rev (insert eq) ys xs;
+*}
+
+text {* Here the first list is treated conservatively: only the new
+ elements from the second list are inserted. The inside-out order of
+ insertion via @{ML fold_rev} attempts to preserve the order of
+ elements in the result.
+
+ This way of merging lists is typical for context data
+ (\secref{sec:context-data}). See also @{ML merge} as defined in
+ @{file "~~/src/Pure/library.ML"}.
+*}
+
+
+subsection {* Association lists *}
+
+text {* The operations for association lists interpret a concrete list
+ of pairs as a finite function from keys to values. Redundant
+ representations with multiple occurrences of the same key are
+ implicitly normalized: lookup and update only take the first
+ occurrence into account.
+*}
+
+text {*
+ \begin{mldecls}
+ @{index_ML AList.lookup: "('a * 'b -> bool) -> ('b * 'c) list -> 'a -> 'c option"} \\
+ @{index_ML AList.defined: "('a * 'b -> bool) -> ('b * 'c) list -> 'a -> bool"} \\
+ @{index_ML AList.update: "('a * 'a -> bool) -> 'a * 'b -> ('a * 'b) list -> ('a * 'b) list"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML AList.lookup}, @{ML AList.defined}, @{ML AList.update}
+ implement the main ``framework operations'' for mappings in
+ Isabelle/ML, following standard conventions for their names and
+ types.
+
+ Note that a function called @{text lookup} is obliged to express its
+ partiality via an explicit option element. There is no choice to
+ raise an exception, without changing the name to something like
+ @{text "the_element"} or @{text "get"}.
+
+ The @{text "defined"} operation is essentially a contraction of @{ML
+ is_some} and @{text "lookup"}, but this is sufficiently frequent to
+ justify its independent existence. This also gives the
+ implementation some opportunity for peep-hole optimization.
+
+ \end{description}
+
+ Association lists are adequate as simple and light-weight
+ implementation of finite mappings in many practical situations. A
+ more heavy-duty table structure is defined in @{file
+ "~~/src/Pure/General/table.ML"}; that version scales easily to
+ thousands or millions of elements.
+*}
+
+
+subsection {* Unsynchronized references *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type "'a Unsynchronized.ref"} \\
+ @{index_ML Unsynchronized.ref: "'a -> 'a Unsynchronized.ref"} \\
+ @{index_ML "!": "'a Unsynchronized.ref -> 'a"} \\
+ @{index_ML_op ":=": "'a Unsynchronized.ref * 'a -> unit"} \\
+ \end{mldecls}
+*}
+
+text {* Due to ubiquitous parallelism in Isabelle/ML (see also
+ \secref{sec:multi-threading}), the mutable reference cells of
+ Standard ML are notorious for causing problems. In a highly
+ parallel system, both correctness \emph{and} performance are easily
+ degraded when using mutable data.
+
+ The unwieldy name of @{ML Unsynchronized.ref} for the constructor
+ for references in Isabelle/ML emphasizes the inconveniences caused by
+ mutability. Existing operations @{ML "!"} and @{ML_op ":="} are
+ unchanged, but should be used with special precautions, say in a
+ strictly local situation that is guaranteed to be restricted to
+ sequential evaluation --- now and in the future.
+
+ \begin{warn}
+ Never @{ML_text "open Unsynchronized"}, not even in a local scope!
+ Pretending that mutable state is no problem is a very bad idea.
+ \end{warn}
+*}
+
+
+section {* Thread-safe programming \label{sec:multi-threading} *}
+
+text {* Multi-threaded execution has become an everyday reality in
+ Isabelle since Poly/ML 5.2.1 and Isabelle2008. Isabelle/ML provides
+ implicit and explicit parallelism by default, and there is no way
+ for user-space tools to ``opt out''. ML programs that are purely
+ functional, output messages only via the official channels
+ (\secref{sec:message-channels}), and do not intercept interrupts
+ (\secref{sec:exceptions}) can participate in the multi-threaded
+ environment immediately without further ado.
+
+ More ambitious tools with more fine-grained interaction with the
+ environment need to observe the principles explained below.
+*}
+
+
+subsection {* Multi-threading with shared memory *}
+
+text {* Multiple threads help to organize advanced operations of the
+ system, such as real-time conditions on command transactions,
+ sub-components with explicit communication, general asynchronous
+ interaction etc. Moreover, parallel evaluation is a prerequisite to
+ make adequate use of the CPU resources that are available on
+ multi-core systems.\footnote{Multi-core computing does not mean that
+ there are ``spare cycles'' to be wasted. It means that the
+ continued exponential speedup of CPU performance due to ``Moore's
+ Law'' follows different rules: clock frequency has reached its peak
+ around 2005, and applications need to be parallelized in order to
+ avoid a perceived loss of performance. See also
+ \cite{Sutter:2005}.}
+
+ Isabelle/Isar exploits the inherent structure of theories and proofs
+ to support \emph{implicit parallelism} to a large extent. LCF-style
+ theorem provides almost ideal conditions for that, see also
+ \cite{Wenzel:2009}. This means, significant parts of theory and
+ proof checking is parallelized by default. A maximum speedup-factor
+ of 3.0 on 4 cores and 5.0 on 8 cores can be
+ expected.\footnote{Further scalability is limited due to garbage
+ collection, which is still sequential in Poly/ML 5.2/5.3/5.4. It
+ helps to provide initial heap space generously, using the
+ \texttt{-H} option. Initial heap size needs to be scaled-up
+ together with the number of CPU cores: approximately 1--2\,GB per
+ core..}
+
+ \medskip ML threads lack the memory protection of separate
+ processes, and operate concurrently on shared heap memory. This has
+ the advantage that results of independent computations are directly
+ available to other threads: abstract values can be passed without
+ copying or awkward serialization that is typically required for
+ separate processes.
+
+ To make shared-memory multi-threading work robustly and efficiently,
+ some programming guidelines need to be observed. While the ML
+ system is responsible to maintain basic integrity of the
+ representation of ML values in memory, the application programmer
+ needs to ensure that multi-threaded execution does not break the
+ intended semantics.
+
+ \begin{warn}
+ To participate in implicit parallelism, tools need to be
+ thread-safe. A single ill-behaved tool can affect the stability and
+ performance of the whole system.
+ \end{warn}
+
+ Apart from observing the principles of thread-safeness passively,
+ advanced tools may also exploit parallelism actively, e.g.\ by using
+ ``future values'' (\secref{sec:futures}) or the more basic library
+ functions for parallel list operations (\secref{sec:parlist}).
+
+ \begin{warn}
+ Parallel computing resources are managed centrally by the
+ Isabelle/ML infrastructure. User programs must not fork their own
+ ML threads to perform computations.
+ \end{warn}
+*}
+
+
+subsection {* Critical shared resources *}
+
+text {* Thread-safeness is mainly concerned about concurrent
+ read/write access to shared resources, which are outside the purely
+ functional world of ML. This covers the following in particular.
+
+ \begin{itemize}
+
+ \item Global references (or arrays), i.e.\ mutable memory cells that
+ persist over several invocations of associated
+ operations.\footnote{This is independent of the visibility of such
+ mutable values in the toplevel scope.}
+
+ \item Global state of the running Isabelle/ML process, i.e.\ raw I/O
+ channels, environment variables, current working directory.
+
+ \item Writable resources in the file-system that are shared among
+ different threads or external processes.
+
+ \end{itemize}
+
+ Isabelle/ML provides various mechanisms to avoid critical shared
+ resources in most situations. As last resort there are some
+ mechanisms for explicit synchronization. The following guidelines
+ help to make Isabelle/ML programs work smoothly in a concurrent
+ environment.
+
+ \begin{itemize}
+
+ \item Avoid global references altogether. Isabelle/Isar maintains a
+ uniform context that incorporates arbitrary data declared by user
+ programs (\secref{sec:context-data}). This context is passed as
+ plain value and user tools can get/map their own data in a purely
+ functional manner. Configuration options within the context
+ (\secref{sec:config-options}) provide simple drop-in replacements
+ for historic reference variables.
+
+ \item Keep components with local state information re-entrant.
+ Instead of poking initial values into (private) global references, a
+ new state record can be created on each invocation, and passed
+ through any auxiliary functions of the component. The state record
+ may well contain mutable references, without requiring any special
+ synchronizations, as long as each invocation gets its own copy.
+
+ \item Avoid raw output on @{text "stdout"} or @{text "stderr"}. The
+ Poly/ML library is thread-safe for each individual output operation,
+ but the ordering of parallel invocations is arbitrary. This means
+ raw output will appear on some system console with unpredictable
+ interleaving of atomic chunks.
+
+ Note that this does not affect regular message output channels
+ (\secref{sec:message-channels}). An official message is associated
+ with the command transaction from where it originates, independently
+ of other transactions. This means each running Isar command has
+ effectively its own set of message channels, and interleaving can
+ only happen when commands use parallelism internally (and only at
+ message boundaries).
+
+ \item Treat environment variables and the current working directory
+ of the running process as strictly read-only.
+
+ \item Restrict writing to the file-system to unique temporary files.
+ Isabelle already provides a temporary directory that is unique for
+ the running process, and there is a centralized source of unique
+ serial numbers in Isabelle/ML. Thus temporary files that are passed
+ to to some external process will be always disjoint, and thus
+ thread-safe.
+
+ \end{itemize}
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML File.tmp_path: "Path.T -> Path.T"} \\
+ @{index_ML serial_string: "unit -> string"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML File.tmp_path}~@{text "path"} relocates the base
+ component of @{text "path"} into the unique temporary directory of
+ the running Isabelle/ML process.
+
+ \item @{ML serial_string}~@{text "()"} creates a new serial number
+ that is unique over the runtime of the Isabelle/ML process.
+
+ \end{description}
+*}
+
+text %mlex {* The following example shows how to create unique
+ temporary file names.
+*}
+
+ML {*
+ val tmp1 = File.tmp_path (Path.basic ("foo" ^ serial_string ()));
+ val tmp2 = File.tmp_path (Path.basic ("foo" ^ serial_string ()));
+ @{assert} (tmp1 <> tmp2);
+*}
+
+
+subsection {* Explicit synchronization *}
+
+text {* Isabelle/ML also provides some explicit synchronization
+ mechanisms, for the rare situations where mutable shared resources
+ are really required. These are based on the synchronizations
+ primitives of Poly/ML, which have been adapted to the specific
+ assumptions of the concurrent Isabelle/ML environment. User code
+ must not use the Poly/ML primitives directly!
+
+ \medskip The most basic synchronization concept is a single
+ \emph{critical section} (also called ``monitor'' in the literature).
+ A thread that enters the critical section prevents all other threads
+ from doing the same. A thread that is already within the critical
+ section may re-enter it in an idempotent manner.
+
+ Such centralized locking is convenient, because it prevents
+ deadlocks by construction.
+
+ \medskip More fine-grained locking works via \emph{synchronized
+ variables}. An explicit state component is associated with
+ mechanisms for locking and signaling. There are operations to
+ await a condition, change the state, and signal the change to all
+ other waiting threads.
+
+ Here the synchronized access to the state variable is \emph{not}
+ re-entrant: direct or indirect nesting within the same thread will
+ cause a deadlock!
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML NAMED_CRITICAL: "string -> (unit -> 'a) -> 'a"} \\
+ @{index_ML CRITICAL: "(unit -> 'a) -> 'a"} \\
+ \end{mldecls}
+ \begin{mldecls}
+ @{index_ML_type "'a Synchronized.var"} \\
+ @{index_ML Synchronized.var: "string -> 'a -> 'a Synchronized.var"} \\
+ @{index_ML Synchronized.guarded_access: "'a Synchronized.var ->
+ ('a -> ('b * 'a) option) -> 'b"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML NAMED_CRITICAL}~@{text "name e"} evaluates @{text "e ()"}
+ within the central critical section of Isabelle/ML. No other thread
+ may do so at the same time, but non-critical parallel execution will
+ continue. The @{text "name"} argument is used for tracing and might
+ help to spot sources of congestion.
+
+ Entering the critical section without contention is very fast, and
+ several basic system operations do so frequently. Each thread
+ should stay within the critical section quickly only very briefly,
+ otherwise parallel performance may degrade.
+
+ \item @{ML CRITICAL} is the same as @{ML NAMED_CRITICAL} with empty
+ name argument.
+
+ \item Type @{ML_type "'a Synchronized.var"} represents synchronized
+ variables with state of type @{ML_type 'a}.
+
+ \item @{ML Synchronized.var}~@{text "name x"} creates a synchronized
+ variable that is initialized with value @{text "x"}. The @{text
+ "name"} is used for tracing.
+
+ \item @{ML Synchronized.guarded_access}~@{text "var f"} lets the
+ function @{text "f"} operate within a critical section on the state
+ @{text "x"} as follows: if @{text "f x"} produces @{ML NONE}, it
+ continues to wait on the internal condition variable, expecting that
+ some other thread will eventually change the content in a suitable
+ manner; if @{text "f x"} produces @{ML SOME}~@{text "(y, x')"} it is
+ satisfied and assigns the new state value @{text "x'"}, broadcasts a
+ signal to all waiting threads on the associated condition variable,
+ and returns the result @{text "y"}.
+
+ \end{description}
+
+ There are some further variants of the @{ML
+ Synchronized.guarded_access} combinator, see @{file
+ "~~/src/Pure/Concurrent/synchronized.ML"} for details.
+*}
+
+text %mlex {* The following example implements a counter that produces
+ positive integers that are unique over the runtime of the Isabelle
+ process:
+*}
+
+ML {*
+ local
+ val counter = Synchronized.var "counter" 0;
+ in
+ fun next () =
+ Synchronized.guarded_access counter
+ (fn i =>
+ let val j = i + 1
+ in SOME (j, j) end);
+ end;
+*}
+
+ML {*
+ val a = next ();
+ val b = next ();
+ @{assert} (a <> b);
+*}
+
+text {* \medskip See @{file "~~/src/Pure/Concurrent/mailbox.ML"} how
+ to implement a mailbox as synchronized variable over a purely
+ functional queue. *}
+
+end
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc-src/IsarImplementation/Prelim.thy Mon Aug 27 17:11:55 2012 +0200
@@ -0,0 +1,1237 @@
+theory Prelim
+imports Base
+begin
+
+chapter {* Preliminaries *}
+
+section {* Contexts \label{sec:context} *}
+
+text {*
+ A logical context represents the background that is required for
+ formulating statements and composing proofs. It acts as a medium to
+ produce formal content, depending on earlier material (declarations,
+ results etc.).
+
+ For example, derivations within the Isabelle/Pure logic can be
+ described as a judgment @{text "\<Gamma> \<turnstile>\<^sub>\<Theta> \<phi>"}, which means that a
+ proposition @{text "\<phi>"} is derivable from hypotheses @{text "\<Gamma>"}
+ within the theory @{text "\<Theta>"}. There are logical reasons for
+ keeping @{text "\<Theta>"} and @{text "\<Gamma>"} separate: theories can be
+ liberal about supporting type constructors and schematic
+ polymorphism of constants and axioms, while the inner calculus of
+ @{text "\<Gamma> \<turnstile> \<phi>"} is strictly limited to Simple Type Theory (with
+ fixed type variables in the assumptions).
+
+ \medskip Contexts and derivations are linked by the following key
+ principles:
+
+ \begin{itemize}
+
+ \item Transfer: monotonicity of derivations admits results to be
+ transferred into a \emph{larger} context, i.e.\ @{text "\<Gamma> \<turnstile>\<^sub>\<Theta>
+ \<phi>"} implies @{text "\<Gamma>' \<turnstile>\<^sub>\<Theta>\<^sub>' \<phi>"} for contexts @{text "\<Theta>'
+ \<supseteq> \<Theta>"} and @{text "\<Gamma>' \<supseteq> \<Gamma>"}.
+
+ \item Export: discharge of hypotheses admits results to be exported
+ into a \emph{smaller} context, i.e.\ @{text "\<Gamma>' \<turnstile>\<^sub>\<Theta> \<phi>"}
+ implies @{text "\<Gamma> \<turnstile>\<^sub>\<Theta> \<Delta> \<Longrightarrow> \<phi>"} where @{text "\<Gamma>' \<supseteq> \<Gamma>"} and
+ @{text "\<Delta> = \<Gamma>' - \<Gamma>"}. Note that @{text "\<Theta>"} remains unchanged here,
+ only the @{text "\<Gamma>"} part is affected.
+
+ \end{itemize}
+
+ \medskip By modeling the main characteristics of the primitive
+ @{text "\<Theta>"} and @{text "\<Gamma>"} above, and abstracting over any
+ particular logical content, we arrive at the fundamental notions of
+ \emph{theory context} and \emph{proof context} in Isabelle/Isar.
+ These implement a certain policy to manage arbitrary \emph{context
+ data}. There is a strongly-typed mechanism to declare new kinds of
+ data at compile time.
+
+ The internal bootstrap process of Isabelle/Pure eventually reaches a
+ stage where certain data slots provide the logical content of @{text
+ "\<Theta>"} and @{text "\<Gamma>"} sketched above, but this does not stop there!
+ Various additional data slots support all kinds of mechanisms that
+ are not necessarily part of the core logic.
+
+ For example, there would be data for canonical introduction and
+ elimination rules for arbitrary operators (depending on the
+ object-logic and application), which enables users to perform
+ standard proof steps implicitly (cf.\ the @{text "rule"} method
+ \cite{isabelle-isar-ref}).
+
+ \medskip Thus Isabelle/Isar is able to bring forth more and more
+ concepts successively. In particular, an object-logic like
+ Isabelle/HOL continues the Isabelle/Pure setup by adding specific
+ components for automated reasoning (classical reasoner, tableau
+ prover, structured induction etc.) and derived specification
+ mechanisms (inductive predicates, recursive functions etc.). All of
+ this is ultimately based on the generic data management by theory
+ and proof contexts introduced here.
+*}
+
+
+subsection {* Theory context \label{sec:context-theory} *}
+
+text {* A \emph{theory} is a data container with explicit name and
+ unique identifier. Theories are related by a (nominal) sub-theory
+ relation, which corresponds to the dependency graph of the original
+ construction; each theory is derived from a certain sub-graph of
+ ancestor theories. To this end, the system maintains a set of
+ symbolic ``identification stamps'' within each theory.
+
+ In order to avoid the full-scale overhead of explicit sub-theory
+ identification of arbitrary intermediate stages, a theory is
+ switched into @{text "draft"} mode under certain circumstances. A
+ draft theory acts like a linear type, where updates invalidate
+ earlier versions. An invalidated draft is called \emph{stale}.
+
+ The @{text "checkpoint"} operation produces a safe stepping stone
+ that will survive the next update without becoming stale: both the
+ old and the new theory remain valid and are related by the
+ sub-theory relation. Checkpointing essentially recovers purely
+ functional theory values, at the expense of some extra internal
+ bookkeeping.
+
+ The @{text "copy"} operation produces an auxiliary version that has
+ the same data content, but is unrelated to the original: updates of
+ the copy do not affect the original, neither does the sub-theory
+ relation hold.
+
+ The @{text "merge"} operation produces the least upper bound of two
+ theories, which actually degenerates into absorption of one theory
+ into the other (according to the nominal sub-theory relation).
+
+ The @{text "begin"} operation starts a new theory by importing
+ several parent theories and entering a special mode of nameless
+ incremental updates, until the final @{text "end"} operation is
+ performed.
+
+ \medskip The example in \figref{fig:ex-theory} below shows a theory
+ graph derived from @{text "Pure"}, with theory @{text "Length"}
+ importing @{text "Nat"} and @{text "List"}. The body of @{text
+ "Length"} consists of a sequence of updates, working mostly on
+ drafts internally, while transaction boundaries of Isar top-level
+ commands (\secref{sec:isar-toplevel}) are guaranteed to be safe
+ checkpoints.
+
+ \begin{figure}[htb]
+ \begin{center}
+ \begin{tabular}{rcccl}
+ & & @{text "Pure"} \\
+ & & @{text "\<down>"} \\
+ & & @{text "FOL"} \\
+ & $\swarrow$ & & $\searrow$ & \\
+ @{text "Nat"} & & & & @{text "List"} \\
+ & $\searrow$ & & $\swarrow$ \\
+ & & @{text "Length"} \\
+ & & \multicolumn{3}{l}{~~@{keyword "imports"}} \\
+ & & \multicolumn{3}{l}{~~@{keyword "begin"}} \\
+ & & $\vdots$~~ \\
+ & & @{text "\<bullet>"}~~ \\
+ & & $\vdots$~~ \\
+ & & @{text "\<bullet>"}~~ \\
+ & & $\vdots$~~ \\
+ & & \multicolumn{3}{l}{~~@{command "end"}} \\
+ \end{tabular}
+ \caption{A theory definition depending on ancestors}\label{fig:ex-theory}
+ \end{center}
+ \end{figure}
+
+ \medskip There is a separate notion of \emph{theory reference} for
+ maintaining a live link to an evolving theory context: updates on
+ drafts are propagated automatically. Dynamic updating stops when
+ the next @{text "checkpoint"} is reached.
+
+ Derived entities may store a theory reference in order to indicate
+ the formal context from which they are derived. This implicitly
+ assumes monotonic reasoning, because the referenced context may
+ become larger without further notice.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type theory} \\
+ @{index_ML Theory.eq_thy: "theory * theory -> bool"} \\
+ @{index_ML Theory.subthy: "theory * theory -> bool"} \\
+ @{index_ML Theory.checkpoint: "theory -> theory"} \\
+ @{index_ML Theory.copy: "theory -> theory"} \\
+ @{index_ML Theory.merge: "theory * theory -> theory"} \\
+ @{index_ML Theory.begin_theory: "string * Position.T -> theory list -> theory"} \\
+ @{index_ML Theory.parents_of: "theory -> theory list"} \\
+ @{index_ML Theory.ancestors_of: "theory -> theory list"} \\
+ \end{mldecls}
+ \begin{mldecls}
+ @{index_ML_type theory_ref} \\
+ @{index_ML Theory.deref: "theory_ref -> theory"} \\
+ @{index_ML Theory.check_thy: "theory -> theory_ref"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type theory} represents theory contexts. This is
+ essentially a linear type, with explicit runtime checking.
+ Primitive theory operations destroy the original version, which then
+ becomes ``stale''. This can be prevented by explicit checkpointing,
+ which the system does at least at the boundary of toplevel command
+ transactions \secref{sec:isar-toplevel}.
+
+ \item @{ML "Theory.eq_thy"}~@{text "(thy\<^sub>1, thy\<^sub>2)"} check strict
+ identity of two theories.
+
+ \item @{ML "Theory.subthy"}~@{text "(thy\<^sub>1, thy\<^sub>2)"} compares theories
+ according to the intrinsic graph structure of the construction.
+ This sub-theory relation is a nominal approximation of inclusion
+ (@{text "\<subseteq>"}) of the corresponding content (according to the
+ semantics of the ML modules that implement the data).
+
+ \item @{ML "Theory.checkpoint"}~@{text "thy"} produces a safe
+ stepping stone in the linear development of @{text "thy"}. This
+ changes the old theory, but the next update will result in two
+ related, valid theories.
+
+ \item @{ML "Theory.copy"}~@{text "thy"} produces a variant of @{text
+ "thy"} with the same data. The copy is not related to the original,
+ but the original is unchanged.
+
+ \item @{ML "Theory.merge"}~@{text "(thy\<^sub>1, thy\<^sub>2)"} absorbs one theory
+ into the other, without changing @{text "thy\<^sub>1"} or @{text "thy\<^sub>2"}.
+ This version of ad-hoc theory merge fails for unrelated theories!
+
+ \item @{ML "Theory.begin_theory"}~@{text "name parents"} constructs
+ a new theory based on the given parents. This ML function is
+ normally not invoked directly.
+
+ \item @{ML "Theory.parents_of"}~@{text "thy"} returns the direct
+ ancestors of @{text thy}.
+
+ \item @{ML "Theory.ancestors_of"}~@{text "thy"} returns all
+ ancestors of @{text thy} (not including @{text thy} itself).
+
+ \item Type @{ML_type theory_ref} represents a sliding reference to
+ an always valid theory; updates on the original are propagated
+ automatically.
+
+ \item @{ML "Theory.deref"}~@{text "thy_ref"} turns a @{ML_type
+ "theory_ref"} into an @{ML_type "theory"} value. As the referenced
+ theory evolves monotonically over time, later invocations of @{ML
+ "Theory.deref"} may refer to a larger context.
+
+ \item @{ML "Theory.check_thy"}~@{text "thy"} produces a @{ML_type
+ "theory_ref"} from a valid @{ML_type "theory"} value.
+
+ \end{description}
+*}
+
+text %mlantiq {*
+ \begin{matharray}{rcl}
+ @{ML_antiquotation_def "theory"} & : & @{text ML_antiquotation} \\
+ \end{matharray}
+
+ @{rail "
+ @@{ML_antiquotation theory} nameref?
+ "}
+
+ \begin{description}
+
+ \item @{text "@{theory}"} refers to the background theory of the
+ current context --- as abstract value.
+
+ \item @{text "@{theory A}"} refers to an explicitly named ancestor
+ theory @{text "A"} of the background theory of the current context
+ --- as abstract value.
+
+ \end{description}
+*}
+
+
+subsection {* Proof context \label{sec:context-proof} *}
+
+text {* A proof context is a container for pure data with a
+ back-reference to the theory from which it is derived. The @{text
+ "init"} operation creates a proof context from a given theory.
+ Modifications to draft theories are propagated to the proof context
+ as usual, but there is also an explicit @{text "transfer"} operation
+ to force resynchronization with more substantial updates to the
+ underlying theory.
+
+ Entities derived in a proof context need to record logical
+ requirements explicitly, since there is no separate context
+ identification or symbolic inclusion as for theories. For example,
+ hypotheses used in primitive derivations (cf.\ \secref{sec:thms})
+ are recorded separately within the sequent @{text "\<Gamma> \<turnstile> \<phi>"}, just to
+ make double sure. Results could still leak into an alien proof
+ context due to programming errors, but Isabelle/Isar includes some
+ extra validity checks in critical positions, notably at the end of a
+ sub-proof.
+
+ Proof contexts may be manipulated arbitrarily, although the common
+ discipline is to follow block structure as a mental model: a given
+ context is extended consecutively, and results are exported back
+ into the original context. Note that an Isar proof state models
+ block-structured reasoning explicitly, using a stack of proof
+ contexts internally. For various technical reasons, the background
+ theory of an Isar proof state must not be changed while the proof is
+ still under construction!
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type Proof.context} \\
+ @{index_ML Proof_Context.init_global: "theory -> Proof.context"} \\
+ @{index_ML Proof_Context.theory_of: "Proof.context -> theory"} \\
+ @{index_ML Proof_Context.transfer: "theory -> Proof.context -> Proof.context"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type Proof.context} represents proof contexts.
+ Elements of this type are essentially pure values, with a sliding
+ reference to the background theory.
+
+ \item @{ML Proof_Context.init_global}~@{text "thy"} produces a proof context
+ derived from @{text "thy"}, initializing all data.
+
+ \item @{ML Proof_Context.theory_of}~@{text "ctxt"} selects the
+ background theory from @{text "ctxt"}, dereferencing its internal
+ @{ML_type theory_ref}.
+
+ \item @{ML Proof_Context.transfer}~@{text "thy ctxt"} promotes the
+ background theory of @{text "ctxt"} to the super theory @{text
+ "thy"}.
+
+ \end{description}
+*}
+
+text %mlantiq {*
+ \begin{matharray}{rcl}
+ @{ML_antiquotation_def "context"} & : & @{text ML_antiquotation} \\
+ \end{matharray}
+
+ \begin{description}
+
+ \item @{text "@{context}"} refers to \emph{the} context at
+ compile-time --- as abstract value. Independently of (local) theory
+ or proof mode, this always produces a meaningful result.
+
+ This is probably the most common antiquotation in interactive
+ experimentation with ML inside Isar.
+
+ \end{description}
+*}
+
+
+subsection {* Generic contexts \label{sec:generic-context} *}
+
+text {*
+ A generic context is the disjoint sum of either a theory or proof
+ context. Occasionally, this enables uniform treatment of generic
+ context data, typically extra-logical information. Operations on
+ generic contexts include the usual injections, partial selections,
+ and combinators for lifting operations on either component of the
+ disjoint sum.
+
+ Moreover, there are total operations @{text "theory_of"} and @{text
+ "proof_of"} to convert a generic context into either kind: a theory
+ can always be selected from the sum, while a proof context might
+ have to be constructed by an ad-hoc @{text "init"} operation, which
+ incurs a small runtime overhead.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type Context.generic} \\
+ @{index_ML Context.theory_of: "Context.generic -> theory"} \\
+ @{index_ML Context.proof_of: "Context.generic -> Proof.context"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type Context.generic} is the direct sum of @{ML_type
+ "theory"} and @{ML_type "Proof.context"}, with the datatype
+ constructors @{ML "Context.Theory"} and @{ML "Context.Proof"}.
+
+ \item @{ML Context.theory_of}~@{text "context"} always produces a
+ theory from the generic @{text "context"}, using @{ML
+ "Proof_Context.theory_of"} as required.
+
+ \item @{ML Context.proof_of}~@{text "context"} always produces a
+ proof context from the generic @{text "context"}, using @{ML
+ "Proof_Context.init_global"} as required (note that this re-initializes the
+ context data with each invocation).
+
+ \end{description}
+*}
+
+
+subsection {* Context data \label{sec:context-data} *}
+
+text {* The main purpose of theory and proof contexts is to manage
+ arbitrary (pure) data. New data types can be declared incrementally
+ at compile time. There are separate declaration mechanisms for any
+ of the three kinds of contexts: theory, proof, generic.
+
+ \paragraph{Theory data} declarations need to implement the following
+ SML signature:
+
+ \medskip
+ \begin{tabular}{ll}
+ @{text "\<type> T"} & representing type \\
+ @{text "\<val> empty: T"} & empty default value \\
+ @{text "\<val> extend: T \<rightarrow> T"} & re-initialize on import \\
+ @{text "\<val> merge: T \<times> T \<rightarrow> T"} & join on import \\
+ \end{tabular}
+ \medskip
+
+ The @{text "empty"} value acts as initial default for \emph{any}
+ theory that does not declare actual data content; @{text "extend"}
+ is acts like a unitary version of @{text "merge"}.
+
+ Implementing @{text "merge"} can be tricky. The general idea is
+ that @{text "merge (data\<^sub>1, data\<^sub>2)"} inserts those parts of @{text
+ "data\<^sub>2"} into @{text "data\<^sub>1"} that are not yet present, while
+ keeping the general order of things. The @{ML Library.merge}
+ function on plain lists may serve as canonical template.
+
+ Particularly note that shared parts of the data must not be
+ duplicated by naive concatenation, or a theory graph that is like a
+ chain of diamonds would cause an exponential blowup!
+
+ \paragraph{Proof context data} declarations need to implement the
+ following SML signature:
+
+ \medskip
+ \begin{tabular}{ll}
+ @{text "\<type> T"} & representing type \\
+ @{text "\<val> init: theory \<rightarrow> T"} & produce initial value \\
+ \end{tabular}
+ \medskip
+
+ The @{text "init"} operation is supposed to produce a pure value
+ from the given background theory and should be somehow
+ ``immediate''. Whenever a proof context is initialized, which
+ happens frequently, the the system invokes the @{text "init"}
+ operation of \emph{all} theory data slots ever declared. This also
+ means that one needs to be economic about the total number of proof
+ data declarations in the system, i.e.\ each ML module should declare
+ at most one, sometimes two data slots for its internal use.
+ Repeated data declarations to simulate a record type should be
+ avoided!
+
+ \paragraph{Generic data} provides a hybrid interface for both theory
+ and proof data. The @{text "init"} operation for proof contexts is
+ predefined to select the current data value from the background
+ theory.
+
+ \bigskip Any of the above data declarations over type @{text "T"}
+ result in an ML structure with the following signature:
+
+ \medskip
+ \begin{tabular}{ll}
+ @{text "get: context \<rightarrow> T"} \\
+ @{text "put: T \<rightarrow> context \<rightarrow> context"} \\
+ @{text "map: (T \<rightarrow> T) \<rightarrow> context \<rightarrow> context"} \\
+ \end{tabular}
+ \medskip
+
+ These other operations provide exclusive access for the particular
+ kind of context (theory, proof, or generic context). This interface
+ observes the ML discipline for types and scopes: there is no other
+ way to access the corresponding data slot of a context. By keeping
+ these operations private, an Isabelle/ML module may maintain
+ abstract values authentically. *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_functor Theory_Data} \\
+ @{index_ML_functor Proof_Data} \\
+ @{index_ML_functor Generic_Data} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML_functor Theory_Data}@{text "(spec)"} declares data for
+ type @{ML_type theory} according to the specification provided as
+ argument structure. The resulting structure provides data init and
+ access operations as described above.
+
+ \item @{ML_functor Proof_Data}@{text "(spec)"} is analogous to
+ @{ML_functor Theory_Data} for type @{ML_type Proof.context}.
+
+ \item @{ML_functor Generic_Data}@{text "(spec)"} is analogous to
+ @{ML_functor Theory_Data} for type @{ML_type Context.generic}.
+
+ \end{description}
+*}
+
+text %mlex {*
+ The following artificial example demonstrates theory
+ data: we maintain a set of terms that are supposed to be wellformed
+ wrt.\ the enclosing theory. The public interface is as follows:
+*}
+
+ML {*
+ signature WELLFORMED_TERMS =
+ sig
+ val get: theory -> term list
+ val add: term -> theory -> theory
+ end;
+*}
+
+text {* The implementation uses private theory data internally, and
+ only exposes an operation that involves explicit argument checking
+ wrt.\ the given theory. *}
+
+ML {*
+ structure Wellformed_Terms: WELLFORMED_TERMS =
+ struct
+
+ structure Terms = Theory_Data
+ (
+ type T = term Ord_List.T;
+ val empty = [];
+ val extend = I;
+ fun merge (ts1, ts2) =
+ Ord_List.union Term_Ord.fast_term_ord ts1 ts2;
+ );
+
+ val get = Terms.get;
+
+ fun add raw_t thy =
+ let
+ val t = Sign.cert_term thy raw_t;
+ in
+ Terms.map (Ord_List.insert Term_Ord.fast_term_ord t) thy
+ end;
+
+ end;
+*}
+
+text {* Type @{ML_type "term Ord_List.T"} is used for reasonably
+ efficient representation of a set of terms: all operations are
+ linear in the number of stored elements. Here we assume that users
+ of this module do not care about the declaration order, since that
+ data structure forces its own arrangement of elements.
+
+ Observe how the @{ML_text merge} operation joins the data slots of
+ the two constituents: @{ML Ord_List.union} prevents duplication of
+ common data from different branches, thus avoiding the danger of
+ exponential blowup. Plain list append etc.\ must never be used for
+ theory data merges!
+
+ \medskip Our intended invariant is achieved as follows:
+ \begin{enumerate}
+
+ \item @{ML Wellformed_Terms.add} only admits terms that have passed
+ the @{ML Sign.cert_term} check of the given theory at that point.
+
+ \item Wellformedness in the sense of @{ML Sign.cert_term} is
+ monotonic wrt.\ the sub-theory relation. So our data can move
+ upwards in the hierarchy (via extension or merges), and maintain
+ wellformedness without further checks.
+
+ \end{enumerate}
+
+ Note that all basic operations of the inference kernel (which
+ includes @{ML Sign.cert_term}) observe this monotonicity principle,
+ but other user-space tools don't. For example, fully-featured
+ type-inference via @{ML Syntax.check_term} (cf.\
+ \secref{sec:term-check}) is not necessarily monotonic wrt.\ the
+ background theory, since constraints of term constants can be
+ modified by later declarations, for example.
+
+ In most cases, user-space context data does not have to take such
+ invariants too seriously. The situation is different in the
+ implementation of the inference kernel itself, which uses the very
+ same data mechanisms for types, constants, axioms etc.
+*}
+
+
+subsection {* Configuration options \label{sec:config-options} *}
+
+text {* A \emph{configuration option} is a named optional value of
+ some basic type (Boolean, integer, string) that is stored in the
+ context. It is a simple application of general context data
+ (\secref{sec:context-data}) that is sufficiently common to justify
+ customized setup, which includes some concrete declarations for
+ end-users using existing notation for attributes (cf.\
+ \secref{sec:attributes}).
+
+ For example, the predefined configuration option @{attribute
+ show_types} controls output of explicit type constraints for
+ variables in printed terms (cf.\ \secref{sec:read-print}). Its
+ value can be modified within Isar text like this:
+*}
+
+declare [[show_types = false]]
+ -- {* declaration within (local) theory context *}
+
+notepad
+begin
+ note [[show_types = true]]
+ -- {* declaration within proof (forward mode) *}
+ term x
+
+ have "x = x"
+ using [[show_types = false]]
+ -- {* declaration within proof (backward mode) *}
+ ..
+end
+
+text {* Configuration options that are not set explicitly hold a
+ default value that can depend on the application context. This
+ allows to retrieve the value from another slot within the context,
+ or fall back on a global preference mechanism, for example.
+
+ The operations to declare configuration options and get/map their
+ values are modeled as direct replacements for historic global
+ references, only that the context is made explicit. This allows
+ easy configuration of tools, without relying on the execution order
+ as required for old-style mutable references. *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML Config.get: "Proof.context -> 'a Config.T -> 'a"} \\
+ @{index_ML Config.map: "'a Config.T -> ('a -> 'a) -> Proof.context -> Proof.context"} \\
+ @{index_ML Attrib.setup_config_bool: "binding -> (Context.generic -> bool) ->
+ bool Config.T"} \\
+ @{index_ML Attrib.setup_config_int: "binding -> (Context.generic -> int) ->
+ int Config.T"} \\
+ @{index_ML Attrib.setup_config_real: "binding -> (Context.generic -> real) ->
+ real Config.T"} \\
+ @{index_ML Attrib.setup_config_string: "binding -> (Context.generic -> string) ->
+ string Config.T"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML Config.get}~@{text "ctxt config"} gets the value of
+ @{text "config"} in the given context.
+
+ \item @{ML Config.map}~@{text "config f ctxt"} updates the context
+ by updating the value of @{text "config"}.
+
+ \item @{text "config ="}~@{ML Attrib.setup_config_bool}~@{text "name
+ default"} creates a named configuration option of type @{ML_type
+ bool}, with the given @{text "default"} depending on the application
+ context. The resulting @{text "config"} can be used to get/map its
+ value in a given context. There is an implicit update of the
+ background theory that registers the option as attribute with some
+ concrete syntax.
+
+ \item @{ML Attrib.config_int}, @{ML Attrib.config_real}, and @{ML
+ Attrib.config_string} work like @{ML Attrib.config_bool}, but for
+ types @{ML_type int} and @{ML_type string}, respectively.
+
+ \end{description}
+*}
+
+text %mlex {* The following example shows how to declare and use a
+ Boolean configuration option called @{text "my_flag"} with constant
+ default value @{ML false}. *}
+
+ML {*
+ val my_flag =
+ Attrib.setup_config_bool @{binding my_flag} (K false)
+*}
+
+text {* Now the user can refer to @{attribute my_flag} in
+ declarations, while ML tools can retrieve the current value from the
+ context via @{ML Config.get}. *}
+
+ML_val {* @{assert} (Config.get @{context} my_flag = false) *}
+
+declare [[my_flag = true]]
+
+ML_val {* @{assert} (Config.get @{context} my_flag = true) *}
+
+notepad
+begin
+ {
+ note [[my_flag = false]]
+ ML_val {* @{assert} (Config.get @{context} my_flag = false) *}
+ }
+ ML_val {* @{assert} (Config.get @{context} my_flag = true) *}
+end
+
+text {* Here is another example involving ML type @{ML_type real}
+ (floating-point numbers). *}
+
+ML {*
+ val airspeed_velocity =
+ Attrib.setup_config_real @{binding airspeed_velocity} (K 0.0)
+*}
+
+declare [[airspeed_velocity = 10]]
+declare [[airspeed_velocity = 9.9]]
+
+
+section {* Names \label{sec:names} *}
+
+text {* In principle, a name is just a string, but there are various
+ conventions for representing additional structure. For example,
+ ``@{text "Foo.bar.baz"}'' is considered as a long name consisting of
+ qualifier @{text "Foo.bar"} and base name @{text "baz"}. The
+ individual constituents of a name may have further substructure,
+ e.g.\ the string ``\verb,\,\verb,<alpha>,'' encodes as a single
+ symbol.
+
+ \medskip Subsequently, we shall introduce specific categories of
+ names. Roughly speaking these correspond to logical entities as
+ follows:
+ \begin{itemize}
+
+ \item Basic names (\secref{sec:basic-name}): free and bound
+ variables.
+
+ \item Indexed names (\secref{sec:indexname}): schematic variables.
+
+ \item Long names (\secref{sec:long-name}): constants of any kind
+ (type constructors, term constants, other concepts defined in user
+ space). Such entities are typically managed via name spaces
+ (\secref{sec:name-space}).
+
+ \end{itemize}
+*}
+
+
+subsection {* Strings of symbols \label{sec:symbols} *}
+
+text {* A \emph{symbol} constitutes the smallest textual unit in
+ Isabelle --- raw ML characters are normally not encountered at all!
+ Isabelle strings consist of a sequence of symbols, represented as a
+ packed string or an exploded list of strings. Each symbol is in
+ itself a small string, which has either one of the following forms:
+
+ \begin{enumerate}
+
+ \item a single ASCII character ``@{text "c"}'', for example
+ ``\verb,a,'',
+
+ \item a codepoint according to UTF8 (non-ASCII byte sequence),
+
+ \item a regular symbol ``\verb,\,\verb,<,@{text "ident"}\verb,>,'',
+ for example ``\verb,\,\verb,<alpha>,'',
+
+ \item a control symbol ``\verb,\,\verb,<^,@{text "ident"}\verb,>,'',
+ for example ``\verb,\,\verb,<^bold>,'',
+
+ \item a raw symbol ``\verb,\,\verb,<^raw:,@{text text}\verb,>,''
+ where @{text text} consists of printable characters excluding
+ ``\verb,.,'' and ``\verb,>,'', for example
+ ``\verb,\,\verb,<^raw:$\sum_{i = 1}^n$>,'',
+
+ \item a numbered raw control symbol ``\verb,\,\verb,<^raw,@{text
+ n}\verb,>, where @{text n} consists of digits, for example
+ ``\verb,\,\verb,<^raw42>,''.
+
+ \end{enumerate}
+
+ The @{text "ident"} syntax for symbol names is @{text "letter
+ (letter | digit)\<^sup>*"}, where @{text "letter = A..Za..z"} and @{text
+ "digit = 0..9"}. There are infinitely many regular symbols and
+ control symbols, but a fixed collection of standard symbols is
+ treated specifically. For example, ``\verb,\,\verb,<alpha>,'' is
+ classified as a letter, which means it may occur within regular
+ Isabelle identifiers.
+
+ The character set underlying Isabelle symbols is 7-bit ASCII, but
+ 8-bit character sequences are passed-through unchanged. Unicode/UCS
+ data in UTF-8 encoding is processed in a non-strict fashion, such
+ that well-formed code sequences are recognized
+ accordingly.\footnote{Note that ISO-Latin-1 differs from UTF-8 only
+ in some special punctuation characters that even have replacements
+ within the standard collection of Isabelle symbols. Text consisting
+ of ASCII plus accented letters can be processed in either encoding.}
+ Unicode provides its own collection of mathematical symbols, but
+ within the core Isabelle/ML world there is no link to the standard
+ collection of Isabelle regular symbols.
+
+ \medskip Output of Isabelle symbols depends on the print mode
+ (\cite{isabelle-isar-ref}). For example, the standard {\LaTeX}
+ setup of the Isabelle document preparation system would present
+ ``\verb,\,\verb,<alpha>,'' as @{text "\<alpha>"}, and
+ ``\verb,\,\verb,<^bold>,\verb,\,\verb,<alpha>,'' as @{text "\<^bold>\<alpha>"}.
+ On-screen rendering usually works by mapping a finite subset of
+ Isabelle symbols to suitable Unicode characters.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type "Symbol.symbol": string} \\
+ @{index_ML Symbol.explode: "string -> Symbol.symbol list"} \\
+ @{index_ML Symbol.is_letter: "Symbol.symbol -> bool"} \\
+ @{index_ML Symbol.is_digit: "Symbol.symbol -> bool"} \\
+ @{index_ML Symbol.is_quasi: "Symbol.symbol -> bool"} \\
+ @{index_ML Symbol.is_blank: "Symbol.symbol -> bool"} \\
+ \end{mldecls}
+ \begin{mldecls}
+ @{index_ML_type "Symbol.sym"} \\
+ @{index_ML Symbol.decode: "Symbol.symbol -> Symbol.sym"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type "Symbol.symbol"} represents individual Isabelle
+ symbols.
+
+ \item @{ML "Symbol.explode"}~@{text "str"} produces a symbol list
+ from the packed form. This function supersedes @{ML
+ "String.explode"} for virtually all purposes of manipulating text in
+ Isabelle!\footnote{The runtime overhead for exploded strings is
+ mainly that of the list structure: individual symbols that happen to
+ be a singleton string do not require extra memory in Poly/ML.}
+
+ \item @{ML "Symbol.is_letter"}, @{ML "Symbol.is_digit"}, @{ML
+ "Symbol.is_quasi"}, @{ML "Symbol.is_blank"} classify standard
+ symbols according to fixed syntactic conventions of Isabelle, cf.\
+ \cite{isabelle-isar-ref}.
+
+ \item Type @{ML_type "Symbol.sym"} is a concrete datatype that
+ represents the different kinds of symbols explicitly, with
+ constructors @{ML "Symbol.Char"}, @{ML "Symbol.Sym"}, @{ML
+ "Symbol.UTF8"}, @{ML "Symbol.Ctrl"}, @{ML "Symbol.Raw"}.
+
+ \item @{ML "Symbol.decode"} converts the string representation of a
+ symbol into the datatype version.
+
+ \end{description}
+
+ \paragraph{Historical note.} In the original SML90 standard the
+ primitive ML type @{ML_type char} did not exists, and the @{ML_text
+ "explode: string -> string list"} operation would produce a list of
+ singleton strings as does @{ML "raw_explode: string -> string list"}
+ in Isabelle/ML today. When SML97 came out, Isabelle did not adopt
+ its slightly anachronistic 8-bit characters, but the idea of
+ exploding a string into a list of small strings was extended to
+ ``symbols'' as explained above. Thus Isabelle sources can refer to
+ an infinite store of user-defined symbols, without having to worry
+ about the multitude of Unicode encodings. *}
+
+
+subsection {* Basic names \label{sec:basic-name} *}
+
+text {*
+ A \emph{basic name} essentially consists of a single Isabelle
+ identifier. There are conventions to mark separate classes of basic
+ names, by attaching a suffix of underscores: one underscore means
+ \emph{internal name}, two underscores means \emph{Skolem name},
+ three underscores means \emph{internal Skolem name}.
+
+ For example, the basic name @{text "foo"} has the internal version
+ @{text "foo_"}, with Skolem versions @{text "foo__"} and @{text
+ "foo___"}, respectively.
+
+ These special versions provide copies of the basic name space, apart
+ from anything that normally appears in the user text. For example,
+ system generated variables in Isar proof contexts are usually marked
+ as internal, which prevents mysterious names like @{text "xaa"} to
+ appear in human-readable text.
+
+ \medskip Manipulating binding scopes often requires on-the-fly
+ renamings. A \emph{name context} contains a collection of already
+ used names. The @{text "declare"} operation adds names to the
+ context.
+
+ The @{text "invents"} operation derives a number of fresh names from
+ a given starting point. For example, the first three names derived
+ from @{text "a"} are @{text "a"}, @{text "b"}, @{text "c"}.
+
+ The @{text "variants"} operation produces fresh names by
+ incrementing tentative names as base-26 numbers (with digits @{text
+ "a..z"}) until all clashes are resolved. For example, name @{text
+ "foo"} results in variants @{text "fooa"}, @{text "foob"}, @{text
+ "fooc"}, \dots, @{text "fooaa"}, @{text "fooab"} etc.; each renaming
+ step picks the next unused variant from this sequence.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML Name.internal: "string -> string"} \\
+ @{index_ML Name.skolem: "string -> string"} \\
+ \end{mldecls}
+ \begin{mldecls}
+ @{index_ML_type Name.context} \\
+ @{index_ML Name.context: Name.context} \\
+ @{index_ML Name.declare: "string -> Name.context -> Name.context"} \\
+ @{index_ML Name.invent: "Name.context -> string -> int -> string list"} \\
+ @{index_ML Name.variant: "string -> Name.context -> string * Name.context"} \\
+ \end{mldecls}
+ \begin{mldecls}
+ @{index_ML Variable.names_of: "Proof.context -> Name.context"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML Name.internal}~@{text "name"} produces an internal name
+ by adding one underscore.
+
+ \item @{ML Name.skolem}~@{text "name"} produces a Skolem name by
+ adding two underscores.
+
+ \item Type @{ML_type Name.context} represents the context of already
+ used names; the initial value is @{ML "Name.context"}.
+
+ \item @{ML Name.declare}~@{text "name"} enters a used name into the
+ context.
+
+ \item @{ML Name.invent}~@{text "context name n"} produces @{text
+ "n"} fresh names derived from @{text "name"}.
+
+ \item @{ML Name.variant}~@{text "name context"} produces a fresh
+ variant of @{text "name"}; the result is declared to the context.
+
+ \item @{ML Variable.names_of}~@{text "ctxt"} retrieves the context
+ of declared type and term variable names. Projecting a proof
+ context down to a primitive name context is occasionally useful when
+ invoking lower-level operations. Regular management of ``fresh
+ variables'' is done by suitable operations of structure @{ML_struct
+ Variable}, which is also able to provide an official status of
+ ``locally fixed variable'' within the logical environment (cf.\
+ \secref{sec:variables}).
+
+ \end{description}
+*}
+
+text %mlex {* The following simple examples demonstrate how to produce
+ fresh names from the initial @{ML Name.context}. *}
+
+ML {*
+ val list1 = Name.invent Name.context "a" 5;
+ @{assert} (list1 = ["a", "b", "c", "d", "e"]);
+
+ val list2 =
+ #1 (fold_map Name.variant ["x", "x", "a", "a", "'a", "'a"] Name.context);
+ @{assert} (list2 = ["x", "xa", "a", "aa", "'a", "'aa"]);
+*}
+
+text {* \medskip The same works relatively to the formal context as
+ follows. *}
+
+locale ex = fixes a b c :: 'a
+begin
+
+ML {*
+ val names = Variable.names_of @{context};
+
+ val list1 = Name.invent names "a" 5;
+ @{assert} (list1 = ["d", "e", "f", "g", "h"]);
+
+ val list2 =
+ #1 (fold_map Name.variant ["x", "x", "a", "a", "'a", "'a"] names);
+ @{assert} (list2 = ["x", "xa", "aa", "ab", "'aa", "'ab"]);
+*}
+
+end
+
+
+subsection {* Indexed names \label{sec:indexname} *}
+
+text {*
+ An \emph{indexed name} (or @{text "indexname"}) is a pair of a basic
+ name and a natural number. This representation allows efficient
+ renaming by incrementing the second component only. The canonical
+ way to rename two collections of indexnames apart from each other is
+ this: determine the maximum index @{text "maxidx"} of the first
+ collection, then increment all indexes of the second collection by
+ @{text "maxidx + 1"}; the maximum index of an empty collection is
+ @{text "-1"}.
+
+ Occasionally, basic names are injected into the same pair type of
+ indexed names: then @{text "(x, -1)"} is used to encode the basic
+ name @{text "x"}.
+
+ \medskip Isabelle syntax observes the following rules for
+ representing an indexname @{text "(x, i)"} as a packed string:
+
+ \begin{itemize}
+
+ \item @{text "?x"} if @{text "x"} does not end with a digit and @{text "i = 0"},
+
+ \item @{text "?xi"} if @{text "x"} does not end with a digit,
+
+ \item @{text "?x.i"} otherwise.
+
+ \end{itemize}
+
+ Indexnames may acquire large index numbers after several maxidx
+ shifts have been applied. Results are usually normalized towards
+ @{text "0"} at certain checkpoints, notably at the end of a proof.
+ This works by producing variants of the corresponding basic name
+ components. For example, the collection @{text "?x1, ?x7, ?x42"}
+ becomes @{text "?x, ?xa, ?xb"}.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type indexname: "string * int"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type indexname} represents indexed names. This is
+ an abbreviation for @{ML_type "string * int"}. The second component
+ is usually non-negative, except for situations where @{text "(x,
+ -1)"} is used to inject basic names into this type. Other negative
+ indexes should not be used.
+
+ \end{description}
+*}
+
+
+subsection {* Long names \label{sec:long-name} *}
+
+text {* A \emph{long name} consists of a sequence of non-empty name
+ components. The packed representation uses a dot as separator, as
+ in ``@{text "A.b.c"}''. The last component is called \emph{base
+ name}, the remaining prefix is called \emph{qualifier} (which may be
+ empty). The qualifier can be understood as the access path to the
+ named entity while passing through some nested block-structure,
+ although our free-form long names do not really enforce any strict
+ discipline.
+
+ For example, an item named ``@{text "A.b.c"}'' may be understood as
+ a local entity @{text "c"}, within a local structure @{text "b"},
+ within a global structure @{text "A"}. In practice, long names
+ usually represent 1--3 levels of qualification. User ML code should
+ not make any assumptions about the particular structure of long
+ names!
+
+ The empty name is commonly used as an indication of unnamed
+ entities, or entities that are not entered into the corresponding
+ name space, whenever this makes any sense. The basic operations on
+ long names map empty names again to empty names.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML Long_Name.base_name: "string -> string"} \\
+ @{index_ML Long_Name.qualifier: "string -> string"} \\
+ @{index_ML Long_Name.append: "string -> string -> string"} \\
+ @{index_ML Long_Name.implode: "string list -> string"} \\
+ @{index_ML Long_Name.explode: "string -> string list"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML Long_Name.base_name}~@{text "name"} returns the base name
+ of a long name.
+
+ \item @{ML Long_Name.qualifier}~@{text "name"} returns the qualifier
+ of a long name.
+
+ \item @{ML Long_Name.append}~@{text "name\<^isub>1 name\<^isub>2"} appends two long
+ names.
+
+ \item @{ML Long_Name.implode}~@{text "names"} and @{ML
+ Long_Name.explode}~@{text "name"} convert between the packed string
+ representation and the explicit list form of long names.
+
+ \end{description}
+*}
+
+
+subsection {* Name spaces \label{sec:name-space} *}
+
+text {* A @{text "name space"} manages a collection of long names,
+ together with a mapping between partially qualified external names
+ and fully qualified internal names (in both directions). Note that
+ the corresponding @{text "intern"} and @{text "extern"} operations
+ are mostly used for parsing and printing only! The @{text
+ "declare"} operation augments a name space according to the accesses
+ determined by a given binding, and a naming policy from the context.
+
+ \medskip A @{text "binding"} specifies details about the prospective
+ long name of a newly introduced formal entity. It consists of a
+ base name, prefixes for qualification (separate ones for system
+ infrastructure and user-space mechanisms), a slot for the original
+ source position, and some additional flags.
+
+ \medskip A @{text "naming"} provides some additional details for
+ producing a long name from a binding. Normally, the naming is
+ implicit in the theory or proof context. The @{text "full"}
+ operation (and its variants for different context types) produces a
+ fully qualified internal name to be entered into a name space. The
+ main equation of this ``chemical reaction'' when binding new
+ entities in a context is as follows:
+
+ \medskip
+ \begin{tabular}{l}
+ @{text "binding + naming \<longrightarrow> long name + name space accesses"}
+ \end{tabular}
+
+ \bigskip As a general principle, there is a separate name space for
+ each kind of formal entity, e.g.\ fact, logical constant, type
+ constructor, type class. It is usually clear from the occurrence in
+ concrete syntax (or from the scope) which kind of entity a name
+ refers to. For example, the very same name @{text "c"} may be used
+ uniformly for a constant, type constructor, and type class.
+
+ There are common schemes to name derived entities systematically
+ according to the name of the main logical entity involved, e.g.\
+ fact @{text "c.intro"} for a canonical introduction rule related to
+ constant @{text "c"}. This technique of mapping names from one
+ space into another requires some care in order to avoid conflicts.
+ In particular, theorem names derived from a type constructor or type
+ class should get an additional suffix in addition to the usual
+ qualification. This leads to the following conventions for derived
+ names:
+
+ \medskip
+ \begin{tabular}{ll}
+ logical entity & fact name \\\hline
+ constant @{text "c"} & @{text "c.intro"} \\
+ type @{text "c"} & @{text "c_type.intro"} \\
+ class @{text "c"} & @{text "c_class.intro"} \\
+ \end{tabular}
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type binding} \\
+ @{index_ML Binding.empty: binding} \\
+ @{index_ML Binding.name: "string -> binding"} \\
+ @{index_ML Binding.qualify: "bool -> string -> binding -> binding"} \\
+ @{index_ML Binding.prefix: "bool -> string -> binding -> binding"} \\
+ @{index_ML Binding.conceal: "binding -> binding"} \\
+ @{index_ML Binding.print: "binding -> string"} \\
+ \end{mldecls}
+ \begin{mldecls}
+ @{index_ML_type Name_Space.naming} \\
+ @{index_ML Name_Space.default_naming: Name_Space.naming} \\
+ @{index_ML Name_Space.add_path: "string -> Name_Space.naming -> Name_Space.naming"} \\
+ @{index_ML Name_Space.full_name: "Name_Space.naming -> binding -> string"} \\
+ \end{mldecls}
+ \begin{mldecls}
+ @{index_ML_type Name_Space.T} \\
+ @{index_ML Name_Space.empty: "string -> Name_Space.T"} \\
+ @{index_ML Name_Space.merge: "Name_Space.T * Name_Space.T -> Name_Space.T"} \\
+ @{index_ML Name_Space.declare: "Context.generic -> bool ->
+ binding -> Name_Space.T -> string * Name_Space.T"} \\
+ @{index_ML Name_Space.intern: "Name_Space.T -> string -> string"} \\
+ @{index_ML Name_Space.extern: "Proof.context -> Name_Space.T -> string -> string"} \\
+ @{index_ML Name_Space.is_concealed: "Name_Space.T -> string -> bool"}
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type binding} represents the abstract concept of
+ name bindings.
+
+ \item @{ML Binding.empty} is the empty binding.
+
+ \item @{ML Binding.name}~@{text "name"} produces a binding with base
+ name @{text "name"}. Note that this lacks proper source position
+ information; see also the ML antiquotation @{ML_antiquotation
+ binding}.
+
+ \item @{ML Binding.qualify}~@{text "mandatory name binding"}
+ prefixes qualifier @{text "name"} to @{text "binding"}. The @{text
+ "mandatory"} flag tells if this name component always needs to be
+ given in name space accesses --- this is mostly @{text "false"} in
+ practice. Note that this part of qualification is typically used in
+ derived specification mechanisms.
+
+ \item @{ML Binding.prefix} is similar to @{ML Binding.qualify}, but
+ affects the system prefix. This part of extra qualification is
+ typically used in the infrastructure for modular specifications,
+ notably ``local theory targets'' (see also \chref{ch:local-theory}).
+
+ \item @{ML Binding.conceal}~@{text "binding"} indicates that the
+ binding shall refer to an entity that serves foundational purposes
+ only. This flag helps to mark implementation details of
+ specification mechanism etc. Other tools should not depend on the
+ particulars of concealed entities (cf.\ @{ML
+ Name_Space.is_concealed}).
+
+ \item @{ML Binding.print}~@{text "binding"} produces a string
+ representation for human-readable output, together with some formal
+ markup that might get used in GUI front-ends, for example.
+
+ \item Type @{ML_type Name_Space.naming} represents the abstract
+ concept of a naming policy.
+
+ \item @{ML Name_Space.default_naming} is the default naming policy.
+ In a theory context, this is usually augmented by a path prefix
+ consisting of the theory name.
+
+ \item @{ML Name_Space.add_path}~@{text "path naming"} augments the
+ naming policy by extending its path component.
+
+ \item @{ML Name_Space.full_name}~@{text "naming binding"} turns a
+ name binding (usually a basic name) into the fully qualified
+ internal name, according to the given naming policy.
+
+ \item Type @{ML_type Name_Space.T} represents name spaces.
+
+ \item @{ML Name_Space.empty}~@{text "kind"} and @{ML Name_Space.merge}~@{text
+ "(space\<^isub>1, space\<^isub>2)"} are the canonical operations for
+ maintaining name spaces according to theory data management
+ (\secref{sec:context-data}); @{text "kind"} is a formal comment
+ to characterize the purpose of a name space.
+
+ \item @{ML Name_Space.declare}~@{text "context strict binding
+ space"} enters a name binding as fully qualified internal name into
+ the name space, using the naming of the context.
+
+ \item @{ML Name_Space.intern}~@{text "space name"} internalizes a
+ (partially qualified) external name.
+
+ This operation is mostly for parsing! Note that fully qualified
+ names stemming from declarations are produced via @{ML
+ "Name_Space.full_name"} and @{ML "Name_Space.declare"}
+ (or their derivatives for @{ML_type theory} and
+ @{ML_type Proof.context}).
+
+ \item @{ML Name_Space.extern}~@{text "ctxt space name"} externalizes a
+ (fully qualified) internal name.
+
+ This operation is mostly for printing! User code should not rely on
+ the precise result too much.
+
+ \item @{ML Name_Space.is_concealed}~@{text "space name"} indicates
+ whether @{text "name"} refers to a strictly private entity that
+ other tools are supposed to ignore!
+
+ \end{description}
+*}
+
+text %mlantiq {*
+ \begin{matharray}{rcl}
+ @{ML_antiquotation_def "binding"} & : & @{text ML_antiquotation} \\
+ \end{matharray}
+
+ @{rail "
+ @@{ML_antiquotation binding} name
+ "}
+
+ \begin{description}
+
+ \item @{text "@{binding name}"} produces a binding with base name
+ @{text "name"} and the source position taken from the concrete
+ syntax of this antiquotation. In many situations this is more
+ appropriate than the more basic @{ML Binding.name} function.
+
+ \end{description}
+*}
+
+text %mlex {* The following example yields the source position of some
+ concrete binding inlined into the text:
+*}
+
+ML {* Binding.pos_of @{binding here} *}
+
+text {* \medskip That position can be also printed in a message as
+ follows: *}
+
+ML_command {*
+ writeln
+ ("Look here" ^ Position.str_of (Binding.pos_of @{binding here}))
+*}
+
+text {* This illustrates a key virtue of formalized bindings as
+ opposed to raw specifications of base names: the system can use this
+ additional information for feedback given to the user (error
+ messages etc.). *}
+
+end
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc-src/IsarImplementation/Proof.thy Mon Aug 27 17:11:55 2012 +0200
@@ -0,0 +1,497 @@
+theory Proof
+imports Base
+begin
+
+chapter {* Structured proofs *}
+
+section {* Variables \label{sec:variables} *}
+
+text {*
+ Any variable that is not explicitly bound by @{text "\<lambda>"}-abstraction
+ is considered as ``free''. Logically, free variables act like
+ outermost universal quantification at the sequent level: @{text
+ "A\<^isub>1(x), \<dots>, A\<^isub>n(x) \<turnstile> B(x)"} means that the result
+ holds \emph{for all} values of @{text "x"}. Free variables for
+ terms (not types) can be fully internalized into the logic: @{text
+ "\<turnstile> B(x)"} and @{text "\<turnstile> \<And>x. B(x)"} are interchangeable, provided
+ that @{text "x"} does not occur elsewhere in the context.
+ Inspecting @{text "\<turnstile> \<And>x. B(x)"} more closely, we see that inside the
+ quantifier, @{text "x"} is essentially ``arbitrary, but fixed'',
+ while from outside it appears as a place-holder for instantiation
+ (thanks to @{text "\<And>"} elimination).
+
+ The Pure logic represents the idea of variables being either inside
+ or outside the current scope by providing separate syntactic
+ categories for \emph{fixed variables} (e.g.\ @{text "x"}) vs.\
+ \emph{schematic variables} (e.g.\ @{text "?x"}). Incidently, a
+ universal result @{text "\<turnstile> \<And>x. B(x)"} has the HHF normal form @{text
+ "\<turnstile> B(?x)"}, which represents its generality without requiring an
+ explicit quantifier. The same principle works for type variables:
+ @{text "\<turnstile> B(?\<alpha>)"} represents the idea of ``@{text "\<turnstile> \<forall>\<alpha>. B(\<alpha>)"}''
+ without demanding a truly polymorphic framework.
+
+ \medskip Additional care is required to treat type variables in a
+ way that facilitates type-inference. In principle, term variables
+ depend on type variables, which means that type variables would have
+ to be declared first. For example, a raw type-theoretic framework
+ would demand the context to be constructed in stages as follows:
+ @{text "\<Gamma> = \<alpha>: type, x: \<alpha>, a: A(x\<^isub>\<alpha>)"}.
+
+ We allow a slightly less formalistic mode of operation: term
+ variables @{text "x"} are fixed without specifying a type yet
+ (essentially \emph{all} potential occurrences of some instance
+ @{text "x\<^isub>\<tau>"} are fixed); the first occurrence of @{text "x"}
+ within a specific term assigns its most general type, which is then
+ maintained consistently in the context. The above example becomes
+ @{text "\<Gamma> = x: term, \<alpha>: type, A(x\<^isub>\<alpha>)"}, where type @{text
+ "\<alpha>"} is fixed \emph{after} term @{text "x"}, and the constraint
+ @{text "x :: \<alpha>"} is an implicit consequence of the occurrence of
+ @{text "x\<^isub>\<alpha>"} in the subsequent proposition.
+
+ This twist of dependencies is also accommodated by the reverse
+ operation of exporting results from a context: a type variable
+ @{text "\<alpha>"} is considered fixed as long as it occurs in some fixed
+ term variable of the context. For example, exporting @{text "x:
+ term, \<alpha>: type \<turnstile> x\<^isub>\<alpha> \<equiv> x\<^isub>\<alpha>"} produces in the first step @{text "x: term
+ \<turnstile> x\<^isub>\<alpha> \<equiv> x\<^isub>\<alpha>"} for fixed @{text "\<alpha>"}, and only in the second step
+ @{text "\<turnstile> ?x\<^isub>?\<^isub>\<alpha> \<equiv> ?x\<^isub>?\<^isub>\<alpha>"} for schematic @{text "?x"} and @{text "?\<alpha>"}.
+ The following Isar source text illustrates this scenario.
+*}
+
+notepad
+begin
+ {
+ fix x -- {* all potential occurrences of some @{text "x::\<tau>"} are fixed *}
+ {
+ have "x::'a \<equiv> x" -- {* implicit type assigment by concrete occurrence *}
+ by (rule reflexive)
+ }
+ thm this -- {* result still with fixed type @{text "'a"} *}
+ }
+ thm this -- {* fully general result for arbitrary @{text "?x::?'a"} *}
+end
+
+text {* The Isabelle/Isar proof context manages the details of term
+ vs.\ type variables, with high-level principles for moving the
+ frontier between fixed and schematic variables.
+
+ The @{text "add_fixes"} operation explictly declares fixed
+ variables; the @{text "declare_term"} operation absorbs a term into
+ a context by fixing new type variables and adding syntactic
+ constraints.
+
+ The @{text "export"} operation is able to perform the main work of
+ generalizing term and type variables as sketched above, assuming
+ that fixing variables and terms have been declared properly.
+
+ There @{text "import"} operation makes a generalized fact a genuine
+ part of the context, by inventing fixed variables for the schematic
+ ones. The effect can be reversed by using @{text "export"} later,
+ potentially with an extended context; the result is equivalent to
+ the original modulo renaming of schematic variables.
+
+ The @{text "focus"} operation provides a variant of @{text "import"}
+ for nested propositions (with explicit quantification): @{text
+ "\<And>x\<^isub>1 \<dots> x\<^isub>n. B(x\<^isub>1, \<dots>, x\<^isub>n)"} is
+ decomposed by inventing fixed variables @{text "x\<^isub>1, \<dots>,
+ x\<^isub>n"} for the body.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML Variable.add_fixes: "
+ string list -> Proof.context -> string list * Proof.context"} \\
+ @{index_ML Variable.variant_fixes: "
+ string list -> Proof.context -> string list * Proof.context"} \\
+ @{index_ML Variable.declare_term: "term -> Proof.context -> Proof.context"} \\
+ @{index_ML Variable.declare_constraints: "term -> Proof.context -> Proof.context"} \\
+ @{index_ML Variable.export: "Proof.context -> Proof.context -> thm list -> thm list"} \\
+ @{index_ML Variable.polymorphic: "Proof.context -> term list -> term list"} \\
+ @{index_ML Variable.import: "bool -> thm list -> Proof.context ->
+ (((ctyp * ctyp) list * (cterm * cterm) list) * thm list) * Proof.context"} \\
+ @{index_ML Variable.focus: "term -> Proof.context ->
+ ((string * (string * typ)) list * term) * Proof.context"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML Variable.add_fixes}~@{text "xs ctxt"} fixes term
+ variables @{text "xs"}, returning the resulting internal names. By
+ default, the internal representation coincides with the external
+ one, which also means that the given variables must not be fixed
+ already. There is a different policy within a local proof body: the
+ given names are just hints for newly invented Skolem variables.
+
+ \item @{ML Variable.variant_fixes} is similar to @{ML
+ Variable.add_fixes}, but always produces fresh variants of the given
+ names.
+
+ \item @{ML Variable.declare_term}~@{text "t ctxt"} declares term
+ @{text "t"} to belong to the context. This automatically fixes new
+ type variables, but not term variables. Syntactic constraints for
+ type and term variables are declared uniformly, though.
+
+ \item @{ML Variable.declare_constraints}~@{text "t ctxt"} declares
+ syntactic constraints from term @{text "t"}, without making it part
+ of the context yet.
+
+ \item @{ML Variable.export}~@{text "inner outer thms"} generalizes
+ fixed type and term variables in @{text "thms"} according to the
+ difference of the @{text "inner"} and @{text "outer"} context,
+ following the principles sketched above.
+
+ \item @{ML Variable.polymorphic}~@{text "ctxt ts"} generalizes type
+ variables in @{text "ts"} as far as possible, even those occurring
+ in fixed term variables. The default policy of type-inference is to
+ fix newly introduced type variables, which is essentially reversed
+ with @{ML Variable.polymorphic}: here the given terms are detached
+ from the context as far as possible.
+
+ \item @{ML Variable.import}~@{text "open thms ctxt"} invents fixed
+ type and term variables for the schematic ones occurring in @{text
+ "thms"}. The @{text "open"} flag indicates whether the fixed names
+ should be accessible to the user, otherwise newly introduced names
+ are marked as ``internal'' (\secref{sec:names}).
+
+ \item @{ML Variable.focus}~@{text B} decomposes the outermost @{text
+ "\<And>"} prefix of proposition @{text "B"}.
+
+ \end{description}
+*}
+
+text %mlex {* The following example shows how to work with fixed term
+ and type parameters and with type-inference. *}
+
+ML {*
+ (*static compile-time context -- for testing only*)
+ val ctxt0 = @{context};
+
+ (*locally fixed parameters -- no type assignment yet*)
+ val ([x, y], ctxt1) = ctxt0 |> Variable.add_fixes ["x", "y"];
+
+ (*t1: most general fixed type; t1': most general arbitrary type*)
+ val t1 = Syntax.read_term ctxt1 "x";
+ val t1' = singleton (Variable.polymorphic ctxt1) t1;
+
+ (*term u enforces specific type assignment*)
+ val u = Syntax.read_term ctxt1 "(x::nat) \<equiv> y";
+
+ (*official declaration of u -- propagates constraints etc.*)
+ val ctxt2 = ctxt1 |> Variable.declare_term u;
+ val t2 = Syntax.read_term ctxt2 "x"; (*x::nat is enforced*)
+*}
+
+text {* In the above example, the starting context is derived from the
+ toplevel theory, which means that fixed variables are internalized
+ literally: @{text "x"} is mapped again to @{text "x"}, and
+ attempting to fix it again in the subsequent context is an error.
+ Alternatively, fixed parameters can be renamed explicitly as
+ follows: *}
+
+ML {*
+ val ctxt0 = @{context};
+ val ([x1, x2, x3], ctxt1) =
+ ctxt0 |> Variable.variant_fixes ["x", "x", "x"];
+*}
+
+text {* The following ML code can now work with the invented names of
+ @{text x1}, @{text x2}, @{text x3}, without depending on
+ the details on the system policy for introducing these variants.
+ Recall that within a proof body the system always invents fresh
+ ``skolem constants'', e.g.\ as follows: *}
+
+notepad
+begin
+ ML_prf %"ML" {*
+ val ctxt0 = @{context};
+
+ val ([x1], ctxt1) = ctxt0 |> Variable.add_fixes ["x"];
+ val ([x2], ctxt2) = ctxt1 |> Variable.add_fixes ["x"];
+ val ([x3], ctxt3) = ctxt2 |> Variable.add_fixes ["x"];
+
+ val ([y1, y2], ctxt4) =
+ ctxt3 |> Variable.variant_fixes ["y", "y"];
+ *}
+end
+
+text {* In this situation @{ML Variable.add_fixes} and @{ML
+ Variable.variant_fixes} are very similar, but identical name
+ proposals given in a row are only accepted by the second version.
+ *}
+
+
+section {* Assumptions \label{sec:assumptions} *}
+
+text {*
+ An \emph{assumption} is a proposition that it is postulated in the
+ current context. Local conclusions may use assumptions as
+ additional facts, but this imposes implicit hypotheses that weaken
+ the overall statement.
+
+ Assumptions are restricted to fixed non-schematic statements, i.e.\
+ all generality needs to be expressed by explicit quantifiers.
+ Nevertheless, the result will be in HHF normal form with outermost
+ quantifiers stripped. For example, by assuming @{text "\<And>x :: \<alpha>. P
+ x"} we get @{text "\<And>x :: \<alpha>. P x \<turnstile> P ?x"} for schematic @{text "?x"}
+ of fixed type @{text "\<alpha>"}. Local derivations accumulate more and
+ more explicit references to hypotheses: @{text "A\<^isub>1, \<dots>,
+ A\<^isub>n \<turnstile> B"} where @{text "A\<^isub>1, \<dots>, A\<^isub>n"} needs to
+ be covered by the assumptions of the current context.
+
+ \medskip The @{text "add_assms"} operation augments the context by
+ local assumptions, which are parameterized by an arbitrary @{text
+ "export"} rule (see below).
+
+ The @{text "export"} operation moves facts from a (larger) inner
+ context into a (smaller) outer context, by discharging the
+ difference of the assumptions as specified by the associated export
+ rules. Note that the discharged portion is determined by the
+ difference of contexts, not the facts being exported! There is a
+ separate flag to indicate a goal context, where the result is meant
+ to refine an enclosing sub-goal of a structured proof state.
+
+ \medskip The most basic export rule discharges assumptions directly
+ by means of the @{text "\<Longrightarrow>"} introduction rule:
+ \[
+ \infer[(@{text "\<Longrightarrow>\<hyphen>intro"})]{@{text "\<Gamma> - A \<turnstile> A \<Longrightarrow> B"}}{@{text "\<Gamma> \<turnstile> B"}}
+ \]
+
+ The variant for goal refinements marks the newly introduced
+ premises, which causes the canonical Isar goal refinement scheme to
+ enforce unification with local premises within the goal:
+ \[
+ \infer[(@{text "#\<Longrightarrow>\<hyphen>intro"})]{@{text "\<Gamma> - A \<turnstile> #A \<Longrightarrow> B"}}{@{text "\<Gamma> \<turnstile> B"}}
+ \]
+
+ \medskip Alternative versions of assumptions may perform arbitrary
+ transformations on export, as long as the corresponding portion of
+ hypotheses is removed from the given facts. For example, a local
+ definition works by fixing @{text "x"} and assuming @{text "x \<equiv> t"},
+ with the following export rule to reverse the effect:
+ \[
+ \infer[(@{text "\<equiv>\<hyphen>expand"})]{@{text "\<Gamma> - (x \<equiv> t) \<turnstile> B t"}}{@{text "\<Gamma> \<turnstile> B x"}}
+ \]
+ This works, because the assumption @{text "x \<equiv> t"} was introduced in
+ a context with @{text "x"} being fresh, so @{text "x"} does not
+ occur in @{text "\<Gamma>"} here.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type Assumption.export} \\
+ @{index_ML Assumption.assume: "cterm -> thm"} \\
+ @{index_ML Assumption.add_assms:
+ "Assumption.export ->
+ cterm list -> Proof.context -> thm list * Proof.context"} \\
+ @{index_ML Assumption.add_assumes: "
+ cterm list -> Proof.context -> thm list * Proof.context"} \\
+ @{index_ML Assumption.export: "bool -> Proof.context -> Proof.context -> thm -> thm"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type Assumption.export} represents arbitrary export
+ rules, which is any function of type @{ML_type "bool -> cterm list
+ -> thm -> thm"}, where the @{ML_type "bool"} indicates goal mode,
+ and the @{ML_type "cterm list"} the collection of assumptions to be
+ discharged simultaneously.
+
+ \item @{ML Assumption.assume}~@{text "A"} turns proposition @{text
+ "A"} into a primitive assumption @{text "A \<turnstile> A'"}, where the
+ conclusion @{text "A'"} is in HHF normal form.
+
+ \item @{ML Assumption.add_assms}~@{text "r As"} augments the context
+ by assumptions @{text "As"} with export rule @{text "r"}. The
+ resulting facts are hypothetical theorems as produced by the raw
+ @{ML Assumption.assume}.
+
+ \item @{ML Assumption.add_assumes}~@{text "As"} is a special case of
+ @{ML Assumption.add_assms} where the export rule performs @{text
+ "\<Longrightarrow>\<hyphen>intro"} or @{text "#\<Longrightarrow>\<hyphen>intro"}, depending on goal
+ mode.
+
+ \item @{ML Assumption.export}~@{text "is_goal inner outer thm"}
+ exports result @{text "thm"} from the the @{text "inner"} context
+ back into the @{text "outer"} one; @{text "is_goal = true"} means
+ this is a goal context. The result is in HHF normal form. Note
+ that @{ML "Proof_Context.export"} combines @{ML "Variable.export"}
+ and @{ML "Assumption.export"} in the canonical way.
+
+ \end{description}
+*}
+
+text %mlex {* The following example demonstrates how rules can be
+ derived by building up a context of assumptions first, and exporting
+ some local fact afterwards. We refer to @{theory Pure} equality
+ here for testing purposes.
+*}
+
+ML {*
+ (*static compile-time context -- for testing only*)
+ val ctxt0 = @{context};
+
+ val ([eq], ctxt1) =
+ ctxt0 |> Assumption.add_assumes [@{cprop "x \<equiv> y"}];
+ val eq' = Thm.symmetric eq;
+
+ (*back to original context -- discharges assumption*)
+ val r = Assumption.export false ctxt1 ctxt0 eq';
+*}
+
+text {* Note that the variables of the resulting rule are not
+ generalized. This would have required to fix them properly in the
+ context beforehand, and export wrt.\ variables afterwards (cf.\ @{ML
+ Variable.export} or the combined @{ML "Proof_Context.export"}). *}
+
+
+section {* Structured goals and results \label{sec:struct-goals} *}
+
+text {*
+ Local results are established by monotonic reasoning from facts
+ within a context. This allows common combinations of theorems,
+ e.g.\ via @{text "\<And>/\<Longrightarrow>"} elimination, resolution rules, or equational
+ reasoning, see \secref{sec:thms}. Unaccounted context manipulations
+ should be avoided, notably raw @{text "\<And>/\<Longrightarrow>"} introduction or ad-hoc
+ references to free variables or assumptions not present in the proof
+ context.
+
+ \medskip The @{text "SUBPROOF"} combinator allows to structure a
+ tactical proof recursively by decomposing a selected sub-goal:
+ @{text "(\<And>x. A(x) \<Longrightarrow> B(x)) \<Longrightarrow> \<dots>"} is turned into @{text "B(x) \<Longrightarrow> \<dots>"}
+ after fixing @{text "x"} and assuming @{text "A(x)"}. This means
+ the tactic needs to solve the conclusion, but may use the premise as
+ a local fact, for locally fixed variables.
+
+ The family of @{text "FOCUS"} combinators is similar to @{text
+ "SUBPROOF"}, but allows to retain schematic variables and pending
+ subgoals in the resulting goal state.
+
+ The @{text "prove"} operation provides an interface for structured
+ backwards reasoning under program control, with some explicit sanity
+ checks of the result. The goal context can be augmented by
+ additional fixed variables (cf.\ \secref{sec:variables}) and
+ assumptions (cf.\ \secref{sec:assumptions}), which will be available
+ as local facts during the proof and discharged into implications in
+ the result. Type and term variables are generalized as usual,
+ according to the context.
+
+ The @{text "obtain"} operation produces results by eliminating
+ existing facts by means of a given tactic. This acts like a dual
+ conclusion: the proof demonstrates that the context may be augmented
+ by parameters and assumptions, without affecting any conclusions
+ that do not mention these parameters. See also
+ \cite{isabelle-isar-ref} for the user-level @{command obtain} and
+ @{command guess} elements. Final results, which may not refer to
+ the parameters in the conclusion, need to exported explicitly into
+ the original context. *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML SELECT_GOAL: "tactic -> int -> tactic"} \\
+ @{index_ML SUBPROOF: "(Subgoal.focus -> tactic) ->
+ Proof.context -> int -> tactic"} \\
+ @{index_ML Subgoal.FOCUS: "(Subgoal.focus -> tactic) ->
+ Proof.context -> int -> tactic"} \\
+ @{index_ML Subgoal.FOCUS_PREMS: "(Subgoal.focus -> tactic) ->
+ Proof.context -> int -> tactic"} \\
+ @{index_ML Subgoal.FOCUS_PARAMS: "(Subgoal.focus -> tactic) ->
+ Proof.context -> int -> tactic"} \\
+ @{index_ML Subgoal.focus: "Proof.context -> int -> thm -> Subgoal.focus * thm"} \\
+ @{index_ML Subgoal.focus_prems: "Proof.context -> int -> thm -> Subgoal.focus * thm"} \\
+ @{index_ML Subgoal.focus_params: "Proof.context -> int -> thm -> Subgoal.focus * thm"} \\
+ \end{mldecls}
+
+ \begin{mldecls}
+ @{index_ML Goal.prove: "Proof.context -> string list -> term list -> term ->
+ ({prems: thm list, context: Proof.context} -> tactic) -> thm"} \\
+ @{index_ML Goal.prove_multi: "Proof.context -> string list -> term list -> term list ->
+ ({prems: thm list, context: Proof.context} -> tactic) -> thm list"} \\
+ \end{mldecls}
+ \begin{mldecls}
+ @{index_ML Obtain.result: "(Proof.context -> tactic) -> thm list ->
+ Proof.context -> ((string * cterm) list * thm list) * Proof.context"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML SELECT_GOAL}~@{text "tac i"} confines a tactic to the
+ specified subgoal @{text "i"}. This introduces a nested goal state,
+ without decomposing the internal structure of the subgoal yet.
+
+ \item @{ML SUBPROOF}~@{text "tac ctxt i"} decomposes the structure
+ of the specified sub-goal, producing an extended context and a
+ reduced goal, which needs to be solved by the given tactic. All
+ schematic parameters of the goal are imported into the context as
+ fixed ones, which may not be instantiated in the sub-proof.
+
+ \item @{ML Subgoal.FOCUS}, @{ML Subgoal.FOCUS_PREMS}, and @{ML
+ Subgoal.FOCUS_PARAMS} are similar to @{ML SUBPROOF}, but are
+ slightly more flexible: only the specified parts of the subgoal are
+ imported into the context, and the body tactic may introduce new
+ subgoals and schematic variables.
+
+ \item @{ML Subgoal.focus}, @{ML Subgoal.focus_prems}, @{ML
+ Subgoal.focus_params} extract the focus information from a goal
+ state in the same way as the corresponding tacticals above. This is
+ occasionally useful to experiment without writing actual tactics
+ yet.
+
+ \item @{ML Goal.prove}~@{text "ctxt xs As C tac"} states goal @{text
+ "C"} in the context augmented by fixed variables @{text "xs"} and
+ assumptions @{text "As"}, and applies tactic @{text "tac"} to solve
+ it. The latter may depend on the local assumptions being presented
+ as facts. The result is in HHF normal form.
+
+ \item @{ML Goal.prove_multi} is simular to @{ML Goal.prove}, but
+ states several conclusions simultaneously. The goal is encoded by
+ means of Pure conjunction; @{ML Goal.conjunction_tac} will turn this
+ into a collection of individual subgoals.
+
+ \item @{ML Obtain.result}~@{text "tac thms ctxt"} eliminates the
+ given facts using a tactic, which results in additional fixed
+ variables and assumptions in the context. Final results need to be
+ exported explicitly.
+
+ \end{description}
+*}
+
+text %mlex {* The following minimal example illustrates how to access
+ the focus information of a structured goal state. *}
+
+notepad
+begin
+ fix A B C :: "'a \<Rightarrow> bool"
+
+ have "\<And>x. A x \<Longrightarrow> B x \<Longrightarrow> C x"
+ ML_val
+ {*
+ val {goal, context = goal_ctxt, ...} = @{Isar.goal};
+ val (focus as {params, asms, concl, ...}, goal') =
+ Subgoal.focus goal_ctxt 1 goal;
+ val [A, B] = #prems focus;
+ val [(_, x)] = #params focus;
+ *}
+ oops
+
+text {* \medskip The next example demonstrates forward-elimination in
+ a local context, using @{ML Obtain.result}. *}
+
+notepad
+begin
+ assume ex: "\<exists>x. B x"
+
+ ML_prf %"ML" {*
+ val ctxt0 = @{context};
+ val (([(_, x)], [B]), ctxt1) = ctxt0
+ |> Obtain.result (fn _ => etac @{thm exE} 1) [@{thm ex}];
+ *}
+ ML_prf %"ML" {*
+ singleton (Proof_Context.export ctxt1 ctxt0) @{thm refl};
+ *}
+ ML_prf %"ML" {*
+ Proof_Context.export ctxt1 ctxt0 [Thm.reflexive x]
+ handle ERROR msg => (warning msg; []);
+ *}
+end
+
+end
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc-src/IsarImplementation/Syntax.thy Mon Aug 27 17:11:55 2012 +0200
@@ -0,0 +1,163 @@
+theory Syntax
+imports Base
+begin
+
+chapter {* Concrete syntax and type-checking *}
+
+text {* Pure @{text "\<lambda>"}-calculus as introduced in \chref{ch:logic} is
+ an adequate foundation for logical languages --- in the tradition of
+ \emph{higher-order abstract syntax} --- but end-users require
+ additional means for reading and printing of terms and types. This
+ important add-on outside the logical core is called \emph{inner
+ syntax} in Isabelle jargon, as opposed to the \emph{outer syntax} of
+ the theory and proof language (cf.\ \cite{isabelle-isar-ref}).
+
+ For example, according to \cite{church40} quantifiers are
+ represented as higher-order constants @{text "All :: ('a \<Rightarrow> bool) \<Rightarrow>
+ bool"} such that @{text "All (\<lambda>x::'a. B x)"} faithfully represents
+ the idea that is displayed as @{text "\<forall>x::'a. B x"} via @{keyword
+ "binder"} notation. Moreover, type-inference in the style of
+ Hindley-Milner \cite{hindleymilner} (and extensions) enables users
+ to write @{text "\<forall>x. B x"} concisely, when the type @{text "'a"} is
+ already clear from the context.\footnote{Type-inference taken to the
+ extreme can easily confuse users, though. Beginners often stumble
+ over unexpectedly general types inferred by the system.}
+
+ \medskip The main inner syntax operations are \emph{read} for
+ parsing together with type-checking, and \emph{pretty} for formatted
+ output. See also \secref{sec:read-print}.
+
+ Furthermore, the input and output syntax layers are sub-divided into
+ separate phases for \emph{concrete syntax} versus \emph{abstract
+ syntax}, see also \secref{sec:parse-unparse} and
+ \secref{sec:term-check}, respectively. This results in the
+ following decomposition of the main operations:
+
+ \begin{itemize}
+
+ \item @{text "read = parse; check"}
+
+ \item @{text "pretty = uncheck; unparse"}
+
+ \end{itemize}
+
+ Some specification package might thus intercept syntax processing at
+ a well-defined stage after @{text "parse"}, to a augment the
+ resulting pre-term before full type-reconstruction is performed by
+ @{text "check"}, for example. Note that the formal status of bound
+ variables, versus free variables, versus constants must not be
+ changed here! *}
+
+
+section {* Reading and pretty printing \label{sec:read-print} *}
+
+text {* Read and print operations are roughly dual to each other, such
+ that for the user @{text "s' = pretty (read s)"} looks similar to
+ the original source text @{text "s"}, but the details depend on many
+ side-conditions. There are also explicit options to control
+ suppressing of type information in the output. The default
+ configuration routinely looses information, so @{text "t' = read
+ (pretty t)"} might fail, produce a differently typed term, or a
+ completely different term in the face of syntactic overloading! *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML Syntax.read_typ: "Proof.context -> string -> typ"} \\
+ @{index_ML Syntax.read_term: "Proof.context -> string -> term"} \\
+ @{index_ML Syntax.read_prop: "Proof.context -> string -> term"} \\
+ @{index_ML Syntax.pretty_typ: "Proof.context -> typ -> Pretty.T"} \\
+ @{index_ML Syntax.pretty_term: "Proof.context -> term -> Pretty.T"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item FIXME
+
+ \end{description}
+*}
+
+
+section {* Parsing and unparsing \label{sec:parse-unparse} *}
+
+text {* Parsing and unparsing converts between actual source text and
+ a certain \emph{pre-term} format, where all bindings and scopes are
+ resolved faithfully. Thus the names of free variables or constants
+ are already determined in the sense of the logical context, but type
+ information might is still missing. Pre-terms support an explicit
+ language of \emph{type constraints} that may be augmented by user
+ code to guide the later \emph{check} phase, for example.
+
+ Actual parsing is based on traditional lexical analysis and Earley
+ parsing for arbitrary context-free grammars. The user can specify
+ this via mixfix annotations. Moreover, there are \emph{syntax
+ translations} that can be augmented by the user, either
+ declaratively via @{command translations} or programmatically via
+ @{command parse_translation}, @{command print_translation} etc. The
+ final scope resolution is performed by the system, according to name
+ spaces for types, constants etc.\ determined by the context.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML Syntax.parse_typ: "Proof.context -> string -> typ"} \\
+ @{index_ML Syntax.parse_term: "Proof.context -> string -> term"} \\
+ @{index_ML Syntax.parse_prop: "Proof.context -> string -> term"} \\
+ @{index_ML Syntax.unparse_typ: "Proof.context -> typ -> Pretty.T"} \\
+ @{index_ML Syntax.unparse_term: "Proof.context -> term -> Pretty.T"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item FIXME
+
+ \end{description}
+*}
+
+
+section {* Checking and unchecking \label{sec:term-check} *}
+
+text {* These operations define the transition from pre-terms and
+ fully-annotated terms in the sense of the logical core
+ (\chref{ch:logic}).
+
+ The \emph{check} phase is meant to subsume a variety of mechanisms
+ in the manner of ``type-inference'' or ``type-reconstruction'' or
+ ``type-improvement'', not just type-checking in the narrow sense.
+ The \emph{uncheck} phase is roughly dual, it prunes type-information
+ before pretty printing.
+
+ A typical add-on for the check/uncheck syntax layer is the @{command
+ abbreviation} mechanism. Here the user specifies syntactic
+ definitions that are managed by the system as polymorphic @{text
+ "let"} bindings. These are expanded during the @{text "check"}
+ phase, and contracted during the @{text "uncheck"} phase, without
+ affecting the type-assignment of the given terms.
+
+ \medskip The precise meaning of type checking depends on the context
+ --- additional check/uncheck plugins might be defined in user space!
+
+ For example, the @{command class} command defines a context where
+ @{text "check"} treats certain type instances of overloaded
+ constants according to the ``dictionary construction'' of its
+ logical foundation. This involves ``type improvement''
+ (specialization of slightly too general types) and replacement by
+ certain locale parameters. See also \cite{Haftmann-Wenzel:2009}.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML Syntax.check_typs: "Proof.context -> typ list -> typ list"} \\
+ @{index_ML Syntax.check_terms: "Proof.context -> term list -> term list"} \\
+ @{index_ML Syntax.check_props: "Proof.context -> term list -> term list"} \\
+ @{index_ML Syntax.uncheck_typs: "Proof.context -> typ list -> typ list"} \\
+ @{index_ML Syntax.uncheck_terms: "Proof.context -> term list -> term list"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item FIXME
+
+ \end{description}
+*}
+
+end
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc-src/IsarImplementation/Tactic.thy Mon Aug 27 17:11:55 2012 +0200
@@ -0,0 +1,861 @@
+theory Tactic
+imports Base
+begin
+
+chapter {* Tactical reasoning *}
+
+text {* Tactical reasoning works by refining an initial claim in a
+ backwards fashion, until a solved form is reached. A @{text "goal"}
+ consists of several subgoals that need to be solved in order to
+ achieve the main statement; zero subgoals means that the proof may
+ be finished. A @{text "tactic"} is a refinement operation that maps
+ a goal to a lazy sequence of potential successors. A @{text
+ "tactical"} is a combinator for composing tactics. *}
+
+
+section {* Goals \label{sec:tactical-goals} *}
+
+text {*
+ Isabelle/Pure represents a goal as a theorem stating that the
+ subgoals imply the main goal: @{text "A\<^sub>1 \<Longrightarrow> \<dots> \<Longrightarrow> A\<^sub>n \<Longrightarrow>
+ C"}. The outermost goal structure is that of a Horn Clause: i.e.\
+ an iterated implication without any quantifiers\footnote{Recall that
+ outermost @{text "\<And>x. \<phi>[x]"} is always represented via schematic
+ variables in the body: @{text "\<phi>[?x]"}. These variables may get
+ instantiated during the course of reasoning.}. For @{text "n = 0"}
+ a goal is called ``solved''.
+
+ The structure of each subgoal @{text "A\<^sub>i"} is that of a
+ general Hereditary Harrop Formula @{text "\<And>x\<^sub>1 \<dots>
+ \<And>x\<^sub>k. H\<^sub>1 \<Longrightarrow> \<dots> \<Longrightarrow> H\<^sub>m \<Longrightarrow> B"}. Here @{text
+ "x\<^sub>1, \<dots>, x\<^sub>k"} are goal parameters, i.e.\
+ arbitrary-but-fixed entities of certain types, and @{text
+ "H\<^sub>1, \<dots>, H\<^sub>m"} are goal hypotheses, i.e.\ facts that may
+ be assumed locally. Together, this forms the goal context of the
+ conclusion @{text B} to be established. The goal hypotheses may be
+ again arbitrary Hereditary Harrop Formulas, although the level of
+ nesting rarely exceeds 1--2 in practice.
+
+ The main conclusion @{text C} is internally marked as a protected
+ proposition, which is represented explicitly by the notation @{text
+ "#C"} here. This ensures that the decomposition into subgoals and
+ main conclusion is well-defined for arbitrarily structured claims.
+
+ \medskip Basic goal management is performed via the following
+ Isabelle/Pure rules:
+
+ \[
+ \infer[@{text "(init)"}]{@{text "C \<Longrightarrow> #C"}}{} \qquad
+ \infer[@{text "(finish)"}]{@{text "C"}}{@{text "#C"}}
+ \]
+
+ \medskip The following low-level variants admit general reasoning
+ with protected propositions:
+
+ \[
+ \infer[@{text "(protect)"}]{@{text "#C"}}{@{text "C"}} \qquad
+ \infer[@{text "(conclude)"}]{@{text "A\<^sub>1 \<Longrightarrow> \<dots> \<Longrightarrow> A\<^sub>n \<Longrightarrow> C"}}{@{text "A\<^sub>1 \<Longrightarrow> \<dots> \<Longrightarrow> A\<^sub>n \<Longrightarrow> #C"}}
+ \]
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML Goal.init: "cterm -> thm"} \\
+ @{index_ML Goal.finish: "Proof.context -> thm -> thm"} \\
+ @{index_ML Goal.protect: "thm -> thm"} \\
+ @{index_ML Goal.conclude: "thm -> thm"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML "Goal.init"}~@{text C} initializes a tactical goal from
+ the well-formed proposition @{text C}.
+
+ \item @{ML "Goal.finish"}~@{text "ctxt thm"} checks whether theorem
+ @{text "thm"} is a solved goal (no subgoals), and concludes the
+ result by removing the goal protection. The context is only
+ required for printing error messages.
+
+ \item @{ML "Goal.protect"}~@{text "thm"} protects the full statement
+ of theorem @{text "thm"}.
+
+ \item @{ML "Goal.conclude"}~@{text "thm"} removes the goal
+ protection, even if there are pending subgoals.
+
+ \end{description}
+*}
+
+
+section {* Tactics\label{sec:tactics} *}
+
+text {* A @{text "tactic"} is a function @{text "goal \<rightarrow> goal\<^sup>*\<^sup>*"} that
+ maps a given goal state (represented as a theorem, cf.\
+ \secref{sec:tactical-goals}) to a lazy sequence of potential
+ successor states. The underlying sequence implementation is lazy
+ both in head and tail, and is purely functional in \emph{not}
+ supporting memoing.\footnote{The lack of memoing and the strict
+ nature of SML requires some care when working with low-level
+ sequence operations, to avoid duplicate or premature evaluation of
+ results. It also means that modified runtime behavior, such as
+ timeout, is very hard to achieve for general tactics.}
+
+ An \emph{empty result sequence} means that the tactic has failed: in
+ a compound tactic expression other tactics might be tried instead,
+ or the whole refinement step might fail outright, producing a
+ toplevel error message in the end. When implementing tactics from
+ scratch, one should take care to observe the basic protocol of
+ mapping regular error conditions to an empty result; only serious
+ faults should emerge as exceptions.
+
+ By enumerating \emph{multiple results}, a tactic can easily express
+ the potential outcome of an internal search process. There are also
+ combinators for building proof tools that involve search
+ systematically, see also \secref{sec:tacticals}.
+
+ \medskip As explained before, a goal state essentially consists of a
+ list of subgoals that imply the main goal (conclusion). Tactics may
+ operate on all subgoals or on a particularly specified subgoal, but
+ must not change the main conclusion (apart from instantiating
+ schematic goal variables).
+
+ Tactics with explicit \emph{subgoal addressing} are of the form
+ @{text "int \<rightarrow> tactic"} and may be applied to a particular subgoal
+ (counting from 1). If the subgoal number is out of range, the
+ tactic should fail with an empty result sequence, but must not raise
+ an exception!
+
+ Operating on a particular subgoal means to replace it by an interval
+ of zero or more subgoals in the same place; other subgoals must not
+ be affected, apart from instantiating schematic variables ranging
+ over the whole goal state.
+
+ A common pattern of composing tactics with subgoal addressing is to
+ try the first one, and then the second one only if the subgoal has
+ not been solved yet. Special care is required here to avoid bumping
+ into unrelated subgoals that happen to come after the original
+ subgoal. Assuming that there is only a single initial subgoal is a
+ very common error when implementing tactics!
+
+ Tactics with internal subgoal addressing should expose the subgoal
+ index as @{text "int"} argument in full generality; a hardwired
+ subgoal 1 is not acceptable.
+
+ \medskip The main well-formedness conditions for proper tactics are
+ summarized as follows.
+
+ \begin{itemize}
+
+ \item General tactic failure is indicated by an empty result, only
+ serious faults may produce an exception.
+
+ \item The main conclusion must not be changed, apart from
+ instantiating schematic variables.
+
+ \item A tactic operates either uniformly on all subgoals, or
+ specifically on a selected subgoal (without bumping into unrelated
+ subgoals).
+
+ \item Range errors in subgoal addressing produce an empty result.
+
+ \end{itemize}
+
+ Some of these conditions are checked by higher-level goal
+ infrastructure (\secref{sec:struct-goals}); others are not checked
+ explicitly, and violating them merely results in ill-behaved tactics
+ experienced by the user (e.g.\ tactics that insist in being
+ applicable only to singleton goals, or prevent composition via
+ standard tacticals such as @{ML REPEAT}).
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_type tactic: "thm -> thm Seq.seq"} \\
+ @{index_ML no_tac: tactic} \\
+ @{index_ML all_tac: tactic} \\
+ @{index_ML print_tac: "string -> tactic"} \\[1ex]
+ @{index_ML PRIMITIVE: "(thm -> thm) -> tactic"} \\[1ex]
+ @{index_ML SUBGOAL: "(term * int -> tactic) -> int -> tactic"} \\
+ @{index_ML CSUBGOAL: "(cterm * int -> tactic) -> int -> tactic"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item Type @{ML_type tactic} represents tactics. The
+ well-formedness conditions described above need to be observed. See
+ also @{file "~~/src/Pure/General/seq.ML"} for the underlying
+ implementation of lazy sequences.
+
+ \item Type @{ML_type "int -> tactic"} represents tactics with
+ explicit subgoal addressing, with well-formedness conditions as
+ described above.
+
+ \item @{ML no_tac} is a tactic that always fails, returning the
+ empty sequence.
+
+ \item @{ML all_tac} is a tactic that always succeeds, returning a
+ singleton sequence with unchanged goal state.
+
+ \item @{ML print_tac}~@{text "message"} is like @{ML all_tac}, but
+ prints a message together with the goal state on the tracing
+ channel.
+
+ \item @{ML PRIMITIVE}~@{text rule} turns a primitive inference rule
+ into a tactic with unique result. Exception @{ML THM} is considered
+ a regular tactic failure and produces an empty result; other
+ exceptions are passed through.
+
+ \item @{ML SUBGOAL}~@{text "(fn (subgoal, i) => tactic)"} is the
+ most basic form to produce a tactic with subgoal addressing. The
+ given abstraction over the subgoal term and subgoal number allows to
+ peek at the relevant information of the full goal state. The
+ subgoal range is checked as required above.
+
+ \item @{ML CSUBGOAL} is similar to @{ML SUBGOAL}, but passes the
+ subgoal as @{ML_type cterm} instead of raw @{ML_type term}. This
+ avoids expensive re-certification in situations where the subgoal is
+ used directly for primitive inferences.
+
+ \end{description}
+*}
+
+
+subsection {* Resolution and assumption tactics \label{sec:resolve-assume-tac} *}
+
+text {* \emph{Resolution} is the most basic mechanism for refining a
+ subgoal using a theorem as object-level rule.
+ \emph{Elim-resolution} is particularly suited for elimination rules:
+ it resolves with a rule, proves its first premise by assumption, and
+ finally deletes that assumption from any new subgoals.
+ \emph{Destruct-resolution} is like elim-resolution, but the given
+ destruction rules are first turned into canonical elimination
+ format. \emph{Forward-resolution} is like destruct-resolution, but
+ without deleting the selected assumption. The @{text "r/e/d/f"}
+ naming convention is maintained for several different kinds of
+ resolution rules and tactics.
+
+ Assumption tactics close a subgoal by unifying some of its premises
+ against its conclusion.
+
+ \medskip All the tactics in this section operate on a subgoal
+ designated by a positive integer. Other subgoals might be affected
+ indirectly, due to instantiation of schematic variables.
+
+ There are various sources of non-determinism, the tactic result
+ sequence enumerates all possibilities of the following choices (if
+ applicable):
+
+ \begin{enumerate}
+
+ \item selecting one of the rules given as argument to the tactic;
+
+ \item selecting a subgoal premise to eliminate, unifying it against
+ the first premise of the rule;
+
+ \item unifying the conclusion of the subgoal to the conclusion of
+ the rule.
+
+ \end{enumerate}
+
+ Recall that higher-order unification may produce multiple results
+ that are enumerated here.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML resolve_tac: "thm list -> int -> tactic"} \\
+ @{index_ML eresolve_tac: "thm list -> int -> tactic"} \\
+ @{index_ML dresolve_tac: "thm list -> int -> tactic"} \\
+ @{index_ML forward_tac: "thm list -> int -> tactic"} \\[1ex]
+ @{index_ML assume_tac: "int -> tactic"} \\
+ @{index_ML eq_assume_tac: "int -> tactic"} \\[1ex]
+ @{index_ML match_tac: "thm list -> int -> tactic"} \\
+ @{index_ML ematch_tac: "thm list -> int -> tactic"} \\
+ @{index_ML dmatch_tac: "thm list -> int -> tactic"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML resolve_tac}~@{text "thms i"} refines the goal state
+ using the given theorems, which should normally be introduction
+ rules. The tactic resolves a rule's conclusion with subgoal @{text
+ i}, replacing it by the corresponding versions of the rule's
+ premises.
+
+ \item @{ML eresolve_tac}~@{text "thms i"} performs elim-resolution
+ with the given theorems, which are normally be elimination rules.
+
+ Note that @{ML "eresolve_tac [asm_rl]"} is equivalent to @{ML
+ assume_tac}, which facilitates mixing of assumption steps with
+ genuine eliminations.
+
+ \item @{ML dresolve_tac}~@{text "thms i"} performs
+ destruct-resolution with the given theorems, which should normally
+ be destruction rules. This replaces an assumption by the result of
+ applying one of the rules.
+
+ \item @{ML forward_tac} is like @{ML dresolve_tac} except that the
+ selected assumption is not deleted. It applies a rule to an
+ assumption, adding the result as a new assumption.
+
+ \item @{ML assume_tac}~@{text i} attempts to solve subgoal @{text i}
+ by assumption (modulo higher-order unification).
+
+ \item @{ML eq_assume_tac} is similar to @{ML assume_tac}, but checks
+ only for immediate @{text "\<alpha>"}-convertibility instead of using
+ unification. It succeeds (with a unique next state) if one of the
+ assumptions is equal to the subgoal's conclusion. Since it does not
+ instantiate variables, it cannot make other subgoals unprovable.
+
+ \item @{ML match_tac}, @{ML ematch_tac}, and @{ML dmatch_tac} are
+ similar to @{ML resolve_tac}, @{ML eresolve_tac}, and @{ML
+ dresolve_tac}, respectively, but do not instantiate schematic
+ variables in the goal state.
+
+ Flexible subgoals are not updated at will, but are left alone.
+ Strictly speaking, matching means to treat the unknowns in the goal
+ state as constants; these tactics merely discard unifiers that would
+ update the goal state.
+
+ \end{description}
+*}
+
+
+subsection {* Explicit instantiation within a subgoal context *}
+
+text {* The main resolution tactics (\secref{sec:resolve-assume-tac})
+ use higher-order unification, which works well in many practical
+ situations despite its daunting theoretical properties.
+ Nonetheless, there are important problem classes where unguided
+ higher-order unification is not so useful. This typically involves
+ rules like universal elimination, existential introduction, or
+ equational substitution. Here the unification problem involves
+ fully flexible @{text "?P ?x"} schemes, which are hard to manage
+ without further hints.
+
+ By providing a (small) rigid term for @{text "?x"} explicitly, the
+ remaining unification problem is to assign a (large) term to @{text
+ "?P"}, according to the shape of the given subgoal. This is
+ sufficiently well-behaved in most practical situations.
+
+ \medskip Isabelle provides separate versions of the standard @{text
+ "r/e/d/f"} resolution tactics that allow to provide explicit
+ instantiations of unknowns of the given rule, wrt.\ terms that refer
+ to the implicit context of the selected subgoal.
+
+ An instantiation consists of a list of pairs of the form @{text
+ "(?x, t)"}, where @{text ?x} is a schematic variable occurring in
+ the given rule, and @{text t} is a term from the current proof
+ context, augmented by the local goal parameters of the selected
+ subgoal; cf.\ the @{text "focus"} operation described in
+ \secref{sec:variables}.
+
+ Entering the syntactic context of a subgoal is a brittle operation,
+ because its exact form is somewhat accidental, and the choice of
+ bound variable names depends on the presence of other local and
+ global names. Explicit renaming of subgoal parameters prior to
+ explicit instantiation might help to achieve a bit more robustness.
+
+ Type instantiations may be given as well, via pairs like @{text
+ "(?'a, \<tau>)"}. Type instantiations are distinguished from term
+ instantiations by the syntactic form of the schematic variable.
+ Types are instantiated before terms are. Since term instantiation
+ already performs simple type-inference, so explicit type
+ instantiations are seldom necessary.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML res_inst_tac: "Proof.context -> (indexname * string) list -> thm -> int -> tactic"} \\
+ @{index_ML eres_inst_tac: "Proof.context -> (indexname * string) list -> thm -> int -> tactic"} \\
+ @{index_ML dres_inst_tac: "Proof.context -> (indexname * string) list -> thm -> int -> tactic"} \\
+ @{index_ML forw_inst_tac: "Proof.context -> (indexname * string) list -> thm -> int -> tactic"} \\
+ @{index_ML subgoal_tac: "Proof.context -> string -> int -> tactic"} \\
+ @{index_ML thin_tac: "Proof.context -> string -> int -> tactic"} \\
+ @{index_ML rename_tac: "string list -> int -> tactic"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML res_inst_tac}~@{text "ctxt insts thm i"} instantiates the
+ rule @{text thm} with the instantiations @{text insts}, as described
+ above, and then performs resolution on subgoal @{text i}.
+
+ \item @{ML eres_inst_tac} is like @{ML res_inst_tac}, but performs
+ elim-resolution.
+
+ \item @{ML dres_inst_tac} is like @{ML res_inst_tac}, but performs
+ destruct-resolution.
+
+ \item @{ML forw_inst_tac} is like @{ML dres_inst_tac} except that
+ the selected assumption is not deleted.
+
+ \item @{ML subgoal_tac}~@{text "ctxt \<phi> i"} adds the proposition
+ @{text "\<phi>"} as local premise to subgoal @{text "i"}, and poses the
+ same as a new subgoal @{text "i + 1"} (in the original context).
+
+ \item @{ML thin_tac}~@{text "ctxt \<phi> i"} deletes the specified
+ premise from subgoal @{text i}. Note that @{text \<phi>} may contain
+ schematic variables, to abbreviate the intended proposition; the
+ first matching subgoal premise will be deleted. Removing useless
+ premises from a subgoal increases its readability and can make
+ search tactics run faster.
+
+ \item @{ML rename_tac}~@{text "names i"} renames the innermost
+ parameters of subgoal @{text i} according to the provided @{text
+ names} (which need to be distinct indentifiers).
+
+ \end{description}
+
+ For historical reasons, the above instantiation tactics take
+ unparsed string arguments, which makes them hard to use in general
+ ML code. The slightly more advanced @{ML Subgoal.FOCUS} combinator
+ of \secref{sec:struct-goals} allows to refer to internal goal
+ structure with explicit context management.
+*}
+
+
+subsection {* Rearranging goal states *}
+
+text {* In rare situations there is a need to rearrange goal states:
+ either the overall collection of subgoals, or the local structure of
+ a subgoal. Various administrative tactics allow to operate on the
+ concrete presentation these conceptual sets of formulae. *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML rotate_tac: "int -> int -> tactic"} \\
+ @{index_ML distinct_subgoals_tac: tactic} \\
+ @{index_ML flexflex_tac: tactic} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML rotate_tac}~@{text "n i"} rotates the premises of subgoal
+ @{text i} by @{text n} positions: from right to left if @{text n} is
+ positive, and from left to right if @{text n} is negative.
+
+ \item @{ML distinct_subgoals_tac} removes duplicate subgoals from a
+ proof state. This is potentially inefficient.
+
+ \item @{ML flexflex_tac} removes all flex-flex pairs from the proof
+ state by applying the trivial unifier. This drastic step loses
+ information. It is already part of the Isar infrastructure for
+ facts resulting from goals, and rarely needs to be invoked manually.
+
+ Flex-flex constraints arise from difficult cases of higher-order
+ unification. To prevent this, use @{ML res_inst_tac} to instantiate
+ some variables in a rule. Normally flex-flex constraints can be
+ ignored; they often disappear as unknowns get instantiated.
+
+ \end{description}
+*}
+
+section {* Tacticals \label{sec:tacticals} *}
+
+text {* A \emph{tactical} is a functional combinator for building up
+ complex tactics from simpler ones. Common tacticals perform
+ sequential composition, disjunctive choice, iteration, or goal
+ addressing. Various search strategies may be expressed via
+ tacticals.
+*}
+
+
+subsection {* Combining tactics *}
+
+text {* Sequential composition and alternative choices are the most
+ basic ways to combine tactics, similarly to ``@{verbatim ","}'' and
+ ``@{verbatim "|"}'' in Isar method notation. This corresponds to
+ @{ML_op "THEN"} and @{ML_op "ORELSE"} in ML, but there are further
+ possibilities for fine-tuning alternation of tactics such as @{ML_op
+ "APPEND"}. Further details become visible in ML due to explicit
+ subgoal addressing.
+*}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML_op "THEN": "tactic * tactic -> tactic"} \\
+ @{index_ML_op "ORELSE": "tactic * tactic -> tactic"} \\
+ @{index_ML_op "APPEND": "tactic * tactic -> tactic"} \\
+ @{index_ML "EVERY": "tactic list -> tactic"} \\
+ @{index_ML "FIRST": "tactic list -> tactic"} \\[0.5ex]
+
+ @{index_ML_op "THEN'": "('a -> tactic) * ('a -> tactic) -> 'a -> tactic"} \\
+ @{index_ML_op "ORELSE'": "('a -> tactic) * ('a -> tactic) -> 'a -> tactic"} \\
+ @{index_ML_op "APPEND'": "('a -> tactic) * ('a -> tactic) -> 'a -> tactic"} \\
+ @{index_ML "EVERY'": "('a -> tactic) list -> 'a -> tactic"} \\
+ @{index_ML "FIRST'": "('a -> tactic) list -> 'a -> tactic"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{text "tac\<^sub>1"}~@{ML_op THEN}~@{text "tac\<^sub>2"} is the sequential
+ composition of @{text "tac\<^sub>1"} and @{text "tac\<^sub>2"}. Applied to a goal
+ state, it returns all states reachable in two steps by applying
+ @{text "tac\<^sub>1"} followed by @{text "tac\<^sub>2"}. First, it applies @{text
+ "tac\<^sub>1"} to the goal state, getting a sequence of possible next
+ states; then, it applies @{text "tac\<^sub>2"} to each of these and
+ concatenates the results to produce again one flat sequence of
+ states.
+
+ \item @{text "tac\<^sub>1"}~@{ML_op ORELSE}~@{text "tac\<^sub>2"} makes a choice
+ between @{text "tac\<^sub>1"} and @{text "tac\<^sub>2"}. Applied to a state, it
+ tries @{text "tac\<^sub>1"} and returns the result if non-empty; if @{text
+ "tac\<^sub>1"} fails then it uses @{text "tac\<^sub>2"}. This is a deterministic
+ choice: if @{text "tac\<^sub>1"} succeeds then @{text "tac\<^sub>2"} is excluded
+ from the result.
+
+ \item @{text "tac\<^sub>1"}~@{ML_op APPEND}~@{text "tac\<^sub>2"} concatenates the
+ possible results of @{text "tac\<^sub>1"} and @{text "tac\<^sub>2"}. Unlike
+ @{ML_op "ORELSE"} there is \emph{no commitment} to either tactic, so
+ @{ML_op "APPEND"} helps to avoid incompleteness during search, at
+ the cost of potential inefficiencies.
+
+ \item @{ML EVERY}~@{text "[tac\<^sub>1, \<dots>, tac\<^sub>n]"} abbreviates @{text
+ "tac\<^sub>1"}~@{ML_op THEN}~@{text "\<dots>"}~@{ML_op THEN}~@{text "tac\<^sub>n"}.
+ Note that @{ML "EVERY []"} is the same as @{ML all_tac}: it always
+ succeeds.
+
+ \item @{ML FIRST}~@{text "[tac\<^sub>1, \<dots>, tac\<^sub>n]"} abbreviates @{text
+ "tac\<^sub>1"}~@{ML_op ORELSE}~@{text "\<dots>"}~@{ML_op "ORELSE"}~@{text
+ "tac\<^sub>n"}. Note that @{ML "FIRST []"} is the same as @{ML no_tac}: it
+ always fails.
+
+ \item @{ML_op "THEN'"} is the lifted version of @{ML_op "THEN"}, for
+ tactics with explicit subgoal addressing. So @{text
+ "(tac\<^sub>1"}~@{ML_op THEN'}~@{text "tac\<^sub>2) i"} is the same as @{text
+ "(tac\<^sub>1 i"}~@{ML_op THEN}~@{text "tac\<^sub>2 i)"}.
+
+ The other primed tacticals work analogously.
+
+ \end{description}
+*}
+
+
+subsection {* Repetition tacticals *}
+
+text {* These tacticals provide further control over repetition of
+ tactics, beyond the stylized forms of ``@{verbatim "?"}'' and
+ ``@{verbatim "+"}'' in Isar method expressions. *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML "TRY": "tactic -> tactic"} \\
+ @{index_ML "REPEAT": "tactic -> tactic"} \\
+ @{index_ML "REPEAT1": "tactic -> tactic"} \\
+ @{index_ML "REPEAT_DETERM": "tactic -> tactic"} \\
+ @{index_ML "REPEAT_DETERM_N": "int -> tactic -> tactic"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML TRY}~@{text "tac"} applies @{text "tac"} to the goal
+ state and returns the resulting sequence, if non-empty; otherwise it
+ returns the original state. Thus, it applies @{text "tac"} at most
+ once.
+
+ Note that for tactics with subgoal addressing, the combinator can be
+ applied via functional composition: @{ML "TRY"}~@{ML_op o}~@{text
+ "tac"}. There is no need for @{verbatim TRY'}.
+
+ \item @{ML REPEAT}~@{text "tac"} applies @{text "tac"} to the goal
+ state and, recursively, to each element of the resulting sequence.
+ The resulting sequence consists of those states that make @{text
+ "tac"} fail. Thus, it applies @{text "tac"} as many times as
+ possible (including zero times), and allows backtracking over each
+ invocation of @{text "tac"}. @{ML REPEAT} is more general than @{ML
+ REPEAT_DETERM}, but requires more space.
+
+ \item @{ML REPEAT1}~@{text "tac"} is like @{ML REPEAT}~@{text "tac"}
+ but it always applies @{text "tac"} at least once, failing if this
+ is impossible.
+
+ \item @{ML REPEAT_DETERM}~@{text "tac"} applies @{text "tac"} to the
+ goal state and, recursively, to the head of the resulting sequence.
+ It returns the first state to make @{text "tac"} fail. It is
+ deterministic, discarding alternative outcomes.
+
+ \item @{ML REPEAT_DETERM_N}~@{text "n tac"} is like @{ML
+ REPEAT_DETERM}~@{text "tac"} but the number of repetitions is bound
+ by @{text "n"} (where @{ML "~1"} means @{text "\<infinity>"}).
+
+ \end{description}
+*}
+
+text %mlex {* The basic tactics and tacticals considered above follow
+ some algebraic laws:
+
+ \begin{itemize}
+
+ \item @{ML all_tac} is the identity element of the tactical @{ML_op
+ "THEN"}.
+
+ \item @{ML no_tac} is the identity element of @{ML_op "ORELSE"} and
+ @{ML_op "APPEND"}. Also, it is a zero element for @{ML_op "THEN"},
+ which means that @{text "tac"}~@{ML_op THEN}~@{ML no_tac} is
+ equivalent to @{ML no_tac}.
+
+ \item @{ML TRY} and @{ML REPEAT} can be expressed as (recursive)
+ functions over more basic combinators (ignoring some internal
+ implementation tricks):
+
+ \end{itemize}
+*}
+
+ML {*
+ fun TRY tac = tac ORELSE all_tac;
+ fun REPEAT tac st = ((tac THEN REPEAT tac) ORELSE all_tac) st;
+*}
+
+text {* If @{text "tac"} can return multiple outcomes then so can @{ML
+ REPEAT}~@{text "tac"}. @{ML REPEAT} uses @{ML_op "ORELSE"} and not
+ @{ML_op "APPEND"}, it applies @{text "tac"} as many times as
+ possible in each outcome.
+
+ \begin{warn}
+ Note the explicit abstraction over the goal state in the ML
+ definition of @{ML REPEAT}. Recursive tacticals must be coded in
+ this awkward fashion to avoid infinite recursion of eager functional
+ evaluation in Standard ML. The following attempt would make @{ML
+ REPEAT}~@{text "tac"} loop:
+ \end{warn}
+*}
+
+ML {*
+ (*BAD -- does not terminate!*)
+ fun REPEAT tac = (tac THEN REPEAT tac) ORELSE all_tac;
+*}
+
+
+subsection {* Applying tactics to subgoal ranges *}
+
+text {* Tactics with explicit subgoal addressing
+ @{ML_type "int -> tactic"} can be used together with tacticals that
+ act like ``subgoal quantifiers'': guided by success of the body
+ tactic a certain range of subgoals is covered. Thus the body tactic
+ is applied to \emph{all} subgoals, \emph{some} subgoal etc.
+
+ Suppose that the goal state has @{text "n \<ge> 0"} subgoals. Many of
+ these tacticals address subgoal ranges counting downwards from
+ @{text "n"} towards @{text "1"}. This has the fortunate effect that
+ newly emerging subgoals are concatenated in the result, without
+ interfering each other. Nonetheless, there might be situations
+ where a different order is desired. *}
+
+text %mlref {*
+ \begin{mldecls}
+ @{index_ML ALLGOALS: "(int -> tactic) -> tactic"} \\
+ @{index_ML SOMEGOAL: "(int -> tactic) -> tactic"} \\
+ @{index_ML FIRSTGOAL: "(int -> tactic) -> tactic"} \\
+ @{index_ML HEADGOAL: "(int -> tactic) -> tactic"} \\
+ @{index_ML REPEAT_SOME: "(int -> tactic) -> tactic"} \\
+ @{index_ML REPEAT_FIRST: "(int -> tactic) -> tactic"} \\
+ @{index_ML RANGE: "(int -> tactic) list -> int -> tactic"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML ALLGOALS}~@{text "tac"} is equivalent to @{text "tac
+ n"}~@{ML_op THEN}~@{text "\<dots>"}~@{ML_op THEN}~@{text "tac 1"}. It
+ applies the @{text tac} to all the subgoals, counting downwards.
+
+ \item @{ML SOMEGOAL}~@{text "tac"} is equivalent to @{text "tac
+ n"}~@{ML_op ORELSE}~@{text "\<dots>"}~@{ML_op ORELSE}~@{text "tac 1"}. It
+ applies @{text "tac"} to one subgoal, counting downwards.
+
+ \item @{ML FIRSTGOAL}~@{text "tac"} is equivalent to @{text "tac
+ 1"}~@{ML_op ORELSE}~@{text "\<dots>"}~@{ML_op ORELSE}~@{text "tac n"}. It
+ applies @{text "tac"} to one subgoal, counting upwards.
+
+ \item @{ML HEADGOAL}~@{text "tac"} is equivalent to @{text "tac 1"}.
+ It applies @{text "tac"} unconditionally to the first subgoal.
+
+ \item @{ML REPEAT_SOME}~@{text "tac"} applies @{text "tac"} once or
+ more to a subgoal, counting downwards.
+
+ \item @{ML REPEAT_FIRST}~@{text "tac"} applies @{text "tac"} once or
+ more to a subgoal, counting upwards.
+
+ \item @{ML RANGE}~@{text "[tac\<^sub>1, \<dots>, tac\<^sub>k] i"} is equivalent to
+ @{text "tac\<^sub>k (i + k - 1)"}~@{ML_op THEN}~@{text "\<dots>"}~@{ML_op
+ THEN}~@{text "tac\<^sub>1 i"}. It applies the given list of tactics to the
+ corresponding range of subgoals, counting downwards.
+
+ \end{description}
+*}
+
+
+subsection {* Control and search tacticals *}
+
+text {* A predicate on theorems @{ML_type "thm -> bool"} can test
+ whether a goal state enjoys some desirable property --- such as
+ having no subgoals. Tactics that search for satisfactory goal
+ states are easy to express. The main search procedures,
+ depth-first, breadth-first and best-first, are provided as
+ tacticals. They generate the search tree by repeatedly applying a
+ given tactic. *}
+
+
+text %mlref ""
+
+subsubsection {* Filtering a tactic's results *}
+
+text {*
+ \begin{mldecls}
+ @{index_ML FILTER: "(thm -> bool) -> tactic -> tactic"} \\
+ @{index_ML CHANGED: "tactic -> tactic"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML FILTER}~@{text "sat tac"} applies @{text "tac"} to the
+ goal state and returns a sequence consisting of those result goal
+ states that are satisfactory in the sense of @{text "sat"}.
+
+ \item @{ML CHANGED}~@{text "tac"} applies @{text "tac"} to the goal
+ state and returns precisely those states that differ from the
+ original state (according to @{ML Thm.eq_thm}). Thus @{ML
+ CHANGED}~@{text "tac"} always has some effect on the state.
+
+ \end{description}
+*}
+
+
+subsubsection {* Depth-first search *}
+
+text {*
+ \begin{mldecls}
+ @{index_ML DEPTH_FIRST: "(thm -> bool) -> tactic -> tactic"} \\
+ @{index_ML DEPTH_SOLVE: "tactic -> tactic"} \\
+ @{index_ML DEPTH_SOLVE_1: "tactic -> tactic"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML DEPTH_FIRST}~@{text "sat tac"} returns the goal state if
+ @{text "sat"} returns true. Otherwise it applies @{text "tac"},
+ then recursively searches from each element of the resulting
+ sequence. The code uses a stack for efficiency, in effect applying
+ @{text "tac"}~@{ML_op THEN}~@{ML DEPTH_FIRST}~@{text "sat tac"} to
+ the state.
+
+ \item @{ML DEPTH_SOLVE}@{text "tac"} uses @{ML DEPTH_FIRST} to
+ search for states having no subgoals.
+
+ \item @{ML DEPTH_SOLVE_1}~@{text "tac"} uses @{ML DEPTH_FIRST} to
+ search for states having fewer subgoals than the given state. Thus,
+ it insists upon solving at least one subgoal.
+
+ \end{description}
+*}
+
+
+subsubsection {* Other search strategies *}
+
+text {*
+ \begin{mldecls}
+ @{index_ML BREADTH_FIRST: "(thm -> bool) -> tactic -> tactic"} \\
+ @{index_ML BEST_FIRST: "(thm -> bool) * (thm -> int) -> tactic -> tactic"} \\
+ @{index_ML THEN_BEST_FIRST: "tactic -> (thm -> bool) * (thm -> int) -> tactic -> tactic"} \\
+ \end{mldecls}
+
+ These search strategies will find a solution if one exists.
+ However, they do not enumerate all solutions; they terminate after
+ the first satisfactory result from @{text "tac"}.
+
+ \begin{description}
+
+ \item @{ML BREADTH_FIRST}~@{text "sat tac"} uses breadth-first
+ search to find states for which @{text "sat"} is true. For most
+ applications, it is too slow.
+
+ \item @{ML BEST_FIRST}~@{text "(sat, dist) tac"} does a heuristic
+ search, using @{text "dist"} to estimate the distance from a
+ satisfactory state (in the sense of @{text "sat"}). It maintains a
+ list of states ordered by distance. It applies @{text "tac"} to the
+ head of this list; if the result contains any satisfactory states,
+ then it returns them. Otherwise, @{ML BEST_FIRST} adds the new
+ states to the list, and continues.
+
+ The distance function is typically @{ML size_of_thm}, which computes
+ the size of the state. The smaller the state, the fewer and simpler
+ subgoals it has.
+
+ \item @{ML THEN_BEST_FIRST}~@{text "tac\<^sub>0 (sat, dist) tac"} is like
+ @{ML BEST_FIRST}, except that the priority queue initially contains
+ the result of applying @{text "tac\<^sub>0"} to the goal state. This
+ tactical permits separate tactics for starting the search and
+ continuing the search.
+
+ \end{description}
+*}
+
+
+subsubsection {* Auxiliary tacticals for searching *}
+
+text {*
+ \begin{mldecls}
+ @{index_ML COND: "(thm -> bool) -> tactic -> tactic -> tactic"} \\
+ @{index_ML IF_UNSOLVED: "tactic -> tactic"} \\
+ @{index_ML SOLVE: "tactic -> tactic"} \\
+ @{index_ML DETERM: "tactic -> tactic"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML COND}~@{text "sat tac\<^sub>1 tac\<^sub>2"} applies @{text "tac\<^sub>1"} to
+ the goal state if it satisfies predicate @{text "sat"}, and applies
+ @{text "tac\<^sub>2"}. It is a conditional tactical in that only one of
+ @{text "tac\<^sub>1"} and @{text "tac\<^sub>2"} is applied to a goal state.
+ However, both @{text "tac\<^sub>1"} and @{text "tac\<^sub>2"} are evaluated
+ because ML uses eager evaluation.
+
+ \item @{ML IF_UNSOLVED}~@{text "tac"} applies @{text "tac"} to the
+ goal state if it has any subgoals, and simply returns the goal state
+ otherwise. Many common tactics, such as @{ML resolve_tac}, fail if
+ applied to a goal state that has no subgoals.
+
+ \item @{ML SOLVE}~@{text "tac"} applies @{text "tac"} to the goal
+ state and then fails iff there are subgoals left.
+
+ \item @{ML DETERM}~@{text "tac"} applies @{text "tac"} to the goal
+ state and returns the head of the resulting sequence. @{ML DETERM}
+ limits the search space by making its argument deterministic.
+
+ \end{description}
+*}
+
+
+subsubsection {* Predicates and functions useful for searching *}
+
+text {*
+ \begin{mldecls}
+ @{index_ML has_fewer_prems: "int -> thm -> bool"} \\
+ @{index_ML Thm.eq_thm: "thm * thm -> bool"} \\
+ @{index_ML Thm.eq_thm_prop: "thm * thm -> bool"} \\
+ @{index_ML size_of_thm: "thm -> int"} \\
+ \end{mldecls}
+
+ \begin{description}
+
+ \item @{ML has_fewer_prems}~@{text "n thm"} reports whether @{text
+ "thm"} has fewer than @{text "n"} premises.
+
+ \item @{ML Thm.eq_thm}~@{text "(thm\<^sub>1, thm\<^sub>2)"} reports whether @{text
+ "thm\<^sub>1"} and @{text "thm\<^sub>2"} are equal. Both theorems must have
+ compatible background theories. Both theorems must have the same
+ conclusions, the same set of hypotheses, and the same set of sort
+ hypotheses. Names of bound variables are ignored as usual.
+
+ \item @{ML Thm.eq_thm_prop}~@{text "(thm\<^sub>1, thm\<^sub>2)"} reports whether
+ the propositions of @{text "thm\<^sub>1"} and @{text "thm\<^sub>2"} are equal.
+ Names of bound variables are ignored.
+
+ \item @{ML size_of_thm}~@{text "thm"} computes the size of @{text
+ "thm"}, namely the number of variables, constants and abstractions
+ in its conclusion. It may serve as a distance function for
+ @{ML BEST_FIRST}.
+
+ \end{description}
+*}
+
+end
--- a/doc-src/IsarImplementation/Thy/Base.thy Mon Aug 27 16:48:41 2012 +0200
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
@@ -1,8 +0,0 @@
-theory Base
-imports Main
-begin
-
-ML_file "../../antiquote_setup.ML"
-setup {* Antiquote_Setup.setup *}
-
-end
--- a/doc-src/IsarImplementation/Thy/Eq.thy Mon Aug 27 16:48:41 2012 +0200
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
@@ -1,85 +0,0 @@
-theory Eq
-imports Base
-begin
-
-chapter {* Equational reasoning *}
-
-text {* Equality is one of the most fundamental concepts of
- mathematics. The Isabelle/Pure logic (\chref{ch:logic}) provides a
- builtin relation @{text "\<equiv> :: \<alpha> \<Rightarrow> \<alpha> \<Rightarrow> prop"} that expresses equality
- of arbitrary terms (or propositions) at the framework level, as
- expressed by certain basic inference rules (\secref{sec:eq-rules}).
-
- Equational reasoning means to replace equals by equals, using
- reflexivity and transitivity to form chains of replacement steps,
- and congruence rules to access sub-structures. Conversions
- (\secref{sec:conv}) provide a convenient framework to compose basic
- equational steps to build specific equational reasoning tools.
-
- Higher-order matching is able to provide suitable instantiations for
- giving equality rules, which leads to the versatile concept of
- @{text "\<lambda>"}-term rewriting (\secref{sec:rewriting}). Internally
- this is based on the general-purpose Simplifier engine of Isabelle,
- which is more specific and more efficient than plain conversions.
-
- Object-logics usually introduce specific notions of equality or
- equivalence, and relate it with the Pure equality. This enables to
- re-use the Pure tools for equational reasoning for particular
- object-logic connectives as well.
-*}
-
-
-section {* Basic equality rules \label{sec:eq-rules} *}
-
-text {* FIXME *}
-
-
-section {* Conversions \label{sec:conv} *}
-
-text {* FIXME *}
-
-
-section {* Rewriting \label{sec:rewriting} *}
-
-text {* Rewriting normalizes a given term (theorem or goal) by
- replacing instances of given equalities @{text "t \<equiv> u"} in subterms.
- Rewriting continues until no rewrites are applicable to any subterm.
- This may be used to unfold simple definitions of the form @{text "f
- x\<^sub>1 \<dots> x\<^sub>n \<equiv> u"}, but is slightly more general than that.
-*}
-
-text %mlref {*
- \begin{mldecls}
- @{index_ML rewrite_rule: "thm list -> thm -> thm"} \\
- @{index_ML rewrite_goals_rule: "thm list -> thm -> thm"} \\
- @{index_ML rewrite_goal_tac: "thm list -> int -> tactic"} \\
- @{index_ML rewrite_goals_tac: "thm list -> tactic"} \\
- @{index_ML fold_goals_tac: "thm list -> tactic"} \\
- \end{mldecls}
-
- \begin{description}
-
- \item @{ML rewrite_rule}~@{text "rules thm"} rewrites the whole
- theorem by the given rules.
-
- \item @{ML rewrite_goals_rule}~@{text "rules thm"} rewrites the
- outer premises of the given theorem. Interpreting the same as a
- goal state (\secref{sec:tactical-goals}) it means to rewrite all
- subgoals (in the same manner as @{ML rewrite_goals_tac}).
-
- \item @{ML rewrite_goal_tac}~@{text "rules i"} rewrites subgoal
- @{text "i"} by the given rewrite rules.
-
- \item @{ML rewrite_goals_tac}~@{text "rules"} rewrites all subgoals
- by the given rewrite rules.
-
- \item @{ML fold_goals_tac}~@{text "rules"} essentially uses @{ML
- rewrite_goals_tac} with the symmetric form of each member of @{text
- "rules"}, re-ordered to fold longer expression first. This supports
- to idea to fold primitive definitions that appear in expended form
- in the proof state.
-
- \end{description}
-*}
-
-end
--- a/doc-src/IsarImplementation/Thy/Integration.thy Mon Aug 27 16:48:41 2012 +0200
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
@@ -1,307 +0,0 @@
-theory Integration
-imports Base
-begin
-
-chapter {* System integration *}
-
-section {* Isar toplevel \label{sec:isar-toplevel} *}
-
-text {* The Isar toplevel may be considered the centeral hub of the
- Isabelle/Isar system, where all key components and sub-systems are
- integrated into a single read-eval-print loop of Isar commands,
- which also incorporates the underlying ML compiler.
-
- Isabelle/Isar departs from the original ``LCF system architecture''
- where ML was really The Meta Language for defining theories and
- conducting proofs. Instead, ML now only serves as the
- implementation language for the system (and user extensions), while
- the specific Isar toplevel supports the concepts of theory and proof
- development natively. This includes the graph structure of theories
- and the block structure of proofs, support for unlimited undo,
- facilities for tracing, debugging, timing, profiling etc.
-
- \medskip The toplevel maintains an implicit state, which is
- transformed by a sequence of transitions -- either interactively or
- in batch-mode. In interactive mode, Isar state transitions are
- encapsulated as safe transactions, such that both failure and undo
- are handled conveniently without destroying the underlying draft
- theory (cf.~\secref{sec:context-theory}). In batch mode,
- transitions operate in a linear (destructive) fashion, such that
- error conditions abort the present attempt to construct a theory or
- proof altogether.
-
- The toplevel state is a disjoint sum of empty @{text toplevel}, or
- @{text theory}, or @{text proof}. On entering the main Isar loop we
- start with an empty toplevel. A theory is commenced by giving a
- @{text \<THEORY>} header; within a theory we may issue theory
- commands such as @{text \<DEFINITION>}, or state a @{text
- \<THEOREM>} to be proven. Now we are within a proof state, with a
- rich collection of Isar proof commands for structured proof
- composition, or unstructured proof scripts. When the proof is
- concluded we get back to the theory, which is then updated by
- storing the resulting fact. Further theory declarations or theorem
- statements with proofs may follow, until we eventually conclude the
- theory development by issuing @{text \<END>}. The resulting theory
- is then stored within the theory database and we are back to the
- empty toplevel.
-
- In addition to these proper state transformations, there are also
- some diagnostic commands for peeking at the toplevel state without
- modifying it (e.g.\ \isakeyword{thm}, \isakeyword{term},
- \isakeyword{print-cases}).
-*}
-
-text %mlref {*
- \begin{mldecls}
- @{index_ML_type Toplevel.state} \\
- @{index_ML Toplevel.UNDEF: "exn"} \\
- @{index_ML Toplevel.is_toplevel: "Toplevel.state -> bool"} \\
- @{index_ML Toplevel.theory_of: "Toplevel.state -> theory"} \\
- @{index_ML Toplevel.proof_of: "Toplevel.state -> Proof.state"} \\
- @{index_ML Toplevel.debug: "bool Unsynchronized.ref"} \\
- @{index_ML Toplevel.timing: "bool Unsynchronized.ref"} \\
- @{index_ML Toplevel.profiling: "int Unsynchronized.ref"} \\
- \end{mldecls}
-
- \begin{description}
-
- \item Type @{ML_type Toplevel.state} represents Isar toplevel
- states, which are normally manipulated through the concept of
- toplevel transitions only (\secref{sec:toplevel-transition}). Also
- note that a raw toplevel state is subject to the same linearity
- restrictions as a theory context (cf.~\secref{sec:context-theory}).
-
- \item @{ML Toplevel.UNDEF} is raised for undefined toplevel
- operations. Many operations work only partially for certain cases,
- since @{ML_type Toplevel.state} is a sum type.
-
- \item @{ML Toplevel.is_toplevel}~@{text "state"} checks for an empty
- toplevel state.
-
- \item @{ML Toplevel.theory_of}~@{text "state"} selects the
- background theory of @{text "state"}, raises @{ML Toplevel.UNDEF}
- for an empty toplevel state.
-
- \item @{ML Toplevel.proof_of}~@{text "state"} selects the Isar proof
- state if available, otherwise raises @{ML Toplevel.UNDEF}.
-
- \item @{ML "Toplevel.debug := true"} makes the toplevel print further
- details about internal error conditions, exceptions being raised
- etc.
-
- \item @{ML "Toplevel.timing := true"} makes the toplevel print timing
- information for each Isar command being executed.
-
- \item @{ML Toplevel.profiling}~@{ML_text ":="}~@{text "n"} controls
- low-level profiling of the underlying ML runtime system. For
- Poly/ML, @{text "n = 1"} means time and @{text "n = 2"} space
- profiling.
-
- \end{description}
-*}
-
-text %mlantiq {*
- \begin{matharray}{rcl}
- @{ML_antiquotation_def "Isar.state"} & : & @{text ML_antiquotation} \\
- \end{matharray}
-
- \begin{description}
-
- \item @{text "@{Isar.state}"} refers to Isar toplevel state at that
- point --- as abstract value.
-
- This only works for diagnostic ML commands, such as @{command
- ML_val} or @{command ML_command}.
-
- \end{description}
-*}
-
-
-subsection {* Toplevel transitions \label{sec:toplevel-transition} *}
-
-text {*
- An Isar toplevel transition consists of a partial function on the
- toplevel state, with additional information for diagnostics and
- error reporting: there are fields for command name, source position,
- optional source text, as well as flags for interactive-only commands
- (which issue a warning in batch-mode), printing of result state,
- etc.
-
- The operational part is represented as the sequential union of a
- list of partial functions, which are tried in turn until the first
- one succeeds. This acts like an outer case-expression for various
- alternative state transitions. For example, \isakeyword{qed} works
- differently for a local proofs vs.\ the global ending of the main
- proof.
-
- Toplevel transitions are composed via transition transformers.
- Internally, Isar commands are put together from an empty transition
- extended by name and source position. It is then left to the
- individual command parser to turn the given concrete syntax into a
- suitable transition transformer that adjoins actual operations on a
- theory or proof state etc.
-*}
-
-text %mlref {*
- \begin{mldecls}
- @{index_ML Toplevel.print: "Toplevel.transition -> Toplevel.transition"} \\
- @{index_ML Toplevel.no_timing: "Toplevel.transition -> Toplevel.transition"} \\
- @{index_ML Toplevel.keep: "(Toplevel.state -> unit) ->
- Toplevel.transition -> Toplevel.transition"} \\
- @{index_ML Toplevel.theory: "(theory -> theory) ->
- Toplevel.transition -> Toplevel.transition"} \\
- @{index_ML Toplevel.theory_to_proof: "(theory -> Proof.state) ->
- Toplevel.transition -> Toplevel.transition"} \\
- @{index_ML Toplevel.proof: "(Proof.state -> Proof.state) ->
- Toplevel.transition -> Toplevel.transition"} \\
- @{index_ML Toplevel.proofs: "(Proof.state -> Proof.state Seq.seq) ->
- Toplevel.transition -> Toplevel.transition"} \\
- @{index_ML Toplevel.end_proof: "(bool -> Proof.state -> Proof.context) ->
- Toplevel.transition -> Toplevel.transition"} \\
- \end{mldecls}
-
- \begin{description}
-
- \item @{ML Toplevel.print}~@{text "tr"} sets the print flag, which
- causes the toplevel loop to echo the result state (in interactive
- mode).
-
- \item @{ML Toplevel.no_timing}~@{text "tr"} indicates that the
- transition should never show timing information, e.g.\ because it is
- a diagnostic command.
-
- \item @{ML Toplevel.keep}~@{text "tr"} adjoins a diagnostic
- function.
-
- \item @{ML Toplevel.theory}~@{text "tr"} adjoins a theory
- transformer.
-
- \item @{ML Toplevel.theory_to_proof}~@{text "tr"} adjoins a global
- goal function, which turns a theory into a proof state. The theory
- may be changed before entering the proof; the generic Isar goal
- setup includes an argument that specifies how to apply the proven
- result to the theory, when the proof is finished.
-
- \item @{ML Toplevel.proof}~@{text "tr"} adjoins a deterministic
- proof command, with a singleton result.
-
- \item @{ML Toplevel.proofs}~@{text "tr"} adjoins a general proof
- command, with zero or more result states (represented as a lazy
- list).
-
- \item @{ML Toplevel.end_proof}~@{text "tr"} adjoins a concluding
- proof command, that returns the resulting theory, after storing the
- resulting facts in the context etc.
-
- \end{description}
-*}
-
-
-section {* Theory database \label{sec:theory-database} *}
-
-text {*
- The theory database maintains a collection of theories, together
- with some administrative information about their original sources,
- which are held in an external store (i.e.\ some directory within the
- regular file system).
-
- The theory database is organized as a directed acyclic graph;
- entries are referenced by theory name. Although some additional
- interfaces allow to include a directory specification as well, this
- is only a hint to the underlying theory loader. The internal theory
- name space is flat!
-
- Theory @{text A} is associated with the main theory file @{text
- A}\verb,.thy,, which needs to be accessible through the theory
- loader path. Any number of additional ML source files may be
- associated with each theory, by declaring these dependencies in the
- theory header as @{text \<USES>}, and loading them consecutively
- within the theory context. The system keeps track of incoming ML
- sources and associates them with the current theory.
-
- The basic internal actions of the theory database are @{text
- "update"} and @{text "remove"}:
-
- \begin{itemize}
-
- \item @{text "update A"} introduces a link of @{text "A"} with a
- @{text "theory"} value of the same name; it asserts that the theory
- sources are now consistent with that value;
-
- \item @{text "remove A"} deletes entry @{text "A"} from the theory
- database.
-
- \end{itemize}
-
- These actions are propagated to sub- or super-graphs of a theory
- entry as expected, in order to preserve global consistency of the
- state of all loaded theories with the sources of the external store.
- This implies certain causalities between actions: @{text "update"}
- or @{text "remove"} of an entry will @{text "remove"} all
- descendants.
-
- \medskip There are separate user-level interfaces to operate on the
- theory database directly or indirectly. The primitive actions then
- just happen automatically while working with the system. In
- particular, processing a theory header @{text "\<THEORY> A
- \<IMPORTS> B\<^sub>1 \<dots> B\<^sub>n \<BEGIN>"} ensures that the
- sub-graph of the collective imports @{text "B\<^sub>1 \<dots> B\<^sub>n"}
- is up-to-date, too. Earlier theories are reloaded as required, with
- @{text update} actions proceeding in topological order according to
- theory dependencies. There may be also a wave of implied @{text
- remove} actions for derived theory nodes until a stable situation
- is achieved eventually.
-*}
-
-text %mlref {*
- \begin{mldecls}
- @{index_ML use_thy: "string -> unit"} \\
- @{index_ML use_thys: "string list -> unit"} \\
- @{index_ML Thy_Info.get_theory: "string -> theory"} \\
- @{index_ML Thy_Info.remove_thy: "string -> unit"} \\[1ex]
- @{index_ML Thy_Info.register_thy: "theory -> unit"} \\[1ex]
- @{ML_text "datatype action = Update | Remove"} \\
- @{index_ML Thy_Info.add_hook: "(Thy_Info.action -> string -> unit) -> unit"} \\
- \end{mldecls}
-
- \begin{description}
-
- \item @{ML use_thy}~@{text A} ensures that theory @{text A} is fully
- up-to-date wrt.\ the external file store, reloading outdated
- ancestors as required. In batch mode, the simultaneous @{ML
- use_thys} should be used exclusively.
-
- \item @{ML use_thys} is similar to @{ML use_thy}, but handles
- several theories simultaneously. Thus it acts like processing the
- import header of a theory, without performing the merge of the
- result. By loading a whole sub-graph of theories like that, the
- intrinsic parallelism can be exploited by the system, to speedup
- loading.
-
- \item @{ML Thy_Info.get_theory}~@{text A} retrieves the theory value
- presently associated with name @{text A}. Note that the result
- might be outdated.
-
- \item @{ML Thy_Info.remove_thy}~@{text A} deletes theory @{text A} and all
- descendants from the theory database.
-
- \item @{ML Thy_Info.register_thy}~@{text "text thy"} registers an
- existing theory value with the theory loader database and updates
- source version information according to the current file-system
- state.
-
- \item @{ML "Thy_Info.add_hook"}~@{text f} registers function @{text
- f} as a hook for theory database actions. The function will be
- invoked with the action and theory name being involved; thus derived
- actions may be performed in associated system components, e.g.\
- maintaining the state of an editor for the theory sources.
-
- The kind and order of actions occurring in practice depends both on
- user interactions and the internal process of resolving theory
- imports. Hooks should not rely on a particular policy here! Any
- exceptions raised by the hook are ignored.
-
- \end{description}
-*}
-
-end
--- a/doc-src/IsarImplementation/Thy/Isar.thy Mon Aug 27 16:48:41 2012 +0200
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
@@ -1,584 +0,0 @@
-theory Isar
-imports Base
-begin
-
-chapter {* Isar language elements *}
-
-text {* The Isar proof language (see also
- \cite[\S2]{isabelle-isar-ref}) consists of three main categories of
- language elements as follows.
-
- \begin{enumerate}
-
- \item Proof \emph{commands} define the primary language of
- transactions of the underlying Isar/VM interpreter. Typical
- examples are @{command "fix"}, @{command "assume"}, @{command
- "show"}, @{command "proof"}, and @{command "qed"}.
-
- Composing proof commands according to the rules of the Isar/VM leads
- to expressions of structured proof text, such that both the machine
- and the human reader can give it a meaning as formal reasoning.
-
- \item Proof \emph{methods} define a secondary language of mixed
- forward-backward refinement steps involving facts and goals.
- Typical examples are @{method rule}, @{method unfold}, and @{method
- simp}.
-
- Methods can occur in certain well-defined parts of the Isar proof
- language, say as arguments to @{command "proof"}, @{command "qed"},
- or @{command "by"}.
-
- \item \emph{Attributes} define a tertiary language of small
- annotations to theorems being defined or referenced. Attributes can
- modify both the context and the theorem.
-
- Typical examples are @{attribute intro} (which affects the context),
- and @{attribute symmetric} (which affects the theorem).
-
- \end{enumerate}
-*}
-
-
-section {* Proof commands *}
-
-text {* A \emph{proof command} is state transition of the Isar/VM
- proof interpreter.
-
- In principle, Isar proof commands could be defined in user-space as
- well. The system is built like that in the first place: one part of
- the commands are primitive, the other part is defined as derived
- elements. Adding to the genuine structured proof language requires
- profound understanding of the Isar/VM machinery, though, so this is
- beyond the scope of this manual.
-
- What can be done realistically is to define some diagnostic commands
- that inspect the general state of the Isar/VM, and report some
- feedback to the user. Typically this involves checking of the
- linguistic \emph{mode} of a proof state, or peeking at the pending
- goals (if available).
-
- Another common application is to define a toplevel command that
- poses a problem to the user as Isar proof state and processes the
- final result relatively to the context. Thus a proof can be
- incorporated into the context of some user-space tool, without
- modifying the Isar proof language itself. *}
-
-text %mlref {*
- \begin{mldecls}
- @{index_ML_type Proof.state} \\
- @{index_ML Proof.assert_forward: "Proof.state -> Proof.state"} \\
- @{index_ML Proof.assert_chain: "Proof.state -> Proof.state"} \\
- @{index_ML Proof.assert_backward: "Proof.state -> Proof.state"} \\
- @{index_ML Proof.simple_goal: "Proof.state -> {context: Proof.context, goal: thm}"} \\
- @{index_ML Proof.goal: "Proof.state ->
- {context: Proof.context, facts: thm list, goal: thm}"} \\
- @{index_ML Proof.raw_goal: "Proof.state ->
- {context: Proof.context, facts: thm list, goal: thm}"} \\
- @{index_ML Proof.theorem: "Method.text option ->
- (thm list list -> Proof.context -> Proof.context) ->
- (term * term list) list list -> Proof.context -> Proof.state"} \\
- \end{mldecls}
-
- \begin{description}
-
- \item Type @{ML_type Proof.state} represents Isar proof states.
- This is a block-structured configuration with proof context,
- linguistic mode, and optional goal. The latter consists of goal
- context, goal facts (``@{text "using"}''), and tactical goal state
- (see \secref{sec:tactical-goals}).
-
- The general idea is that the facts shall contribute to the
- refinement of some parts of the tactical goal --- how exactly is
- defined by the proof method that is applied in that situation.
-
- \item @{ML Proof.assert_forward}, @{ML Proof.assert_chain}, @{ML
- Proof.assert_backward} are partial identity functions that fail
- unless a certain linguistic mode is active, namely ``@{text
- "proof(state)"}'', ``@{text "proof(chain)"}'', ``@{text
- "proof(prove)"}'', respectively (using the terminology of
- \cite{isabelle-isar-ref}).
-
- It is advisable study the implementations of existing proof commands
- for suitable modes to be asserted.
-
- \item @{ML Proof.simple_goal}~@{text "state"} returns the structured
- Isar goal (if available) in the form seen by ``simple'' methods
- (like @{method simp} or @{method blast}). The Isar goal facts are
- already inserted as premises into the subgoals, which are presented
- individually as in @{ML Proof.goal}.
-
- \item @{ML Proof.goal}~@{text "state"} returns the structured Isar
- goal (if available) in the form seen by regular methods (like
- @{method rule}). The auxiliary internal encoding of Pure
- conjunctions is split into individual subgoals as usual.
-
- \item @{ML Proof.raw_goal}~@{text "state"} returns the structured
- Isar goal (if available) in the raw internal form seen by ``raw''
- methods (like @{method induct}). This form is rarely appropriate
- for dignostic tools; @{ML Proof.simple_goal} or @{ML Proof.goal}
- should be used in most situations.
-
- \item @{ML Proof.theorem}~@{text "before_qed after_qed statement ctxt"}
- initializes a toplevel Isar proof state within a given context.
-
- The optional @{text "before_qed"} method is applied at the end of
- the proof, just before extracting the result (this feature is rarely
- used).
-
- The @{text "after_qed"} continuation receives the extracted result
- in order to apply it to the final context in a suitable way (e.g.\
- storing named facts). Note that at this generic level the target
- context is specified as @{ML_type Proof.context}, but the usual
- wrapping of toplevel proofs into command transactions will provide a
- @{ML_type local_theory} here (\chref{ch:local-theory}). This
- affects the way how results are stored.
-
- The @{text "statement"} is given as a nested list of terms, each
- associated with optional @{keyword "is"} patterns as usual in the
- Isar source language. The original nested list structure over terms
- is turned into one over theorems when @{text "after_qed"} is
- invoked.
-
- \end{description}
-*}
-
-
-text %mlantiq {*
- \begin{matharray}{rcl}
- @{ML_antiquotation_def "Isar.goal"} & : & @{text ML_antiquotation} \\
- \end{matharray}
-
- \begin{description}
-
- \item @{text "@{Isar.goal}"} refers to the regular goal state (if
- available) of the current proof state managed by the Isar toplevel
- --- as abstract value.
-
- This only works for diagnostic ML commands, such as @{command
- ML_val} or @{command ML_command}.
-
- \end{description}
-*}
-
-text %mlex {* The following example peeks at a certain goal configuration. *}
-
-notepad
-begin
- have A and B and C
- ML_val {*
- val n = Thm.nprems_of (#goal @{Isar.goal});
- @{assert} (n = 3);
- *}
- oops
-
-text {* Here we see 3 individual subgoals in the same way as regular
- proof methods would do. *}
-
-
-section {* Proof methods *}
-
-text {* A @{text "method"} is a function @{text "context \<rightarrow> thm\<^sup>* \<rightarrow> goal
- \<rightarrow> (cases \<times> goal)\<^sup>*\<^sup>*"} that operates on the full Isar goal
- configuration with context, goal facts, and tactical goal state and
- enumerates possible follow-up goal states, with the potential
- addition of named extensions of the proof context (\emph{cases}).
- The latter feature is rarely used.
-
- This means a proof method is like a structurally enhanced tactic
- (cf.\ \secref{sec:tactics}). The well-formedness conditions for
- tactics need to hold for methods accordingly, with the following
- additions.
-
- \begin{itemize}
-
- \item Goal addressing is further limited either to operate either
- uniformly on \emph{all} subgoals, or specifically on the
- \emph{first} subgoal.
-
- Exception: old-style tactic emulations that are embedded into the
- method space, e.g.\ @{method rule_tac}.
-
- \item A non-trivial method always needs to make progress: an
- identical follow-up goal state has to be avoided.\footnote{This
- enables the user to write method expressions like @{text "meth\<^sup>+"}
- without looping, while the trivial do-nothing case can be recovered
- via @{text "meth\<^sup>?"}.}
-
- Exception: trivial stuttering steps, such as ``@{method -}'' or
- @{method succeed}.
-
- \item Goal facts passed to the method must not be ignored. If there
- is no sensible use of facts outside the goal state, facts should be
- inserted into the subgoals that are addressed by the method.
-
- \end{itemize}
-
- \medskip Syntactically, the language of proof methods appears as
- arguments to Isar commands like @{command "by"} or @{command apply}.
- User-space additions are reasonably easy by plugging suitable
- method-valued parser functions into the framework, using the
- @{command method_setup} command, for example.
-
- To get a better idea about the range of possibilities, consider the
- following Isar proof schemes. This is the general form of
- structured proof text:
-
- \medskip
- \begin{tabular}{l}
- @{command from}~@{text "facts\<^sub>1"}~@{command have}~@{text "props"}~@{command using}~@{text "facts\<^sub>2"} \\
- @{command proof}~@{text "(initial_method)"} \\
- \quad@{text "body"} \\
- @{command qed}~@{text "(terminal_method)"} \\
- \end{tabular}
- \medskip
-
- The goal configuration consists of @{text "facts\<^sub>1"} and
- @{text "facts\<^sub>2"} appended in that order, and various @{text
- "props"} being claimed. The @{text "initial_method"} is invoked
- with facts and goals together and refines the problem to something
- that is handled recursively in the proof @{text "body"}. The @{text
- "terminal_method"} has another chance to finish any remaining
- subgoals, but it does not see the facts of the initial step.
-
- \medskip This pattern illustrates unstructured proof scripts:
-
- \medskip
- \begin{tabular}{l}
- @{command have}~@{text "props"} \\
- \quad@{command using}~@{text "facts\<^sub>1"}~@{command apply}~@{text "method\<^sub>1"} \\
- \quad@{command apply}~@{text "method\<^sub>2"} \\
- \quad@{command using}~@{text "facts\<^sub>3"}~@{command apply}~@{text "method\<^sub>3"} \\
- \quad@{command done} \\
- \end{tabular}
- \medskip
-
- The @{text "method\<^sub>1"} operates on the original claim while
- using @{text "facts\<^sub>1"}. Since the @{command apply} command
- structurally resets the facts, the @{text "method\<^sub>2"} will
- operate on the remaining goal state without facts. The @{text
- "method\<^sub>3"} will see again a collection of @{text
- "facts\<^sub>3"} that has been inserted into the script explicitly.
-
- \medskip Empirically, any Isar proof method can be categorized as
- follows.
-
- \begin{enumerate}
-
- \item \emph{Special method with cases} with named context additions
- associated with the follow-up goal state.
-
- Example: @{method "induct"}, which is also a ``raw'' method since it
- operates on the internal representation of simultaneous claims as
- Pure conjunction (\secref{sec:logic-aux}), instead of separate
- subgoals (\secref{sec:tactical-goals}).
-
- \item \emph{Structured method} with strong emphasis on facts outside
- the goal state.
-
- Example: @{method "rule"}, which captures the key ideas behind
- structured reasoning in Isar in purest form.
-
- \item \emph{Simple method} with weaker emphasis on facts, which are
- inserted into subgoals to emulate old-style tactical as
- ``premises''.
-
- Examples: @{method "simp"}, @{method "blast"}, @{method "auto"}.
-
- \item \emph{Old-style tactic emulation} with detailed numeric goal
- addressing and explicit references to entities of the internal goal
- state (which are otherwise invisible from proper Isar proof text).
- The naming convention @{text "foo_tac"} makes this special
- non-standard status clear.
-
- Example: @{method "rule_tac"}.
-
- \end{enumerate}
-
- When implementing proof methods, it is advisable to study existing
- implementations carefully and imitate the typical ``boiler plate''
- for context-sensitive parsing and further combinators to wrap-up
- tactic expressions as methods.\footnote{Aliases or abbreviations of
- the standard method combinators should be avoided. Note that from
- Isabelle99 until Isabelle2009 the system did provide various odd
- combinations of method wrappers that made user applications more
- complicated than necessary.}
-*}
-
-text %mlref {*
- \begin{mldecls}
- @{index_ML_type Proof.method} \\
- @{index_ML METHOD_CASES: "(thm list -> cases_tactic) -> Proof.method"} \\
- @{index_ML METHOD: "(thm list -> tactic) -> Proof.method"} \\
- @{index_ML SIMPLE_METHOD: "tactic -> Proof.method"} \\
- @{index_ML SIMPLE_METHOD': "(int -> tactic) -> Proof.method"} \\
- @{index_ML Method.insert_tac: "thm list -> int -> tactic"} \\
- @{index_ML Method.setup: "binding -> (Proof.context -> Proof.method) context_parser ->
- string -> theory -> theory"} \\
- \end{mldecls}
-
- \begin{description}
-
- \item Type @{ML_type Proof.method} represents proof methods as
- abstract type.
-
- \item @{ML METHOD_CASES}~@{text "(fn facts => cases_tactic)"} wraps
- @{text cases_tactic} depending on goal facts as proof method with
- cases; the goal context is passed via method syntax.
-
- \item @{ML METHOD}~@{text "(fn facts => tactic)"} wraps @{text
- tactic} depending on goal facts as regular proof method; the goal
- context is passed via method syntax.
-
- \item @{ML SIMPLE_METHOD}~@{text "tactic"} wraps a tactic that
- addresses all subgoals uniformly as simple proof method. Goal facts
- are already inserted into all subgoals before @{text "tactic"} is
- applied.
-
- \item @{ML SIMPLE_METHOD'}~@{text "tactic"} wraps a tactic that
- addresses a specific subgoal as simple proof method that operates on
- subgoal 1. Goal facts are inserted into the subgoal then the @{text
- "tactic"} is applied.
-
- \item @{ML Method.insert_tac}~@{text "facts i"} inserts @{text
- "facts"} into subgoal @{text "i"}. This is convenient to reproduce
- part of the @{ML SIMPLE_METHOD} or @{ML SIMPLE_METHOD'} wrapping
- within regular @{ML METHOD}, for example.
-
- \item @{ML Method.setup}~@{text "name parser description"} provides
- the functionality of the Isar command @{command method_setup} as ML
- function.
-
- \end{description}
-*}
-
-text %mlex {* See also @{command method_setup} in
- \cite{isabelle-isar-ref} which includes some abstract examples.
-
- \medskip The following toy examples illustrate how the goal facts
- and state are passed to proof methods. The pre-defined proof method
- called ``@{method tactic}'' wraps ML source of type @{ML_type
- tactic} (abstracted over @{ML_text facts}). This allows immediate
- experimentation without parsing of concrete syntax. *}
-
-notepad
-begin
- assume a: A and b: B
-
- have "A \<and> B"
- apply (tactic {* rtac @{thm conjI} 1 *})
- using a apply (tactic {* resolve_tac facts 1 *})
- using b apply (tactic {* resolve_tac facts 1 *})
- done
-
- have "A \<and> B"
- using a and b
- ML_val "@{Isar.goal}"
- apply (tactic {* Method.insert_tac facts 1 *})
- apply (tactic {* (rtac @{thm conjI} THEN_ALL_NEW atac) 1 *})
- done
-end
-
-text {* \medskip The next example implements a method that simplifies
- the first subgoal by rewrite rules given as arguments. *}
-
-method_setup my_simp = {*
- Attrib.thms >> (fn thms => fn ctxt =>
- SIMPLE_METHOD' (fn i =>
- CHANGED (asm_full_simp_tac
- (HOL_basic_ss addsimps thms) i)))
-*} "rewrite subgoal by given rules"
-
-text {* The concrete syntax wrapping of @{command method_setup} always
- passes-through the proof context at the end of parsing, but it is
- not used in this example.
-
- The @{ML Attrib.thms} parser produces a list of theorems from the
- usual Isar syntax involving attribute expressions etc.\ (syntax
- category @{syntax thmrefs}) \cite{isabelle-isar-ref}. The resulting
- @{ML_text thms} are added to @{ML HOL_basic_ss} which already
- contains the basic Simplifier setup for HOL.
-
- The tactic @{ML asm_full_simp_tac} is the one that is also used in
- method @{method simp} by default. The extra wrapping by the @{ML
- CHANGED} tactical ensures progress of simplification: identical goal
- states are filtered out explicitly to make the raw tactic conform to
- standard Isar method behaviour.
-
- \medskip Method @{method my_simp} can be used in Isar proofs like
- this:
-*}
-
-notepad
-begin
- fix a b c
- assume a: "a = b"
- assume b: "b = c"
- have "a = c" by (my_simp a b)
-end
-
-text {* Here is a similar method that operates on all subgoals,
- instead of just the first one. *}
-
-method_setup my_simp_all = {*
- Attrib.thms >> (fn thms => fn ctxt =>
- SIMPLE_METHOD
- (CHANGED
- (ALLGOALS (asm_full_simp_tac
- (HOL_basic_ss addsimps thms)))))
-*} "rewrite all subgoals by given rules"
-
-notepad
-begin
- fix a b c
- assume a: "a = b"
- assume b: "b = c"
- have "a = c" and "c = b" by (my_simp_all a b)
-end
-
-text {* \medskip Apart from explicit arguments, common proof methods
- typically work with a default configuration provided by the context.
- As a shortcut to rule management we use a cheap solution via functor
- @{ML_functor Named_Thms} (see also @{file
- "~~/src/Pure/Tools/named_thms.ML"}). *}
-
-ML {*
- structure My_Simps =
- Named_Thms
- (val name = @{binding my_simp} val description = "my_simp rule")
-*}
-setup My_Simps.setup
-
-text {* This provides ML access to a list of theorems in canonical
- declaration order via @{ML My_Simps.get}. The user can add or
- delete rules via the attribute @{attribute my_simp}. The actual
- proof method is now defined as before, but we append the explicit
- arguments and the rules from the context. *}
-
-method_setup my_simp' = {*
- Attrib.thms >> (fn thms => fn ctxt =>
- SIMPLE_METHOD' (fn i =>
- CHANGED (asm_full_simp_tac
- (HOL_basic_ss addsimps (thms @ My_Simps.get ctxt)) i)))
-*} "rewrite subgoal by given rules and my_simp rules from the context"
-
-text {*
- \medskip Method @{method my_simp'} can be used in Isar proofs
- like this:
-*}
-
-notepad
-begin
- fix a b c
- assume [my_simp]: "a \<equiv> b"
- assume [my_simp]: "b \<equiv> c"
- have "a \<equiv> c" by my_simp'
-end
-
-text {* \medskip The @{method my_simp} variants defined above are
- ``simple'' methods, i.e.\ the goal facts are merely inserted as goal
- premises by the @{ML SIMPLE_METHOD'} or @{ML SIMPLE_METHOD} wrapper.
- For proof methods that are similar to the standard collection of
- @{method simp}, @{method blast}, @{method fast}, @{method auto}
- there is little more that can be done.
-
- Note that using the primary goal facts in the same manner as the
- method arguments obtained via concrete syntax or the context does
- not meet the requirement of ``strong emphasis on facts'' of regular
- proof methods, because rewrite rules as used above can be easily
- ignored. A proof text ``@{command using}~@{text "foo"}~@{command
- "by"}~@{text "my_simp"}'' where @{text "foo"} is not used would
- deceive the reader.
-
- \medskip The technical treatment of rules from the context requires
- further attention. Above we rebuild a fresh @{ML_type simpset} from
- the arguments and \emph{all} rules retrieved from the context on
- every invocation of the method. This does not scale to really large
- collections of rules, which easily emerges in the context of a big
- theory library, for example.
-
- This is an inherent limitation of the simplistic rule management via
- functor @{ML_functor Named_Thms}, because it lacks tool-specific
- storage and retrieval. More realistic applications require
- efficient index-structures that organize theorems in a customized
- manner, such as a discrimination net that is indexed by the
- left-hand sides of rewrite rules. For variations on the Simplifier,
- re-use of the existing type @{ML_type simpset} is adequate, but
- scalability would require it be maintained statically within the
- context data, not dynamically on each tool invocation. *}
-
-
-section {* Attributes \label{sec:attributes} *}
-
-text {* An \emph{attribute} is a function @{text "context \<times> thm \<rightarrow>
- context \<times> thm"}, which means both a (generic) context and a theorem
- can be modified simultaneously. In practice this mixed form is very
- rare, instead attributes are presented either as \emph{declaration
- attribute:} @{text "thm \<rightarrow> context \<rightarrow> context"} or \emph{rule
- attribute:} @{text "context \<rightarrow> thm \<rightarrow> thm"}.
-
- Attributes can have additional arguments via concrete syntax. There
- is a collection of context-sensitive parsers for various logical
- entities (types, terms, theorems). These already take care of
- applying morphisms to the arguments when attribute expressions are
- moved into a different context (see also \secref{sec:morphisms}).
-
- When implementing declaration attributes, it is important to operate
- exactly on the variant of the generic context that is provided by
- the system, which is either global theory context or local proof
- context. In particular, the background theory of a local context
- must not be modified in this situation! *}
-
-text %mlref {*
- \begin{mldecls}
- @{index_ML_type attribute} \\
- @{index_ML Thm.rule_attribute: "(Context.generic -> thm -> thm) -> attribute"} \\
- @{index_ML Thm.declaration_attribute: "
- (thm -> Context.generic -> Context.generic) -> attribute"} \\
- @{index_ML Attrib.setup: "binding -> attribute context_parser ->
- string -> theory -> theory"} \\
- \end{mldecls}
-
- \begin{description}
-
- \item Type @{ML_type attribute} represents attributes as concrete
- type alias.
-
- \item @{ML Thm.rule_attribute}~@{text "(fn context => rule)"} wraps
- a context-dependent rule (mapping on @{ML_type thm}) as attribute.
-
- \item @{ML Thm.declaration_attribute}~@{text "(fn thm => decl)"}
- wraps a theorem-dependent declaration (mapping on @{ML_type
- Context.generic}) as attribute.
-
- \item @{ML Attrib.setup}~@{text "name parser description"} provides
- the functionality of the Isar command @{command attribute_setup} as
- ML function.
-
- \end{description}
-*}
-
-text %mlantiq {*
- \begin{matharray}{rcl}
- @{ML_antiquotation_def attributes} & : & @{text ML_antiquotation} \\
- \end{matharray}
-
- @{rail "
- @@{ML_antiquotation attributes} attributes
- "}
-
- \begin{description}
-
- \item @{text "@{attributes [\<dots>]}"} embeds attribute source
- representation into the ML text, which is particularly useful with
- declarations like @{ML Local_Theory.note}. Attribute names are
- internalized at compile time, but the source is unevaluated. This
- means attributes with formal arguments (types, terms, theorems) may
- be subject to odd effects of dynamic scoping!
-
- \end{description}
-*}
-
-text %mlex {* See also @{command attribute_setup} in
- \cite{isabelle-isar-ref} which includes some abstract examples. *}
-
-end
--- a/doc-src/IsarImplementation/Thy/Local_Theory.thy Mon Aug 27 16:48:41 2012 +0200
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
@@ -1,167 +0,0 @@
-theory Local_Theory
-imports Base
-begin
-
-chapter {* Local theory specifications \label{ch:local-theory} *}
-
-text {*
- A \emph{local theory} combines aspects of both theory and proof
- context (cf.\ \secref{sec:context}), such that definitional
- specifications may be given relatively to parameters and
- assumptions. A local theory is represented as a regular proof
- context, augmented by administrative data about the \emph{target
- context}.
-
- The target is usually derived from the background theory by adding
- local @{text "\<FIX>"} and @{text "\<ASSUME>"} elements, plus
- suitable modifications of non-logical context data (e.g.\ a special
- type-checking discipline). Once initialized, the target is ready to
- absorb definitional primitives: @{text "\<DEFINE>"} for terms and
- @{text "\<NOTE>"} for theorems. Such definitions may get
- transformed in a target-specific way, but the programming interface
- hides such details.
-
- Isabelle/Pure provides target mechanisms for locales, type-classes,
- type-class instantiations, and general overloading. In principle,
- users can implement new targets as well, but this rather arcane
- discipline is beyond the scope of this manual. In contrast,
- implementing derived definitional packages to be used within a local
- theory context is quite easy: the interfaces are even simpler and
- more abstract than the underlying primitives for raw theories.
-
- Many definitional packages for local theories are available in
- Isabelle. Although a few old packages only work for global
- theories, the standard way of implementing definitional packages in
- Isabelle is via the local theory interface.
-*}
-
-
-section {* Definitional elements *}
-
-text {*
- There are separate elements @{text "\<DEFINE> c \<equiv> t"} for terms, and
- @{text "\<NOTE> b = thm"} for theorems. Types are treated
- implicitly, according to Hindley-Milner discipline (cf.\
- \secref{sec:variables}). These definitional primitives essentially
- act like @{text "let"}-bindings within a local context that may
- already contain earlier @{text "let"}-bindings and some initial
- @{text "\<lambda>"}-bindings. Thus we gain \emph{dependent definitions}
- that are relative to an initial axiomatic context. The following
- diagram illustrates this idea of axiomatic elements versus
- definitional elements:
-
- \begin{center}
- \begin{tabular}{|l|l|l|}
- \hline
- & @{text "\<lambda>"}-binding & @{text "let"}-binding \\
- \hline
- types & fixed @{text "\<alpha>"} & arbitrary @{text "\<beta>"} \\
- terms & @{text "\<FIX> x :: \<tau>"} & @{text "\<DEFINE> c \<equiv> t"} \\
- theorems & @{text "\<ASSUME> a: A"} & @{text "\<NOTE> b = \<^BG>B\<^EN>"} \\
- \hline
- \end{tabular}
- \end{center}
-
- A user package merely needs to produce suitable @{text "\<DEFINE>"}
- and @{text "\<NOTE>"} elements according to the application. For
- example, a package for inductive definitions might first @{text
- "\<DEFINE>"} a certain predicate as some fixed-point construction,
- then @{text "\<NOTE>"} a proven result about monotonicity of the
- functor involved here, and then produce further derived concepts via
- additional @{text "\<DEFINE>"} and @{text "\<NOTE>"} elements.
-
- The cumulative sequence of @{text "\<DEFINE>"} and @{text "\<NOTE>"}
- produced at package runtime is managed by the local theory
- infrastructure by means of an \emph{auxiliary context}. Thus the
- system holds up the impression of working within a fully abstract
- situation with hypothetical entities: @{text "\<DEFINE> c \<equiv> t"}
- always results in a literal fact @{text "\<^BG>c \<equiv> t\<^EN>"}, where
- @{text "c"} is a fixed variable @{text "c"}. The details about
- global constants, name spaces etc. are handled internally.
-
- So the general structure of a local theory is a sandwich of three
- layers:
-
- \begin{center}
- \framebox{\quad auxiliary context \quad\framebox{\quad target context \quad\framebox{\quad background theory\quad}}}
- \end{center}
-
- When a definitional package is finished, the auxiliary context is
- reset to the target context. The target now holds definitions for
- terms and theorems that stem from the hypothetical @{text
- "\<DEFINE>"} and @{text "\<NOTE>"} elements, transformed by the
- particular target policy (see \cite[\S4--5]{Haftmann-Wenzel:2009}
- for details). *}
-
-text %mlref {*
- \begin{mldecls}
- @{index_ML_type local_theory: Proof.context} \\
- @{index_ML Named_Target.init: "(local_theory -> local_theory) ->
- string -> theory -> local_theory"} \\[1ex]
- @{index_ML Local_Theory.define: "(binding * mixfix) * (Attrib.binding * term) ->
- local_theory -> (term * (string * thm)) * local_theory"} \\
- @{index_ML Local_Theory.note: "Attrib.binding * thm list ->
- local_theory -> (string * thm list) * local_theory"} \\
- \end{mldecls}
-
- \begin{description}
-
- \item Type @{ML_type local_theory} represents local theories.
- Although this is merely an alias for @{ML_type Proof.context}, it is
- semantically a subtype of the same: a @{ML_type local_theory} holds
- target information as special context data. Subtyping means that
- any value @{text "lthy:"}~@{ML_type local_theory} can be also used
- with operations on expecting a regular @{text "ctxt:"}~@{ML_type
- Proof.context}.
-
- \item @{ML Named_Target.init}~@{text "before_exit name thy"}
- initializes a local theory derived from the given background theory.
- An empty name refers to a \emph{global theory} context, and a
- non-empty name refers to a @{command locale} or @{command class}
- context (a fully-qualified internal name is expected here). This is
- useful for experimentation --- normally the Isar toplevel already
- takes care to initialize the local theory context. The given @{text
- "before_exit"} function is invoked before leaving the context; in
- most situations plain identity @{ML I} is sufficient.
-
- \item @{ML Local_Theory.define}~@{text "((b, mx), (a, rhs))
- lthy"} defines a local entity according to the specification that is
- given relatively to the current @{text "lthy"} context. In
- particular the term of the RHS may refer to earlier local entities
- from the auxiliary context, or hypothetical parameters from the
- target context. The result is the newly defined term (which is
- always a fixed variable with exactly the same name as specified for
- the LHS), together with an equational theorem that states the
- definition as a hypothetical fact.
-
- Unless an explicit name binding is given for the RHS, the resulting
- fact will be called @{text "b_def"}. Any given attributes are
- applied to that same fact --- immediately in the auxiliary context
- \emph{and} in any transformed versions stemming from target-specific
- policies or any later interpretations of results from the target
- context (think of @{command locale} and @{command interpretation},
- for example). This means that attributes should be usually plain
- declarations such as @{attribute simp}, while non-trivial rules like
- @{attribute simplified} are better avoided.
-
- \item @{ML Local_Theory.note}~@{text "(a, ths) lthy"} is
- analogous to @{ML Local_Theory.define}, but defines facts instead of
- terms. There is also a slightly more general variant @{ML
- Local_Theory.notes} that defines several facts (with attribute
- expressions) simultaneously.
-
- This is essentially the internal version of the @{command lemmas}
- command, or @{command declare} if an empty name binding is given.
-
- \end{description}
-*}
-
-
-section {* Morphisms and declarations \label{sec:morphisms} *}
-
-text {* FIXME
-
- \medskip See also \cite{Chaieb-Wenzel:2007}.
-*}
-
-end
--- a/doc-src/IsarImplementation/Thy/Logic.thy Mon Aug 27 16:48:41 2012 +0200
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
@@ -1,1137 +0,0 @@
-theory Logic
-imports Base
-begin
-
-chapter {* Primitive logic \label{ch:logic} *}
-
-text {*
- The logical foundations of Isabelle/Isar are that of the Pure logic,
- which has been introduced as a Natural Deduction framework in
- \cite{paulson700}. This is essentially the same logic as ``@{text
- "\<lambda>HOL"}'' in the more abstract setting of Pure Type Systems (PTS)
- \cite{Barendregt-Geuvers:2001}, although there are some key
- differences in the specific treatment of simple types in
- Isabelle/Pure.
-
- Following type-theoretic parlance, the Pure logic consists of three
- levels of @{text "\<lambda>"}-calculus with corresponding arrows, @{text
- "\<Rightarrow>"} for syntactic function space (terms depending on terms), @{text
- "\<And>"} for universal quantification (proofs depending on terms), and
- @{text "\<Longrightarrow>"} for implication (proofs depending on proofs).
-
- Derivations are relative to a logical theory, which declares type
- constructors, constants, and axioms. Theory declarations support
- schematic polymorphism, which is strictly speaking outside the
- logic.\footnote{This is the deeper logical reason, why the theory
- context @{text "\<Theta>"} is separate from the proof context @{text "\<Gamma>"}
- of the core calculus: type constructors, term constants, and facts
- (proof constants) may involve arbitrary type schemes, but the type
- of a locally fixed term parameter is also fixed!}
-*}
-
-
-section {* Types \label{sec:types} *}
-
-text {*
- The language of types is an uninterpreted order-sorted first-order
- algebra; types are qualified by ordered type classes.
-
- \medskip A \emph{type class} is an abstract syntactic entity
- declared in the theory context. The \emph{subclass relation} @{text
- "c\<^isub>1 \<subseteq> c\<^isub>2"} is specified by stating an acyclic
- generating relation; the transitive closure is maintained
- internally. The resulting relation is an ordering: reflexive,
- transitive, and antisymmetric.
-
- A \emph{sort} is a list of type classes written as @{text "s = {c\<^isub>1,
- \<dots>, c\<^isub>m}"}, it represents symbolic intersection. Notationally, the
- curly braces are omitted for singleton intersections, i.e.\ any
- class @{text "c"} may be read as a sort @{text "{c}"}. The ordering
- on type classes is extended to sorts according to the meaning of
- intersections: @{text "{c\<^isub>1, \<dots> c\<^isub>m} \<subseteq> {d\<^isub>1, \<dots>, d\<^isub>n}"} iff @{text
- "\<forall>j. \<exists>i. c\<^isub>i \<subseteq> d\<^isub>j"}. The empty intersection @{text "{}"} refers to
- the universal sort, which is the largest element wrt.\ the sort
- order. Thus @{text "{}"} represents the ``full sort'', not the
- empty one! The intersection of all (finitely many) classes declared
- in the current theory is the least element wrt.\ the sort ordering.
-
- \medskip A \emph{fixed type variable} is a pair of a basic name
- (starting with a @{text "'"} character) and a sort constraint, e.g.\
- @{text "('a, s)"} which is usually printed as @{text "\<alpha>\<^isub>s"}.
- A \emph{schematic type variable} is a pair of an indexname and a
- sort constraint, e.g.\ @{text "(('a, 0), s)"} which is usually
- printed as @{text "?\<alpha>\<^isub>s"}.
-
- Note that \emph{all} syntactic components contribute to the identity
- of type variables: basic name, index, and sort constraint. The core
- logic handles type variables with the same name but different sorts
- as different, although the type-inference layer (which is outside
- the core) rejects anything like that.
-
- A \emph{type constructor} @{text "\<kappa>"} is a @{text "k"}-ary operator
- on types declared in the theory. Type constructor application is
- written postfix as @{text "(\<alpha>\<^isub>1, \<dots>, \<alpha>\<^isub>k)\<kappa>"}. For
- @{text "k = 0"} the argument tuple is omitted, e.g.\ @{text "prop"}
- instead of @{text "()prop"}. For @{text "k = 1"} the parentheses
- are omitted, e.g.\ @{text "\<alpha> list"} instead of @{text "(\<alpha>)list"}.
- Further notation is provided for specific constructors, notably the
- right-associative infix @{text "\<alpha> \<Rightarrow> \<beta>"} instead of @{text "(\<alpha>,
- \<beta>)fun"}.
-
- The logical category \emph{type} is defined inductively over type
- variables and type constructors as follows: @{text "\<tau> = \<alpha>\<^isub>s | ?\<alpha>\<^isub>s |
- (\<tau>\<^sub>1, \<dots>, \<tau>\<^sub>k)\<kappa>"}.
-
- A \emph{type abbreviation} is a syntactic definition @{text
- "(\<^vec>\<alpha>)\<kappa> = \<tau>"} of an arbitrary type expression @{text "\<tau>"} over
- variables @{text "\<^vec>\<alpha>"}. Type abbreviations appear as type
- constructors in the syntax, but are expanded before entering the
- logical core.
-
- A \emph{type arity} declares the image behavior of a type
- constructor wrt.\ the algebra of sorts: @{text "\<kappa> :: (s\<^isub>1, \<dots>,
- s\<^isub>k)s"} means that @{text "(\<tau>\<^isub>1, \<dots>, \<tau>\<^isub>k)\<kappa>"} is
- of sort @{text "s"} if every argument type @{text "\<tau>\<^isub>i"} is
- of sort @{text "s\<^isub>i"}. Arity declarations are implicitly
- completed, i.e.\ @{text "\<kappa> :: (\<^vec>s)c"} entails @{text "\<kappa> ::
- (\<^vec>s)c'"} for any @{text "c' \<supseteq> c"}.
-
- \medskip The sort algebra is always maintained as \emph{coregular},
- which means that type arities are consistent with the subclass
- relation: for any type constructor @{text "\<kappa>"}, and classes @{text
- "c\<^isub>1 \<subseteq> c\<^isub>2"}, and arities @{text "\<kappa> ::
- (\<^vec>s\<^isub>1)c\<^isub>1"} and @{text "\<kappa> ::
- (\<^vec>s\<^isub>2)c\<^isub>2"} holds @{text "\<^vec>s\<^isub>1 \<subseteq>
- \<^vec>s\<^isub>2"} component-wise.
-
- The key property of a coregular order-sorted algebra is that sort
- constraints can be solved in a most general fashion: for each type
- constructor @{text "\<kappa>"} and sort @{text "s"} there is a most general
- vector of argument sorts @{text "(s\<^isub>1, \<dots>, s\<^isub>k)"} such
- that a type scheme @{text "(\<alpha>\<^bsub>s\<^isub>1\<^esub>, \<dots>,
- \<alpha>\<^bsub>s\<^isub>k\<^esub>)\<kappa>"} is of sort @{text "s"}.
- Consequently, type unification has most general solutions (modulo
- equivalence of sorts), so type-inference produces primary types as
- expected \cite{nipkow-prehofer}.
-*}
-
-text %mlref {*
- \begin{mldecls}
- @{index_ML_type class: string} \\
- @{index_ML_type sort: "class list"} \\
- @{index_ML_type arity: "string * sort list * sort"} \\
- @{index_ML_type typ} \\
- @{index_ML Term.map_atyps: "(typ -> typ) -> typ -> typ"} \\
- @{index_ML Term.fold_atyps: "(typ -> 'a -> 'a) -> typ -> 'a -> 'a"} \\
- \end{mldecls}
- \begin{mldecls}
- @{index_ML Sign.subsort: "theory -> sort * sort -> bool"} \\
- @{index_ML Sign.of_sort: "theory -> typ * sort -> bool"} \\
- @{index_ML Sign.add_type: "Proof.context -> binding * int * mixfix -> theory -> theory"} \\
- @{index_ML Sign.add_type_abbrev: "Proof.context ->
- binding * string list * typ -> theory -> theory"} \\
- @{index_ML Sign.primitive_class: "binding * class list -> theory -> theory"} \\
- @{index_ML Sign.primitive_classrel: "class * class -> theory -> theory"} \\
- @{index_ML Sign.primitive_arity: "arity -> theory -> theory"} \\
- \end{mldecls}
-
- \begin{description}
-
- \item Type @{ML_type class} represents type classes.
-
- \item Type @{ML_type sort} represents sorts, i.e.\ finite
- intersections of classes. The empty list @{ML "[]: sort"} refers to
- the empty class intersection, i.e.\ the ``full sort''.
-
- \item Type @{ML_type arity} represents type arities. A triple
- @{text "(\<kappa>, \<^vec>s, s) : arity"} represents @{text "\<kappa> ::
- (\<^vec>s)s"} as described above.
-
- \item Type @{ML_type typ} represents types; this is a datatype with
- constructors @{ML TFree}, @{ML TVar}, @{ML Type}.
-
- \item @{ML Term.map_atyps}~@{text "f \<tau>"} applies the mapping @{text
- "f"} to all atomic types (@{ML TFree}, @{ML TVar}) occurring in
- @{text "\<tau>"}.
-
- \item @{ML Term.fold_atyps}~@{text "f \<tau>"} iterates the operation
- @{text "f"} over all occurrences of atomic types (@{ML TFree}, @{ML
- TVar}) in @{text "\<tau>"}; the type structure is traversed from left to
- right.
-
- \item @{ML Sign.subsort}~@{text "thy (s\<^isub>1, s\<^isub>2)"}
- tests the subsort relation @{text "s\<^isub>1 \<subseteq> s\<^isub>2"}.
-
- \item @{ML Sign.of_sort}~@{text "thy (\<tau>, s)"} tests whether type
- @{text "\<tau>"} is of sort @{text "s"}.
-
- \item @{ML Sign.add_type}~@{text "ctxt (\<kappa>, k, mx)"} declares a
- new type constructors @{text "\<kappa>"} with @{text "k"} arguments and
- optional mixfix syntax.
-
- \item @{ML Sign.add_type_abbrev}~@{text "ctxt (\<kappa>, \<^vec>\<alpha>, \<tau>)"}
- defines a new type abbreviation @{text "(\<^vec>\<alpha>)\<kappa> = \<tau>"}.
-
- \item @{ML Sign.primitive_class}~@{text "(c, [c\<^isub>1, \<dots>,
- c\<^isub>n])"} declares a new class @{text "c"}, together with class
- relations @{text "c \<subseteq> c\<^isub>i"}, for @{text "i = 1, \<dots>, n"}.
-
- \item @{ML Sign.primitive_classrel}~@{text "(c\<^isub>1,
- c\<^isub>2)"} declares the class relation @{text "c\<^isub>1 \<subseteq>
- c\<^isub>2"}.
-
- \item @{ML Sign.primitive_arity}~@{text "(\<kappa>, \<^vec>s, s)"} declares
- the arity @{text "\<kappa> :: (\<^vec>s)s"}.
-
- \end{description}
-*}
-
-text %mlantiq {*
- \begin{matharray}{rcl}
- @{ML_antiquotation_def "class"} & : & @{text ML_antiquotation} \\
- @{ML_antiquotation_def "sort"} & : & @{text ML_antiquotation} \\
- @{ML_antiquotation_def "type_name"} & : & @{text ML_antiquotation} \\
- @{ML_antiquotation_def "type_abbrev"} & : & @{text ML_antiquotation} \\
- @{ML_antiquotation_def "nonterminal"} & : & @{text ML_antiquotation} \\
- @{ML_antiquotation_def "typ"} & : & @{text ML_antiquotation} \\
- \end{matharray}
-
- @{rail "
- @@{ML_antiquotation class} nameref
- ;
- @@{ML_antiquotation sort} sort
- ;
- (@@{ML_antiquotation type_name} |
- @@{ML_antiquotation type_abbrev} |
- @@{ML_antiquotation nonterminal}) nameref
- ;
- @@{ML_antiquotation typ} type
- "}
-
- \begin{description}
-
- \item @{text "@{class c}"} inlines the internalized class @{text
- "c"} --- as @{ML_type string} literal.
-
- \item @{text "@{sort s}"} inlines the internalized sort @{text "s"}
- --- as @{ML_type "string list"} literal.
-
- \item @{text "@{type_name c}"} inlines the internalized type
- constructor @{text "c"} --- as @{ML_type string} literal.
-
- \item @{text "@{type_abbrev c}"} inlines the internalized type
- abbreviation @{text "c"} --- as @{ML_type string} literal.
-
- \item @{text "@{nonterminal c}"} inlines the internalized syntactic
- type~/ grammar nonterminal @{text "c"} --- as @{ML_type string}
- literal.
-
- \item @{text "@{typ \<tau>}"} inlines the internalized type @{text "\<tau>"}
- --- as constructor term for datatype @{ML_type typ}.
-
- \end{description}
-*}
-
-
-section {* Terms \label{sec:terms} *}
-
-text {*
- The language of terms is that of simply-typed @{text "\<lambda>"}-calculus
- with de-Bruijn indices for bound variables (cf.\ \cite{debruijn72}
- or \cite{paulson-ml2}), with the types being determined by the
- corresponding binders. In contrast, free variables and constants
- have an explicit name and type in each occurrence.
-
- \medskip A \emph{bound variable} is a natural number @{text "b"},
- which accounts for the number of intermediate binders between the
- variable occurrence in the body and its binding position. For
- example, the de-Bruijn term @{text "\<lambda>\<^bsub>bool\<^esub>. \<lambda>\<^bsub>bool\<^esub>. 1 \<and> 0"} would
- correspond to @{text "\<lambda>x\<^bsub>bool\<^esub>. \<lambda>y\<^bsub>bool\<^esub>. x \<and> y"} in a named
- representation. Note that a bound variable may be represented by
- different de-Bruijn indices at different occurrences, depending on
- the nesting of abstractions.
-
- A \emph{loose variable} is a bound variable that is outside the
- scope of local binders. The types (and names) for loose variables
- can be managed as a separate context, that is maintained as a stack
- of hypothetical binders. The core logic operates on closed terms,
- without any loose variables.
-
- A \emph{fixed variable} is a pair of a basic name and a type, e.g.\
- @{text "(x, \<tau>)"} which is usually printed @{text "x\<^isub>\<tau>"} here. A
- \emph{schematic variable} is a pair of an indexname and a type,
- e.g.\ @{text "((x, 0), \<tau>)"} which is likewise printed as @{text
- "?x\<^isub>\<tau>"}.
-
- \medskip A \emph{constant} is a pair of a basic name and a type,
- e.g.\ @{text "(c, \<tau>)"} which is usually printed as @{text "c\<^isub>\<tau>"}
- here. Constants are declared in the context as polymorphic families
- @{text "c :: \<sigma>"}, meaning that all substitution instances @{text
- "c\<^isub>\<tau>"} for @{text "\<tau> = \<sigma>\<vartheta>"} are valid.
-
- The vector of \emph{type arguments} of constant @{text "c\<^isub>\<tau>"} wrt.\
- the declaration @{text "c :: \<sigma>"} is defined as the codomain of the
- matcher @{text "\<vartheta> = {?\<alpha>\<^isub>1 \<mapsto> \<tau>\<^isub>1, \<dots>, ?\<alpha>\<^isub>n \<mapsto> \<tau>\<^isub>n}"} presented in
- canonical order @{text "(\<tau>\<^isub>1, \<dots>, \<tau>\<^isub>n)"}, corresponding to the
- left-to-right occurrences of the @{text "\<alpha>\<^isub>i"} in @{text "\<sigma>"}.
- Within a given theory context, there is a one-to-one correspondence
- between any constant @{text "c\<^isub>\<tau>"} and the application @{text "c(\<tau>\<^isub>1,
- \<dots>, \<tau>\<^isub>n)"} of its type arguments. For example, with @{text "plus :: \<alpha>
- \<Rightarrow> \<alpha> \<Rightarrow> \<alpha>"}, the instance @{text "plus\<^bsub>nat \<Rightarrow> nat \<Rightarrow> nat\<^esub>"} corresponds to
- @{text "plus(nat)"}.
-
- Constant declarations @{text "c :: \<sigma>"} may contain sort constraints
- for type variables in @{text "\<sigma>"}. These are observed by
- type-inference as expected, but \emph{ignored} by the core logic.
- This means the primitive logic is able to reason with instances of
- polymorphic constants that the user-level type-checker would reject
- due to violation of type class restrictions.
-
- \medskip An \emph{atomic term} is either a variable or constant.
- The logical category \emph{term} is defined inductively over atomic
- terms, with abstraction and application as follows: @{text "t = b |
- x\<^isub>\<tau> | ?x\<^isub>\<tau> | c\<^isub>\<tau> | \<lambda>\<^isub>\<tau>. t | t\<^isub>1 t\<^isub>2"}. Parsing and printing takes care of
- converting between an external representation with named bound
- variables. Subsequently, we shall use the latter notation instead
- of internal de-Bruijn representation.
-
- The inductive relation @{text "t :: \<tau>"} assigns a (unique) type to a
- term according to the structure of atomic terms, abstractions, and
- applicatins:
- \[
- \infer{@{text "a\<^isub>\<tau> :: \<tau>"}}{}
- \qquad
- \infer{@{text "(\<lambda>x\<^sub>\<tau>. t) :: \<tau> \<Rightarrow> \<sigma>"}}{@{text "t :: \<sigma>"}}
- \qquad
- \infer{@{text "t u :: \<sigma>"}}{@{text "t :: \<tau> \<Rightarrow> \<sigma>"} & @{text "u :: \<tau>"}}
- \]
- A \emph{well-typed term} is a term that can be typed according to these rules.
-
- Typing information can be omitted: type-inference is able to
- reconstruct the most general type of a raw term, while assigning
- most general types to all of its variables and constants.
- Type-inference depends on a context of type constraints for fixed
- variables, and declarations for polymorphic constants.
-
- The identity of atomic terms consists both of the name and the type
- component. This means that different variables @{text
- "x\<^bsub>\<tau>\<^isub>1\<^esub>"} and @{text "x\<^bsub>\<tau>\<^isub>2\<^esub>"} may become the same after
- type instantiation. Type-inference rejects variables of the same
- name, but different types. In contrast, mixed instances of
- polymorphic constants occur routinely.
-
- \medskip The \emph{hidden polymorphism} of a term @{text "t :: \<sigma>"}
- is the set of type variables occurring in @{text "t"}, but not in
- its type @{text "\<sigma>"}. This means that the term implicitly depends
- on type arguments that are not accounted in the result type, i.e.\
- there are different type instances @{text "t\<vartheta> :: \<sigma>"} and
- @{text "t\<vartheta>' :: \<sigma>"} with the same type. This slightly
- pathological situation notoriously demands additional care.
-
- \medskip A \emph{term abbreviation} is a syntactic definition @{text
- "c\<^isub>\<sigma> \<equiv> t"} of a closed term @{text "t"} of type @{text "\<sigma>"},
- without any hidden polymorphism. A term abbreviation looks like a
- constant in the syntax, but is expanded before entering the logical
- core. Abbreviations are usually reverted when printing terms, using
- @{text "t \<rightarrow> c\<^isub>\<sigma>"} as rules for higher-order rewriting.
-
- \medskip Canonical operations on @{text "\<lambda>"}-terms include @{text
- "\<alpha>\<beta>\<eta>"}-conversion: @{text "\<alpha>"}-conversion refers to capture-free
- renaming of bound variables; @{text "\<beta>"}-conversion contracts an
- abstraction applied to an argument term, substituting the argument
- in the body: @{text "(\<lambda>x. b)a"} becomes @{text "b[a/x]"}; @{text
- "\<eta>"}-conversion contracts vacuous application-abstraction: @{text
- "\<lambda>x. f x"} becomes @{text "f"}, provided that the bound variable
- does not occur in @{text "f"}.
-
- Terms are normally treated modulo @{text "\<alpha>"}-conversion, which is
- implicit in the de-Bruijn representation. Names for bound variables
- in abstractions are maintained separately as (meaningless) comments,
- mostly for parsing and printing. Full @{text "\<alpha>\<beta>\<eta>"}-conversion is
- commonplace in various standard operations (\secref{sec:obj-rules})
- that are based on higher-order unification and matching.
-*}
-
-text %mlref {*
- \begin{mldecls}
- @{index_ML_type term} \\
- @{index_ML_op "aconv": "term * term -> bool"} \\
- @{index_ML Term.map_types: "(typ -> typ) -> term -> term"} \\
- @{index_ML Term.fold_types: "(typ -> 'a -> 'a) -> term -> 'a -> 'a"} \\
- @{index_ML Term.map_aterms: "(term -> term) -> term -> term"} \\
- @{index_ML Term.fold_aterms: "(term -> 'a -> 'a) -> term -> 'a -> 'a"} \\
- \end{mldecls}
- \begin{mldecls}
- @{index_ML fastype_of: "term -> typ"} \\
- @{index_ML lambda: "term -> term -> term"} \\
- @{index_ML betapply: "term * term -> term"} \\
- @{index_ML incr_boundvars: "int -> term -> term"} \\
- @{index_ML Sign.declare_const: "Proof.context ->
- (binding * typ) * mixfix -> theory -> term * theory"} \\
- @{index_ML Sign.add_abbrev: "string -> binding * term ->
- theory -> (term * term) * theory"} \\
- @{index_ML Sign.const_typargs: "theory -> string * typ -> typ list"} \\
- @{index_ML Sign.const_instance: "theory -> string * typ list -> typ"} \\
- \end{mldecls}
-
- \begin{description}
-
- \item Type @{ML_type term} represents de-Bruijn terms, with comments
- in abstractions, and explicitly named free variables and constants;
- this is a datatype with constructors @{ML Bound}, @{ML Free}, @{ML
- Var}, @{ML Const}, @{ML Abs}, @{ML_op "$"}.
-
- \item @{text "t"}~@{ML_text aconv}~@{text "u"} checks @{text
- "\<alpha>"}-equivalence of two terms. This is the basic equality relation
- on type @{ML_type term}; raw datatype equality should only be used
- for operations related to parsing or printing!
-
- \item @{ML Term.map_types}~@{text "f t"} applies the mapping @{text
- "f"} to all types occurring in @{text "t"}.
-
- \item @{ML Term.fold_types}~@{text "f t"} iterates the operation
- @{text "f"} over all occurrences of types in @{text "t"}; the term
- structure is traversed from left to right.
-
- \item @{ML Term.map_aterms}~@{text "f t"} applies the mapping @{text
- "f"} to all atomic terms (@{ML Bound}, @{ML Free}, @{ML Var}, @{ML
- Const}) occurring in @{text "t"}.
-
- \item @{ML Term.fold_aterms}~@{text "f t"} iterates the operation
- @{text "f"} over all occurrences of atomic terms (@{ML Bound}, @{ML
- Free}, @{ML Var}, @{ML Const}) in @{text "t"}; the term structure is
- traversed from left to right.
-
- \item @{ML fastype_of}~@{text "t"} determines the type of a
- well-typed term. This operation is relatively slow, despite the
- omission of any sanity checks.
-
- \item @{ML lambda}~@{text "a b"} produces an abstraction @{text
- "\<lambda>a. b"}, where occurrences of the atomic term @{text "a"} in the
- body @{text "b"} are replaced by bound variables.
-
- \item @{ML betapply}~@{text "(t, u)"} produces an application @{text
- "t u"}, with topmost @{text "\<beta>"}-conversion if @{text "t"} is an
- abstraction.
-
- \item @{ML incr_boundvars}~@{text "j"} increments a term's dangling
- bound variables by the offset @{text "j"}. This is required when
- moving a subterm into a context where it is enclosed by a different
- number of abstractions. Bound variables with a matching abstraction
- are unaffected.
-
- \item @{ML Sign.declare_const}~@{text "ctxt ((c, \<sigma>), mx)"} declares
- a new constant @{text "c :: \<sigma>"} with optional mixfix syntax.
-
- \item @{ML Sign.add_abbrev}~@{text "print_mode (c, t)"}
- introduces a new term abbreviation @{text "c \<equiv> t"}.
-
- \item @{ML Sign.const_typargs}~@{text "thy (c, \<tau>)"} and @{ML
- Sign.const_instance}~@{text "thy (c, [\<tau>\<^isub>1, \<dots>, \<tau>\<^isub>n])"}
- convert between two representations of polymorphic constants: full
- type instance vs.\ compact type arguments form.
-
- \end{description}
-*}
-
-text %mlantiq {*
- \begin{matharray}{rcl}
- @{ML_antiquotation_def "const_name"} & : & @{text ML_antiquotation} \\
- @{ML_antiquotation_def "const_abbrev"} & : & @{text ML_antiquotation} \\
- @{ML_antiquotation_def "const"} & : & @{text ML_antiquotation} \\
- @{ML_antiquotation_def "term"} & : & @{text ML_antiquotation} \\
- @{ML_antiquotation_def "prop"} & : & @{text ML_antiquotation} \\
- \end{matharray}
-
- @{rail "
- (@@{ML_antiquotation const_name} |
- @@{ML_antiquotation const_abbrev}) nameref
- ;
- @@{ML_antiquotation const} ('(' (type + ',') ')')?
- ;
- @@{ML_antiquotation term} term
- ;
- @@{ML_antiquotation prop} prop
- "}
-
- \begin{description}
-
- \item @{text "@{const_name c}"} inlines the internalized logical
- constant name @{text "c"} --- as @{ML_type string} literal.
-
- \item @{text "@{const_abbrev c}"} inlines the internalized
- abbreviated constant name @{text "c"} --- as @{ML_type string}
- literal.
-
- \item @{text "@{const c(\<^vec>\<tau>)}"} inlines the internalized
- constant @{text "c"} with precise type instantiation in the sense of
- @{ML Sign.const_instance} --- as @{ML Const} constructor term for
- datatype @{ML_type term}.
-
- \item @{text "@{term t}"} inlines the internalized term @{text "t"}
- --- as constructor term for datatype @{ML_type term}.
-
- \item @{text "@{prop \<phi>}"} inlines the internalized proposition
- @{text "\<phi>"} --- as constructor term for datatype @{ML_type term}.
-
- \end{description}
-*}
-
-
-section {* Theorems \label{sec:thms} *}
-
-text {*
- A \emph{proposition} is a well-typed term of type @{text "prop"}, a
- \emph{theorem} is a proven proposition (depending on a context of
- hypotheses and the background theory). Primitive inferences include
- plain Natural Deduction rules for the primary connectives @{text
- "\<And>"} and @{text "\<Longrightarrow>"} of the framework. There is also a builtin
- notion of equality/equivalence @{text "\<equiv>"}.
-*}
-
-
-subsection {* Primitive connectives and rules \label{sec:prim-rules} *}
-
-text {*
- The theory @{text "Pure"} contains constant declarations for the
- primitive connectives @{text "\<And>"}, @{text "\<Longrightarrow>"}, and @{text "\<equiv>"} of
- the logical framework, see \figref{fig:pure-connectives}. The
- derivability judgment @{text "A\<^isub>1, \<dots>, A\<^isub>n \<turnstile> B"} is
- defined inductively by the primitive inferences given in
- \figref{fig:prim-rules}, with the global restriction that the
- hypotheses must \emph{not} contain any schematic variables. The
- builtin equality is conceptually axiomatized as shown in
- \figref{fig:pure-equality}, although the implementation works
- directly with derived inferences.
-
- \begin{figure}[htb]
- \begin{center}
- \begin{tabular}{ll}
- @{text "all :: (\<alpha> \<Rightarrow> prop) \<Rightarrow> prop"} & universal quantification (binder @{text "\<And>"}) \\
- @{text "\<Longrightarrow> :: prop \<Rightarrow> prop \<Rightarrow> prop"} & implication (right associative infix) \\
- @{text "\<equiv> :: \<alpha> \<Rightarrow> \<alpha> \<Rightarrow> prop"} & equality relation (infix) \\
- \end{tabular}
- \caption{Primitive connectives of Pure}\label{fig:pure-connectives}
- \end{center}
- \end{figure}
-
- \begin{figure}[htb]
- \begin{center}
- \[
- \infer[@{text "(axiom)"}]{@{text "\<turnstile> A"}}{@{text "A \<in> \<Theta>"}}
- \qquad
- \infer[@{text "(assume)"}]{@{text "A \<turnstile> A"}}{}
- \]
- \[
- \infer[@{text "(\<And>\<hyphen>intro)"}]{@{text "\<Gamma> \<turnstile> \<And>x. b[x]"}}{@{text "\<Gamma> \<turnstile> b[x]"} & @{text "x \<notin> \<Gamma>"}}
- \qquad
- \infer[@{text "(\<And>\<hyphen>elim)"}]{@{text "\<Gamma> \<turnstile> b[a]"}}{@{text "\<Gamma> \<turnstile> \<And>x. b[x]"}}
- \]
- \[
- \infer[@{text "(\<Longrightarrow>\<hyphen>intro)"}]{@{text "\<Gamma> - A \<turnstile> A \<Longrightarrow> B"}}{@{text "\<Gamma> \<turnstile> B"}}
- \qquad
- \infer[@{text "(\<Longrightarrow>\<hyphen>elim)"}]{@{text "\<Gamma>\<^sub>1 \<union> \<Gamma>\<^sub>2 \<turnstile> B"}}{@{text "\<Gamma>\<^sub>1 \<turnstile> A \<Longrightarrow> B"} & @{text "\<Gamma>\<^sub>2 \<turnstile> A"}}
- \]
- \caption{Primitive inferences of Pure}\label{fig:prim-rules}
- \end{center}
- \end{figure}
-
- \begin{figure}[htb]
- \begin{center}
- \begin{tabular}{ll}
- @{text "\<turnstile> (\<lambda>x. b[x]) a \<equiv> b[a]"} & @{text "\<beta>"}-conversion \\
- @{text "\<turnstile> x \<equiv> x"} & reflexivity \\
- @{text "\<turnstile> x \<equiv> y \<Longrightarrow> P x \<Longrightarrow> P y"} & substitution \\
- @{text "\<turnstile> (\<And>x. f x \<equiv> g x) \<Longrightarrow> f \<equiv> g"} & extensionality \\
- @{text "\<turnstile> (A \<Longrightarrow> B) \<Longrightarrow> (B \<Longrightarrow> A) \<Longrightarrow> A \<equiv> B"} & logical equivalence \\
- \end{tabular}
- \caption{Conceptual axiomatization of Pure equality}\label{fig:pure-equality}
- \end{center}
- \end{figure}
-
- The introduction and elimination rules for @{text "\<And>"} and @{text
- "\<Longrightarrow>"} are analogous to formation of dependently typed @{text
- "\<lambda>"}-terms representing the underlying proof objects. Proof terms
- are irrelevant in the Pure logic, though; they cannot occur within
- propositions. The system provides a runtime option to record
- explicit proof terms for primitive inferences. Thus all three
- levels of @{text "\<lambda>"}-calculus become explicit: @{text "\<Rightarrow>"} for
- terms, and @{text "\<And>/\<Longrightarrow>"} for proofs (cf.\
- \cite{Berghofer-Nipkow:2000:TPHOL}).
-
- Observe that locally fixed parameters (as in @{text
- "\<And>\<hyphen>intro"}) need not be recorded in the hypotheses, because
- the simple syntactic types of Pure are always inhabitable.
- ``Assumptions'' @{text "x :: \<tau>"} for type-membership are only
- present as long as some @{text "x\<^isub>\<tau>"} occurs in the statement
- body.\footnote{This is the key difference to ``@{text "\<lambda>HOL"}'' in
- the PTS framework \cite{Barendregt-Geuvers:2001}, where hypotheses
- @{text "x : A"} are treated uniformly for propositions and types.}
-
- \medskip The axiomatization of a theory is implicitly closed by
- forming all instances of type and term variables: @{text "\<turnstile>
- A\<vartheta>"} holds for any substitution instance of an axiom
- @{text "\<turnstile> A"}. By pushing substitutions through derivations
- inductively, we also get admissible @{text "generalize"} and @{text
- "instantiate"} rules as shown in \figref{fig:subst-rules}.
-
- \begin{figure}[htb]
- \begin{center}
- \[
- \infer{@{text "\<Gamma> \<turnstile> B[?\<alpha>]"}}{@{text "\<Gamma> \<turnstile> B[\<alpha>]"} & @{text "\<alpha> \<notin> \<Gamma>"}}
- \quad
- \infer[\quad@{text "(generalize)"}]{@{text "\<Gamma> \<turnstile> B[?x]"}}{@{text "\<Gamma> \<turnstile> B[x]"} & @{text "x \<notin> \<Gamma>"}}
- \]
- \[
- \infer{@{text "\<Gamma> \<turnstile> B[\<tau>]"}}{@{text "\<Gamma> \<turnstile> B[?\<alpha>]"}}
- \quad
- \infer[\quad@{text "(instantiate)"}]{@{text "\<Gamma> \<turnstile> B[t]"}}{@{text "\<Gamma> \<turnstile> B[?x]"}}
- \]
- \caption{Admissible substitution rules}\label{fig:subst-rules}
- \end{center}
- \end{figure}
-
- Note that @{text "instantiate"} does not require an explicit
- side-condition, because @{text "\<Gamma>"} may never contain schematic
- variables.
-
- In principle, variables could be substituted in hypotheses as well,
- but this would disrupt the monotonicity of reasoning: deriving
- @{text "\<Gamma>\<vartheta> \<turnstile> B\<vartheta>"} from @{text "\<Gamma> \<turnstile> B"} is
- correct, but @{text "\<Gamma>\<vartheta> \<supseteq> \<Gamma>"} does not necessarily hold:
- the result belongs to a different proof context.
-
- \medskip An \emph{oracle} is a function that produces axioms on the
- fly. Logically, this is an instance of the @{text "axiom"} rule
- (\figref{fig:prim-rules}), but there is an operational difference.
- The system always records oracle invocations within derivations of
- theorems by a unique tag.
-
- Axiomatizations should be limited to the bare minimum, typically as
- part of the initial logical basis of an object-logic formalization.
- Later on, theories are usually developed in a strictly definitional
- fashion, by stating only certain equalities over new constants.
-
- A \emph{simple definition} consists of a constant declaration @{text
- "c :: \<sigma>"} together with an axiom @{text "\<turnstile> c \<equiv> t"}, where @{text "t
- :: \<sigma>"} is a closed term without any hidden polymorphism. The RHS
- may depend on further defined constants, but not @{text "c"} itself.
- Definitions of functions may be presented as @{text "c \<^vec>x \<equiv>
- t"} instead of the puristic @{text "c \<equiv> \<lambda>\<^vec>x. t"}.
-
- An \emph{overloaded definition} consists of a collection of axioms
- for the same constant, with zero or one equations @{text
- "c((\<^vec>\<alpha>)\<kappa>) \<equiv> t"} for each type constructor @{text "\<kappa>"} (for
- distinct variables @{text "\<^vec>\<alpha>"}). The RHS may mention
- previously defined constants as above, or arbitrary constants @{text
- "d(\<alpha>\<^isub>i)"} for some @{text "\<alpha>\<^isub>i"} projected from @{text
- "\<^vec>\<alpha>"}. Thus overloaded definitions essentially work by
- primitive recursion over the syntactic structure of a single type
- argument. See also \cite[\S4.3]{Haftmann-Wenzel:2006:classes}.
-*}
-
-text %mlref {*
- \begin{mldecls}
- @{index_ML Logic.all: "term -> term -> term"} \\
- @{index_ML Logic.mk_implies: "term * term -> term"} \\
- \end{mldecls}
- \begin{mldecls}
- @{index_ML_type ctyp} \\
- @{index_ML_type cterm} \\
- @{index_ML Thm.ctyp_of: "theory -> typ -> ctyp"} \\
- @{index_ML Thm.cterm_of: "theory -> term -> cterm"} \\
- @{index_ML Thm.apply: "cterm -> cterm -> cterm"} \\
- @{index_ML Thm.lambda: "cterm -> cterm -> cterm"} \\
- @{index_ML Thm.all: "cterm -> cterm -> cterm"} \\
- @{index_ML Drule.mk_implies: "cterm * cterm -> cterm"} \\
- \end{mldecls}
- \begin{mldecls}
- @{index_ML_type thm} \\
- @{index_ML proofs: "int Unsynchronized.ref"} \\
- @{index_ML Thm.transfer: "theory -> thm -> thm"} \\
- @{index_ML Thm.assume: "cterm -> thm"} \\
- @{index_ML Thm.forall_intr: "cterm -> thm -> thm"} \\
- @{index_ML Thm.forall_elim: "cterm -> thm -> thm"} \\
- @{index_ML Thm.implies_intr: "cterm -> thm -> thm"} \\
- @{index_ML Thm.implies_elim: "thm -> thm -> thm"} \\
- @{index_ML Thm.generalize: "string list * string list -> int -> thm -> thm"} \\
- @{index_ML Thm.instantiate: "(ctyp * ctyp) list * (cterm * cterm) list -> thm -> thm"} \\
- @{index_ML Thm.add_axiom: "Proof.context ->
- binding * term -> theory -> (string * thm) * theory"} \\
- @{index_ML Thm.add_oracle: "binding * ('a -> cterm) -> theory ->
- (string * ('a -> thm)) * theory"} \\
- @{index_ML Thm.add_def: "Proof.context -> bool -> bool ->
- binding * term -> theory -> (string * thm) * theory"} \\
- \end{mldecls}
- \begin{mldecls}
- @{index_ML Theory.add_deps: "Proof.context -> string ->
- string * typ -> (string * typ) list -> theory -> theory"} \\
- \end{mldecls}
-
- \begin{description}
-
- \item @{ML Logic.all}~@{text "a B"} produces a Pure quantification
- @{text "\<And>a. B"}, where occurrences of the atomic term @{text "a"} in
- the body proposition @{text "B"} are replaced by bound variables.
- (See also @{ML lambda} on terms.)
-
- \item @{ML Logic.mk_implies}~@{text "(A, B)"} produces a Pure
- implication @{text "A \<Longrightarrow> B"}.
-
- \item Types @{ML_type ctyp} and @{ML_type cterm} represent certified
- types and terms, respectively. These are abstract datatypes that
- guarantee that its values have passed the full well-formedness (and
- well-typedness) checks, relative to the declarations of type
- constructors, constants etc.\ in the background theory. The
- abstract types @{ML_type ctyp} and @{ML_type cterm} are part of the
- same inference kernel that is mainly responsible for @{ML_type thm}.
- Thus syntactic operations on @{ML_type ctyp} and @{ML_type cterm}
- are located in the @{ML_struct Thm} module, even though theorems are
- not yet involved at that stage.
-
- \item @{ML Thm.ctyp_of}~@{text "thy \<tau>"} and @{ML
- Thm.cterm_of}~@{text "thy t"} explicitly checks types and terms,
- respectively. This also involves some basic normalizations, such
- expansion of type and term abbreviations from the theory context.
- Full re-certification is relatively slow and should be avoided in
- tight reasoning loops.
-
- \item @{ML Thm.apply}, @{ML Thm.lambda}, @{ML Thm.all}, @{ML
- Drule.mk_implies} etc.\ compose certified terms (or propositions)
- incrementally. This is equivalent to @{ML Thm.cterm_of} after
- unchecked @{ML_op "$"}, @{ML lambda}, @{ML Logic.all}, @{ML
- Logic.mk_implies} etc., but there can be a big difference in
- performance when large existing entities are composed by a few extra
- constructions on top. There are separate operations to decompose
- certified terms and theorems to produce certified terms again.
-
- \item Type @{ML_type thm} represents proven propositions. This is
- an abstract datatype that guarantees that its values have been
- constructed by basic principles of the @{ML_struct Thm} module.
- Every @{ML_type thm} value contains a sliding back-reference to the
- enclosing theory, cf.\ \secref{sec:context-theory}.
-
- \item @{ML proofs} specifies the detail of proof recording within
- @{ML_type thm} values: @{ML 0} records only the names of oracles,
- @{ML 1} records oracle names and propositions, @{ML 2} additionally
- records full proof terms. Officially named theorems that contribute
- to a result are recorded in any case.
-
- \item @{ML Thm.transfer}~@{text "thy thm"} transfers the given
- theorem to a \emph{larger} theory, see also \secref{sec:context}.
- This formal adjustment of the background context has no logical
- significance, but is occasionally required for formal reasons, e.g.\
- when theorems that are imported from more basic theories are used in
- the current situation.
-
- \item @{ML Thm.assume}, @{ML Thm.forall_intr}, @{ML
- Thm.forall_elim}, @{ML Thm.implies_intr}, and @{ML Thm.implies_elim}
- correspond to the primitive inferences of \figref{fig:prim-rules}.
-
- \item @{ML Thm.generalize}~@{text "(\<^vec>\<alpha>, \<^vec>x)"}
- corresponds to the @{text "generalize"} rules of
- \figref{fig:subst-rules}. Here collections of type and term
- variables are generalized simultaneously, specified by the given
- basic names.
-
- \item @{ML Thm.instantiate}~@{text "(\<^vec>\<alpha>\<^isub>s,
- \<^vec>x\<^isub>\<tau>)"} corresponds to the @{text "instantiate"} rules
- of \figref{fig:subst-rules}. Type variables are substituted before
- term variables. Note that the types in @{text "\<^vec>x\<^isub>\<tau>"}
- refer to the instantiated versions.
-
- \item @{ML Thm.add_axiom}~@{text "ctxt (name, A)"} declares an
- arbitrary proposition as axiom, and retrieves it as a theorem from
- the resulting theory, cf.\ @{text "axiom"} in
- \figref{fig:prim-rules}. Note that the low-level representation in
- the axiom table may differ slightly from the returned theorem.
-