tactic hammer documentation (from Jasmin)
authordesharna
Tue, 25 Feb 2025 17:43:58 +0100 (2 months ago)
changeset 82208 bab8158a02f0
parent 82207 7e45a83373c8
child 82209 a8e92f663481
tactic hammer documentation (from Jasmin)
src/Doc/Sledgehammer/document/root.tex
--- a/src/Doc/Sledgehammer/document/root.tex	Tue Feb 25 17:43:53 2025 +0100
+++ b/src/Doc/Sledgehammer/document/root.tex	Tue Feb 25 17:43:58 2025 +0100
@@ -101,9 +101,9 @@
 \section{Introduction}
 \label{introduction}
 
-Sledgehammer is a tool that applies automatic theorem provers (ATPs)
-and satisfiability-modulo-theories (SMT) solvers on the current goal, mostly
-to find proofs but also to refute the goal.%
+Sledgehammer is a tool that applies automatic theorem provers (ATPs),
+satisfiability-modulo-theories (SMT) solvers, and Isabelle proof methods on the
+current goal, mostly to find proofs but optionally also to refute the goal.%
 \footnote{The distinction between ATPs and SMT solvers is mostly
 historical but convenient.}
 %
@@ -115,17 +115,21 @@
 via the System\-On\-TPTP web service \cite{sutcliffe-2000}. The supported SMT
 solvers are CVC4 \cite{cvc4}, cvc5 \cite{barbosa-et-al-cvc5}, veriT
 \cite{bouton-et-al-2009}, and Z3 \cite{de-moura-2008}. These are always run
-locally.
+locally. The supported proof methods are \textit{algebra}, \textit{argo},
+\textit{auto}, \textit{blast}, \textit{fastforce}, \textit{force},
+\textit{linarith}, \textit{meson}, \textit{metis}, \textit{order},
+\textit{presburger}, \textit{satx}, and \textit{simp}. The proof method support
+is experimental and disabled by default.
 
-The problem passed to the automatic provers (or solvers) consists of your
-current goal together with a heuristic selection of hundreds of facts (theorems)
-from the current theory context, filtered by relevance.
+The problem passed to the ATPs, SMT solvers, or proof methods consists
+of your current goal together with a heuristic selection of facts (theorems)
+from the current theory context, filtered by likely relevance.
 
 The result of a successful proof search is some source text that typically
-reconstructs the proof within Isabelle. For ATPs, the reconstructed proof
-typically relies on the general-purpose \textit{metis} proof method, which
-integrates the Metis ATP in Isabelle/HOL with explicit inferences going through
-the kernel. Thus its results are correct by construction.
+reconstructs the proof within Isabelle. The reconstructed proof often relies on
+the general-purpose \textit{metis} proof method, which integrates the Metis ATP
+in Isabelle/HOL with explicit inferences going through the kernel. Thus its
+results are correct by construction.
 
 Sometimes the automatic provers might detect that the goal is inconsistent with
 the background facts---or even that the background facts are inconsistent
@@ -154,7 +158,7 @@
 \label{installation}
 
 Sledgehammer is part of Isabelle, so you do not need to install it. However, it
-relies on third-party automatic provers (ATPs and SMT solvers).
+relies on third-party ATPs and SMT solvers.
 
 Among the ATPs, agsyHOL, Alt-Ergo, E, LEO-II, Leo-III, Satallax, SPASS, Vampire,
 and Zipperposition can be run locally; in addition, agsyHOL, Alt-Ergo, E,
@@ -162,7 +166,7 @@
 available remotely via System\-On\-TPTP \cite{sutcliffe-2000}. The SMT solvers
 CVC4, cvc5, veriT, and Z3 can be run locally.
 
-There are three main ways to install automatic provers on your machine:
+There are three main ways to install ATPs or SMT solvers on your machine:
 
 \begin{sloppy}
 \begin{enum}
@@ -177,7 +181,7 @@
 \footnote{The variable \texttt{\$ISABELLE\_HOME\_USER} is set by Isabelle at
 startup. Its value can be retrieved by executing \texttt{isabelle}
 \texttt{getenv} \texttt{ISABELLE\_HOME\_USER} on the command line.}
-file with the absolute path to the prover. For example, if the
+file with the absolute path to the system. For example, if the
 \texttt{components} file does not exist yet and you extracted SPASS to
 \texttt{/usr/local/spass-3.8ds}, create it with the single line
 
@@ -322,15 +326,13 @@
 
 \begin{enum}
 \item[\labelitemi] \textbf{\textit{provers}} (\S\ref{mode-of-operation}) specifies
-the automatic provers (ATPs and SMT solvers) that should be run whenever
-Sledgehammer is invoked (e.g., ``\textit{provers}~= \textit{cvc4 e
+the automatic provers (ATPs, SMT solvers, or proof methods) that should be run whenever
+Sledgehammer is invoked (e.g., ``\textit{provers}~= \textit{auto cvc4 e
 vampire zipperposition\/}'').
 
 \item[\labelitemi] \textbf{\textit{max\_facts}} (\S\ref{relevance-filter})
 specifies the maximum number of facts that should be passed to the provers. By
-default, the value is prover-dependent but varies between about 50 and 1000. If
-the provers time out, you can try lowering this value to, say, 25 or 50 and see
-if that helps.
+default, the value is prover-dependent and varies between 0 and about 1000.
 
 \item[\labelitemi] \textbf{\textit{isar\_proofs}} (\S\ref{output-format}) specifies
 that Isar proofs should be generated, in addition to one-line proofs. The length
@@ -421,10 +423,10 @@
 
 \point{Why does Metis fail to reconstruct the proof?}
 
-There are many reasons. If Metis runs seemingly forever, that is a sign that the
-proof is too difficult for it. Metis's search is complete for first-order logic
-with equality, so if the proof was found by a superposition-based ATP such as
-E, SPASS, or Vampire, Metis should \emph{eventually} find it---in principle.
+There are many reasons. If Metis runs seemingly forever, that is a sign that
+the proof is too difficult for it. Metis's search is complete for first-order
+logic with equality, but ATPs such as E, Vampire, and Zipperposition are
+higher-order, so Metis might fail at refinding their proofs.
 
 In some rare cases, \textit{metis} fails fairly quickly, and you get the error
 message ``One-line proof reconstruction failed.'' This indicates that
@@ -437,8 +439,8 @@
 \textit{mono\_tags} arguments to Metis?}
 
 The \textit{metis}~(\textit{full\_types}) proof method
-and its cousin \textit{metis}~(\textit{mono\_tags}) are fully-typed
-versions of Metis. It is somewhat slower than \textit{metis}, but the proof
+and its relative \textit{metis}~(\textit{mono\_tags}) are fully-typed
+versions of Metis. They are somewhat slower than \textit{metis}, but the proof
 search is fully typed, and it also includes more powerful rules such as the
 axiom ``$x = \const{True} \mathrel{\lor} x = \const{False}$'' for reasoning in
 higher-order positions (e.g., in set comprehensions). The method is tried as a
@@ -478,10 +480,10 @@
 \point{Are the generated proofs minimal?}
 
 Automatic provers frequently use many more facts than are necessary.
-Sledgehammer includes a proof minimization tool that takes a set of facts returned by
-a given prover and repeatedly calls a prover or proof method with subsets of
-those facts to find a minimal set. Reducing the number of facts typically helps
-reconstruction and declutters the proof documents.
+Sledgehammer includes a proof minimization tool that takes a set of facts
+returned by a prover and repeatedly calls a prover or proof method with subsets
+of those facts to find a minimal set. Reducing the number of facts typically
+helps reconstruction and declutters the proof documents.
 
 
 \point{A strange error occurred---what should I do?}
@@ -490,15 +492,6 @@
 error to the author at \authoremail.
 
 
-\point{Auto can solve it---why not Sledgehammer?}
-
-Problems can be easy for \textit{auto} and difficult for automatic provers, but
-the reverse is also true, so do not be discouraged if your first attempts fail.
-Because the system refers to all theorems known to Isabelle, it is particularly
-suitable when your goal has a short proof but requires lemmas that you do not
-know about.
-
-
 \point{Why are there so many options?}
 
 Sledgehammer's philosophy is that it should work out of the box, without user
@@ -528,8 +521,8 @@
 
 \item[\labelitemi] \textbf{\textit{supported\_provers}:} Prints the list of
 automatic provers supported by Sledgehammer. See \S\ref{installation} and
-\S\ref{mode-of-operation} for more information on how to install automatic
-provers.
+\S\ref{mode-of-operation} for more information on how to install ATPs and SMT
+solvers.
 
 \item[\labelitemi] \textbf{\textit{refresh\_tptp}:} Refreshes the list of remote
 ATPs available at System\-On\-TPTP \cite{sutcliffe-2000}.
@@ -551,9 +544,7 @@
 theory to process all the available facts, learning from proofs generated by
 automatic provers. The prover to use and its timeout can be set using the
 \textit{prover} (\S\ref{mode-of-operation}) and \textit{timeout}
-(\S\ref{timeouts}) options. It is recommended to perform learning using a
-first-order ATP (such as E, SPASS, and Vampire) as opposed to a
-higher-order ATP or an SMT solver.
+(\S\ref{timeouts}) options.
 
 \item[\labelitemi] \textbf{\textit{relearn\_isar}:} Same as \textit{unlearn}
 followed by \textit{learn\_isar}.
@@ -687,12 +678,11 @@
 \begin{enum}
 \opnodefaultbrk{provers}{string}
 Specifies the automatic provers to use as a space-separated list (e.g.,
-``\textit{cvc4}~\textit{e}~\textit{spass}~\textit{vampire\/}'').
+``\textit{auto}~\textit{cvc4}~\textit{e}~\textit{spass}~\textit{vampire\/}'').
 Provers can be run locally or remotely; see \S\ref{installation} for
-installation instructions. By default, \textit{provers} is set to the list of
-all installed local provers.
+installation instructions.
 
-The following local provers are supported:
+The following local ATPs and SMT solvers are supported:
 
 \begin{sloppy}
 \begin{enum}
@@ -719,13 +709,13 @@
 set the environment variable \texttt{CVC5\_SOLVER} to the complete path of the
 executable, including the file name.
 
-\item[\labelitemi] \textbf{\textit{e}:} E is a first-order resolution prover
+\item[\labelitemi] \textbf{\textit{e}:} E is a higher-order superposition prover
 developed by Stephan Schulz \cite{schulz-2019}. To use E, set the environment
 variable \texttt{E\_HOME} to the directory that contains the \texttt{eproof}
-executable and \texttt{E\_VERSION} to the version number (e.g., ``1.8''), or
+executable and \texttt{E\_VERSION} to the version number (e.g., ``3.0''), or
 install the prebuilt E package from \download.
 
-\item[\labelitemi] \textbf{\textit{iprover}:} iProver is a pure
+\item[\labelitemi] \textbf{\textit{iprover}:} iProver is a first-order
 instantiation-based prover developed by Konstantin Korovin \cite{korovin-2009}.
 To use iProver, set the environment variable \texttt{IPROVER\_HOME} to the
 directory that contains the \texttt{iproveropt} executable. iProver depends on
@@ -750,25 +740,26 @@
 environment variable \texttt{SATALLAX\_HOME} to the directory that contains the
 \texttt{satallax} executable.
 
-\item[\labelitemi] \textbf{\textit{spass}:} SPASS is a first-order resolution
+\item[\labelitemi] \textbf{\textit{spass}:} SPASS is a first-order superposition
 prover developed by Christoph Weidenbach et al.\ \cite{weidenbach-et-al-2009}.
 To use SPASS, set the environment variable \texttt{SPASS\_HOME} to the directory
 that contains the \texttt{SPASS} executable and \texttt{SPASS\_VERSION} to the
 version number (e.g., ``3.8ds''), or install the prebuilt SPASS package from
 \download.
 
-\item[\labelitemi] \textbf{\textit{vampire}:} Vampire is a first-order
-resolution prover developed by Andrei Voronkov and his colleagues
+\item[\labelitemi] \textbf{\textit{vampire}:} Vampire is a higher-order
+superposition prover developed by Andrei Voronkov and his colleagues
 \cite{riazanov-voronkov-2002}. To use Vampire, set the environment variable
 \texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{vampire}
 executable and \texttt{VAMPIRE\_VERSION} to the version number (e.g.,
-``4.2.2'').
+``4.8'').
 
-\item[\labelitemi] \textbf{\textit{verit}:} veriT \cite{bouton-et-al-2009} is an
-SMT solver developed by David D\'eharbe, Pascal Fontaine, and their colleagues.
-It is designed to produce detailed proofs for reconstruction in proof
-assistants. To use veriT, set the environment variable \texttt{ISABELLE\_VERIT}
-to the complete path of the executable, including the file name.
+\item[\labelitemi] \textbf{\textit{verit}:} veriT \cite{bouton-et-al-2009} is a
+first-order SMT solver developed by David D\'eharbe, Pascal Fontaine, and their
+colleagues. It is designed to produce detailed proofs for reconstruction in
+proof assistants. To use veriT, set the environment variable
+\texttt{ISABELLE\_VERIT} to the complete path of the executable, including the
+file name.
 
 \item[\labelitemi] \textbf{\textit{z3}:} Z3 is an SMT solver developed at
 Microsoft Research \cite{de-moura-2008}. To use Z3, set the environment variable
@@ -780,12 +771,12 @@
 Cruanes, Petar Vukmirovi\'c, and colleagues. To use Zipperposition, set the
 environment variable \texttt{ZIPPERPOSITION\_HOME} to the directory that
 contains the \texttt{zipperposition} executable and
-\texttt{ZIPPERPOSITION\_VERSION} to the version number (e.g., ``2.0.1'').
+\texttt{ZIPPERPOSITION\_VERSION} to the version number (e.g., ``2.1'').
 \end{enum}
 
 \end{sloppy}
 
-Moreover, the following remote provers are supported:
+The following remote ATPs are supported:
 
 \begin{enum}
 \item[\labelitemi] \textbf{\textit{remote\_agsyhol}:} The remote version of
@@ -817,10 +808,20 @@
 version of Zipperposition runs on Geoff Sutcliffe's Miami servers.
 \end{enum}
 
-By default, Sledgehammer runs a subset of CVC4, E, SPASS, Vampire, veriT, Z3,
-and Zipperposition in parallel, either locally or remotely---depending on the
-number of processor cores available and on which provers are actually installed.
-It is generally beneficial to run several provers in parallel.
+By default, \textit{provers} is set to a subset of CVC4, E, SPASS, Vampire,
+veriT, Z3, and Zipperposition, to be run in parallel, either locally or
+remotely---depending on the number of processor cores available and on which
+provers are actually installed. Proof methods are currently not included, due
+to their experimental status. (Proof methods can nevertheless appear in
+Isabelle proofs that reconstruct proofs originally found by ATPs or SMT
+solvers.)
+
+The following proof methods are supported:\ \textbf{\textit{algebra}},
+\textbf{\textit{argo}}, \textbf{\textit{auto}}, \textbf{\textit{blast}},
+\textbf{\textit{fastforce}}, \textbf{\textit{force}},
+\textbf{\textit{linarith}}, \textbf{\textit{meson}}, \textbf{\textit{metis}},
+\textbf{\textit{order}}, \textbf{\textit{presburger}}, \textbf{\textit{satx}},
+\textbf{\textit{simp}}.
 
 \opnodefault{prover}{string}
 Alias for \textit{provers}.
@@ -936,7 +937,7 @@
 Specifies the maximum number of facts that may be returned by the relevance
 filter. If the option is set to \textit{smart} (the default), it effectively
 takes a value that was empirically found to be appropriate for the prover.
-Typical values lie between 50 and 1000.
+Typical values lie between 0 and 1000.
 
 \opdefault{fact\_thresholds}{float\_pair}{\upshape 0.45~0.85}
 Specifies the thresholds above which facts are considered relevant by the
@@ -955,9 +956,9 @@
 Specifies the maximum number of monomorphic instances to generate beyond
 \textit{max\_facts}. The higher this limit is, the more monomorphic instances
 are potentially generated. Whether monomorphization takes place depends on the
-type encoding used. If the option is set to \textit{smart} (the default), it
-takes a value that was empirically found to be appropriate for the prover. For
-most provers, this value is 100.
+prover and possibly the specified type encoding. If the option is set to
+\textit{smart} (the default), it takes a value that was empirically found to be
+appropriate for the prover. For most provers, this value is 100.
 
 \nopagebreak
 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
@@ -966,9 +967,9 @@
 Specifies the maximum number of iterations for the monomorphization fixpoint
 construction. The higher this limit is, the more monomorphic instances are
 potentially generated. Whether monomorphization takes place depends on the
-type encoding used. If the option is set to \textit{smart} (the default), it
-takes a value that was empirically found to be appropriate for the prover.
-For most provers, this value is 3.
+prover and possibly the specified type encoding. If the option is set to
+\textit{smart} (the default), it takes a value that was empirically found to be
+appropriate for the prover.
 
 \nopagebreak
 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
@@ -1247,12 +1248,12 @@
 problem.
 \end{enum}
 
-Sledgehammer emits an error if the actual outcome differs from the expected outcome. This option is
-useful for regression testing.
+Sledgehammer emits an error if the actual outcome differs from the expected
+outcome. This option is useful for regression testing.
 
-The expected outcomes are not mutually exclusive. More specifically, \textit{some} is accepted
-whenever \textit{some\_preplayed} is accepted as the former has strictly fewer requirements
-than the later.
+The expected outcomes are not mutually exclusive. More specifically,
+\textit{some} is accepted whenever \textit{some\_preplayed} is accepted as the
+former has strictly fewer requirements than the later.
 
 \nopagebreak
 {\small See also \textit{timeout} (\S\ref{timeouts}).}
@@ -1267,7 +1268,7 @@
 Specifies the maximum number of seconds that the automatic provers should spend
 searching for a proof. This excludes problem preparation and is a soft limit.
 
-\opdefault{slices}{int}{\upshape 12 times the number of cores detected}
+\opdefault{slices}{int}{\upshape 24 times the number of cores detected}
 Specifies the number of time slices. Time slices are the basic unit for prover
 invocations. They are divided among the available provers. A single prover
 invocation can occupy a single slice, two slices, or more, depending on the
@@ -1299,9 +1300,9 @@
 \label{mirabelle}
 
 The \texttt{isabelle mirabelle} tool executes Sledgehammer or other advisory
-tools (e.g., Nitpick) or tactics (e.g., \textit{auto}) on all subgoals emerging
-in a theory. It is typically used to measure the success rate of a proof tool
-on some benchmark. Its command-line usage is as follows:
+tools (e.g., Nitpick) or proof methods (e.g., \textit{auto}) on all subgoals
+emerging in a theory. It is typically used to measure the success rate of a
+proof tool on some benchmark. Its command-line usage is as follows:
 
 {\small
 \begin{verbatim}