# HG changeset patch # User desharna # Date 1740501838 -3600 # Node ID bab8158a02f0f406b6db8edefe88741a4b18305a # Parent 7e45a83373c89af4cd33bca10bad60b45fd5df41 tactic hammer documentation (from Jasmin) diff -r 7e45a83373c8 -r bab8158a02f0 src/Doc/Sledgehammer/document/root.tex --- a/src/Doc/Sledgehammer/document/root.tex Tue Feb 25 17:43:53 2025 +0100 +++ b/src/Doc/Sledgehammer/document/root.tex Tue Feb 25 17:43:58 2025 +0100 @@ -101,9 +101,9 @@ \section{Introduction} \label{introduction} -Sledgehammer is a tool that applies automatic theorem provers (ATPs) -and satisfiability-modulo-theories (SMT) solvers on the current goal, mostly -to find proofs but also to refute the goal.% +Sledgehammer is a tool that applies automatic theorem provers (ATPs), +satisfiability-modulo-theories (SMT) solvers, and Isabelle proof methods on the +current goal, mostly to find proofs but optionally also to refute the goal.% \footnote{The distinction between ATPs and SMT solvers is mostly historical but convenient.} % @@ -115,17 +115,21 @@ via the System\-On\-TPTP web service \cite{sutcliffe-2000}. The supported SMT solvers are CVC4 \cite{cvc4}, cvc5 \cite{barbosa-et-al-cvc5}, veriT \cite{bouton-et-al-2009}, and Z3 \cite{de-moura-2008}. These are always run -locally. +locally. The supported proof methods are \textit{algebra}, \textit{argo}, +\textit{auto}, \textit{blast}, \textit{fastforce}, \textit{force}, +\textit{linarith}, \textit{meson}, \textit{metis}, \textit{order}, +\textit{presburger}, \textit{satx}, and \textit{simp}. The proof method support +is experimental and disabled by default. -The problem passed to the automatic provers (or solvers) consists of your -current goal together with a heuristic selection of hundreds of facts (theorems) -from the current theory context, filtered by relevance. +The problem passed to the ATPs, SMT solvers, or proof methods consists +of your current goal together with a heuristic selection of facts (theorems) +from the current theory context, filtered by likely relevance. The result of a successful proof search is some source text that typically -reconstructs the proof within Isabelle. For ATPs, the reconstructed proof -typically relies on the general-purpose \textit{metis} proof method, which -integrates the Metis ATP in Isabelle/HOL with explicit inferences going through -the kernel. Thus its results are correct by construction. +reconstructs the proof within Isabelle. The reconstructed proof often relies on +the general-purpose \textit{metis} proof method, which integrates the Metis ATP +in Isabelle/HOL with explicit inferences going through the kernel. Thus its +results are correct by construction. Sometimes the automatic provers might detect that the goal is inconsistent with the background facts---or even that the background facts are inconsistent @@ -154,7 +158,7 @@ \label{installation} Sledgehammer is part of Isabelle, so you do not need to install it. However, it -relies on third-party automatic provers (ATPs and SMT solvers). +relies on third-party ATPs and SMT solvers. Among the ATPs, agsyHOL, Alt-Ergo, E, LEO-II, Leo-III, Satallax, SPASS, Vampire, and Zipperposition can be run locally; in addition, agsyHOL, Alt-Ergo, E, @@ -162,7 +166,7 @@ available remotely via System\-On\-TPTP \cite{sutcliffe-2000}. The SMT solvers CVC4, cvc5, veriT, and Z3 can be run locally. -There are three main ways to install automatic provers on your machine: +There are three main ways to install ATPs or SMT solvers on your machine: \begin{sloppy} \begin{enum} @@ -177,7 +181,7 @@ \footnote{The variable \texttt{\$ISABELLE\_HOME\_USER} is set by Isabelle at startup. Its value can be retrieved by executing \texttt{isabelle} \texttt{getenv} \texttt{ISABELLE\_HOME\_USER} on the command line.} -file with the absolute path to the prover. For example, if the +file with the absolute path to the system. For example, if the \texttt{components} file does not exist yet and you extracted SPASS to \texttt{/usr/local/spass-3.8ds}, create it with the single line @@ -322,15 +326,13 @@ \begin{enum} \item[\labelitemi] \textbf{\textit{provers}} (\S\ref{mode-of-operation}) specifies -the automatic provers (ATPs and SMT solvers) that should be run whenever -Sledgehammer is invoked (e.g., ``\textit{provers}~= \textit{cvc4 e +the automatic provers (ATPs, SMT solvers, or proof methods) that should be run whenever +Sledgehammer is invoked (e.g., ``\textit{provers}~= \textit{auto cvc4 e vampire zipperposition\/}''). \item[\labelitemi] \textbf{\textit{max\_facts}} (\S\ref{relevance-filter}) specifies the maximum number of facts that should be passed to the provers. By -default, the value is prover-dependent but varies between about 50 and 1000. If -the provers time out, you can try lowering this value to, say, 25 or 50 and see -if that helps. +default, the value is prover-dependent and varies between 0 and about 1000. \item[\labelitemi] \textbf{\textit{isar\_proofs}} (\S\ref{output-format}) specifies that Isar proofs should be generated, in addition to one-line proofs. The length @@ -421,10 +423,10 @@ \point{Why does Metis fail to reconstruct the proof?} -There are many reasons. If Metis runs seemingly forever, that is a sign that the -proof is too difficult for it. Metis's search is complete for first-order logic -with equality, so if the proof was found by a superposition-based ATP such as -E, SPASS, or Vampire, Metis should \emph{eventually} find it---in principle. +There are many reasons. If Metis runs seemingly forever, that is a sign that +the proof is too difficult for it. Metis's search is complete for first-order +logic with equality, but ATPs such as E, Vampire, and Zipperposition are +higher-order, so Metis might fail at refinding their proofs. In some rare cases, \textit{metis} fails fairly quickly, and you get the error message ``One-line proof reconstruction failed.'' This indicates that @@ -437,8 +439,8 @@ \textit{mono\_tags} arguments to Metis?} The \textit{metis}~(\textit{full\_types}) proof method -and its cousin \textit{metis}~(\textit{mono\_tags}) are fully-typed -versions of Metis. It is somewhat slower than \textit{metis}, but the proof +and its relative \textit{metis}~(\textit{mono\_tags}) are fully-typed +versions of Metis. They are somewhat slower than \textit{metis}, but the proof search is fully typed, and it also includes more powerful rules such as the axiom ``$x = \const{True} \mathrel{\lor} x = \const{False}$'' for reasoning in higher-order positions (e.g., in set comprehensions). The method is tried as a @@ -478,10 +480,10 @@ \point{Are the generated proofs minimal?} Automatic provers frequently use many more facts than are necessary. -Sledgehammer includes a proof minimization tool that takes a set of facts returned by -a given prover and repeatedly calls a prover or proof method with subsets of -those facts to find a minimal set. Reducing the number of facts typically helps -reconstruction and declutters the proof documents. +Sledgehammer includes a proof minimization tool that takes a set of facts +returned by a prover and repeatedly calls a prover or proof method with subsets +of those facts to find a minimal set. Reducing the number of facts typically +helps reconstruction and declutters the proof documents. \point{A strange error occurred---what should I do?} @@ -490,15 +492,6 @@ error to the author at \authoremail. -\point{Auto can solve it---why not Sledgehammer?} - -Problems can be easy for \textit{auto} and difficult for automatic provers, but -the reverse is also true, so do not be discouraged if your first attempts fail. -Because the system refers to all theorems known to Isabelle, it is particularly -suitable when your goal has a short proof but requires lemmas that you do not -know about. - - \point{Why are there so many options?} Sledgehammer's philosophy is that it should work out of the box, without user @@ -528,8 +521,8 @@ \item[\labelitemi] \textbf{\textit{supported\_provers}:} Prints the list of automatic provers supported by Sledgehammer. See \S\ref{installation} and -\S\ref{mode-of-operation} for more information on how to install automatic -provers. +\S\ref{mode-of-operation} for more information on how to install ATPs and SMT +solvers. \item[\labelitemi] \textbf{\textit{refresh\_tptp}:} Refreshes the list of remote ATPs available at System\-On\-TPTP \cite{sutcliffe-2000}. @@ -551,9 +544,7 @@ theory to process all the available facts, learning from proofs generated by automatic provers. The prover to use and its timeout can be set using the \textit{prover} (\S\ref{mode-of-operation}) and \textit{timeout} -(\S\ref{timeouts}) options. It is recommended to perform learning using a -first-order ATP (such as E, SPASS, and Vampire) as opposed to a -higher-order ATP or an SMT solver. +(\S\ref{timeouts}) options. \item[\labelitemi] \textbf{\textit{relearn\_isar}:} Same as \textit{unlearn} followed by \textit{learn\_isar}. @@ -687,12 +678,11 @@ \begin{enum} \opnodefaultbrk{provers}{string} Specifies the automatic provers to use as a space-separated list (e.g., -``\textit{cvc4}~\textit{e}~\textit{spass}~\textit{vampire\/}''). +``\textit{auto}~\textit{cvc4}~\textit{e}~\textit{spass}~\textit{vampire\/}''). Provers can be run locally or remotely; see \S\ref{installation} for -installation instructions. By default, \textit{provers} is set to the list of -all installed local provers. +installation instructions. -The following local provers are supported: +The following local ATPs and SMT solvers are supported: \begin{sloppy} \begin{enum} @@ -719,13 +709,13 @@ set the environment variable \texttt{CVC5\_SOLVER} to the complete path of the executable, including the file name. -\item[\labelitemi] \textbf{\textit{e}:} E is a first-order resolution prover +\item[\labelitemi] \textbf{\textit{e}:} E is a higher-order superposition prover developed by Stephan Schulz \cite{schulz-2019}. To use E, set the environment variable \texttt{E\_HOME} to the directory that contains the \texttt{eproof} -executable and \texttt{E\_VERSION} to the version number (e.g., ``1.8''), or +executable and \texttt{E\_VERSION} to the version number (e.g., ``3.0''), or install the prebuilt E package from \download. -\item[\labelitemi] \textbf{\textit{iprover}:} iProver is a pure +\item[\labelitemi] \textbf{\textit{iprover}:} iProver is a first-order instantiation-based prover developed by Konstantin Korovin \cite{korovin-2009}. To use iProver, set the environment variable \texttt{IPROVER\_HOME} to the directory that contains the \texttt{iproveropt} executable. iProver depends on @@ -750,25 +740,26 @@ environment variable \texttt{SATALLAX\_HOME} to the directory that contains the \texttt{satallax} executable. -\item[\labelitemi] \textbf{\textit{spass}:} SPASS is a first-order resolution +\item[\labelitemi] \textbf{\textit{spass}:} SPASS is a first-order superposition prover developed by Christoph Weidenbach et al.\ \cite{weidenbach-et-al-2009}. To use SPASS, set the environment variable \texttt{SPASS\_HOME} to the directory that contains the \texttt{SPASS} executable and \texttt{SPASS\_VERSION} to the version number (e.g., ``3.8ds''), or install the prebuilt SPASS package from \download. -\item[\labelitemi] \textbf{\textit{vampire}:} Vampire is a first-order -resolution prover developed by Andrei Voronkov and his colleagues +\item[\labelitemi] \textbf{\textit{vampire}:} Vampire is a higher-order +superposition prover developed by Andrei Voronkov and his colleagues \cite{riazanov-voronkov-2002}. To use Vampire, set the environment variable \texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{vampire} executable and \texttt{VAMPIRE\_VERSION} to the version number (e.g., -``4.2.2''). +``4.8''). -\item[\labelitemi] \textbf{\textit{verit}:} veriT \cite{bouton-et-al-2009} is an -SMT solver developed by David D\'eharbe, Pascal Fontaine, and their colleagues. -It is designed to produce detailed proofs for reconstruction in proof -assistants. To use veriT, set the environment variable \texttt{ISABELLE\_VERIT} -to the complete path of the executable, including the file name. +\item[\labelitemi] \textbf{\textit{verit}:} veriT \cite{bouton-et-al-2009} is a +first-order SMT solver developed by David D\'eharbe, Pascal Fontaine, and their +colleagues. It is designed to produce detailed proofs for reconstruction in +proof assistants. To use veriT, set the environment variable +\texttt{ISABELLE\_VERIT} to the complete path of the executable, including the +file name. \item[\labelitemi] \textbf{\textit{z3}:} Z3 is an SMT solver developed at Microsoft Research \cite{de-moura-2008}. To use Z3, set the environment variable @@ -780,12 +771,12 @@ Cruanes, Petar Vukmirovi\'c, and colleagues. To use Zipperposition, set the environment variable \texttt{ZIPPERPOSITION\_HOME} to the directory that contains the \texttt{zipperposition} executable and -\texttt{ZIPPERPOSITION\_VERSION} to the version number (e.g., ``2.0.1''). +\texttt{ZIPPERPOSITION\_VERSION} to the version number (e.g., ``2.1''). \end{enum} \end{sloppy} -Moreover, the following remote provers are supported: +The following remote ATPs are supported: \begin{enum} \item[\labelitemi] \textbf{\textit{remote\_agsyhol}:} The remote version of @@ -817,10 +808,20 @@ version of Zipperposition runs on Geoff Sutcliffe's Miami servers. \end{enum} -By default, Sledgehammer runs a subset of CVC4, E, SPASS, Vampire, veriT, Z3, -and Zipperposition in parallel, either locally or remotely---depending on the -number of processor cores available and on which provers are actually installed. -It is generally beneficial to run several provers in parallel. +By default, \textit{provers} is set to a subset of CVC4, E, SPASS, Vampire, +veriT, Z3, and Zipperposition, to be run in parallel, either locally or +remotely---depending on the number of processor cores available and on which +provers are actually installed. Proof methods are currently not included, due +to their experimental status. (Proof methods can nevertheless appear in +Isabelle proofs that reconstruct proofs originally found by ATPs or SMT +solvers.) + +The following proof methods are supported:\ \textbf{\textit{algebra}}, +\textbf{\textit{argo}}, \textbf{\textit{auto}}, \textbf{\textit{blast}}, +\textbf{\textit{fastforce}}, \textbf{\textit{force}}, +\textbf{\textit{linarith}}, \textbf{\textit{meson}}, \textbf{\textit{metis}}, +\textbf{\textit{order}}, \textbf{\textit{presburger}}, \textbf{\textit{satx}}, +\textbf{\textit{simp}}. \opnodefault{prover}{string} Alias for \textit{provers}. @@ -936,7 +937,7 @@ Specifies the maximum number of facts that may be returned by the relevance filter. If the option is set to \textit{smart} (the default), it effectively takes a value that was empirically found to be appropriate for the prover. -Typical values lie between 50 and 1000. +Typical values lie between 0 and 1000. \opdefault{fact\_thresholds}{float\_pair}{\upshape 0.45~0.85} Specifies the thresholds above which facts are considered relevant by the @@ -955,9 +956,9 @@ Specifies the maximum number of monomorphic instances to generate beyond \textit{max\_facts}. The higher this limit is, the more monomorphic instances are potentially generated. Whether monomorphization takes place depends on the -type encoding used. If the option is set to \textit{smart} (the default), it -takes a value that was empirically found to be appropriate for the prover. For -most provers, this value is 100. +prover and possibly the specified type encoding. If the option is set to +\textit{smart} (the default), it takes a value that was empirically found to be +appropriate for the prover. For most provers, this value is 100. \nopagebreak {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).} @@ -966,9 +967,9 @@ Specifies the maximum number of iterations for the monomorphization fixpoint construction. The higher this limit is, the more monomorphic instances are potentially generated. Whether monomorphization takes place depends on the -type encoding used. If the option is set to \textit{smart} (the default), it -takes a value that was empirically found to be appropriate for the prover. -For most provers, this value is 3. +prover and possibly the specified type encoding. If the option is set to +\textit{smart} (the default), it takes a value that was empirically found to be +appropriate for the prover. \nopagebreak {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).} @@ -1247,12 +1248,12 @@ problem. \end{enum} -Sledgehammer emits an error if the actual outcome differs from the expected outcome. This option is -useful for regression testing. +Sledgehammer emits an error if the actual outcome differs from the expected +outcome. This option is useful for regression testing. -The expected outcomes are not mutually exclusive. More specifically, \textit{some} is accepted -whenever \textit{some\_preplayed} is accepted as the former has strictly fewer requirements -than the later. +The expected outcomes are not mutually exclusive. More specifically, +\textit{some} is accepted whenever \textit{some\_preplayed} is accepted as the +former has strictly fewer requirements than the later. \nopagebreak {\small See also \textit{timeout} (\S\ref{timeouts}).} @@ -1267,7 +1268,7 @@ Specifies the maximum number of seconds that the automatic provers should spend searching for a proof. This excludes problem preparation and is a soft limit. -\opdefault{slices}{int}{\upshape 12 times the number of cores detected} +\opdefault{slices}{int}{\upshape 24 times the number of cores detected} Specifies the number of time slices. Time slices are the basic unit for prover invocations. They are divided among the available provers. A single prover invocation can occupy a single slice, two slices, or more, depending on the @@ -1299,9 +1300,9 @@ \label{mirabelle} The \texttt{isabelle mirabelle} tool executes Sledgehammer or other advisory -tools (e.g., Nitpick) or tactics (e.g., \textit{auto}) on all subgoals emerging -in a theory. It is typically used to measure the success rate of a proof tool -on some benchmark. Its command-line usage is as follows: +tools (e.g., Nitpick) or proof methods (e.g., \textit{auto}) on all subgoals +emerging in a theory. It is typically used to measure the success rate of a +proof tool on some benchmark. Its command-line usage is as follows: {\small \begin{verbatim}