# HG changeset patch # User blanchet # Date 1305888479 -7200 # Node ID ec1ea24d49bcad9dac706cb7ebddd6ec549bf7af # Parent 391e41ac038b6efa1632c7ecdd0b37eb61fbb676 more FAQs diff -r 391e41ac038b -r ec1ea24d49bc doc-src/Sledgehammer/sledgehammer.tex --- a/doc-src/Sledgehammer/sledgehammer.tex Fri May 20 12:47:59 2011 +0200 +++ b/doc-src/Sledgehammer/sledgehammer.tex Fri May 20 12:47:59 2011 +0200 @@ -118,7 +118,7 @@ \textit{Main}. Examples of Sledgehammer use can be found in Isabelle's \texttt{src/HOL/Metis\_Examples} directory. Comments and bug reports concerning Sledgehammer or this manual should be -directed to \authoremail. +directed to the author at \authoremail. \vskip2.5\smallskipamount @@ -283,7 +283,7 @@ the proof found by E looks perfect, so click it to finish the proof. You can ask Sledgehammer for an Isar text proof by passing the -\textit{isar\_proof} option: +\textit{isar\_proof} option (\S\ref{output-format}): \prew \textbf{sledgehammer} [\textit{isar\_proof}] @@ -356,14 +356,14 @@ \point{Why does Metis fail to reconstruct the proof?} -There can be many reasons. If Metis runs seemingly forever, that's a sign that -the proof is too difficult for it. Metis is complete, so it should eventually -find it, but that's little consolation. There are several possible solutions: +There are many reasons. If Metis runs seemingly forever, that is a sign that the +proof is too difficult for it. Metis is complete, so it should eventually find +it, but that's little consolation. There are several possible solutions: \begin{enum} -\item[$\bullet$] Try the \textit{isar\_proof} option to obtain a step-by-step -Isar proof where each step is justified by Metis. Since the steps are fairly -small, Metis is more likely to be able to replay them. +\item[$\bullet$] Try the \textit{isar\_proof} option (\S\ref{output-format}) to +obtain a step-by-step Isar proof where each step is justified by Metis. Since +the steps are fairly small, Metis is more likely to be able to replay them. \item[$\bullet$] Try the \textit{smt} proof method instead of \textit{metis}. It is usually stronger, but you need to have Z3 available to replay the proofs, @@ -375,75 +375,118 @@ \textit{elim}{:}, \textit{dest}{:}, or \textit{simp}{:}, as appropriate. \end{enum} -% * sometimes Metis runs into some error, e.g. a type error. then it tries -% again with metisFT, where FT stands for ``full type information' -% * metisFT is much slower, but its proof search is fully typed, and it also -% includes more powerful rules such as the axiom ``$x = \mathit{True} -% \mathrel{\lor} x = \mathit{False}$'' for reasoning in higher-order places -% (e.g., in set comprehensions) -% -% * finally, in some cases the ATP proof is simply type-incorrect. -% Sledgehammer drops some type information to speed up the search. Try -% Sledgehammer again with full type information: \textit{full\_types} -% (\S\ref{problem-encoding}), or choose a specific type encoding with -% \textit{type\_sys} (\S\ref{problem-encoding}). Older versions of -% Sledgehammer were frequent victims of this problem. Now this should very -% seldom be an issue, but if you notice too many unsound proofs, contact -% -%\point{How can I easily tell whether a Sledgehammer proof is sound?} -% -%Easiest way: Once it's found: ... by (metis facts) -%try -%sledgehammer [full\_types] (facts) -% -%should usually give unprovable or refind the proof fairly quickly -% -%Same trick if you believe that there exists a proof with certain facts. -% -%\point{Which facts does Sledgehammer select?} -% -% * heuristic -% * and several hundreds -% * show them: debug -% * influence it with sledgehammer (add: xxx) -% -% * S/h good at finding short proofs combining a handful of existing lemmas -% * for deeper proofs, you must restrict the number of facts, e.g. -% max\_relevant = 50 -% * but then proof reconstruction is an issue -% -%\point{Why are the Isar proofs generated by Sledgehammer so ugly?} -% -% * experimental -% * working on this -% * there is a large body of research into transforming resolution proofs into -% natural deduction proofs (e.g., Isar proofs) -% * meantime: isar\_shrink\_factor -% -% -%\point{Should I let Sledgehammer minimize the number of lemmas?} -% -% * in general, yes -% * proofs involving fewer lemmas tend to be shorter as well, and hence easier -% to re-find by Metis -% * but the opposite is sometimes the case +In some rare cases, Metis fails fairly quickly. This usually indicates that +Sledgehammer found a type-incorrect proof. Sledgehammer erases some type +information to speed up the search. Try Sledgehammer again with full type +information: \textit{full\_types} (\S\ref{problem-encoding}), or choose a +specific type encoding with \textit{type\_sys} (\S\ref{problem-encoding}). Older +versions of Sledgehammer were frequent victims of this problem. Now this should +very seldom be an issue, but if you notice many unsound proofs, contact the +author at \authoremail. + +\point{How can I tell whether a Sledgehammer proof is sound?} + +First, if \emph{metis} (or \emph{metisFT}) can reconstruct it, the proof is +sound (modulo soundness of Isabelle's inference kernel). If it fails or runs +seemingly forever, you can try + +\prew +\textbf{apply}~\textbf{--} \\ +\textbf{sledgehammer} [\textit{type\_sys} = \textit{poly\_tags}] (\textit{metis\_facts}) +\postw + +where \textit{metis\_facts} is the list of facts appearing in the suggested +Metis call. The automatic provers should be able to refind the proof very +quickly if it is sound, and the \textit{type\_sys} $=$ \textit{poly\_tags} +option (\S\ref{problem-encoding}) ensures that no unsound proofs are found. + +The \textit{full\_types} option (\S\ref{problem-encoding}) can also be used +here, but it is unsound in extremely rare degenerate cases such as the +following: + +\prew +\textbf{lemma} ``$\forall x\> y\Colon{'}a.\ x = y \,\Longrightarrow \exists f\> g\Colon\mathit{nat} \Rightarrow {'}a.\ f \not= g$'' \\ +\textbf{sledgehammer} [\textit{full\_types}] (\textit{nat.distinct\/}(1)) +\postw + +\point{How does Sledgehammer select the facts that should be passed to the +automatic provers?} + +Briefly, the relevance filter assigns a score to every available fact (lemma, +theorem, definition, or axiom)\ based upon how many constants that fact shares +with the conjecture; this process iterates to include facts relevant to those +just accepted, but with a decay factor to ensure termination. The constants are +weighted to give unusual ones greater significance. The relevance filter copes +best when the conjecture contains some unusual constants; if all the constants +are common, it is unable to discriminate among the hundreds of facts that are +picked up. The relevance filter is also memoryless: It has no information about +how many times a particular fact has been used in a proof, and it cannot learn. -% Why is Sledgehammer automatically minimizing sometimes? -% * some provers (e.g. CVC3 and Yices) -% * also, sometimes E finds a proof but doesn't give a proof object +The number of facts included in a problem varies from prover to prover, since +some provers get overwhelmed quicker than others. You can show the number of +facts given using the \textit{verbose} option (\S\ref{output-format}) and the +actual facts using \textit{debug} (\S\ref{output-format}). + +Sledgehammer is good at finding short proofs combining a handful of existing +lemmas. If you are looking for longer proofs, you must typically restrict the +number of facts, by setting \textit{max\_relevant} to, say, 50 or 100. + +\point{Why are the Isar proofs generated by Sledgehammer so ugly?} + +The current implementation is experimental and explodes exponentially in the +worst case. Work on a new implementation has begun. There is a large body of +research into transforming resolution proofs into natural deduction proofs (such +as Isar proofs), which we hope to leverage. In the meantime, a workaround is to +set the \textit{isar\_shrink\_factor} option (\S\ref{output-format}) to a larger +value or to try several provers and keep the nicest-looking proof. + +\point{Should I let Sledgehammer minimize the number of lemmas?} + +In general, minimization is a good idea, because proofs involving fewer lemmas +tend to be shorter as well, and hence easier to re-find by Metis. But the +opposite is sometimes the case. + +\point{Why does the minimizer sometimes starts of its own?} + +There are two scenarios in which this can happen. First, some provers (e.g., +CVC3 and Yices) do not provide proofs or provide incomplete proofs. The +minimizer is then invoked to find out which facts are actually needed from the +(large) set of facts that was initinally given to the prover. Second, if a +prover returns a proof with lots of facts, the minimizer is invoked +automatically since Metis is unlikely to refind the proof. + +\point{What is metisFT?} + +The \textit{metisFT} proof method is the fully-typed version of Metis. It is +much slower than \textit{metis}, but the proof search is fully typed, and it +also includes more powerful rules such as the axiom ``$x = \mathit{True} +\mathrel{\lor} x = \mathit{False}$'' for reasoning in higher-order places (e.g., +in set comprehensions). The method kicks in automatically as a fallback when +\textit{metis} fails, and it is sometimes generated by Sledgehammer instead of +\textit{metis} if the proof obviously requires type information. + +If you see the warning + +\prew +\textsl +Metis: Falling back on ``\textit{metisFT}''. +\postw + +in a successful Metis proof, you can advantageously replace the \textit{metis} +call with \textit{metisFT}. \point{I got a strange error from Sledgehammer---what should I do?} Sledgehammer tries to give informative error messages. Please report any strange -error to \authoremail. This applies double if you get the message +error to the author at \authoremail. This applies double if you get the message -\begin{quote} +\prew \slshape The prover found a type-unsound proof involving ``\textit{foo}'', ``\textit{bar}'', ``\textit{baz}'' even though a supposedly type-sound encoding was used (or, less likely, your axioms are inconsistent). You might want to report this to the Isabelle developers. -\end{quote} +\postw \point{Auto can solve it---why not Sledgehammer?} @@ -452,6 +495,13 @@ Because the system refers to all theorems known to Isabelle, it is particularly suitable when your goal has a short proof from lemmas that you don't know about. +\point{Why are there so many options?} + +Sledgehammer's philosophy should work out of the box, without user guidance. +Many of the options are meant to be used mostly by the Sledgehammer developers +for experimentation purposes. Of course, feel free to experiment with them if +you are so inclined. + \section{Command Syntax} \label{command-syntax}