--- a/doc-src/Sledgehammer/sledgehammer.tex Fri May 20 12:47:59 2011 +0200
+++ b/doc-src/Sledgehammer/sledgehammer.tex Fri May 20 12:47:59 2011 +0200
@@ -118,7 +118,7 @@
\textit{Main}. Examples of Sledgehammer use can be found in Isabelle's
\texttt{src/HOL/Metis\_Examples} directory.
Comments and bug reports concerning Sledgehammer or this manual should be
-directed to \authoremail.
+directed to the author at \authoremail.
\vskip2.5\smallskipamount
@@ -283,7 +283,7 @@
the proof found by E looks perfect, so click it to finish the proof.
You can ask Sledgehammer for an Isar text proof by passing the
-\textit{isar\_proof} option:
+\textit{isar\_proof} option (\S\ref{output-format}):
\prew
\textbf{sledgehammer} [\textit{isar\_proof}]
@@ -356,14 +356,14 @@
\point{Why does Metis fail to reconstruct the proof?}
-There can be many reasons. If Metis runs seemingly forever, that's a sign that
-the proof is too difficult for it. Metis is complete, so it should eventually
-find it, but that's little consolation. There are several possible solutions:
+There are many reasons. If Metis runs seemingly forever, that is a sign that the
+proof is too difficult for it. Metis is complete, so it should eventually find
+it, but that's little consolation. There are several possible solutions:
\begin{enum}
-\item[$\bullet$] Try the \textit{isar\_proof} option to obtain a step-by-step
-Isar proof where each step is justified by Metis. Since the steps are fairly
-small, Metis is more likely to be able to replay them.
+\item[$\bullet$] Try the \textit{isar\_proof} option (\S\ref{output-format}) to
+obtain a step-by-step Isar proof where each step is justified by Metis. Since
+the steps are fairly small, Metis is more likely to be able to replay them.
\item[$\bullet$] Try the \textit{smt} proof method instead of \textit{metis}. It
is usually stronger, but you need to have Z3 available to replay the proofs,
@@ -375,75 +375,118 @@
\textit{elim}{:}, \textit{dest}{:}, or \textit{simp}{:}, as appropriate.
\end{enum}
-% * sometimes Metis runs into some error, e.g. a type error. then it tries
-% again with metisFT, where FT stands for ``full type information'
-% * metisFT is much slower, but its proof search is fully typed, and it also
-% includes more powerful rules such as the axiom ``$x = \mathit{True}
-% \mathrel{\lor} x = \mathit{False}$'' for reasoning in higher-order places
-% (e.g., in set comprehensions)
-%
-% * finally, in some cases the ATP proof is simply type-incorrect.
-% Sledgehammer drops some type information to speed up the search. Try
-% Sledgehammer again with full type information: \textit{full\_types}
-% (\S\ref{problem-encoding}), or choose a specific type encoding with
-% \textit{type\_sys} (\S\ref{problem-encoding}). Older versions of
-% Sledgehammer were frequent victims of this problem. Now this should very
-% seldom be an issue, but if you notice too many unsound proofs, contact
-%
-%\point{How can I easily tell whether a Sledgehammer proof is sound?}
-%
-%Easiest way: Once it's found: ... by (metis facts)
-%try
-%sledgehammer [full\_types] (facts)
-%
-%should usually give unprovable or refind the proof fairly quickly
-%
-%Same trick if you believe that there exists a proof with certain facts.
-%
-%\point{Which facts does Sledgehammer select?}
-%
-% * heuristic
-% * and several hundreds
-% * show them: debug
-% * influence it with sledgehammer (add: xxx)
-%
-% * S/h good at finding short proofs combining a handful of existing lemmas
-% * for deeper proofs, you must restrict the number of facts, e.g.
-% max\_relevant = 50
-% * but then proof reconstruction is an issue
-%
-%\point{Why are the Isar proofs generated by Sledgehammer so ugly?}
-%
-% * experimental
-% * working on this
-% * there is a large body of research into transforming resolution proofs into
-% natural deduction proofs (e.g., Isar proofs)
-% * meantime: isar\_shrink\_factor
-%
-%
-%\point{Should I let Sledgehammer minimize the number of lemmas?}
-%
-% * in general, yes
-% * proofs involving fewer lemmas tend to be shorter as well, and hence easier
-% to re-find by Metis
-% * but the opposite is sometimes the case
+In some rare cases, Metis fails fairly quickly. This usually indicates that
+Sledgehammer found a type-incorrect proof. Sledgehammer erases some type
+information to speed up the search. Try Sledgehammer again with full type
+information: \textit{full\_types} (\S\ref{problem-encoding}), or choose a
+specific type encoding with \textit{type\_sys} (\S\ref{problem-encoding}). Older
+versions of Sledgehammer were frequent victims of this problem. Now this should
+very seldom be an issue, but if you notice many unsound proofs, contact the
+author at \authoremail.
+
+\point{How can I tell whether a Sledgehammer proof is sound?}
+
+First, if \emph{metis} (or \emph{metisFT}) can reconstruct it, the proof is
+sound (modulo soundness of Isabelle's inference kernel). If it fails or runs
+seemingly forever, you can try
+
+\prew
+\textbf{apply}~\textbf{--} \\
+\textbf{sledgehammer} [\textit{type\_sys} = \textit{poly\_tags}] (\textit{metis\_facts})
+\postw
+
+where \textit{metis\_facts} is the list of facts appearing in the suggested
+Metis call. The automatic provers should be able to refind the proof very
+quickly if it is sound, and the \textit{type\_sys} $=$ \textit{poly\_tags}
+option (\S\ref{problem-encoding}) ensures that no unsound proofs are found.
+
+The \textit{full\_types} option (\S\ref{problem-encoding}) can also be used
+here, but it is unsound in extremely rare degenerate cases such as the
+following:
+
+\prew
+\textbf{lemma} ``$\forall x\> y\Colon{'}a.\ x = y \,\Longrightarrow \exists f\> g\Colon\mathit{nat} \Rightarrow {'}a.\ f \not= g$'' \\
+\textbf{sledgehammer} [\textit{full\_types}] (\textit{nat.distinct\/}(1))
+\postw
+
+\point{How does Sledgehammer select the facts that should be passed to the
+automatic provers?}
+
+Briefly, the relevance filter assigns a score to every available fact (lemma,
+theorem, definition, or axiom)\ based upon how many constants that fact shares
+with the conjecture; this process iterates to include facts relevant to those
+just accepted, but with a decay factor to ensure termination. The constants are
+weighted to give unusual ones greater significance. The relevance filter copes
+best when the conjecture contains some unusual constants; if all the constants
+are common, it is unable to discriminate among the hundreds of facts that are
+picked up. The relevance filter is also memoryless: It has no information about
+how many times a particular fact has been used in a proof, and it cannot learn.
-% Why is Sledgehammer automatically minimizing sometimes?
-% * some provers (e.g. CVC3 and Yices)
-% * also, sometimes E finds a proof but doesn't give a proof object
+The number of facts included in a problem varies from prover to prover, since
+some provers get overwhelmed quicker than others. You can show the number of
+facts given using the \textit{verbose} option (\S\ref{output-format}) and the
+actual facts using \textit{debug} (\S\ref{output-format}).
+
+Sledgehammer is good at finding short proofs combining a handful of existing
+lemmas. If you are looking for longer proofs, you must typically restrict the
+number of facts, by setting \textit{max\_relevant} to, say, 50 or 100.
+
+\point{Why are the Isar proofs generated by Sledgehammer so ugly?}
+
+The current implementation is experimental and explodes exponentially in the
+worst case. Work on a new implementation has begun. There is a large body of
+research into transforming resolution proofs into natural deduction proofs (such
+as Isar proofs), which we hope to leverage. In the meantime, a workaround is to
+set the \textit{isar\_shrink\_factor} option (\S\ref{output-format}) to a larger
+value or to try several provers and keep the nicest-looking proof.
+
+\point{Should I let Sledgehammer minimize the number of lemmas?}
+
+In general, minimization is a good idea, because proofs involving fewer lemmas
+tend to be shorter as well, and hence easier to re-find by Metis. But the
+opposite is sometimes the case.
+
+\point{Why does the minimizer sometimes starts of its own?}
+
+There are two scenarios in which this can happen. First, some provers (e.g.,
+CVC3 and Yices) do not provide proofs or provide incomplete proofs. The
+minimizer is then invoked to find out which facts are actually needed from the
+(large) set of facts that was initinally given to the prover. Second, if a
+prover returns a proof with lots of facts, the minimizer is invoked
+automatically since Metis is unlikely to refind the proof.
+
+\point{What is metisFT?}
+
+The \textit{metisFT} proof method is the fully-typed version of Metis. It is
+much slower than \textit{metis}, but the proof search is fully typed, and it
+also includes more powerful rules such as the axiom ``$x = \mathit{True}
+\mathrel{\lor} x = \mathit{False}$'' for reasoning in higher-order places (e.g.,
+in set comprehensions). The method kicks in automatically as a fallback when
+\textit{metis} fails, and it is sometimes generated by Sledgehammer instead of
+\textit{metis} if the proof obviously requires type information.
+
+If you see the warning
+
+\prew
+\textsl
+Metis: Falling back on ``\textit{metisFT}''.
+\postw
+
+in a successful Metis proof, you can advantageously replace the \textit{metis}
+call with \textit{metisFT}.
\point{I got a strange error from Sledgehammer---what should I do?}
Sledgehammer tries to give informative error messages. Please report any strange
-error to \authoremail. This applies double if you get the message
+error to the author at \authoremail. This applies double if you get the message
-\begin{quote}
+\prew
\slshape
The prover found a type-unsound proof involving ``\textit{foo}'',
``\textit{bar}'', ``\textit{baz}'' even though a supposedly type-sound encoding
was used (or, less likely, your axioms are inconsistent). You might want to
report this to the Isabelle developers.
-\end{quote}
+\postw
\point{Auto can solve it---why not Sledgehammer?}
@@ -452,6 +495,13 @@
Because the system refers to all theorems known to Isabelle, it is particularly
suitable when your goal has a short proof from lemmas that you don't know about.
+\point{Why are there so many options?}
+
+Sledgehammer's philosophy should work out of the box, without user guidance.
+Many of the options are meant to be used mostly by the Sledgehammer developers
+for experimentation purposes. Of course, feel free to experiment with them if
+you are so inclined.
+
\section{Command Syntax}
\label{command-syntax}