author | hoelzl |
Thu, 09 Dec 2010 10:22:17 +0100 | |
changeset 41098 | ababba14c965 |
parent 40942 | e08fa125c268 |
child 41208 | 1b28c43a7074 |
permissions | -rw-r--r-- |
36926 | 1 |
\documentclass[a4paper,12pt]{article} |
2 |
\usepackage[T1]{fontenc} |
|
3 |
\usepackage{amsmath} |
|
4 |
\usepackage{amssymb} |
|
5 |
\usepackage[english,french]{babel} |
|
6 |
\usepackage{color} |
|
7 |
\usepackage{footmisc} |
|
8 |
\usepackage{graphicx} |
|
9 |
%\usepackage{mathpazo} |
|
10 |
\usepackage{multicol} |
|
11 |
\usepackage{stmaryrd} |
|
12 |
%\usepackage[scaled=.85]{beramono} |
|
40689 | 13 |
\usepackage{../isabelle,../iman,../pdfsetup} |
36926 | 14 |
|
15 |
%\oddsidemargin=4.6mm |
|
16 |
%\evensidemargin=4.6mm |
|
17 |
%\textwidth=150mm |
|
18 |
%\topmargin=4.6mm |
|
19 |
%\headheight=0mm |
|
20 |
%\headsep=0mm |
|
21 |
%\textheight=234mm |
|
22 |
||
23 |
\def\Colon{\mathord{:\mkern-1.5mu:}} |
|
24 |
%\def\lbrakk{\mathopen{\lbrack\mkern-3.25mu\lbrack}} |
|
25 |
%\def\rbrakk{\mathclose{\rbrack\mkern-3.255mu\rbrack}} |
|
26 |
\def\lparr{\mathopen{(\mkern-4mu\mid}} |
|
27 |
\def\rparr{\mathclose{\mid\mkern-4mu)}} |
|
28 |
||
29 |
\def\unk{{?}} |
|
30 |
\def\undef{(\lambda x.\; \unk)} |
|
31 |
%\def\unr{\textit{others}} |
|
32 |
\def\unr{\ldots} |
|
33 |
\def\Abs#1{\hbox{\rm{\flqq}}{\,#1\,}\hbox{\rm{\frqq}}} |
|
34 |
\def\Q{{\smash{\lower.2ex\hbox{$\scriptstyle?$}}}} |
|
35 |
||
36 |
\urlstyle{tt} |
|
37 |
||
38 |
\begin{document} |
|
39 |
||
40 |
\selectlanguage{english} |
|
41 |
||
42 |
\title{\includegraphics[scale=0.5]{isabelle_sledgehammer} \\[4ex] |
|
43 |
Hammering Away \\[\smallskipamount] |
|
44 |
\Large A User's Guide to Sledgehammer for Isabelle/HOL} |
|
45 |
\author{\hbox{} \\ |
|
46 |
Jasmin Christian Blanchette \\ |
|
47 |
{\normalsize Institut f\"ur Informatik, Technische Universit\"at M\"unchen} \\ |
|
48 |
\hbox{}} |
|
49 |
||
50 |
\maketitle |
|
51 |
||
52 |
\tableofcontents |
|
53 |
||
54 |
\setlength{\parskip}{.7em plus .2em minus .1em} |
|
55 |
\setlength{\parindent}{0pt} |
|
56 |
\setlength{\abovedisplayskip}{\parskip} |
|
57 |
\setlength{\abovedisplayshortskip}{.9\parskip} |
|
58 |
\setlength{\belowdisplayskip}{\parskip} |
|
59 |
\setlength{\belowdisplayshortskip}{.9\parskip} |
|
60 |
||
61 |
% General-purpose enum environment with correct spacing |
|
62 |
\newenvironment{enum}% |
|
63 |
{\begin{list}{}{% |
|
64 |
\setlength{\topsep}{.1\parskip}% |
|
65 |
\setlength{\partopsep}{.1\parskip}% |
|
66 |
\setlength{\itemsep}{\parskip}% |
|
67 |
\advance\itemsep by-\parsep}} |
|
68 |
{\end{list}} |
|
69 |
||
70 |
\def\pre{\begingroup\vskip0pt plus1ex\advance\leftskip by\leftmargin |
|
71 |
\advance\rightskip by\leftmargin} |
|
72 |
\def\post{\vskip0pt plus1ex\endgroup} |
|
73 |
||
74 |
\def\prew{\pre\advance\rightskip by-\leftmargin} |
|
75 |
\def\postw{\post} |
|
76 |
||
77 |
\section{Introduction} |
|
78 |
\label{introduction} |
|
79 |
||
80 |
Sledgehammer is a tool that applies first-order automatic theorem provers (ATPs) |
|
40942 | 81 |
and satisfiability-modulo-theories (SMT) solvers on the current goal. The |
40073 | 82 |
supported ATPs are E \cite{schulz-2002}, SPASS \cite{weidenbach-et-al-2009}, |
83 |
Vampire \cite{riazanov-voronkov-2002}, SInE-E \cite{sine}, and SNARK |
|
84 |
\cite{snark}. The ATPs are run either locally or remotely via the |
|
85 |
System\-On\-TPTP web service \cite{sutcliffe-2000}. In addition to the ATPs, the |
|
40942 | 86 |
SMT solvers Z3 \cite{z3} is used, and you can tell Sledgehammer to try Yices |
87 |
\cite{yices} and CVC3 \cite{cvc3} as well. |
|
36926 | 88 |
|
40073 | 89 |
The problem passed to the automatic provers consists of your current goal |
90 |
together with a heuristic selection of hundreds of facts (theorems) from the |
|
91 |
current theory context, filtered by relevance. Because jobs are run in the |
|
92 |
background, you can continue to work on your proof by other means. Provers can |
|
93 |
be run in parallel. Any reply (which may arrive half a minute later) will appear |
|
94 |
in the Proof General response buffer. |
|
37517
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
95 |
|
40073 | 96 |
The result of a successful proof search is some source text that usually (but |
97 |
not always) reconstructs the proof within Isabelle. For ATPs, the reconstructed |
|
98 |
proof relies on the general-purpose Metis prover \cite{metis}, which is fully |
|
99 |
integrated into Isabelle/HOL, with explicit inferences going through the kernel. |
|
100 |
Thus its results are correct by construction. |
|
36926 | 101 |
|
39320 | 102 |
In this manual, we will explicitly invoke the \textbf{sledgehammer} command. |
103 |
Sledgehammer also provides an automatic mode that can be enabled via the |
|
104 |
``Auto Sledgehammer'' option from the ``Isabelle'' menu in Proof General. In |
|
105 |
this mode, Sledgehammer is run on every newly entered theorem. The time limit |
|
106 |
for Auto Sledgehammer and other automatic tools can be set using the ``Auto |
|
107 |
Tools Time Limit'' option. |
|
108 |
||
36926 | 109 |
\newbox\boxA |
110 |
\setbox\boxA=\hbox{\texttt{nospam}} |
|
111 |
||
40689 | 112 |
To run Sledgehammer, you must make sure that the theory \textit{Sledgehammer} is |
113 |
imported---this is rarely a problem in practice since it is part of |
|
114 |
\textit{Main}. Examples of Sledgehammer use can be found in Isabelle's |
|
36926 | 115 |
\texttt{src/HOL/Metis\_Examples} directory. |
116 |
Comments and bug reports concerning Sledgehammer or this manual should be |
|
117 |
directed to |
|
118 |
\texttt{blan{\color{white}nospam}\kern-\wd\boxA{}chette@\allowbreak |
|
119 |
in.\allowbreak tum.\allowbreak de}. |
|
120 |
||
121 |
\vskip2.5\smallskipamount |
|
122 |
||
123 |
%\textbf{Acknowledgment.} The author would like to thank Mark Summerfield for |
|
124 |
%suggesting several textual improvements. |
|
125 |
||
126 |
\section{Installation} |
|
127 |
\label{installation} |
|
128 |
||
129 |
Sledgehammer is part of Isabelle, so you don't need to install it. However, it |
|
40073 | 130 |
relies on third-party automatic theorem provers (ATPs) and SAT solvers. |
131 |
Currently, E, SPASS, and Vampire can be run locally; in addition, E, Vampire, |
|
132 |
SInE-E, and SNARK are available remotely via SystemOnTPTP \cite{sutcliffe-2000}. |
|
133 |
If you want better performance, you should install E and SPASS locally. |
|
36926 | 134 |
|
38043 | 135 |
There are three main ways to install ATPs on your machine: |
36926 | 136 |
|
137 |
\begin{enum} |
|
138 |
\item[$\bullet$] If you installed an official Isabelle package with everything |
|
139 |
inside, it should already include properly setup executables for E and SPASS, |
|
38043 | 140 |
ready to use.% |
141 |
\footnote{Vampire's license prevents us from doing the same for this otherwise |
|
142 |
wonderful tool.} |
|
36926 | 143 |
|
38043 | 144 |
\item[$\bullet$] Alternatively, you can download the Isabelle-aware E and SPASS |
36926 | 145 |
binary packages from Isabelle's download page. Extract the archives, then add a |
40203 | 146 |
line to your \texttt{\char`\~/.isabelle/etc/components} file with the absolute |
147 |
path to E or SPASS. For example, if the \texttt{components} does not exist yet |
|
148 |
and you extracted SPASS to \texttt{/usr/local/spass-3.7}, create the |
|
149 |
\texttt{components} file with the single line |
|
36926 | 150 |
|
151 |
\prew |
|
152 |
\texttt{/usr/local/spass-3.7} |
|
153 |
\postw |
|
154 |
||
38043 | 155 |
in it. |
156 |
||
157 |
\item[$\bullet$] If you prefer to build E or SPASS yourself, or obtained a |
|
158 |
Vampire executable from somewhere (e.g., \url{http://www.vprover.org/}), |
|
159 |
set the environment variable \texttt{E\_HOME}, \texttt{SPASS\_HOME}, or |
|
160 |
\texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{eproof}, |
|
38063 | 161 |
\texttt{SPASS}, or \texttt{vampire} executable. Sledgehammer has been tested |
162 |
with E 1.0 and 1.2, SPASS 3.5 and 3.7, and Vampire 1.0% |
|
163 |
\footnote{Following the rewrite of Vampire, the counter for version numbers was |
|
164 |
reset to 0; hence the new Vampire 1.0 is more recent than Vampire 11.5.}% |
|
165 |
. Since the ATPs' output formats are neither documented nor stable, other |
|
166 |
versions of the ATPs might or might not work well with Sledgehammer. |
|
36926 | 167 |
\end{enum} |
168 |
||
169 |
To check whether E and SPASS are installed, follow the example in |
|
170 |
\S\ref{first-steps}. |
|
171 |
||
37517
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
172 |
Remote ATP invocation via the SystemOnTPTP web service requires Perl with the |
39152
f09b378cb252
make remote ATP invocation work for those people who need to go through a proxy;
blanchet
parents:
38997
diff
changeset
|
173 |
World Wide Web Library (\texttt{libwww-perl}) installed. If you must use a proxy |
f09b378cb252
make remote ATP invocation work for those people who need to go through a proxy;
blanchet
parents:
38997
diff
changeset
|
174 |
server to access the Internet, set the \texttt{http\_proxy} environment variable |
39153 | 175 |
to the proxy, either in the environment in which Isabelle is launched or in your |
176 |
\texttt{\char`\~/.isabelle/etc/settings} file. Here are a few examples: |
|
39152
f09b378cb252
make remote ATP invocation work for those people who need to go through a proxy;
blanchet
parents:
38997
diff
changeset
|
177 |
|
f09b378cb252
make remote ATP invocation work for those people who need to go through a proxy;
blanchet
parents:
38997
diff
changeset
|
178 |
\prew |
39153 | 179 |
\texttt{http\_proxy=http://proxy.example.org} \\ |
180 |
\texttt{http\_proxy=http://proxy.example.org:8080} \\ |
|
181 |
\texttt{http\_proxy=http://joeblow:pAsSwRd@proxy.example.org} |
|
39152
f09b378cb252
make remote ATP invocation work for those people who need to go through a proxy;
blanchet
parents:
38997
diff
changeset
|
182 |
\postw |
37517
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
183 |
|
36926 | 184 |
\section{First Steps} |
185 |
\label{first-steps} |
|
186 |
||
187 |
To illustrate Sledgehammer in context, let us start a theory file and |
|
188 |
attempt to prove a simple lemma: |
|
189 |
||
190 |
\prew |
|
191 |
\textbf{theory}~\textit{Scratch} \\ |
|
192 |
\textbf{imports}~\textit{Main} \\ |
|
193 |
\textbf{begin} \\[2\smallskipamount] |
|
194 |
% |
|
195 |
\textbf{lemma} ``$[a] = [b] \,\longleftrightarrow\, a = b$'' \\ |
|
196 |
\textbf{sledgehammer} |
|
197 |
\postw |
|
198 |
||
37517
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
199 |
Instead of issuing the \textbf{sledgehammer} command, you can also find |
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
200 |
Sledgehammer in the ``Commands'' submenu of the ``Isabelle'' menu in Proof |
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
201 |
General or press the Emacs key sequence C-c C-a C-s. |
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
202 |
Either way, Sledgehammer produces the following output after a few seconds: |
36926 | 203 |
|
204 |
\prew |
|
205 |
\slshape |
|
40060
5ef6747aa619
first step in adding support for an SMT backend to Sledgehammer
blanchet
parents:
40059
diff
changeset
|
206 |
Sledgehammer: ``\textit{e}'' for subgoal 1: \\ |
36926 | 207 |
$([a] = [b]) = (a = b)$ \\ |
208 |
Try this command: \textbf{by} (\textit{metis hd.simps}). \\ |
|
38043 | 209 |
To minimize the number of lemmas, try this: \\ |
40059
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
210 |
\textbf{sledgehammer} \textit{minimize} [\textit{prover} = \textit{e}] (\textit{hd.simps}). \\[3\smallskipamount] |
36926 | 211 |
% |
40060
5ef6747aa619
first step in adding support for an SMT backend to Sledgehammer
blanchet
parents:
40059
diff
changeset
|
212 |
Sledgehammer: ``\textit{spass}'' for subgoal 1: \\ |
36926 | 213 |
$([a] = [b]) = (a = b)$ \\ |
214 |
Try this command: \textbf{by} (\textit{metis insert\_Nil last\_ConsL}). \\ |
|
38043 | 215 |
To minimize the number of lemmas, try this: \\ |
40059
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
216 |
\textbf{sledgehammer} \textit{minimize} [\textit{prover} = \textit{spass}] (\textit{insert\_Nil last\_ConsL}). \\[3\smallskipamount] |
36926 | 217 |
% |
40073 | 218 |
Sledgehammer: ``\textit{vampire}'' for subgoal 1: \\ |
36926 | 219 |
$([a] = [b]) = (a = b)$ \\ |
40073 | 220 |
Try this command: \textbf{by} (\textit{metis eq\_commute last\_snoc}) \\ |
38043 | 221 |
To minimize the number of lemmas, try this: \\ |
40073 | 222 |
\textbf{sledgehammer} \textit{minimize} [\textit{prover} = \textit{vampire}]~(\textit{eq\_commute last\_snoc}). \\[3\smallskipamount] |
223 |
% |
|
224 |
Sledgehammer: ``\textit{remote\_sine\_e}'' for subgoal 1: \\ |
|
225 |
$([a] = [b]) = (a = b)$ \\ |
|
226 |
Try this command: \textbf{by} (\textit{metis hd.simps}) \\ |
|
227 |
To minimize the number of lemmas, try this: \\ |
|
40203 | 228 |
\textbf{sledgehammer} \textit{minimize} [\textit{prover} = \textit{remote\_sine\_e}]~(\textit{hd.simps}). |
40942 | 229 |
% |
230 |
Sledgehammer: ``\textit{remote\_z3}'' for subgoal 1: \\ |
|
231 |
$([a] = [b]) = (a = b)$ \\ |
|
232 |
Try this command: \textbf{by} (\textit{metis hd.simps}) \\ |
|
233 |
To minimize the number of lemmas, try this: \\ |
|
234 |
\textbf{sledgehammer} \textit{minimize} [\textit{prover} = \textit{remote\_sine\_e}]~(\textit{hd.simps}). |
|
36926 | 235 |
\postw |
236 |
||
40942 | 237 |
Sledgehammer ran E, SPASS, Vampire, SInE-E, and Z3 in parallel. Depending on |
238 |
which provers are installed and how many processor cores are available, some of |
|
239 |
the provers might be missing or present with a \textit{remote\_} prefix. |
|
36926 | 240 |
|
40073 | 241 |
For each successful prover, Sledgehammer gives a one-liner proof that uses the |
242 |
\textit{metis} or \textit{smt} method. You can click the proof to insert it into |
|
243 |
the theory text. You can click the ``\textbf{sledgehammer} \textit{minimize}'' |
|
244 |
command if you want to look for a shorter (and probably faster) proof. But here |
|
245 |
the proof found by E looks perfect, so click it to finish the proof. |
|
36926 | 246 |
|
247 |
You can ask Sledgehammer for an Isar text proof by passing the |
|
248 |
\textit{isar\_proof} option: |
|
249 |
||
250 |
\prew |
|
251 |
\textbf{sledgehammer} [\textit{isar\_proof}] |
|
252 |
\postw |
|
253 |
||
254 |
When Isar proof construction is successful, it can yield proofs that are more |
|
255 |
readable and also faster than the \textit{metis} one-liners. This feature is |
|
40073 | 256 |
experimental and is only available for ATPs. |
36926 | 257 |
|
37517
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
258 |
\section{Hints} |
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
259 |
\label{hints} |
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
260 |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
261 |
For best results, first simplify your problem by calling \textit{auto} or at |
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
262 |
least \textit{safe} followed by \textit{simp\_all}. None of the ATPs contain |
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
263 |
arithmetic decision procedures. They are not especially good at heavy rewriting, |
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
264 |
but because they regard equations as undirected, they often prove theorems that |
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
265 |
require the reverse orientation of a \textit{simp} rule. Higher-order problems |
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
266 |
can be tackled, but the success rate is better for first-order problems. Hence, |
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
267 |
you may get better results if you first simplify the problem to remove |
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
268 |
higher-order features. |
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
269 |
|
39320 | 270 |
Note that problems can be easy for \textit{auto} and difficult for ATPs, but the |
271 |
reverse is also true, so don't be discouraged if your first attempts fail. |
|
272 |
Because the system refers to all theorems known to Isabelle, it is particularly |
|
273 |
suitable when your goal has a short proof from lemmas that you don't know about. |
|
37517
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
274 |
|
36926 | 275 |
\section{Command Syntax} |
276 |
\label{command-syntax} |
|
277 |
||
278 |
Sledgehammer can be invoked at any point when there is an open goal by entering |
|
279 |
the \textbf{sledgehammer} command in the theory file. Its general syntax is as |
|
280 |
follows: |
|
281 |
||
282 |
\prew |
|
283 |
\textbf{sledgehammer} \textit{subcommand\/$^?$ options\/$^?$ facts\_override\/$^?$ num\/$^?$} |
|
284 |
\postw |
|
285 |
||
286 |
For convenience, Sledgehammer is also available in the ``Commands'' submenu of |
|
287 |
the ``Isabelle'' menu in Proof General or by pressing the Emacs key sequence C-c |
|
288 |
C-a C-s. This is equivalent to entering the \textbf{sledgehammer} command with |
|
289 |
no arguments in the theory text. |
|
290 |
||
291 |
In the general syntax, the \textit{subcommand} may be any of the following: |
|
292 |
||
293 |
\begin{enum} |
|
40203 | 294 |
\item[$\bullet$] \textbf{\textit{run} (the default):} Runs Sledgehammer on |
295 |
subgoal number \textit{num} (1 by default), with the given options and facts. |
|
36926 | 296 |
|
297 |
\item[$\bullet$] \textbf{\textit{minimize}:} Attempts to minimize the provided facts |
|
298 |
(specified in the \textit{facts\_override} argument) to obtain a simpler proof |
|
299 |
involving fewer facts. The options and goal number are as for \textit{run}. |
|
300 |
||
40203 | 301 |
\item[$\bullet$] \textbf{\textit{messages}:} Redisplays recent messages issued |
302 |
by Sledgehammer. This allows you to examine results that might have been lost |
|
303 |
due to Sledgehammer's asynchronous nature. The \textit{num} argument specifies a |
|
36926 | 304 |
limit on the number of messages to display (5 by default). |
305 |
||
40203 | 306 |
\item[$\bullet$] \textbf{\textit{available\_provers}:} Prints the list of |
307 |
installed provers. See \S\ref{installation} and \S\ref{mode-of-operation} for |
|
308 |
more information on how to install automatic provers. |
|
36926 | 309 |
|
40059
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
310 |
\item[$\bullet$] \textbf{\textit{running\_provers}:} Prints information about |
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
311 |
currently running automatic provers, including elapsed runtime and remaining |
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
312 |
time until timeout. |
36926 | 313 |
|
40059
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
314 |
\item[$\bullet$] \textbf{\textit{kill\_provers}:} Terminates all running |
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
315 |
automatic provers. |
36926 | 316 |
|
317 |
\item[$\bullet$] \textbf{\textit{refresh\_tptp}:} Refreshes the list of remote |
|
318 |
ATPs available at System\-On\-TPTP \cite{sutcliffe-2000}. |
|
319 |
\end{enum} |
|
320 |
||
321 |
Sledgehammer's behavior can be influenced by various \textit{options}, which can |
|
322 |
be specified in brackets after the \textbf{sledgehammer} command. The |
|
323 |
\textit{options} are a list of key--value pairs of the form ``[$k_1 = v_1, |
|
324 |
\ldots, k_n = v_n$]''. For Boolean options, ``= \textit{true}'' is optional. For |
|
325 |
example: |
|
326 |
||
327 |
\prew |
|
328 |
\textbf{sledgehammer} [\textit{isar\_proof}, \,\textit{timeout} = 120$\,s$] |
|
329 |
\postw |
|
330 |
||
331 |
Default values can be set using \textbf{sledgehammer\_\allowbreak params}: |
|
332 |
||
333 |
\prew |
|
334 |
\textbf{sledgehammer\_params} \textit{options} |
|
335 |
\postw |
|
336 |
||
337 |
The supported options are described in \S\ref{option-reference}. |
|
338 |
||
339 |
The \textit{facts\_override} argument lets you alter the set of facts that go |
|
340 |
through the relevance filter. It may be of the form ``(\textit{facts})'', where |
|
341 |
\textit{facts} is a space-separated list of Isabelle facts (theorems, local |
|
342 |
assumptions, etc.), in which case the relevance filter is bypassed and the given |
|
39320 | 343 |
facts are used. It may also be of the form ``(\textit{add}:\ \textit{facts}$_1$)'', |
344 |
``(\textit{del}:\ \textit{facts}$_2$)'', or ``(\textit{add}:\ \textit{facts}$_1$\ |
|
345 |
\textit{del}:\ \textit{facts}$_2$)'', where the relevance filter is instructed to |
|
36926 | 346 |
proceed as usual except that it should consider \textit{facts}$_1$ |
347 |
highly-relevant and \textit{facts}$_2$ fully irrelevant. |
|
348 |
||
39320 | 349 |
You can instruct Sledgehammer to run automatically on newly entered theorems by |
350 |
enabling the ``Auto Sledgehammer'' option from the ``Isabelle'' menu in Proof |
|
40059
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
351 |
General. For automatic runs, only the first prover set using \textit{provers} |
39320 | 352 |
(\S\ref{mode-of-operation}) is considered, \textit{verbose} |
353 |
(\S\ref{output-format}) and \textit{debug} (\S\ref{output-format}) are disabled, |
|
40073 | 354 |
fewer facts are passed to the prover, and \textit{timeout} |
355 |
(\S\ref{mode-of-operation}) is superseded by the ``Auto Tools Time Limit'' in |
|
356 |
Proof General's ``Isabelle'' menu. Sledgehammer's output is also more concise. |
|
39320 | 357 |
|
36926 | 358 |
\section{Option Reference} |
359 |
\label{option-reference} |
|
360 |
||
361 |
\def\flushitem#1{\item[]\noindent\kern-\leftmargin \textbf{#1}} |
|
362 |
\def\qty#1{$\left<\textit{#1}\right>$} |
|
363 |
\def\qtybf#1{$\mathbf{\left<\textbf{\textit{#1}}\right>}$} |
|
364 |
\def\optrue#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\quad [\textit{true}]\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]} |
|
365 |
\def\opfalse#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\quad [\textit{false}]\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]} |
|
366 |
\def\opsmart#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool\_or\_smart}$\bigr]$\quad [\textit{smart}]\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]} |
|
367 |
\def\opsmartx#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool\_or\_smart}$\bigr]$\quad [\textit{smart}]\hfill\\\hbox{}\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]} |
|
368 |
\def\opnodefault#1#2{\flushitem{\textit{#1} = \qtybf{#2}} \nopagebreak\\[\parskip]} |
|
369 |
\def\opdefault#1#2#3{\flushitem{\textit{#1} = \qtybf{#2}\quad [\textit{#3}]} \nopagebreak\\[\parskip]} |
|
370 |
\def\oparg#1#2#3{\flushitem{\textit{#1} \qtybf{#2} = \qtybf{#3}} \nopagebreak\\[\parskip]} |
|
371 |
\def\opargbool#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]} |
|
372 |
\def\opargboolorsmart#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{bool\_or\_smart}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]} |
|
373 |
||
374 |
Sledgehammer's options are categorized as follows:\ mode of operation |
|
38984 | 375 |
(\S\ref{mode-of-operation}), problem encoding (\S\ref{problem-encoding}), |
376 |
relevance filter (\S\ref{relevance-filter}), output format |
|
377 |
(\S\ref{output-format}), and authentication (\S\ref{authentication}). |
|
36926 | 378 |
|
379 |
The descriptions below refer to the following syntactic quantities: |
|
380 |
||
381 |
\begin{enum} |
|
382 |
\item[$\bullet$] \qtybf{string}: A string. |
|
383 |
\item[$\bullet$] \qtybf{bool\/}: \textit{true} or \textit{false}. |
|
40203 | 384 |
\item[$\bullet$] \qtybf{bool\_or\_smart\/}: \textit{true}, \textit{false}, or |
385 |
\textit{smart}. |
|
36926 | 386 |
\item[$\bullet$] \qtybf{int\/}: An integer. |
40343
4521d56aef63
use floating-point numbers for Sledgehammer's "thresholds" option rather than percentages;
blanchet
parents:
40341
diff
changeset
|
387 |
\item[$\bullet$] \qtybf{float\_pair\/}: A pair of floating-point numbers |
4521d56aef63
use floating-point numbers for Sledgehammer's "thresholds" option rather than percentages;
blanchet
parents:
40341
diff
changeset
|
388 |
(e.g., 0.6 0.95). |
38591 | 389 |
\item[$\bullet$] \qtybf{int\_or\_smart\/}: An integer or \textit{smart}. |
40343
4521d56aef63
use floating-point numbers for Sledgehammer's "thresholds" option rather than percentages;
blanchet
parents:
40341
diff
changeset
|
390 |
\item[$\bullet$] \qtybf{float\_or\_none\/}: An integer (e.g., 60) or |
4521d56aef63
use floating-point numbers for Sledgehammer's "thresholds" option rather than percentages;
blanchet
parents:
40341
diff
changeset
|
391 |
floating-point number (e.g., 0.5) expressing a number of seconds, or the keyword |
4521d56aef63
use floating-point numbers for Sledgehammer's "thresholds" option rather than percentages;
blanchet
parents:
40341
diff
changeset
|
392 |
\textit{none} ($\infty$ seconds). |
36926 | 393 |
\end{enum} |
394 |
||
395 |
Default values are indicated in square brackets. Boolean options have a negated |
|
38984 | 396 |
counterpart (e.g., \textit{blocking} vs.\ \textit{non\_blocking}). When setting |
36926 | 397 |
Boolean options, ``= \textit{true}'' may be omitted. |
398 |
||
399 |
\subsection{Mode of Operation} |
|
400 |
\label{mode-of-operation} |
|
401 |
||
402 |
\begin{enum} |
|
40059
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
403 |
\opnodefault{provers}{string} |
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
404 |
Specifies the automatic provers to use as a space-separated list (e.g., |
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
405 |
``\textit{e}~\textit{spass}''). The following provers are supported: |
36926 | 406 |
|
407 |
\begin{enum} |
|
408 |
\item[$\bullet$] \textbf{\textit{e}:} E is an ATP developed by Stephan Schulz |
|
409 |
\cite{schulz-2002}. To use E, set the environment variable |
|
410 |
\texttt{E\_HOME} to the directory that contains the \texttt{eproof} executable, |
|
411 |
or install the prebuilt E package from Isabelle's download page. See |
|
412 |
\S\ref{installation} for details. |
|
413 |
||
414 |
\item[$\bullet$] \textbf{\textit{spass}:} SPASS is an ATP developed by Christoph |
|
415 |
Weidenbach et al.\ \cite{weidenbach-et-al-2009}. To use SPASS, set the |
|
416 |
environment variable \texttt{SPASS\_HOME} to the directory that contains the |
|
417 |
\texttt{SPASS} executable, or install the prebuilt SPASS package from Isabelle's |
|
37414
d0cea0796295
expect SPASS 3.7, and give a friendly warning if an older version is used
blanchet
parents:
36926
diff
changeset
|
418 |
download page. Sledgehammer requires version 3.5 or above. See |
d0cea0796295
expect SPASS 3.7, and give a friendly warning if an older version is used
blanchet
parents:
36926
diff
changeset
|
419 |
\S\ref{installation} for details. |
36926 | 420 |
|
421 |
\item[$\bullet$] \textbf{\textit{vampire}:} Vampire is an ATP developed by |
|
422 |
Andrei Voronkov and his colleagues \cite{riazanov-voronkov-2002}. To use |
|
423 |
Vampire, set the environment variable \texttt{VAMPIRE\_HOME} to the directory |
|
40942 | 424 |
that contains the \texttt{vampire} executable. Sledgehammer has been tested with |
425 |
versions 11, 0.6, and 1.0. |
|
426 |
||
427 |
\item[$\bullet$] \textbf{\textit{z3}:} Z3 is an SMT solver developed at |
|
428 |
Microsoft Research \cite{z3}. To use Z3, set the environment variable |
|
429 |
\texttt{Z3\_SOLVER} to the complete path of the executable, including the file |
|
430 |
name. Sledgehammer has been tested with 2.7 to 2.15. |
|
36926 | 431 |
|
40942 | 432 |
\item[$\bullet$] \textbf{\textit{yices}:} Yices is an SMT solver developed at |
433 |
SRI \cite{yices}. To use Yices, set the environment variable |
|
434 |
\texttt{YICES\_SOLVER} to the complete path of the executable, including the |
|
435 |
file name. Sledgehammer has been tested with version 1.0. |
|
436 |
||
437 |
\item[$\bullet$] \textbf{\textit{cvc3}:} CVC3 is an SMT solver developed by |
|
438 |
Clark Barrett, Cesare Tinelli, and their colleagues \cite{cvc3}. To use CVC3, |
|
439 |
set the environment variable \texttt{CVC3\_SOLVER} to the complete path of the |
|
440 |
executable, including the file name. Sledgehammer has been tested with version |
|
441 |
2.2. |
|
40073 | 442 |
|
38601 | 443 |
\item[$\bullet$] \textbf{\textit{remote\_e}:} The remote version of E runs |
36926 | 444 |
on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}. |
445 |
||
446 |
\item[$\bullet$] \textbf{\textit{remote\_vampire}:} The remote version of |
|
38601 | 447 |
Vampire runs on Geoff Sutcliffe's Miami servers. Version 9 is used. |
36926 | 448 |
|
38601 | 449 |
\item[$\bullet$] \textbf{\textit{remote\_sine\_e}:} SInE-E is a metaprover |
450 |
developed by Kry\v stof Hoder \cite{sine} based on E. The remote version of |
|
451 |
SInE runs on Geoff Sutcliffe's Miami servers. |
|
452 |
||
453 |
\item[$\bullet$] \textbf{\textit{remote\_snark}:} SNARK is a prover |
|
454 |
developed by Stickel et al.\ \cite{snark}. The remote version of |
|
455 |
SNARK runs on Geoff Sutcliffe's Miami servers. |
|
40073 | 456 |
|
40942 | 457 |
\item[$\bullet$] \textbf{\textit{remote\_z3}:} The remote version of Z3 runs on |
458 |
servers at the TU M\"unchen (or wherever \texttt{REMOTE\_SMT\_URL} is set to |
|
459 |
point). |
|
40073 | 460 |
|
40942 | 461 |
\item[$\bullet$] \textbf{\textit{remote\_cvc3}:} The remote version of CVC3 runs |
462 |
on servers at the TU M\"unchen (or wherever \texttt{REMOTE\_SMT\_URL} is set to |
|
463 |
point). |
|
36926 | 464 |
\end{enum} |
465 |
||
40942 | 466 |
By default, Sledgehammer will run E, SPASS, Vampire, SInE-E, and Z3 (or whatever |
467 |
the SMT module's \emph{smt\_solver} configuration option is set to) in |
|
40073 | 468 |
parallel---either locally or remotely, depending on the number of processor |
469 |
cores available. For historical reasons, the default value of this option can be |
|
470 |
overridden using the option ``Sledgehammer: Provers'' from the ``Isabelle'' menu |
|
471 |
in Proof General. |
|
36926 | 472 |
|
40059
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
473 |
It is a good idea to run several provers in parallel, although it could slow |
40073 | 474 |
down your machine. Running E, SPASS, Vampire, and SInE-E together for 5 seconds |
475 |
yields a better success rate than running the most effective of these (Vampire) |
|
476 |
for 120 seconds \cite{boehme-nipkow-2010}. |
|
40059
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
477 |
|
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
478 |
\opnodefault{prover}{string} |
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
479 |
Alias for \textit{provers}. |
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
480 |
|
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
481 |
\opnodefault{atps}{string} |
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
482 |
Legacy alias for \textit{provers}. |
36926 | 483 |
|
484 |
\opnodefault{atp}{string} |
|
40059
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
485 |
Legacy alias for \textit{provers}. |
36926 | 486 |
|
40343
4521d56aef63
use floating-point numbers for Sledgehammer's "thresholds" option rather than percentages;
blanchet
parents:
40341
diff
changeset
|
487 |
\opdefault{timeout}{float\_or\_none}{\upshape 30} |
40341
03156257040f
standardize on seconds for Nitpick and Sledgehammer timeouts
blanchet
parents:
40203
diff
changeset
|
488 |
Specifies the maximum number of seconds that the automatic provers should spend |
40059
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
489 |
searching for a proof. For historical reasons, the default value of this option |
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
490 |
can be overridden using the option ``Sledgehammer: Time Limit'' from the |
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
491 |
``Isabelle'' menu in Proof General. |
38984 | 492 |
|
38983 | 493 |
\opfalse{blocking}{non\_blocking} |
494 |
Specifies whether the \textbf{sledgehammer} command should operate |
|
495 |
synchronously. The asynchronous (non-blocking) mode lets the user start proving |
|
496 |
the putative theorem manually while Sledgehammer looks for a proof, but it can |
|
497 |
also be more confusing. |
|
498 |
||
36926 | 499 |
\opfalse{overlord}{no\_overlord} |
500 |
Specifies whether Sledgehammer should put its temporary files in |
|
501 |
\texttt{\$ISA\-BELLE\_\allowbreak HOME\_\allowbreak USER}, which is useful for |
|
502 |
debugging Sledgehammer but also unsafe if several instances of the tool are run |
|
503 |
simultaneously. The files are identified by the prefix \texttt{prob\_}; you may |
|
504 |
safely remove them after Sledgehammer has run. |
|
505 |
||
506 |
\nopagebreak |
|
507 |
{\small See also \textit{debug} (\S\ref{output-format}).} |
|
508 |
\end{enum} |
|
509 |
||
510 |
\subsection{Problem Encoding} |
|
511 |
\label{problem-encoding} |
|
512 |
||
513 |
\begin{enum} |
|
514 |
\opfalse{explicit\_apply}{implicit\_apply} |
|
515 |
Specifies whether function application should be encoded as an explicit |
|
40073 | 516 |
``apply'' operator in ATP problems. If the option is set to \textit{false}, each |
517 |
function will be directly applied to as many arguments as possible. Enabling |
|
518 |
this option can sometimes help discover higher-order proofs that otherwise would |
|
519 |
not be found. |
|
36926 | 520 |
|
521 |
\opfalse{full\_types}{partial\_types} |
|
40073 | 522 |
Specifies whether full-type information is encoded in ATP problems. Enabling |
523 |
this option can prevent the discovery of type-incorrect proofs, but it also |
|
524 |
tends to slow down the ATPs significantly. For historical reasons, the default |
|
525 |
value of this option can be overridden using the option ``Sledgehammer: Full |
|
526 |
Types'' from the ``Isabelle'' menu in Proof General. |
|
38591 | 527 |
\end{enum} |
36926 | 528 |
|
38591 | 529 |
\subsection{Relevance Filter} |
530 |
\label{relevance-filter} |
|
531 |
||
532 |
\begin{enum} |
|
40343
4521d56aef63
use floating-point numbers for Sledgehammer's "thresholds" option rather than percentages;
blanchet
parents:
40341
diff
changeset
|
533 |
\opdefault{relevance\_thresholds}{float\_pair}{\upshape 0.45~0.85} |
38746 | 534 |
Specifies the thresholds above which facts are considered relevant by the |
535 |
relevance filter. The first threshold is used for the first iteration of the |
|
536 |
relevance filter and the second threshold is used for the last iteration (if it |
|
537 |
is reached). The effective threshold is quadratically interpolated for the other |
|
40343
4521d56aef63
use floating-point numbers for Sledgehammer's "thresholds" option rather than percentages;
blanchet
parents:
40341
diff
changeset
|
538 |
iterations. Each threshold ranges from 0 to 1, where 0 means that all theorems |
4521d56aef63
use floating-point numbers for Sledgehammer's "thresholds" option rather than percentages;
blanchet
parents:
40341
diff
changeset
|
539 |
are relevant and 1 only theorems that refer to previously seen constants. |
36926 | 540 |
|
40343
4521d56aef63
use floating-point numbers for Sledgehammer's "thresholds" option rather than percentages;
blanchet
parents:
40341
diff
changeset
|
541 |
\opsmart{max\_relevant}{int\_or\_smart} |
38746 | 542 |
Specifies the maximum number of facts that may be returned by the relevance |
543 |
filter. If the option is set to \textit{smart}, it is set to a value that was |
|
40059
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
544 |
empirically found to be appropriate for the prover. A typical value would be |
6ad9081665db
use consistent terminology in Sledgehammer: "prover = ATP or SMT solver or ..."
blanchet
parents:
39335
diff
changeset
|
545 |
300. |
36926 | 546 |
\end{enum} |
547 |
||
548 |
\subsection{Output Format} |
|
549 |
\label{output-format} |
|
550 |
||
551 |
\begin{enum} |
|
552 |
||
553 |
\opfalse{verbose}{quiet} |
|
554 |
Specifies whether the \textbf{sledgehammer} command should explain what it does. |
|
555 |
||
556 |
\opfalse{debug}{no\_debug} |
|
40203 | 557 |
Specifies whether Sledgehammer should display additional debugging information |
558 |
beyond what \textit{verbose} already displays. Enabling \textit{debug} also |
|
559 |
enables \textit{verbose} behind the scenes. |
|
36926 | 560 |
|
561 |
\nopagebreak |
|
562 |
{\small See also \textit{overlord} (\S\ref{mode-of-operation}).} |
|
563 |
||
564 |
\opfalse{isar\_proof}{no\_isar\_proof} |
|
565 |
Specifies whether Isar proofs should be output in addition to one-liner |
|
566 |
\textit{metis} proofs. Isar proof construction is still experimental and often |
|
567 |
fails; however, they are usually faster and sometimes more robust than |
|
568 |
\textit{metis} proofs. |
|
569 |
||
40343
4521d56aef63
use floating-point numbers for Sledgehammer's "thresholds" option rather than percentages;
blanchet
parents:
40341
diff
changeset
|
570 |
\opdefault{isar\_shrink\_factor}{int}{\upshape 1} |
36926 | 571 |
Specifies the granularity of the Isar proof. A value of $n$ indicates that each |
572 |
Isar proof step should correspond to a group of up to $n$ consecutive proof |
|
573 |
steps in the ATP proof. |
|
574 |
||
575 |
\end{enum} |
|
576 |
||
38984 | 577 |
\subsection{Authentication} |
578 |
\label{authentication} |
|
579 |
||
580 |
\begin{enum} |
|
581 |
\opnodefault{expect}{string} |
|
582 |
Specifies the expected outcome, which must be one of the following: |
|
36926 | 583 |
|
584 |
\begin{enum} |
|
40203 | 585 |
\item[$\bullet$] \textbf{\textit{some}:} Sledgehammer found a (potentially |
586 |
unsound) proof. |
|
38984 | 587 |
\item[$\bullet$] \textbf{\textit{none}:} Sledgehammer found no proof. |
40203 | 588 |
\item[$\bullet$] \textbf{\textit{unknown}:} Sledgehammer encountered some |
589 |
problem. |
|
38984 | 590 |
\end{enum} |
591 |
||
592 |
Sledgehammer emits an error (if \textit{blocking} is enabled) or a warning |
|
593 |
(otherwise) if the actual outcome differs from the expected outcome. This option |
|
594 |
is useful for regression testing. |
|
595 |
||
596 |
\nopagebreak |
|
597 |
{\small See also \textit{blocking} (\S\ref{mode-of-operation}).} |
|
36926 | 598 |
\end{enum} |
599 |
||
600 |
\let\em=\sl |
|
601 |
\bibliography{../manual}{} |
|
602 |
\bibliographystyle{abbrv} |
|
603 |
||
604 |
\end{document} |