| author | blanchet |
| Sat, 11 Sep 2010 12:31:05 +0200 | |
| changeset 39326 | 0b68add21e3d |
| parent 39320 | 5d578004be23 |
| child 39335 | 87a9ff4d5817 |
| permissions | -rw-r--r-- |
| 36926 | 1 |
\documentclass[a4paper,12pt]{article}
|
2 |
\usepackage[T1]{fontenc}
|
|
3 |
\usepackage{amsmath}
|
|
4 |
\usepackage{amssymb}
|
|
5 |
\usepackage[english,french]{babel}
|
|
6 |
\usepackage{color}
|
|
7 |
\usepackage{footmisc}
|
|
8 |
\usepackage{graphicx}
|
|
9 |
%\usepackage{mathpazo}
|
|
10 |
\usepackage{multicol}
|
|
11 |
\usepackage{stmaryrd}
|
|
12 |
%\usepackage[scaled=.85]{beramono}
|
|
13 |
\usepackage{../iman,../pdfsetup}
|
|
14 |
||
15 |
%\oddsidemargin=4.6mm |
|
16 |
%\evensidemargin=4.6mm |
|
17 |
%\textwidth=150mm |
|
18 |
%\topmargin=4.6mm |
|
19 |
%\headheight=0mm |
|
20 |
%\headsep=0mm |
|
21 |
%\textheight=234mm |
|
22 |
||
23 |
\def\Colon{\mathord{:\mkern-1.5mu:}}
|
|
24 |
%\def\lbrakk{\mathopen{\lbrack\mkern-3.25mu\lbrack}}
|
|
25 |
%\def\rbrakk{\mathclose{\rbrack\mkern-3.255mu\rbrack}}
|
|
26 |
\def\lparr{\mathopen{(\mkern-4mu\mid}}
|
|
27 |
\def\rparr{\mathclose{\mid\mkern-4mu)}}
|
|
28 |
||
29 |
\def\unk{{?}}
|
|
30 |
\def\undef{(\lambda x.\; \unk)}
|
|
31 |
%\def\unr{\textit{others}}
|
|
32 |
\def\unr{\ldots}
|
|
33 |
\def\Abs#1{\hbox{\rm{\flqq}}{\,#1\,}\hbox{\rm{\frqq}}}
|
|
34 |
\def\Q{{\smash{\lower.2ex\hbox{$\scriptstyle?$}}}}
|
|
35 |
||
36 |
\urlstyle{tt}
|
|
37 |
||
38 |
\begin{document}
|
|
39 |
||
40 |
\selectlanguage{english}
|
|
41 |
||
42 |
\title{\includegraphics[scale=0.5]{isabelle_sledgehammer} \\[4ex]
|
|
43 |
Hammering Away \\[\smallskipamount] |
|
44 |
\Large A User's Guide to Sledgehammer for Isabelle/HOL} |
|
45 |
\author{\hbox{} \\
|
|
46 |
Jasmin Christian Blanchette \\ |
|
47 |
{\normalsize Institut f\"ur Informatik, Technische Universit\"at M\"unchen} \\
|
|
48 |
\hbox{}}
|
|
49 |
||
50 |
\maketitle |
|
51 |
||
52 |
\tableofcontents |
|
53 |
||
54 |
\setlength{\parskip}{.7em plus .2em minus .1em}
|
|
55 |
\setlength{\parindent}{0pt}
|
|
56 |
\setlength{\abovedisplayskip}{\parskip}
|
|
57 |
\setlength{\abovedisplayshortskip}{.9\parskip}
|
|
58 |
\setlength{\belowdisplayskip}{\parskip}
|
|
59 |
\setlength{\belowdisplayshortskip}{.9\parskip}
|
|
60 |
||
61 |
% General-purpose enum environment with correct spacing |
|
62 |
\newenvironment{enum}%
|
|
63 |
{\begin{list}{}{%
|
|
64 |
\setlength{\topsep}{.1\parskip}%
|
|
65 |
\setlength{\partopsep}{.1\parskip}%
|
|
66 |
\setlength{\itemsep}{\parskip}%
|
|
67 |
\advance\itemsep by-\parsep}} |
|
68 |
{\end{list}}
|
|
69 |
||
70 |
\def\pre{\begingroup\vskip0pt plus1ex\advance\leftskip by\leftmargin
|
|
71 |
\advance\rightskip by\leftmargin} |
|
72 |
\def\post{\vskip0pt plus1ex\endgroup}
|
|
73 |
||
74 |
\def\prew{\pre\advance\rightskip by-\leftmargin}
|
|
75 |
\def\postw{\post}
|
|
76 |
||
77 |
\section{Introduction}
|
|
78 |
\label{introduction}
|
|
79 |
||
80 |
Sledgehammer is a tool that applies first-order automatic theorem provers (ATPs) |
|
81 |
on the current goal. The supported ATPs are E \cite{schulz-2002}, SPASS
|
|
82 |
\cite{weidenbach-et-al-2009}, and Vampire \cite{riazanov-voronkov-2002}, which
|
|
83 |
can be run locally or remotely via the SystemOnTPTP web service |
|
84 |
\cite{sutcliffe-2000}.
|
|
85 |
||
|
37517
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
86 |
The problem passed to ATPs consists of your current goal together with a |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
87 |
heuristic selection of hundreds of facts (theorems) from the current theory |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
88 |
context, filtered by relevance. Because jobs are run in the background, you can |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
89 |
continue to work on your proof by other means. Provers can be run in parallel. |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
90 |
Any reply (which may arrive minutes later) will appear in the Proof General |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
91 |
response buffer. |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
92 |
|
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
93 |
The result of a successful ATP proof search is some source text that usually |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
94 |
(but not always) reconstructs the proof within Isabelle, without requiring the |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
95 |
ATPs again. The reconstructed proof relies on the general-purpose Metis prover |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
96 |
\cite{metis}, which is fully integrated into Isabelle/HOL, with explicit
|
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
97 |
inferences going through the kernel. Thus its results are correct by |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
98 |
construction. |
| 36926 | 99 |
|
| 39320 | 100 |
In this manual, we will explicitly invoke the \textbf{sledgehammer} command.
|
101 |
Sledgehammer also provides an automatic mode that can be enabled via the |
|
102 |
``Auto Sledgehammer'' option from the ``Isabelle'' menu in Proof General. In |
|
103 |
this mode, Sledgehammer is run on every newly entered theorem. The time limit |
|
104 |
for Auto Sledgehammer and other automatic tools can be set using the ``Auto |
|
105 |
Tools Time Limit'' option. |
|
106 |
||
| 36926 | 107 |
\newbox\boxA |
108 |
\setbox\boxA=\hbox{\texttt{nospam}}
|
|
109 |
||
110 |
Examples of Sledgehammer use can be found in Isabelle's |
|
111 |
\texttt{src/HOL/Metis\_Examples} directory.
|
|
112 |
Comments and bug reports concerning Sledgehammer or this manual should be |
|
113 |
directed to |
|
114 |
\texttt{blan{\color{white}nospam}\kern-\wd\boxA{}chette@\allowbreak
|
|
115 |
in.\allowbreak tum.\allowbreak de}. |
|
116 |
||
117 |
\vskip2.5\smallskipamount |
|
118 |
||
119 |
%\textbf{Acknowledgment.} The author would like to thank Mark Summerfield for
|
|
120 |
%suggesting several textual improvements. |
|
121 |
||
122 |
\section{Installation}
|
|
123 |
\label{installation}
|
|
124 |
||
125 |
Sledgehammer is part of Isabelle, so you don't need to install it. However, it |
|
126 |
relies on third-party automatic theorem provers (ATPs). Currently, E, SPASS, and |
|
127 |
Vampire are supported. All of these are available remotely via SystemOnTPTP |
|
128 |
\cite{sutcliffe-2000}, but if you want better performance you will need to
|
|
129 |
install at least E and SPASS locally. |
|
130 |
||
| 38043 | 131 |
There are three main ways to install ATPs on your machine: |
| 36926 | 132 |
|
133 |
\begin{enum}
|
|
134 |
\item[$\bullet$] If you installed an official Isabelle package with everything |
|
135 |
inside, it should already include properly setup executables for E and SPASS, |
|
| 38043 | 136 |
ready to use.% |
137 |
\footnote{Vampire's license prevents us from doing the same for this otherwise
|
|
138 |
wonderful tool.} |
|
| 36926 | 139 |
|
| 38043 | 140 |
\item[$\bullet$] Alternatively, you can download the Isabelle-aware E and SPASS |
| 36926 | 141 |
binary packages from Isabelle's download page. Extract the archives, then add a |
142 |
line to your \texttt{\char`\~/.isabelle/etc/components} file with the absolute path to
|
|
| 38043 | 143 |
E or SPASS. For example, if the \texttt{components} does not exist
|
| 36926 | 144 |
yet and you extracted SPASS to \texttt{/usr/local/spass-3.7}, create
|
| 38043 | 145 |
the \texttt{components} file with the single line
|
| 36926 | 146 |
|
147 |
\prew |
|
148 |
\texttt{/usr/local/spass-3.7}
|
|
149 |
\postw |
|
150 |
||
| 38043 | 151 |
in it. |
152 |
||
153 |
\item[$\bullet$] If you prefer to build E or SPASS yourself, or obtained a |
|
154 |
Vampire executable from somewhere (e.g., \url{http://www.vprover.org/}),
|
|
155 |
set the environment variable \texttt{E\_HOME}, \texttt{SPASS\_HOME}, or
|
|
156 |
\texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{eproof},
|
|
| 38063 | 157 |
\texttt{SPASS}, or \texttt{vampire} executable. Sledgehammer has been tested
|
158 |
with E 1.0 and 1.2, SPASS 3.5 and 3.7, and Vampire 1.0% |
|
159 |
\footnote{Following the rewrite of Vampire, the counter for version numbers was
|
|
160 |
reset to 0; hence the new Vampire 1.0 is more recent than Vampire 11.5.}% |
|
161 |
. Since the ATPs' output formats are neither documented nor stable, other |
|
162 |
versions of the ATPs might or might not work well with Sledgehammer. |
|
| 36926 | 163 |
\end{enum}
|
164 |
||
165 |
To check whether E and SPASS are installed, follow the example in |
|
166 |
\S\ref{first-steps}.
|
|
167 |
||
|
37517
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
168 |
Remote ATP invocation via the SystemOnTPTP web service requires Perl with the |
|
39152
f09b378cb252
make remote ATP invocation work for those people who need to go through a proxy;
blanchet
parents:
38997
diff
changeset
|
169 |
World Wide Web Library (\texttt{libwww-perl}) installed. If you must use a proxy
|
|
f09b378cb252
make remote ATP invocation work for those people who need to go through a proxy;
blanchet
parents:
38997
diff
changeset
|
170 |
server to access the Internet, set the \texttt{http\_proxy} environment variable
|
| 39153 | 171 |
to the proxy, either in the environment in which Isabelle is launched or in your |
172 |
\texttt{\char`\~/.isabelle/etc/settings} file. Here are a few examples:
|
|
|
39152
f09b378cb252
make remote ATP invocation work for those people who need to go through a proxy;
blanchet
parents:
38997
diff
changeset
|
173 |
|
|
f09b378cb252
make remote ATP invocation work for those people who need to go through a proxy;
blanchet
parents:
38997
diff
changeset
|
174 |
\prew |
| 39153 | 175 |
\texttt{http\_proxy=http://proxy.example.org} \\
|
176 |
\texttt{http\_proxy=http://proxy.example.org:8080} \\
|
|
177 |
\texttt{http\_proxy=http://joeblow:pAsSwRd@proxy.example.org}
|
|
|
39152
f09b378cb252
make remote ATP invocation work for those people who need to go through a proxy;
blanchet
parents:
38997
diff
changeset
|
178 |
\postw |
|
37517
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
179 |
|
| 36926 | 180 |
\section{First Steps}
|
181 |
\label{first-steps}
|
|
182 |
||
183 |
To illustrate Sledgehammer in context, let us start a theory file and |
|
184 |
attempt to prove a simple lemma: |
|
185 |
||
186 |
\prew |
|
187 |
\textbf{theory}~\textit{Scratch} \\
|
|
188 |
\textbf{imports}~\textit{Main} \\
|
|
189 |
\textbf{begin} \\[2\smallskipamount]
|
|
190 |
% |
|
191 |
\textbf{lemma} ``$[a] = [b] \,\longleftrightarrow\, a = b$'' \\
|
|
192 |
\textbf{sledgehammer}
|
|
193 |
\postw |
|
194 |
||
|
37517
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
195 |
Instead of issuing the \textbf{sledgehammer} command, you can also find
|
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
196 |
Sledgehammer in the ``Commands'' submenu of the ``Isabelle'' menu in Proof |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
197 |
General or press the Emacs key sequence C-c C-a C-s. |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
198 |
Either way, Sledgehammer produces the following output after a few seconds: |
| 36926 | 199 |
|
200 |
\prew |
|
201 |
\slshape |
|
202 |
Sledgehammer: ATP ``\textit{e}'' for subgoal 1: \\
|
|
203 |
$([a] = [b]) = (a = b)$ \\ |
|
204 |
Try this command: \textbf{by} (\textit{metis hd.simps}). \\
|
|
| 38043 | 205 |
To minimize the number of lemmas, try this: \\ |
| 36926 | 206 |
\textbf{sledgehammer} \textit{minimize} [\textit{atp} = \textit{e}] (\textit{hd.simps}). \\[3\smallskipamount]
|
207 |
% |
|
208 |
Sledgehammer: ATP ``\textit{spass}'' for subgoal 1: \\
|
|
209 |
$([a] = [b]) = (a = b)$ \\ |
|
210 |
Try this command: \textbf{by} (\textit{metis insert\_Nil last\_ConsL}). \\
|
|
| 38043 | 211 |
To minimize the number of lemmas, try this: \\ |
| 36926 | 212 |
\textbf{sledgehammer} \textit{minimize} [\textit{atp} = \textit{spass}] (\textit{insert\_Nil last\_ConsL}). \\[3\smallskipamount]
|
213 |
% |
|
214 |
Sledgehammer: ATP ``\textit{remote\_vampire}'' for subgoal 1: \\
|
|
215 |
$([a] = [b]) = (a = b)$ \\ |
|
216 |
Try this command: \textbf{by} (\textit{metis One\_nat\_def\_raw empty\_replicate} \\
|
|
217 |
\phantom{Try this command: \textbf{by} (\textit{metis~}}\textit{insert\_Nil last\_ConsL replicate\_Suc}). \\
|
|
| 38043 | 218 |
To minimize the number of lemmas, try this: \\ |
| 36926 | 219 |
\textbf{sledgehammer} \textit{minimize} [\textit{atp} = \textit{remote\_vampire}] \\
|
220 |
\phantom{\textbf{sledgehammer}~}(\textit{One\_nat\_def\_raw empty\_replicate insert\_Nil} \\
|
|
221 |
\phantom{\textbf{sledgehammer}~(}\textit{last\_ConsL replicate\_Suc}).
|
|
222 |
\postw |
|
223 |
||
224 |
Sledgehammer ran E, SPASS, and the remote version of Vampire in parallel. If E |
|
| 38043 | 225 |
is not installed (\S\ref{installation}), you will see references to
|
226 |
its remote American cousin \textit{remote\_e} instead of
|
|
227 |
\textit{e}; and if SPASS is not installed, it will not appear in the output.
|
|
| 36926 | 228 |
|
229 |
Based on each ATP proof, Sledgehammer gives a one-liner proof that uses the |
|
230 |
\textit{metis} method. You can click them and insert them into the theory text.
|
|
231 |
You can click the ``\textbf{sledgehammer} \textit{minimize}'' command if you
|
|
232 |
want to look for a shorter (and faster) proof. But here the proof found by E |
|
233 |
looks perfect, so click it to finish the proof. |
|
234 |
||
235 |
You can ask Sledgehammer for an Isar text proof by passing the |
|
236 |
\textit{isar\_proof} option:
|
|
237 |
||
238 |
\prew |
|
239 |
\textbf{sledgehammer} [\textit{isar\_proof}]
|
|
240 |
\postw |
|
241 |
||
242 |
When Isar proof construction is successful, it can yield proofs that are more |
|
243 |
readable and also faster than the \textit{metis} one-liners. This feature is
|
|
244 |
experimental. |
|
245 |
||
|
37517
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
246 |
\section{Hints}
|
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
247 |
\label{hints}
|
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
248 |
|
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
249 |
For best results, first simplify your problem by calling \textit{auto} or at
|
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
250 |
least \textit{safe} followed by \textit{simp\_all}. None of the ATPs contain
|
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
251 |
arithmetic decision procedures. They are not especially good at heavy rewriting, |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
252 |
but because they regard equations as undirected, they often prove theorems that |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
253 |
require the reverse orientation of a \textit{simp} rule. Higher-order problems
|
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
254 |
can be tackled, but the success rate is better for first-order problems. Hence, |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
255 |
you may get better results if you first simplify the problem to remove |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
256 |
higher-order features. |
|
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
257 |
|
| 39320 | 258 |
Note that problems can be easy for \textit{auto} and difficult for ATPs, but the
|
259 |
reverse is also true, so don't be discouraged if your first attempts fail. |
|
260 |
Because the system refers to all theorems known to Isabelle, it is particularly |
|
261 |
suitable when your goal has a short proof from lemmas that you don't know about. |
|
|
37517
19ba7ec5f1e3
steal some of http://isabelle.in.tum.de/sledgehammer.html and add it to the docs
blanchet
parents:
37498
diff
changeset
|
262 |
|
| 36926 | 263 |
\section{Command Syntax}
|
264 |
\label{command-syntax}
|
|
265 |
||
266 |
Sledgehammer can be invoked at any point when there is an open goal by entering |
|
267 |
the \textbf{sledgehammer} command in the theory file. Its general syntax is as
|
|
268 |
follows: |
|
269 |
||
270 |
\prew |
|
271 |
\textbf{sledgehammer} \textit{subcommand\/$^?$ options\/$^?$ facts\_override\/$^?$ num\/$^?$}
|
|
272 |
\postw |
|
273 |
||
274 |
For convenience, Sledgehammer is also available in the ``Commands'' submenu of |
|
275 |
the ``Isabelle'' menu in Proof General or by pressing the Emacs key sequence C-c |
|
276 |
C-a C-s. This is equivalent to entering the \textbf{sledgehammer} command with
|
|
277 |
no arguments in the theory text. |
|
278 |
||
279 |
In the general syntax, the \textit{subcommand} may be any of the following:
|
|
280 |
||
281 |
\begin{enum}
|
|
282 |
\item[$\bullet$] \textbf{\textit{run} (the default):} Runs Sledgehammer on subgoal number
|
|
283 |
\textit{num} (1 by default), with the given options and facts.
|
|
284 |
||
285 |
\item[$\bullet$] \textbf{\textit{minimize}:} Attempts to minimize the provided facts
|
|
286 |
(specified in the \textit{facts\_override} argument) to obtain a simpler proof
|
|
287 |
involving fewer facts. The options and goal number are as for \textit{run}.
|
|
288 |
||
289 |
\item[$\bullet$] \textbf{\textit{messages}:} Redisplays recent messages issued by
|
|
290 |
Sledgehammer. This allows you to examine results that might have been lost due |
|
291 |
to Sledgehammer's asynchronous nature. The \textit{num} argument specifies a
|
|
292 |
limit on the number of messages to display (5 by default). |
|
293 |
||
294 |
\item[$\bullet$] \textbf{\textit{available\_atps}:} Prints the list of installed ATPs.
|
|
295 |
See \S\ref{installation} and \S\ref{mode-of-operation} for more information on
|
|
296 |
how to install ATPs. |
|
297 |
||
298 |
\item[$\bullet$] \textbf{\textit{running\_atps}:} Prints information about currently
|
|
299 |
running ATPs, including elapsed runtime and remaining time until timeout. |
|
300 |
||
301 |
\item[$\bullet$] \textbf{\textit{kill\_atps}:} Terminates all running ATPs.
|
|
302 |
||
303 |
\item[$\bullet$] \textbf{\textit{refresh\_tptp}:} Refreshes the list of remote
|
|
304 |
ATPs available at System\-On\-TPTP \cite{sutcliffe-2000}.
|
|
305 |
\end{enum}
|
|
306 |
||
307 |
Sledgehammer's behavior can be influenced by various \textit{options}, which can
|
|
308 |
be specified in brackets after the \textbf{sledgehammer} command. The
|
|
309 |
\textit{options} are a list of key--value pairs of the form ``[$k_1 = v_1,
|
|
310 |
\ldots, k_n = v_n$]''. For Boolean options, ``= \textit{true}'' is optional. For
|
|
311 |
example: |
|
312 |
||
313 |
\prew |
|
314 |
\textbf{sledgehammer} [\textit{isar\_proof}, \,\textit{timeout} = 120$\,s$]
|
|
315 |
\postw |
|
316 |
||
317 |
Default values can be set using \textbf{sledgehammer\_\allowbreak params}:
|
|
318 |
||
319 |
\prew |
|
320 |
\textbf{sledgehammer\_params} \textit{options}
|
|
321 |
\postw |
|
322 |
||
323 |
The supported options are described in \S\ref{option-reference}.
|
|
324 |
||
325 |
The \textit{facts\_override} argument lets you alter the set of facts that go
|
|
326 |
through the relevance filter. It may be of the form ``(\textit{facts})'', where
|
|
327 |
\textit{facts} is a space-separated list of Isabelle facts (theorems, local
|
|
328 |
assumptions, etc.), in which case the relevance filter is bypassed and the given |
|
| 39320 | 329 |
facts are used. It may also be of the form ``(\textit{add}:\ \textit{facts}$_1$)'',
|
330 |
``(\textit{del}:\ \textit{facts}$_2$)'', or ``(\textit{add}:\ \textit{facts}$_1$\
|
|
331 |
\textit{del}:\ \textit{facts}$_2$)'', where the relevance filter is instructed to
|
|
| 36926 | 332 |
proceed as usual except that it should consider \textit{facts}$_1$
|
333 |
highly-relevant and \textit{facts}$_2$ fully irrelevant.
|
|
334 |
||
| 39320 | 335 |
You can instruct Sledgehammer to run automatically on newly entered theorems by |
336 |
enabling the ``Auto Sledgehammer'' option from the ``Isabelle'' menu in Proof |
|
337 |
General. For automatic runs, only the first ATP set using \textit{atps}
|
|
338 |
(\S\ref{mode-of-operation}) is considered, \textit{verbose}
|
|
339 |
(\S\ref{output-format}) and \textit{debug} (\S\ref{output-format}) are disabled,
|
|
340 |
fewer facts are passed to the ATP, and \textit{timeout} (\S\ref{timeouts}) is
|
|
341 |
superseded by the ``Auto Tools Time Limit'' in Proof General's ``Isabelle'' |
|
342 |
menu. Sledgehammer's output is also more concise. |
|
343 |
||
| 36926 | 344 |
\section{Option Reference}
|
345 |
\label{option-reference}
|
|
346 |
||
347 |
\def\flushitem#1{\item[]\noindent\kern-\leftmargin \textbf{#1}}
|
|
348 |
\def\qty#1{$\left<\textit{#1}\right>$}
|
|
349 |
\def\qtybf#1{$\mathbf{\left<\textbf{\textit{#1}}\right>}$}
|
|
350 |
\def\optrue#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\quad [\textit{true}]\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
|
|
351 |
\def\opfalse#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\quad [\textit{false}]\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
|
|
352 |
\def\opsmart#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool\_or\_smart}$\bigr]$\quad [\textit{smart}]\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
|
|
353 |
\def\opsmartx#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool\_or\_smart}$\bigr]$\quad [\textit{smart}]\hfill\\\hbox{}\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
|
|
354 |
\def\opnodefault#1#2{\flushitem{\textit{#1} = \qtybf{#2}} \nopagebreak\\[\parskip]}
|
|
355 |
\def\opdefault#1#2#3{\flushitem{\textit{#1} = \qtybf{#2}\quad [\textit{#3}]} \nopagebreak\\[\parskip]}
|
|
356 |
\def\oparg#1#2#3{\flushitem{\textit{#1} \qtybf{#2} = \qtybf{#3}} \nopagebreak\\[\parskip]}
|
|
357 |
\def\opargbool#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}
|
|
358 |
\def\opargboolorsmart#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{bool\_or\_smart}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}
|
|
359 |
||
360 |
Sledgehammer's options are categorized as follows:\ mode of operation |
|
| 38984 | 361 |
(\S\ref{mode-of-operation}), problem encoding (\S\ref{problem-encoding}),
|
362 |
relevance filter (\S\ref{relevance-filter}), output format
|
|
363 |
(\S\ref{output-format}), and authentication (\S\ref{authentication}).
|
|
| 36926 | 364 |
|
365 |
The descriptions below refer to the following syntactic quantities: |
|
366 |
||
367 |
\begin{enum}
|
|
368 |
\item[$\bullet$] \qtybf{string}: A string.
|
|
369 |
\item[$\bullet$] \qtybf{bool\/}: \textit{true} or \textit{false}.
|
|
370 |
\item[$\bullet$] \qtybf{bool\_or\_smart\/}: \textit{true}, \textit{false}, or \textit{smart}.
|
|
371 |
\item[$\bullet$] \qtybf{int\/}: An integer.
|
|
| 38591 | 372 |
\item[$\bullet$] \qtybf{int\_or\_smart\/}: An integer or \textit{smart}.
|
| 36926 | 373 |
\item[$\bullet$] \qtybf{time}: An integer followed by $\textit{min}$ (minutes), $s$ (seconds), or \textit{ms}
|
374 |
(milliseconds), or the keyword \textit{none} ($\infty$ years).
|
|
375 |
\end{enum}
|
|
376 |
||
377 |
Default values are indicated in square brackets. Boolean options have a negated |
|
| 38984 | 378 |
counterpart (e.g., \textit{blocking} vs.\ \textit{non\_blocking}). When setting
|
| 36926 | 379 |
Boolean options, ``= \textit{true}'' may be omitted.
|
380 |
||
381 |
\subsection{Mode of Operation}
|
|
382 |
\label{mode-of-operation}
|
|
383 |
||
384 |
\begin{enum}
|
|
385 |
\opnodefault{atps}{string}
|
|
386 |
Specifies the ATPs (automated theorem provers) to use as a space-separated list |
|
387 |
(e.g., ``\textit{e}~\textit{spass}''). The following ATPs are supported:
|
|
388 |
||
389 |
\begin{enum}
|
|
390 |
\item[$\bullet$] \textbf{\textit{e}:} E is an ATP developed by Stephan Schulz
|
|
391 |
\cite{schulz-2002}. To use E, set the environment variable
|
|
392 |
\texttt{E\_HOME} to the directory that contains the \texttt{eproof} executable,
|
|
393 |
or install the prebuilt E package from Isabelle's download page. See |
|
394 |
\S\ref{installation} for details.
|
|
395 |
||
396 |
\item[$\bullet$] \textbf{\textit{spass}:} SPASS is an ATP developed by Christoph
|
|
397 |
Weidenbach et al.\ \cite{weidenbach-et-al-2009}. To use SPASS, set the
|
|
398 |
environment variable \texttt{SPASS\_HOME} to the directory that contains the
|
|
399 |
\texttt{SPASS} executable, or install the prebuilt SPASS package from Isabelle's
|
|
|
37414
d0cea0796295
expect SPASS 3.7, and give a friendly warning if an older version is used
blanchet
parents:
36926
diff
changeset
|
400 |
download page. Sledgehammer requires version 3.5 or above. See |
|
d0cea0796295
expect SPASS 3.7, and give a friendly warning if an older version is used
blanchet
parents:
36926
diff
changeset
|
401 |
\S\ref{installation} for details.
|
| 36926 | 402 |
|
403 |
\item[$\bullet$] \textbf{\textit{vampire}:} Vampire is an ATP developed by
|
|
404 |
Andrei Voronkov and his colleagues \cite{riazanov-voronkov-2002}. To use
|
|
405 |
Vampire, set the environment variable \texttt{VAMPIRE\_HOME} to the directory
|
|
406 |
that contains the \texttt{vampire} executable.
|
|
407 |
||
| 38601 | 408 |
\item[$\bullet$] \textbf{\textit{remote\_e}:} The remote version of E runs
|
| 36926 | 409 |
on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
|
410 |
||
411 |
\item[$\bullet$] \textbf{\textit{remote\_vampire}:} The remote version of
|
|
| 38601 | 412 |
Vampire runs on Geoff Sutcliffe's Miami servers. Version 9 is used. |
| 36926 | 413 |
|
| 38601 | 414 |
\item[$\bullet$] \textbf{\textit{remote\_sine\_e}:} SInE-E is a metaprover
|
415 |
developed by Kry\v stof Hoder \cite{sine} based on E. The remote version of
|
|
416 |
SInE runs on Geoff Sutcliffe's Miami servers. |
|
417 |
||
418 |
\item[$\bullet$] \textbf{\textit{remote\_snark}:} SNARK is a prover
|
|
419 |
developed by Stickel et al.\ \cite{snark}. The remote version of
|
|
420 |
SNARK runs on Geoff Sutcliffe's Miami servers. |
|
| 36926 | 421 |
\end{enum}
|
422 |
||
| 38601 | 423 |
By default, Sledgehammer will run E, SPASS, Vampire, and SInE-E in parallel. |
424 |
For at most two of E, SPASS, and Vampire, it will use any locally installed |
|
425 |
version if available. For historical reasons, the default value of this option |
|
426 |
can be overridden using the option ``Sledgehammer: ATPs'' from the ``Isabelle'' |
|
427 |
menu in Proof General. |
|
| 36926 | 428 |
|
429 |
It is a good idea to run several ATPs in parallel, although it could slow down |
|
| 38601 | 430 |
your machine. Running E, SPASS, and Vampire together for 5 seconds yields about |
431 |
the same success rate as running the most effective of these (Vampire) for 120 |
|
432 |
seconds \cite{boehme-nipkow-2010}.
|
|
| 36926 | 433 |
|
434 |
\opnodefault{atp}{string}
|
|
435 |
Alias for \textit{atps}.
|
|
436 |
||
| 38984 | 437 |
\opdefault{timeout}{time}{$\mathbf{60}$ s}
|
438 |
Specifies the maximum amount of time that the ATPs should spend searching for a |
|
439 |
proof. For historical reasons, the default value of this option can be |
|
440 |
overridden using the option ``Sledgehammer: Time Limit'' from the ``Isabelle'' |
|
441 |
menu in Proof General. |
|
442 |
||
| 38983 | 443 |
\opfalse{blocking}{non\_blocking}
|
444 |
Specifies whether the \textbf{sledgehammer} command should operate
|
|
445 |
synchronously. The asynchronous (non-blocking) mode lets the user start proving |
|
446 |
the putative theorem manually while Sledgehammer looks for a proof, but it can |
|
447 |
also be more confusing. |
|
448 |
||
| 36926 | 449 |
\opfalse{overlord}{no\_overlord}
|
450 |
Specifies whether Sledgehammer should put its temporary files in |
|
451 |
\texttt{\$ISA\-BELLE\_\allowbreak HOME\_\allowbreak USER}, which is useful for
|
|
452 |
debugging Sledgehammer but also unsafe if several instances of the tool are run |
|
453 |
simultaneously. The files are identified by the prefix \texttt{prob\_}; you may
|
|
454 |
safely remove them after Sledgehammer has run. |
|
455 |
||
456 |
\nopagebreak |
|
457 |
{\small See also \textit{debug} (\S\ref{output-format}).}
|
|
458 |
\end{enum}
|
|
459 |
||
460 |
\subsection{Problem Encoding}
|
|
461 |
\label{problem-encoding}
|
|
462 |
||
463 |
\begin{enum}
|
|
464 |
\opfalse{explicit\_apply}{implicit\_apply}
|
|
465 |
Specifies whether function application should be encoded as an explicit |
|
466 |
``apply'' operator. If the option is set to \textit{false}, each function will
|
|
467 |
be directly applied to as many arguments as possible. Enabling this option can |
|
468 |
sometimes help discover higher-order proofs that otherwise would not be found. |
|
469 |
||
470 |
\opfalse{full\_types}{partial\_types}
|
|
471 |
Specifies whether full-type information is exported. Enabling this option can |
|
472 |
prevent the discovery of type-incorrect proofs, but it also tends to slow down |
|
473 |
the ATPs significantly. For historical reasons, the default value of this option |
|
474 |
can be overridden using the option ``Sledgehammer: ATPs'' from the ``Isabelle'' |
|
475 |
menu in Proof General. |
|
| 38591 | 476 |
\end{enum}
|
| 36926 | 477 |
|
| 38591 | 478 |
\subsection{Relevance Filter}
|
479 |
\label{relevance-filter}
|
|
480 |
||
481 |
\begin{enum}
|
|
| 38940 | 482 |
\opdefault{relevance\_thresholds}{int\_pair}{45~85}
|
| 38746 | 483 |
Specifies the thresholds above which facts are considered relevant by the |
484 |
relevance filter. The first threshold is used for the first iteration of the |
|
485 |
relevance filter and the second threshold is used for the last iteration (if it |
|
486 |
is reached). The effective threshold is quadratically interpolated for the other |
|
487 |
iterations. Each threshold ranges from 0 to 100, where 0 means that all theorems |
|
488 |
are relevant and 100 only theorems that refer to previously seen constants. |
|
| 36926 | 489 |
|
| 38746 | 490 |
\opdefault{max\_relevant}{int\_or\_smart}{\textit{smart}}
|
491 |
Specifies the maximum number of facts that may be returned by the relevance |
|
492 |
filter. If the option is set to \textit{smart}, it is set to a value that was
|
|
493 |
empirically found to be appropriate for the ATP. A typical value would be 300. |
|
| 38591 | 494 |
|
| 38997 | 495 |
%\opsmartx{theory\_relevant}{theory\_irrelevant}
|
496 |
%Specifies whether the theory from which a fact comes should be taken into |
|
497 |
%consideration by the relevance filter. If the option is set to \textit{smart},
|
|
498 |
%it is taken to be \textit{true} for SPASS and \textit{false} for the other ATPs;
|
|
499 |
%empirical results suggest that these are the best settings. |
|
| 36926 | 500 |
|
| 38741 | 501 |
%\opfalse{defs\_relevant}{defs\_irrelevant}
|
502 |
%Specifies whether the definition of constants occurring in the formula to prove |
|
503 |
%should be considered particularly relevant. Enabling this option tends to lead |
|
504 |
%to larger problems and typically slows down the ATPs. |
|
| 36926 | 505 |
\end{enum}
|
506 |
||
507 |
\subsection{Output Format}
|
|
508 |
\label{output-format}
|
|
509 |
||
510 |
\begin{enum}
|
|
511 |
||
512 |
\opfalse{verbose}{quiet}
|
|
513 |
Specifies whether the \textbf{sledgehammer} command should explain what it does.
|
|
514 |
||
515 |
\opfalse{debug}{no\_debug}
|
|
516 |
Specifies whether Nitpick should display additional debugging information beyond |
|
517 |
what \textit{verbose} already displays. Enabling \textit{debug} also enables
|
|
518 |
\textit{verbose} behind the scenes.
|
|
519 |
||
520 |
\nopagebreak |
|
521 |
{\small See also \textit{overlord} (\S\ref{mode-of-operation}).}
|
|
522 |
||
523 |
\opfalse{isar\_proof}{no\_isar\_proof}
|
|
524 |
Specifies whether Isar proofs should be output in addition to one-liner |
|
525 |
\textit{metis} proofs. Isar proof construction is still experimental and often
|
|
526 |
fails; however, they are usually faster and sometimes more robust than |
|
527 |
\textit{metis} proofs.
|
|
528 |
||
529 |
\opdefault{isar\_shrink\_factor}{int}{1}
|
|
530 |
Specifies the granularity of the Isar proof. A value of $n$ indicates that each |
|
531 |
Isar proof step should correspond to a group of up to $n$ consecutive proof |
|
532 |
steps in the ATP proof. |
|
533 |
||
534 |
\end{enum}
|
|
535 |
||
| 38984 | 536 |
\subsection{Authentication}
|
537 |
\label{authentication}
|
|
538 |
||
539 |
\begin{enum}
|
|
540 |
\opnodefault{expect}{string}
|
|
541 |
Specifies the expected outcome, which must be one of the following: |
|
| 36926 | 542 |
|
543 |
\begin{enum}
|
|
| 38984 | 544 |
\item[$\bullet$] \textbf{\textit{some}:} Sledgehammer found a (potentially unsound) proof.
|
545 |
\item[$\bullet$] \textbf{\textit{none}:} Sledgehammer found no proof.
|
|
546 |
\item[$\bullet$] \textbf{\textit{unknown}:} Sledgehammer encountered some problem.
|
|
547 |
\end{enum}
|
|
548 |
||
549 |
Sledgehammer emits an error (if \textit{blocking} is enabled) or a warning
|
|
550 |
(otherwise) if the actual outcome differs from the expected outcome. This option |
|
551 |
is useful for regression testing. |
|
552 |
||
553 |
\nopagebreak |
|
554 |
{\small See also \textit{blocking} (\S\ref{mode-of-operation}).}
|
|
| 36926 | 555 |
\end{enum}
|
556 |
||
557 |
\let\em=\sl |
|
558 |
\bibliography{../manual}{}
|
|
559 |
\bibliographystyle{abbrv}
|
|
560 |
||
561 |
\end{document}
|