summary |
shortlog |
changelog |
graph |
tags |
branches |
files |
changeset |
file |
revisions |
annotate |
diff |
raw

doc-src/Sledgehammer/sledgehammer.tex

author | haftmann |

Wed Sep 07 23:38:52 2011 +0200 (2011-09-07) | |

changeset 44818 | 27ba81ad0890 |

parent 44816 | efa1f532508b |

child 45048 | 59ca831deef4 |

permissions | -rw-r--r-- |

theory of saturated naturals contributed by Peter Gammie

1 \documentclass[a4paper,12pt]{article}

2 \usepackage[T1]{fontenc}

3 \usepackage{amsmath}

4 \usepackage{amssymb}

5 \usepackage[english,french]{babel}

6 \usepackage{color}

7 \usepackage{footmisc}

8 \usepackage{graphicx}

9 %\usepackage{mathpazo}

10 \usepackage{multicol}

11 \usepackage{stmaryrd}

12 %\usepackage[scaled=.85]{beramono}

13 \usepackage{../../lib/texinputs/isabelle,../iman,../pdfsetup}

15 \def\qty#1{\ensuremath{\left<\mathit{#1\/}\right>}}

16 \def\qtybf#1{$\mathbf{\left<\textbf{\textit{#1\/}}\right>}$}

18 %\oddsidemargin=4.6mm

19 %\evensidemargin=4.6mm

20 %\textwidth=150mm

21 %\topmargin=4.6mm

22 %\headheight=0mm

23 %\headsep=0mm

24 %\textheight=234mm

26 \def\Colon{\mathord{:\mkern-1.5mu:}}

27 %\def\lbrakk{\mathopen{\lbrack\mkern-3.25mu\lbrack}}

28 %\def\rbrakk{\mathclose{\rbrack\mkern-3.255mu\rbrack}}

29 \def\lparr{\mathopen{(\mkern-4mu\mid}}

30 \def\rparr{\mathclose{\mid\mkern-4mu)}}

32 \def\unk{{?}}

33 \def\undef{(\lambda x.\; \unk)}

34 %\def\unr{\textit{others}}

35 \def\unr{\ldots}

36 \def\Abs#1{\hbox{\rm{\flqq}}{\,#1\,}\hbox{\rm{\frqq}}}

37 \def\Q{{\smash{\lower.2ex\hbox{$\scriptstyle?$}}}}

39 \urlstyle{tt}

41 \begin{document}

43 \selectlanguage{english}

45 \title{\includegraphics[scale=0.5]{isabelle_sledgehammer} \\[4ex]

46 Hammering Away \\[\smallskipamount]

47 \Large A User's Guide to Sledgehammer for Isabelle/HOL}

48 \author{\hbox{} \\

49 Jasmin Christian Blanchette \\

50 {\normalsize Institut f\"ur Informatik, Technische Universit\"at M\"unchen} \\[4\smallskipamount]

51 {\normalsize with contributions from} \\[4\smallskipamount]

52 Lawrence C. Paulson \\

53 {\normalsize Computer Laboratory, University of Cambridge} \\

54 \hbox{}}

56 \maketitle

58 \tableofcontents

60 \setlength{\parskip}{.7em plus .2em minus .1em}

61 \setlength{\parindent}{0pt}

62 \setlength{\abovedisplayskip}{\parskip}

63 \setlength{\abovedisplayshortskip}{.9\parskip}

64 \setlength{\belowdisplayskip}{\parskip}

65 \setlength{\belowdisplayshortskip}{.9\parskip}

67 % General-purpose enum environment with correct spacing

68 \newenvironment{enum}%

69 {\begin{list}{}{%

70 \setlength{\topsep}{.1\parskip}%

71 \setlength{\partopsep}{.1\parskip}%

72 \setlength{\itemsep}{\parskip}%

73 \advance\itemsep by-\parsep}}

74 {\end{list}}

76 \def\pre{\begingroup\vskip0pt plus1ex\advance\leftskip by\leftmargin

77 \advance\rightskip by\leftmargin}

78 \def\post{\vskip0pt plus1ex\endgroup}

80 \def\prew{\pre\advance\rightskip by-\leftmargin}

81 \def\postw{\post}

83 \section{Introduction}

84 \label{introduction}

86 Sledgehammer is a tool that applies automatic theorem provers (ATPs)

87 and satisfiability-modulo-theories (SMT) solvers on the current goal. The

88 supported ATPs are E \cite{schulz-2002}, E-SInE \cite{sine}, E-ToFoF

89 \cite{tofof}, LEO-II \cite{leo2}, Satallax \cite{satallax}, SNARK \cite{snark},

90 SPASS \cite{weidenbach-et-al-2009}, Vampire \cite{riazanov-voronkov-2002}, and

91 Waldmeister \cite{waldmeister}. The ATPs are run either locally or remotely via

92 the System\-On\-TPTP web service \cite{sutcliffe-2000}. In addition to the ATPs,

93 the SMT solvers Z3 \cite{z3} is used by default, and you can tell Sledgehammer

94 to try CVC3 \cite{cvc3} and Yices \cite{yices} as well; these are run either

95 locally or on a server at the TU M\"unchen.

97 The problem passed to the automatic provers consists of your current goal

98 together with a heuristic selection of hundreds of facts (theorems) from the

99 current theory context, filtered by relevance. Because jobs are run in the

100 background, you can continue to work on your proof by other means. Provers can

101 be run in parallel. Any reply (which may arrive half a minute later) will appear

102 in the Proof General response buffer.

104 The result of a successful proof search is some source text that usually (but

105 not always) reconstructs the proof within Isabelle. For ATPs, the reconstructed

106 proof relies on the general-purpose Metis prover, which is fully integrated into

107 Isabelle/HOL, with explicit inferences going through the kernel. Thus its

108 results are correct by construction.

110 In this manual, we will explicitly invoke the \textbf{sledgehammer} command.

111 Sledgehammer also provides an automatic mode that can be enabled via the ``Auto

112 Sledgehammer'' option in Proof General's ``Isabelle'' menu. In this mode,

113 Sledgehammer is run on every newly entered theorem. The time limit for Auto

114 Sledgehammer and other automatic tools can be set using the ``Auto Tools Time

115 Limit'' option.

117 \newbox\boxA

118 \setbox\boxA=\hbox{\texttt{nospam}}

120 \newcommand\authoremail{\texttt{blan{\color{white}nospam}\kern-\wd\boxA{}chette@\allowbreak

121 in.\allowbreak tum.\allowbreak de}}

123 To run Sledgehammer, you must make sure that the theory \textit{Sledgehammer} is

124 imported---this is rarely a problem in practice since it is part of

125 \textit{Main}. Examples of Sledgehammer use can be found in Isabelle's

126 \texttt{src/HOL/Metis\_Examples} directory.

127 Comments and bug reports concerning Sledgehammer or this manual should be

128 directed to the author at \authoremail.

130 \vskip2.5\smallskipamount

132 %\textbf{Acknowledgment.} The author would like to thank Mark Summerfield for

133 %suggesting several textual improvements.

135 \section{Installation}

136 \label{installation}

138 Sledgehammer is part of Isabelle, so you don't need to install it. However, it

139 relies on third-party automatic theorem provers (ATPs) and SMT solvers.

141 \subsection{Installing ATPs}

143 Currently, E, LEO-II, Satallax, SPASS, and Vampire can be run locally; in

144 addition, E, E-SInE, E-ToFoF, LEO-II, Satallax, SNARK, Waldmeister, and Vampire

145 are available remotely via System\-On\-TPTP \cite{sutcliffe-2000}. If you want

146 better performance, you should at least install E and SPASS locally.

148 There are three main ways to install ATPs on your machine:

150 \begin{enum}

151 \item[$\bullet$] If you installed an official Isabelle package with everything

152 inside, it should already include properly setup executables for E and SPASS,

153 ready to use.%

154 \footnote{Vampire's license prevents us from doing the same for this otherwise

155 wonderful tool.}

157 \item[$\bullet$] Alternatively, you can download the Isabelle-aware E and SPASS

158 binary packages from Isabelle's download page. Extract the archives, then add a

159 line to your \texttt{\$ISABELLE\_HOME\_USER/etc/components}%

160 \footnote{The variable \texttt{\$ISABELLE\_HOME\_USER} is set by Isabelle at

161 startup. Its value can be retrieved by invoking \texttt{isabelle}

162 \texttt{getenv} \texttt{ISABELLE\_HOME\_USER} on the command line.}

163 file with the absolute

164 path to E or SPASS. For example, if the \texttt{components} does not exist yet

165 and you extracted SPASS to \texttt{/usr/local/spass-3.7}, create the

166 \texttt{components} file with the single line

168 \prew

169 \texttt{/usr/local/spass-3.7}

170 \postw

172 in it.

174 \item[$\bullet$] If you prefer to build E or SPASS yourself, or obtained a

175 Vampire executable from somewhere (e.g., \url{http://www.vprover.org/}),

176 set the environment variable \texttt{E\_HOME}, \texttt{SPASS\_HOME}, or

177 \texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{eproof},

178 \texttt{SPASS}, or \texttt{vampire} executable. Sledgehammer has been tested

179 with E 1.0 to 1.3, SPASS 3.5 and 3.7, and Vampire 0.6, 1.0, and 1.8%

180 \footnote{Following the rewrite of Vampire, the counter for version numbers was

181 reset to 0; hence the (new) Vampire versions 0.6, 1.0, and 1.8 are more recent

182 than, say, Vampire 9.0 or 11.5.}%

183 . Since the ATPs' output formats are neither documented nor stable, other

184 versions of the ATPs might or might not work well with Sledgehammer. Ideally,

185 also set \texttt{E\_VERSION}, \texttt{SPASS\_VERSION}, or

186 \texttt{VAMPIRE\_VERSION} to the ATP's version number (e.g., ``1.4'').

187 \end{enum}

189 To check whether E and SPASS are successfully installed, follow the example in

190 \S\ref{first-steps}. If the remote versions of E and SPASS are used (identified

191 by the prefix ``\emph{remote\_}''), or if the local versions fail to solve the

192 easy goal presented there, this is a sign that something is wrong with your

193 installation.

195 Remote ATP invocation via the SystemOnTPTP web service requires Perl with the

196 World Wide Web Library (\texttt{libwww-perl}) installed. If you must use a proxy

197 server to access the Internet, set the \texttt{http\_proxy} environment variable

198 to the proxy, either in the environment in which Isabelle is launched or in your

199 \texttt{\char`\~/\$ISABELLE\_HOME\_USER/etc/settings} file. Here are a few examples:

201 \prew

202 \texttt{http\_proxy=http://proxy.example.org} \\

203 \texttt{http\_proxy=http://proxy.example.org:8080} \\

204 \texttt{http\_proxy=http://joeblow:pAsSwRd@proxy.example.org}

205 \postw

207 \subsection{Installing SMT Solvers}

209 CVC3, Yices, and Z3 can be run locally or (for CVC3 and Z3) remotely on a TU

210 M\"unchen server. If you want better performance and get the ability to replay

211 proofs that rely on the \emph{smt} proof method, you should at least install Z3

212 locally.

214 There are two main ways of installing SMT solvers locally.

216 \begin{enum}

217 \item[$\bullet$] If you installed an official Isabelle package with everything

218 inside, it should already include properly setup executables for CVC3 and Z3,

219 ready to use.%

220 \footnote{Yices's license prevents us from doing the same for this otherwise

221 wonderful tool.}

222 For Z3, you additionally need to set the environment variable

223 \texttt{Z3\_NON\_COMMERCIAL} to ``yes'' to confirm that you are a noncommercial

224 user.

226 \item[$\bullet$] Otherwise, follow the instructions documented in the \emph{SMT}

227 theory (\texttt{\$ISABELLE\_HOME/src/HOL/SMT.thy}).

228 \end{enum}

230 \section{First Steps}

231 \label{first-steps}

233 To illustrate Sledgehammer in context, let us start a theory file and

234 attempt to prove a simple lemma:

236 \prew

237 \textbf{theory}~\textit{Scratch} \\

238 \textbf{imports}~\textit{Main} \\

239 \textbf{begin} \\[2\smallskipamount]

240 %

241 \textbf{lemma} ``$[a] = [b] \,\Longrightarrow\, a = b$'' \\

242 \textbf{sledgehammer}

243 \postw

245 Instead of issuing the \textbf{sledgehammer} command, you can also find

246 Sledgehammer in the ``Commands'' submenu of the ``Isabelle'' menu in Proof

247 General or press the Emacs key sequence C-c C-a C-s.

248 Either way, Sledgehammer produces the following output after a few seconds:

250 \prew

251 \slshape

252 Sledgehammer: ``\textit{e}'' on goal \\

253 $[a] = [b] \,\Longrightarrow\, a = b$ \\

254 Try this: \textbf{by} (\textit{metis last\_ConsL}) (64 ms). \\[3\smallskipamount]

255 %

256 Sledgehammer: ``\textit{vampire}'' on goal \\

257 $[a] = [b] \,\Longrightarrow\, a = b$ \\

258 Try this: \textbf{by} (\textit{metis hd.simps}) (14 ms). \\[3\smallskipamount]

259 %

260 Sledgehammer: ``\textit{spass}'' on goal \\

261 $[a] = [b] \,\Longrightarrow\, a = b$ \\

262 Try this: \textbf{by} (\textit{metis list.inject}) (17 ms). \\[3\smallskipamount]

263 %

264 Sledgehammer: ``\textit{remote\_waldmeister}'' on goal \\

265 $[a] = [b] \,\Longrightarrow\, a = b$ \\

266 Try this: \textbf{by} (\textit{metis hd.simps}) (15 ms). \\[3\smallskipamount]

267 %

268 Sledgehammer: ``\textit{remote\_e\_sine}'' on goal \\

269 $[a] = [b] \,\Longrightarrow\, a = b$ \\

270 Try this: \textbf{by} (\textit{metis hd.simps}) (18 ms). \\[3\smallskipamount]

271 %

272 Sledgehammer: ``\textit{remote\_z3}'' on goal \\

273 $[a] = [b] \,\Longrightarrow\, a = b$ \\

274 Try this: \textbf{by} (\textit{metis list.inject}) (20 ms).

275 \postw

277 Sledgehammer ran E, E-SInE, SPASS, Vampire, Waldmeister, and Z3 in parallel.

278 Depending on which provers are installed and how many processor cores are

279 available, some of the provers might be missing or present with a

280 \textit{remote\_} prefix. Waldmeister is run only for unit equational problems,

281 where the goal's conclusion is a (universally quantified) equation.

283 For each successful prover, Sledgehammer gives a one-liner proof that uses Metis

284 or the \textit{smt} proof method. For Metis, approximate timings are shown in

285 parentheses, indicating how fast the call is. You can click the proof to insert

286 it into the theory text.

288 In addition, you can ask Sledgehammer for an Isar text proof by passing the

289 \textit{isar\_proof} option (\S\ref{output-format}):

291 \prew

292 \textbf{sledgehammer} [\textit{isar\_proof}]

293 \postw

295 When Isar proof construction is successful, it can yield proofs that are more

296 readable and also faster than the Metis one-liners. This feature is experimental

297 and is only available for ATPs.

299 \section{Hints}

300 \label{hints}

302 This section presents a few hints that should help you get the most out of

303 Sledgehammer and Metis. Frequently (and infrequently) asked questions are

304 answered in \S\ref{frequently-asked-questions}.

306 \newcommand\point[1]{\medskip\par{\sl\bfseries#1}\par\nopagebreak}

308 \point{Presimplify the goal}

310 For best results, first simplify your problem by calling \textit{auto} or at

311 least \textit{safe} followed by \textit{simp\_all}. The SMT solvers provide

312 arithmetic decision procedures, but the ATPs typically do not (or if they do,

313 Sledgehammer does not use it yet). Apart from Waldmeister, they are not

314 especially good at heavy rewriting, but because they regard equations as

315 undirected, they often prove theorems that require the reverse orientation of a

316 \textit{simp} rule. Higher-order problems can be tackled, but the success rate

317 is better for first-order problems. Hence, you may get better results if you

318 first simplify the problem to remove higher-order features.

320 \point{Make sure at least E, SPASS, Vampire, and Z3 are installed}

322 Locally installed provers are faster and more reliable than those running on

323 servers. See \S\ref{installation} for details on how to install them.

325 \point{Familiarize yourself with the most important options}

327 Sledgehammer's options are fully documented in \S\ref{command-syntax}. Many of

328 the options are very specialized, but serious users of the tool should at least

329 familiarize themselves with the following options:

331 \begin{enum}

332 \item[$\bullet$] \textbf{\textit{provers}} (\S\ref{mode-of-operation}) specifies

333 the automatic provers (ATPs and SMT solvers) that should be run whenever

334 Sledgehammer is invoked (e.g., ``\textit{provers}~= \textit{e spass

335 remote\_vampire}''). For convenience, you can omit ``\textit{provers}~=''

336 and simply write the prover names as a space-separated list (e.g., ``\textit{e

337 spass remote\_vampire}'').

339 \item[$\bullet$] \textbf{\textit{max\_relevant}} (\S\ref{relevance-filter})

340 specifies the maximum number of facts that should be passed to the provers. By

341 default, the value is prover-dependent but varies between about 150 and 1000. If

342 the provers time out, you can try lowering this value to, say, 100 or 50 and see

343 if that helps.

345 \item[$\bullet$] \textbf{\textit{isar\_proof}} (\S\ref{output-format}) specifies

346 that Isar proofs should be generated, instead of one-liner Metis proofs. The

347 length of the Isar proofs can be controlled by setting

348 \textit{isar\_shrink\_factor} (\S\ref{output-format}).

350 \item[$\bullet$] \textbf{\textit{timeout}} (\S\ref{timeouts}) controls the

351 provers' time limit. It is set to 30 seconds, but since Sledgehammer runs

352 asynchronously you should not hesitate to raise this limit to 60 or 120 seconds

353 if you are the kind of user who can think clearly while ATPs are active.

354 \end{enum}

356 Options can be set globally using \textbf{sledgehammer\_params}

357 (\S\ref{command-syntax}). The command also prints the list of all available

358 options with their current value. Fact selection can be influenced by specifying

359 ``$(\textit{add}{:}~\textit{my\_facts})$'' after the \textbf{sledgehammer} call

360 to ensure that certain facts are included, or simply ``$(\textit{my\_facts})$''

361 to force Sledgehammer to run only with $\textit{my\_facts}$.

363 \section{Frequently Asked Questions}

364 \label{frequently-asked-questions}

366 This sections answers frequently (and infrequently) asked questions about

367 Sledgehammer. It is a good idea to skim over it now even if you don't have any

368 questions at this stage. And if you have any further questions not listed here,

369 send them to the author at \authoremail.

371 \point{Why does Metis fail to reconstruct the proof?}

373 There are many reasons. If Metis runs seemingly forever, that is a sign that the

374 proof is too difficult for it. Metis's search is complete, so it should

375 eventually find it, but that's little consolation. There are several possible

376 solutions:

378 \begin{enum}

379 \item[$\bullet$] Try the \textit{isar\_proof} option (\S\ref{output-format}) to

380 obtain a step-by-step Isar proof where each step is justified by Metis. Since

381 the steps are fairly small, Metis is more likely to be able to replay them.

383 \item[$\bullet$] Try the \textit{smt} proof method instead of Metis. It is

384 usually stronger, but you need to have Z3 available to replay the proofs, trust

385 the SMT solver, or use certificates. See the documentation in the \emph{SMT}

386 theory (\texttt{\$ISABELLE\_HOME/src/HOL/SMT.thy}) for details.

388 \item[$\bullet$] Try the \textit{blast} or \textit{auto} proof methods, passing

389 the necessary facts via \textbf{unfolding}, \textbf{using}, \textit{intro}{:},

390 \textit{elim}{:}, \textit{dest}{:}, or \textit{simp}{:}, as appropriate.

391 \end{enum}

393 In some rare cases, Metis fails fairly quickly, and you get the error message

395 \prew

396 \slshape

397 Proof reconstruction failed.

398 \postw

400 This message usually indicates that Sledgehammer found a type-incorrect proof.

401 This was a frequent issue with older versions of Sledgehammer, which did not

402 supply enough typing information to the ATPs by default. If you notice many

403 unsound proofs and are not using \textit{type\_enc} (\S\ref{problem-encoding}),

404 contact the author at \authoremail.

406 \point{How can I tell whether a generated proof is sound?}

408 First, if Metis can reconstruct it, the proof is sound (assuming Isabelle's

409 inference kernel is sound). If it fails or runs seemingly forever, you can try

411 \prew

412 \textbf{apply}~\textbf{--} \\

413 \textbf{sledgehammer} [\textit{sound}] (\textit{metis\_facts})

414 \postw

416 where \textit{metis\_facts} is the list of facts appearing in the suggested

417 Metis call. The automatic provers should be able to re-find the proof quickly if

418 it is sound, and the \textit{sound} option (\S\ref{problem-encoding}) ensures

419 that no unsound proofs are found.

421 \point{Which facts are passed to the automatic provers?}

423 The relevance filter assigns a score to every available fact (lemma, theorem,

424 definition, or axiom)\ based upon how many constants that fact shares with the

425 conjecture. This process iterates to include facts relevant to those just

426 accepted, but with a decay factor to ensure termination. The constants are

427 weighted to give unusual ones greater significance. The relevance filter copes

428 best when the conjecture contains some unusual constants; if all the constants

429 are common, it is unable to discriminate among the hundreds of facts that are

430 picked up. The relevance filter is also memoryless: It has no information about

431 how many times a particular fact has been used in a proof, and it cannot learn.

433 The number of facts included in a problem varies from prover to prover, since

434 some provers get overwhelmed more easily than others. You can show the number of

435 facts given using the \textit{verbose} option (\S\ref{output-format}) and the

436 actual facts using \textit{debug} (\S\ref{output-format}).

438 Sledgehammer is good at finding short proofs combining a handful of existing

439 lemmas. If you are looking for longer proofs, you must typically restrict the

440 number of facts, by setting the \textit{max\_relevant} option

441 (\S\ref{relevance-filter}) to, say, 25 or 50.

443 You can also influence which facts are actually selected in a number of ways. If

444 you simply want to ensure that a fact is included, you can specify it using the

445 ``$(\textit{add}{:}~\textit{my\_facts})$'' syntax. For example:

446 %

447 \prew

448 \textbf{sledgehammer} (\textit{add}: \textit{hd.simps} \textit{tl.simps})

449 \postw

450 %

451 The specified facts then replace the least relevant facts that would otherwise be

452 included; the other selected facts remain the same.

453 If you want to direct the selection in a particular direction, you can specify

454 the facts via \textbf{using}:

455 %

456 \prew

457 \textbf{using} \textit{hd.simps} \textit{tl.simps} \\

458 \textbf{sledgehammer}

459 \postw

460 %

461 The facts are then more likely to be selected than otherwise, and if they are

462 selected at iteration $j$ they also influence which facts are selected at

463 iterations $j + 1$, $j + 2$, etc. To give them even more weight, try

464 %

465 \prew

466 \textbf{using} \textit{hd.simps} \textit{tl.simps} \\

467 \textbf{apply}~\textbf{--} \\

468 \textbf{sledgehammer}

469 \postw

471 \point{Why are the generated Isar proofs so ugly/detailed/broken?}

473 The current implementation is experimental and explodes exponentially in the

474 worst case. Work on a new implementation has begun. There is a large body of

475 research into transforming resolution proofs into natural deduction proofs (such

476 as Isar proofs), which we hope to leverage. In the meantime, a workaround is to

477 set the \textit{isar\_shrink\_factor} option (\S\ref{output-format}) to a larger

478 value or to try several provers and keep the nicest-looking proof.

480 \point{What are the \textit{full\_types} and \textit{no\_types} arguments to

481 Metis?}

483 The \textit{metis}~(\textit{full\_types}) proof method is the fully-typed

484 version of Metis. It is somewhat slower than \textit{metis}, but the proof

485 search is fully typed, and it also includes more powerful rules such as the

486 axiom ``$x = \mathit{True} \mathrel{\lor} x = \mathit{False}$'' for reasoning in

487 higher-order places (e.g., in set comprehensions). The method kicks in

488 automatically as a fallback when \textit{metis} fails, and it is sometimes

489 generated by Sledgehammer instead of \textit{metis} if the proof obviously

490 requires type information or if \textit{metis} failed when Sledgehammer

491 preplayed the proof. (By default, Sledgehammer tries to run \textit{metis} with

492 various options for up to 4 seconds to ensure that the generated one-line proofs

493 actually work and to display timing information. This can be configured using

494 the \textit{preplay\_timeout} option (\S\ref{timeouts}).)

496 At the other end of the soundness spectrum, \textit{metis} (\textit{no\_types})

497 uses no type information at all during the proof search, which is more efficient

498 but often fails. Calls to \textit{metis} (\textit{no\_types}) are occasionally

499 generated by Sledgehammer.

501 Incidentally, if you see the warning

503 \prew

504 \slshape

505 Metis: Falling back on ``\textit{metis} (\textit{full\_types})''.

506 \postw

508 in a successful Metis proof, you can advantageously pass the

509 \textit{full\_types} option to \textit{metis} directly.

511 \point{Are generated proofs minimal?}

513 Automatic provers frequently use many more facts than are necessary.

514 Sledgehammer inclues a minimization tool that takes a set of facts returned by a

515 given prover and repeatedly calls the same prover or Metis with subsets of those

516 axioms in order to find a minimal set. Reducing the number of axioms typically

517 improves Metis's speed and success rate, while also removing superfluous clutter

518 from the proof scripts.

520 In earlier versions of Sledgehammer, generated proofs were systematically

521 accompanied by a suggestion to invoke the minimization tool. This step is now

522 performed implicitly if it can be done in a reasonable amount of time (something

523 that can be guessed from the number of facts in the original proof and the time

524 it took to find it or replay it).

526 In addition, some provers (notably CVC3, Satallax, and Yices) do not provide

527 proofs or sometimes produce incomplete proofs. The minimizer is then invoked to

528 find out which facts are actually needed from the (large) set of facts that was

529 initinally given to the prover. Finally, if a prover returns a proof with lots

530 of facts, the minimizer is invoked automatically since Metis would be unlikely

531 to re-find the proof.

533 \point{A strange error occurred---what should I do?}

535 Sledgehammer tries to give informative error messages. Please report any strange

536 error to the author at \authoremail. This applies double if you get the message

538 \prew

539 \slshape

540 The prover found a type-unsound proof involving ``\textit{foo}'',

541 ``\textit{bar}'', and ``\textit{baz}'' even though a supposedly type-sound

542 encoding was used (or, less likely, your axioms are inconsistent). You might

543 want to report this to the Isabelle developers.

544 \postw

546 \point{Auto can solve it---why not Sledgehammer?}

548 Problems can be easy for \textit{auto} and difficult for automatic provers, but

549 the reverse is also true, so don't be discouraged if your first attempts fail.

550 Because the system refers to all theorems known to Isabelle, it is particularly

551 suitable when your goal has a short proof from lemmas that you don't know about.

553 \point{Why are there so many options?}

555 Sledgehammer's philosophy should work out of the box, without user guidance.

556 Many of the options are meant to be used mostly by the Sledgehammer developers

557 for experimentation purposes. Of course, feel free to experiment with them if

558 you are so inclined.

560 \section{Command Syntax}

561 \label{command-syntax}

563 Sledgehammer can be invoked at any point when there is an open goal by entering

564 the \textbf{sledgehammer} command in the theory file. Its general syntax is as

565 follows:

567 \prew

568 \textbf{sledgehammer} \qty{subcommand}$^?$ \qty{options}$^?$ \qty{facts\_override}$^?$ \qty{num}$^?$

569 \postw

571 For convenience, Sledgehammer is also available in the ``Commands'' submenu of

572 the ``Isabelle'' menu in Proof General or by pressing the Emacs key sequence C-c

573 C-a C-s. This is equivalent to entering the \textbf{sledgehammer} command with

574 no arguments in the theory text.

576 In the general syntax, the \qty{subcommand} may be any of the following:

578 \begin{enum}

579 \item[$\bullet$] \textbf{\textit{run} (the default):} Runs Sledgehammer on

580 subgoal number \qty{num} (1 by default), with the given options and facts.

582 \item[$\bullet$] \textbf{\textit{min}:} Attempts to minimize the facts

583 specified in the \qty{facts\_override} argument to obtain a simpler proof

584 involving fewer facts. The options and goal number are as for \textit{run}.

586 \item[$\bullet$] \textbf{\textit{messages}:} Redisplays recent messages issued

587 by Sledgehammer. This allows you to examine results that might have been lost

588 due to Sledgehammer's asynchronous nature. The \qty{num} argument specifies a

589 limit on the number of messages to display (5 by default).

591 \item[$\bullet$] \textbf{\textit{supported\_provers}:} Prints the list of

592 automatic provers supported by Sledgehammer. See \S\ref{installation} and

593 \S\ref{mode-of-operation} for more information on how to install automatic

594 provers.

596 \item[$\bullet$] \textbf{\textit{running\_provers}:} Prints information about

597 currently running automatic provers, including elapsed runtime and remaining

598 time until timeout.

600 \item[$\bullet$] \textbf{\textit{kill\_provers}:} Terminates all running

601 automatic provers.

603 \item[$\bullet$] \textbf{\textit{refresh\_tptp}:} Refreshes the list of remote

604 ATPs available at System\-On\-TPTP \cite{sutcliffe-2000}.

605 \end{enum}

607 Sledgehammer's behavior can be influenced by various \qty{options}, which can be

608 specified in brackets after the \textbf{sledgehammer} command. The

609 \qty{options} are a list of key--value pairs of the form ``[$k_1 = v_1,

610 \ldots, k_n = v_n$]''. For Boolean options, ``= \textit{true}'' is optional. For

611 example:

613 \prew

614 \textbf{sledgehammer} [\textit{isar\_proof}, \,\textit{timeout} = 120]

615 \postw

617 Default values can be set using \textbf{sledgehammer\_\allowbreak params}:

619 \prew

620 \textbf{sledgehammer\_params} \qty{options}

621 \postw

623 The supported options are described in \S\ref{option-reference}.

625 The \qty{facts\_override} argument lets you alter the set of facts that go

626 through the relevance filter. It may be of the form ``(\qty{facts})'', where

627 \qty{facts} is a space-separated list of Isabelle facts (theorems, local

628 assumptions, etc.), in which case the relevance filter is bypassed and the given

629 facts are used. It may also be of the form ``(\textit{add}:\ \qty{facts\/_{\mathrm{1}}})'',

630 ``(\textit{del}:\ \qty{facts\/_{\mathrm{2}}})'', or ``(\textit{add}:\ \qty{facts\/_{\mathrm{1}}}\

631 \textit{del}:\ \qty{facts\/_{\mathrm{2}}})'', where the relevance filter is instructed to

632 proceed as usual except that it should consider \qty{facts\/_{\mathrm{1}}}

633 highly-relevant and \qty{facts\/_{\mathrm{2}}} fully irrelevant.

635 You can instruct Sledgehammer to run automatically on newly entered theorems by

636 enabling the ``Auto Sledgehammer'' option in Proof General's ``Isabelle'' menu.

637 For automatic runs, only the first prover set using \textit{provers}

638 (\S\ref{mode-of-operation}) is considered, fewer facts are passed to the prover,

639 \textit{slicing} (\S\ref{mode-of-operation}) is disabled, \textit{sound}

640 (\S\ref{problem-encoding}) is enabled, \textit{verbose} (\S\ref{output-format})

641 and \textit{debug} (\S\ref{output-format}) are disabled, and \textit{timeout}

642 (\S\ref{timeouts}) is superseded by the ``Auto Tools Time Limit'' in Proof

643 General's ``Isabelle'' menu. Sledgehammer's output is also more concise.

645 The \textit{metis} proof method has the syntax

647 \prew

648 \textbf{\textit{metis}}~(\qty{type\_enc})${}^?$~\qty{facts}${}^?$

649 \postw

651 where \qty{type\_enc} is a type encoding specification with the same semantics

652 as Sledgehammer's \textit{type\_enc} option (\S\ref{problem-encoding}) and

653 \qty{facts} is a list of arbitrary facts. In addition to the values listed in

654 \S\ref{problem-encoding}, \qty{type\_enc} may also be \textit{full\_types}, in

655 which case an appropriate type-sound encoding is chosen, \textit{partial\_types}

656 (the default type-unsound encoding), or \textit{no\_types}, a synonym for

657 \textit{erased}.

659 \section{Option Reference}

660 \label{option-reference}

662 \def\defl{\{}

663 \def\defr{\}}

665 \def\flushitem#1{\item[]\noindent\kern-\leftmargin \textbf{#1}}

666 \def\optrue#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{true}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}

667 \def\opfalse#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{false}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}

668 \def\opsmart#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\enskip \defl\textit{smart}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}

669 \def\opnodefault#1#2{\flushitem{\textit{#1} = \qtybf{#2}} \nopagebreak\\[\parskip]}

670 \def\opnodefaultbrk#1#2{\flushitem{$\bigl[$\textit{#1} =$\bigr]$ \qtybf{#2}} \nopagebreak\\[\parskip]}

671 \def\opdefault#1#2#3{\flushitem{\textit{#1} = \qtybf{#2}\enskip \defl\textit{#3}\defr} \nopagebreak\\[\parskip]}

672 \def\oparg#1#2#3{\flushitem{\textit{#1} \qtybf{#2} = \qtybf{#3}} \nopagebreak\\[\parskip]}

673 \def\opargbool#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}

674 \def\opargboolorsmart#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}

676 Sledgehammer's options are categorized as follows:\ mode of operation

677 (\S\ref{mode-of-operation}), problem encoding (\S\ref{problem-encoding}),

678 relevance filter (\S\ref{relevance-filter}), output format

679 (\S\ref{output-format}), authentication (\S\ref{authentication}), and timeouts

680 (\S\ref{timeouts}).

682 The descriptions below refer to the following syntactic quantities:

684 \begin{enum}

685 \item[$\bullet$] \qtybf{string}: A string.

686 \item[$\bullet$] \qtybf{bool\/}: \textit{true} or \textit{false}.

687 \item[$\bullet$] \qtybf{smart\_bool\/}: \textit{true}, \textit{false}, or

688 \textit{smart}.

689 \item[$\bullet$] \qtybf{int\/}: An integer.

690 %\item[$\bullet$] \qtybf{float\/}: A floating-point number (e.g., 2.5).

691 \item[$\bullet$] \qtybf{float\_pair\/}: A pair of floating-point numbers

692 (e.g., 0.6 0.95).

693 \item[$\bullet$] \qtybf{smart\_int\/}: An integer or \textit{smart}.

694 \item[$\bullet$] \qtybf{float\_or\_none\/}: A floating-point number (e.g., 60 or

695 0.5) expressing a number of seconds, or the keyword \textit{none} ($\infty$

696 seconds).

697 \end{enum}

699 Default values are indicated in curly brackets (\textrm{\{\}}). Boolean options

700 have a negated counterpart (e.g., \textit{blocking} vs.\

701 \textit{non\_blocking}). When setting them, ``= \textit{true}'' may be omitted.

703 \subsection{Mode of Operation}

704 \label{mode-of-operation}

706 \begin{enum}

707 \opnodefaultbrk{provers}{string}

708 Specifies the automatic provers to use as a space-separated list (e.g.,

709 ``\textit{e}~\textit{spass}~\textit{remote\_vampire}''). The following local

710 provers are supported:

712 \begin{enum}

713 \item[$\bullet$] \textbf{\textit{cvc3}:} CVC3 is an SMT solver developed by

714 Clark Barrett, Cesare Tinelli, and their colleagues \cite{cvc3}. To use CVC3,

715 set the environment variable \texttt{CVC3\_SOLVER} to the complete path of the

716 executable, including the file name. Sledgehammer has been tested with version

717 2.2.

719 \item[$\bullet$] \textbf{\textit{e}:} E is a first-order resolution prover

720 developed by Stephan Schulz \cite{schulz-2002}. To use E, set the environment

721 variable \texttt{E\_HOME} to the directory that contains the \texttt{eproof}

722 executable, or install the prebuilt E package from Isabelle's download page. See

723 \S\ref{installation} for details.

725 \item[$\bullet$] \textbf{\textit{leo2}:} LEO-II is an automatic

726 higher-order prover developed by Christoph Benzm\"uller et al.\ \cite{leo2},

727 with support for the TPTP many-typed higher-order syntax (THF0).

729 \item[$\bullet$] \textbf{\textit{metis}:} Although it is much less powerful than

730 the external provers, Metis itself can be used for proof search.

732 \item[$\bullet$] \textbf{\textit{metis\_full\_types}:} Fully typed version of

733 Metis, corresponding to \textit{metis} (\textit{full\_types}).

735 \item[$\bullet$] \textbf{\textit{metis\_no\_types}:} Untyped version of Metis,

736 corresponding to \textit{metis} (\textit{no\_types}).

738 \item[$\bullet$] \textbf{\textit{satallax}:} Satallax is an automatic

739 higher-order prover developed by Chad Brown et al.\ \cite{satallax}, with

740 support for the TPTP many-typed higher-order syntax (THF0).

742 \item[$\bullet$] \textbf{\textit{spass}:} SPASS is a first-order resolution

743 prover developed by Christoph Weidenbach et al.\ \cite{weidenbach-et-al-2009}.

744 To use SPASS, set the environment variable \texttt{SPASS\_HOME} to the directory

745 that contains the \texttt{SPASS} executable, or install the prebuilt SPASS

746 package from Isabelle's download page. Sledgehammer requires version 3.5 or

747 above. See \S\ref{installation} for details.

749 \item[$\bullet$] \textbf{\textit{vampire}:} Vampire is a first-order resolution

750 prover developed by Andrei Voronkov and his colleagues

751 \cite{riazanov-voronkov-2002}. To use Vampire, set the environment variable

752 \texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{vampire}

753 executable and \texttt{VAMPIRE\_VERSION} to the version number (e.g., ``1.8'').

754 Sledgehammer has been tested with versions 0.6, 1.0, and 1.8. Vampire 1.8

755 supports the TPTP many-typed first-order format (TFF0).

757 \item[$\bullet$] \textbf{\textit{yices}:} Yices is an SMT solver developed at

758 SRI \cite{yices}. To use Yices, set the environment variable

759 \texttt{YICES\_SOLVER} to the complete path of the executable, including the

760 file name. Sledgehammer has been tested with version 1.0.

762 \item[$\bullet$] \textbf{\textit{z3}:} Z3 is an SMT solver developed at

763 Microsoft Research \cite{z3}. To use Z3, set the environment variable

764 \texttt{Z3\_SOLVER} to the complete path of the executable, including the file

765 name, and set \texttt{Z3\_NON\_COMMERCIAL} to ``yes'' to confirm that you are a

766 noncommercial user. Sledgehammer has been tested with versions 2.7 to 2.18.

768 \item[$\bullet$] \textbf{\textit{z3\_tptp}:} This version of Z3 pretends to be

769 an ATP, exploiting Z3's support for the TPTP untyped and many-typed first-order

770 formats (FOF and TFF0). It is included for experimental purposes. It requires

771 version 3.0 or above.

772 \end{enum}

774 In addition, the following remote provers are supported:

776 \begin{enum}

777 \item[$\bullet$] \textbf{\textit{remote\_cvc3}:} The remote version of CVC3 runs

778 on servers at the TU M\"unchen (or wherever \texttt{REMOTE\_SMT\_URL} is set to

779 point).

781 \item[$\bullet$] \textbf{\textit{remote\_e}:} The remote version of E runs

782 on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.

784 \item[$\bullet$] \textbf{\textit{remote\_e\_sine}:} E-SInE is a metaprover

785 developed by Kry\v stof Hoder \cite{sine} based on E. The remote version of

786 SInE runs on Geoff Sutcliffe's Miami servers.

788 \item[$\bullet$] \textbf{\textit{remote\_e\_tofof}:} E-ToFoF is a metaprover

789 developed by Geoff Sutcliffe \cite{tofof} based on E running on his Miami

790 servers. This ATP supports the TPTP many-typed first-order format (TFF0). The

791 remote version of E-ToFoF runs on Geoff Sutcliffe's Miami servers.

793 \item[$\bullet$] \textbf{\textit{remote\_leo2}:} The remote version of LEO-II

794 runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.

796 \item[$\bullet$] \textbf{\textit{remote\_satallax}:} The remote version of

797 Satallax runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.

799 \item[$\bullet$] \textbf{\textit{remote\_snark}:} SNARK is a first-order

800 resolution prover developed by Stickel et al.\ \cite{snark}. It supports the

801 TPTP many-typed first-order format (TFF0). The remote version of SNARK runs on

802 Geoff Sutcliffe's Miami servers.

804 \item[$\bullet$] \textbf{\textit{remote\_vampire}:} The remote version of

805 Vampire runs on Geoff Sutcliffe's Miami servers. Version 1.8 is used.

807 \item[$\bullet$] \textbf{\textit{remote\_waldmeister}:} Waldmeister is a unit

808 equality prover developed by Hillenbrand et al.\ \cite{waldmeister}. It can be

809 used to prove universally quantified equations using unconditional equations,

810 corresponding to the TPTP CNF UEQ division. The remote version of Waldmeister

811 runs on Geoff Sutcliffe's Miami servers.

813 \item[$\bullet$] \textbf{\textit{remote\_z3}:} The remote version of Z3 runs on

814 servers at the TU M\"unchen (or wherever \texttt{REMOTE\_SMT\_URL} is set to

815 point).

817 \item[$\bullet$] \textbf{\textit{remote\_z3\_tptp}:} The remote version of ``Z3

818 with TPTP syntax'' runs on Geoff Sutcliffe's Miami servers.

819 \end{enum}

821 By default, Sledgehammer runs E, E-SInE, SPASS, Vampire, Z3 (or whatever

822 the SMT module's \textit{smt\_solver} configuration option is set to), and (if

823 appropriate) Waldmeister in parallel---either locally or remotely, depending on

824 the number of processor cores available. For historical reasons, the default

825 value of this option can be overridden using the option ``Sledgehammer:

826 Provers'' in Proof General's ``Isabelle'' menu.

828 It is generally a good idea to run several provers in parallel. Running E,

829 SPASS, and Vampire for 5~seconds yields a similar success rate to running the

830 most effective of these for 120~seconds \cite{boehme-nipkow-2010}.

832 For the \textit{min} subcommand, the default prover is \textit{metis}. If

833 several provers are set, the first one is used.

835 \opnodefault{prover}{string}

836 Alias for \textit{provers}.

838 %\opnodefault{atps}{string}

839 %Legacy alias for \textit{provers}.

841 %\opnodefault{atp}{string}

842 %Legacy alias for \textit{provers}.

844 \opfalse{blocking}{non\_blocking}

845 Specifies whether the \textbf{sledgehammer} command should operate

846 synchronously. The asynchronous (non-blocking) mode lets the user start proving

847 the putative theorem manually while Sledgehammer looks for a proof, but it can

848 also be more confusing. Irrespective of the value of this option, Sledgehammer

849 is always run synchronously for the new jEdit-based user interface or if

850 \textit{debug} (\S\ref{output-format}) is enabled.

852 \optrue{slicing}{no\_slicing}

853 Specifies whether the time allocated to a prover should be sliced into several

854 segments, each of which has its own set of possibly prover-dependent options.

855 For SPASS and Vampire, the first slice tries the fast but incomplete

856 set-of-support (SOS) strategy, whereas the second slice runs without it. For E,

857 up to three slices are tried, with different weighted search strategies and

858 number of facts. For SMT solvers, several slices are tried with the same options

859 each time but fewer and fewer facts. According to benchmarks with a timeout of

860 30 seconds, slicing is a valuable optimization, and you should probably leave it

861 enabled unless you are conducting experiments. This option is implicitly

862 disabled for (short) automatic runs.

864 \nopagebreak

865 {\small See also \textit{verbose} (\S\ref{output-format}).}

867 \opfalse{overlord}{no\_overlord}

868 Specifies whether Sledgehammer should put its temporary files in

869 \texttt{\$ISA\-BELLE\_\allowbreak HOME\_\allowbreak USER}, which is useful for

870 debugging Sledgehammer but also unsafe if several instances of the tool are run

871 simultaneously. The files are identified by the prefix \texttt{prob\_}; you may

872 safely remove them after Sledgehammer has run.

874 \nopagebreak

875 {\small See also \textit{debug} (\S\ref{output-format}).}

876 \end{enum}

878 \subsection{Problem Encoding}

879 \label{problem-encoding}

881 \begin{enum}

882 \opdefault{type\_enc}{string}{smart}

883 Specifies the type encoding to use in ATP problems. Some of the type encodings

884 are unsound, meaning that they can give rise to spurious proofs

885 (unreconstructible using Metis). The supported type encodings are listed below,

886 with an indication of their soundness in parentheses:

888 \begin{enum}

889 \item[$\bullet$] \textbf{\textit{erased} (very unsound):} No type information is

890 supplied to the ATP. Types are simply erased.

892 \item[$\bullet$] \textbf{\textit{poly\_guards} (sound):} Types are encoded using

893 a predicate \textit{has\_\allowbreak type\/}$(\tau, t)$ that guards bound

894 variables. Constants are annotated with their types, supplied as additional

895 arguments, to resolve overloading.

897 \item[$\bullet$] \textbf{\textit{poly\_tags} (sound):} Each term and subterm is

898 tagged with its type using a function $\mathit{type\/}(\tau, t)$.

900 \item[$\bullet$] \textbf{\textit{poly\_args} (unsound):}

901 Like for \textit{poly\_guards} constants are annotated with their types to

902 resolve overloading, but otherwise no type information is encoded. This

903 coincides with the default encoding used by the \textit{metis} command.

905 \item[$\bullet$]

906 \textbf{%

907 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags} (sound); \\

908 \textit{raw\_mono\_args} (unsound):} \\

909 Similar to \textit{poly\_guards}, \textit{poly\_tags}, and \textit{poly\_args},

910 respectively, but the problem is additionally monomorphized, meaning that type

911 variables are instantiated with heuristically chosen ground types.

912 Monomorphization can simplify reasoning but also leads to larger fact bases,

913 which can slow down the ATPs.

915 \item[$\bullet$]

916 \textbf{%

917 \textit{mono\_guards}, \textit{mono\_tags} (sound);

918 \textit{mono\_args} (unsound):} \\

919 Similar to

920 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, and

921 \textit{raw\_mono\_args}, respectively but types are mangled in constant names

922 instead of being supplied as ground term arguments. The binary predicate

923 $\mathit{has\_type\/}(\tau, t)$ becomes a unary predicate

924 $\mathit{has\_type\_}\tau(t)$, and the binary function

925 $\mathit{type\/}(\tau, t)$ becomes a unary function

926 $\mathit{type\_}\tau(t)$.

928 \item[$\bullet$] \textbf{\textit{mono\_simple} (sound):} Exploits simple

929 first-order types if the prover supports the TFF0 or THF0 syntax; otherwise,

930 falls back on \textit{mono\_guards}. The problem is monomorphized.

932 \item[$\bullet$] \textbf{\textit{mono\_simple\_higher} (sound):} Exploits simple

933 higher-order types if the prover supports the THF0 syntax; otherwise, falls back

934 on \textit{mono\_simple} or \textit{mono\_guards}. The problem is monomorphized.

936 \item[$\bullet$]

937 \textbf{%

938 \textit{poly\_guards}?, \textit{poly\_tags}?, \textit{raw\_mono\_guards}?, \\

939 \textit{raw\_mono\_tags}?, \textit{mono\_guards}?, \textit{mono\_tags}?, \\

940 \textit{mono\_simple}? (quasi-sound):} \\

941 The type encodings \textit{poly\_guards}, \textit{poly\_tags},

942 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, \textit{mono\_guards},

943 \textit{mono\_tags}, and \textit{mono\_simple} are fully

944 typed and sound. For each of these, Sledgehammer also provides a lighter,

945 virtually sound variant identified by a question mark (`\hbox{?}')\ that detects

946 and erases monotonic types, notably infinite types. (For \textit{mono\_simple},

947 the types are not actually erased but rather replaced by a shared uniform type

948 of individuals.) As argument to the \textit{metis} proof method, the question

949 mark is replaced by a \hbox{``\textit{\_query}''} suffix. If the \emph{sound}

950 option is enabled, these encodings are fully sound.

952 \item[$\bullet$]

953 \textbf{%

954 \textit{poly\_guards}??, \textit{poly\_tags}??, \textit{raw\_mono\_guards}??, \\

955 \textit{raw\_mono\_tags}??, \textit{mono\_guards}??, \textit{mono\_tags}?? \\

956 (quasi-sound):} \\

957 Even lighter versions of the `\hbox{?}' encodings. As argument to the

958 \textit{metis} proof method, the `\hbox{??}' suffix is replaced by

959 \hbox{``\textit{\_query\_query}''}.

961 \item[$\bullet$]

962 \textbf{%

963 \textit{poly\_guards}@?, \textit{poly\_tags}@?, \textit{raw\_mono\_guards}@?, \\

964 \textit{raw\_mono\_tags}@? (quasi-sound):} \\

965 Alternative versions of the `\hbox{??}' encodings. As argument to the

966 \textit{metis} proof method, the `\hbox{@?}' suffix is replaced by

967 \hbox{``\textit{\_at\_query}''}.

969 \item[$\bullet$]

970 \textbf{%

971 \textit{poly\_guards}!, \textit{poly\_tags}!, \textit{raw\_mono\_guards}!, \\

972 \textit{raw\_mono\_tags}!, \textit{mono\_guards}!, \textit{mono\_tags}!, \\

973 \textit{mono\_simple}!, \textit{mono\_simple\_higher}! (mildly unsound):} \\

974 The type encodings \textit{poly\_guards}, \textit{poly\_tags},

975 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, \textit{mono\_guards},

976 \textit{mono\_tags}, \textit{mono\_simple}, and \textit{mono\_simple\_higher}

977 also admit a mildly unsound (but very efficient) variant identified by an

978 exclamation mark (`\hbox{!}') that detects and erases erases all types except

979 those that are clearly finite (e.g., \textit{bool}). (For \textit{mono\_simple}

980 and \textit{mono\_simple\_higher}, the types are not actually erased but rather

981 replaced by a shared uniform type of individuals.) As argument to the

982 \textit{metis} proof method, the exclamation mark is replaced by the suffix

983 \hbox{``\textit{\_bang}''}.

985 \item[$\bullet$]

986 \textbf{%

987 \textit{poly\_guards}!!, \textit{poly\_tags}!!, \textit{raw\_mono\_guards}!!, \\

988 \textit{raw\_mono\_tags}!!, \textit{mono\_guards}!!, \textit{mono\_tags}!! \\

989 (mildly unsound):} \\

990 Even lighter versions of the `\hbox{!}' encodings. As argument to the

991 \textit{metis} proof method, the `\hbox{!!}' suffix is replaced by

992 \hbox{``\textit{\_bang\_bang}''}.

994 \item[$\bullet$]

995 \textbf{%

996 \textit{poly\_guards}@!, \textit{poly\_tags}@!, \textit{raw\_mono\_guards}@!, \\

997 \textit{raw\_mono\_tags}@! (mildly unsound):} \\

998 Alternative versions of the `\hbox{!!}' encodings. As argument to the

999 \textit{metis} proof method, the `\hbox{@!}' suffix is replaced by

1000 \hbox{``\textit{\_at\_bang}''}.

1002 \item[$\bullet$] \textbf{\textit{smart}:} The actual encoding used depends on

1003 the ATP and should be the most efficient virtually sound encoding for that ATP.

1004 \end{enum}

1006 For SMT solvers, the type encoding is always \textit{mono\_simple}, irrespective

1007 of the value of this option.

1009 \nopagebreak

1010 {\small See also \textit{max\_new\_mono\_instances} (\S\ref{relevance-filter})

1011 and \textit{max\_mono\_iters} (\S\ref{relevance-filter}).}

1013 \opfalse{sound}{unsound}

1014 Specifies whether Sledgehammer should run in its fully sound mode. In that mode,

1015 quasi-sound type encodings (which are the default) are made fully sound, at the

1016 cost of some clutter in the generated problems. This option is ignored if

1017 \textit{type\_enc} is explicitly set to an unsound encoding.

1018 \end{enum}

1020 \subsection{Relevance Filter}

1021 \label{relevance-filter}

1023 \begin{enum}

1024 \opdefault{relevance\_thresholds}{float\_pair}{\upshape 0.45~0.85}

1025 Specifies the thresholds above which facts are considered relevant by the

1026 relevance filter. The first threshold is used for the first iteration of the

1027 relevance filter and the second threshold is used for the last iteration (if it

1028 is reached). The effective threshold is quadratically interpolated for the other

1029 iterations. Each threshold ranges from 0 to 1, where 0 means that all theorems

1030 are relevant and 1 only theorems that refer to previously seen constants.

1032 \opdefault{max\_relevant}{smart\_int}{smart}

1033 Specifies the maximum number of facts that may be returned by the relevance

1034 filter. If the option is set to \textit{smart}, it is set to a value that was

1035 empirically found to be appropriate for the prover. A typical value would be

1036 250.

1038 \opdefault{max\_new\_mono\_instances}{int}{\upshape 200}

1039 Specifies the maximum number of monomorphic instances to generate beyond

1040 \textit{max\_relevant}. The higher this limit is, the more monomorphic instances

1041 are potentially generated. Whether monomorphization takes place depends on the

1042 type encoding used.

1044 \nopagebreak

1045 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}

1047 \opdefault{max\_mono\_iters}{int}{\upshape 3}

1048 Specifies the maximum number of iterations for the monomorphization fixpoint

1049 construction. The higher this limit is, the more monomorphic instances are

1050 potentially generated. Whether monomorphization takes place depends on the

1051 type encoding used.

1053 \nopagebreak

1054 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}

1055 \end{enum}

1057 \subsection{Output Format}

1058 \label{output-format}

1060 \begin{enum}

1062 \opfalse{verbose}{quiet}

1063 Specifies whether the \textbf{sledgehammer} command should explain what it does.

1064 This option is implicitly disabled for automatic runs.

1066 \opfalse{debug}{no\_debug}

1067 Specifies whether Sledgehammer should display additional debugging information

1068 beyond what \textit{verbose} already displays. Enabling \textit{debug} also

1069 enables \textit{verbose} and \textit{blocking} (\S\ref{mode-of-operation})

1070 behind the scenes. The \textit{debug} option is implicitly disabled for

1071 automatic runs.

1073 \nopagebreak

1074 {\small See also \textit{overlord} (\S\ref{mode-of-operation}).}

1076 \opfalse{isar\_proof}{no\_isar\_proof}

1077 Specifies whether Isar proofs should be output in addition to one-liner

1078 \textit{metis} proofs. Isar proof construction is still experimental and often

1079 fails; however, they are usually faster and sometimes more robust than

1080 \textit{metis} proofs.

1082 \opdefault{isar\_shrink\_factor}{int}{\upshape 1}

1083 Specifies the granularity of the Isar proof. A value of $n$ indicates that each

1084 Isar proof step should correspond to a group of up to $n$ consecutive proof

1085 steps in the ATP proof.

1086 \end{enum}

1088 \subsection{Authentication}

1089 \label{authentication}

1091 \begin{enum}

1092 \opnodefault{expect}{string}

1093 Specifies the expected outcome, which must be one of the following:

1095 \begin{enum}

1096 \item[$\bullet$] \textbf{\textit{some}:} Sledgehammer found a (potentially

1097 unsound) proof.

1098 \item[$\bullet$] \textbf{\textit{none}:} Sledgehammer found no proof.

1099 \item[$\bullet$] \textbf{\textit{timeout}:} Sledgehammer timed out.

1100 \item[$\bullet$] \textbf{\textit{unknown}:} Sledgehammer encountered some

1101 problem.

1102 \end{enum}

1104 Sledgehammer emits an error (if \textit{blocking} is enabled) or a warning

1105 (otherwise) if the actual outcome differs from the expected outcome. This option

1106 is useful for regression testing.

1108 \nopagebreak

1109 {\small See also \textit{blocking} (\S\ref{mode-of-operation}) and

1110 \textit{timeout} (\S\ref{timeouts}).}

1111 \end{enum}

1113 \subsection{Timeouts}

1114 \label{timeouts}

1116 \begin{enum}

1117 \opdefault{timeout}{float\_or\_none}{\upshape 30}

1118 Specifies the maximum number of seconds that the automatic provers should spend

1119 searching for a proof. This excludes problem preparation and is a soft limit.

1120 For historical reasons, the default value of this option can be overridden using

1121 the option ``Sledgehammer: Time Limit'' in Proof General's ``Isabelle'' menu.

1123 \opdefault{preplay\_timeout}{float\_or\_none}{\upshape 4}

1124 Specifies the maximum number of seconds that Metis should be spent trying to

1125 ``preplay'' the found proof. If this option is set to 0, no preplaying takes

1126 place, and no timing information is displayed next to the suggested Metis calls.

1127 \end{enum}

1129 \let\em=\sl

1130 \bibliography{../manual}{}

1131 \bibliographystyle{abbrv}

1133 \end{document}