doc-src/Sledgehammer/sledgehammer.tex
author blanchet
Sun Apr 22 14:16:46 2012 +0200 (2012-04-22)
changeset 47672 1bf4fa90cd03
parent 47642 9a9218111085
child 47963 46c73ad5f7c0
permissions -rw-r--r--
fixed typos
     1 \documentclass[a4paper,12pt]{article}
     2 \usepackage[T1]{fontenc}
     3 \usepackage{amsmath}
     4 \usepackage{amssymb}
     5 \usepackage[english,french]{babel}
     6 \usepackage{color}
     7 \usepackage{footmisc}
     8 \usepackage{graphicx}
     9 %\usepackage{mathpazo}
    10 \usepackage{multicol}
    11 \usepackage{stmaryrd}
    12 %\usepackage[scaled=.85]{beramono}
    13 \usepackage{../../lib/texinputs/isabelle,../iman,../pdfsetup}
    14 
    15 \newcommand\download{\url{http://www21.in.tum.de/~blanchet/\#software}}
    16 
    17 \def\qty#1{\ensuremath{\left<\mathit{#1\/}\right>}}
    18 \def\qtybf#1{$\mathbf{\left<\textbf{\textit{#1\/}}\right>}$}
    19 
    20 \newcommand\const[1]{\textsf{#1}}
    21 
    22 %\oddsidemargin=4.6mm
    23 %\evensidemargin=4.6mm
    24 %\textwidth=150mm
    25 %\topmargin=4.6mm
    26 %\headheight=0mm
    27 %\headsep=0mm
    28 %\textheight=234mm
    29 
    30 \def\Colon{\mathord{:\mkern-1.5mu:}}
    31 %\def\lbrakk{\mathopen{\lbrack\mkern-3.25mu\lbrack}}
    32 %\def\rbrakk{\mathclose{\rbrack\mkern-3.255mu\rbrack}}
    33 \def\lparr{\mathopen{(\mkern-4mu\mid}}
    34 \def\rparr{\mathclose{\mid\mkern-4mu)}}
    35 
    36 \def\unk{{?}}
    37 \def\undef{(\lambda x.\; \unk)}
    38 %\def\unr{\textit{others}}
    39 \def\unr{\ldots}
    40 \def\Abs#1{\hbox{\rm{\flqq}}{\,#1\,}\hbox{\rm{\frqq}}}
    41 \def\Q{{\smash{\lower.2ex\hbox{$\scriptstyle?$}}}}
    42 
    43 \urlstyle{tt}
    44 
    45 \begin{document}
    46 
    47 %%% TYPESETTING
    48 %\renewcommand\labelitemi{$\bullet$}
    49 \renewcommand\labelitemi{\raise.065ex\hbox{\small\textbullet}}
    50 
    51 \selectlanguage{english}
    52 
    53 \title{\includegraphics[scale=0.5]{isabelle_sledgehammer} \\[4ex]
    54 Hammering Away \\[\smallskipamount]
    55 \Large A User's Guide to Sledgehammer for Isabelle/HOL}
    56 \author{\hbox{} \\
    57 Jasmin Christian Blanchette \\
    58 {\normalsize Institut f\"ur Informatik, Technische Universit\"at M\"unchen} \\[4\smallskipamount]
    59 {\normalsize with contributions from} \\[4\smallskipamount]
    60 Lawrence C. Paulson \\
    61 {\normalsize Computer Laboratory, University of Cambridge} \\
    62 \hbox{}}
    63 
    64 \maketitle
    65 
    66 \tableofcontents
    67 
    68 \setlength{\parskip}{.7em plus .2em minus .1em}
    69 \setlength{\parindent}{0pt}
    70 \setlength{\abovedisplayskip}{\parskip}
    71 \setlength{\abovedisplayshortskip}{.9\parskip}
    72 \setlength{\belowdisplayskip}{\parskip}
    73 \setlength{\belowdisplayshortskip}{.9\parskip}
    74 
    75 % General-purpose enum environment with correct spacing
    76 \newenvironment{enum}%
    77     {\begin{list}{}{%
    78         \setlength{\topsep}{.1\parskip}%
    79         \setlength{\partopsep}{.1\parskip}%
    80         \setlength{\itemsep}{\parskip}%
    81         \advance\itemsep by-\parsep}}
    82     {\end{list}}
    83 
    84 \def\pre{\begingroup\vskip0pt plus1ex\advance\leftskip by\leftmargin
    85 \advance\rightskip by\leftmargin}
    86 \def\post{\vskip0pt plus1ex\endgroup}
    87 
    88 \def\prew{\pre\advance\rightskip by-\leftmargin}
    89 \def\postw{\post}
    90 
    91 \section{Introduction}
    92 \label{introduction}
    93 
    94 Sledgehammer is a tool that applies automatic theorem provers (ATPs)
    95 and satisfiability-modulo-theories (SMT) solvers on the current goal.%
    96 \footnote{The distinction between ATPs and SMT solvers is convenient but mostly
    97 historical. The two communities are converging, with more and more ATPs
    98 supporting typical SMT features such as arithmetic and sorts, and a few SMT
    99 solvers parsing ATP syntaxes. There is also a strong technological connection
   100 between instantiation-based ATPs (such as iProver and iProver-Eq) and SMT
   101 solvers.}
   102 %
   103 The supported ATPs are E \cite{schulz-2002}, E-SInE \cite{sine}, E-ToFoF
   104 \cite{tofof}, iProver \cite{korovin-2009}, iProver-Eq
   105 \cite{korovin-sticksel-2010}, LEO-II \cite{leo2}, Satallax \cite{satallax},
   106 SNARK \cite{snark}, SPASS \cite{weidenbach-et-al-2009}, Vampire
   107 \cite{riazanov-voronkov-2002}, and Waldmeister \cite{waldmeister}. The ATPs are
   108 run either locally or remotely via the System\-On\-TPTP web service
   109 \cite{sutcliffe-2000}. In addition to the ATPs, the SMT solvers Z3 \cite{z3} is
   110 used by default, and you can tell Sledgehammer to try Alt-Ergo \cite{alt-ergo},
   111 CVC3 \cite{cvc3}, and Yices \cite{yices} as well; these are run either locally
   112 or (for CVC3 and Z3) on a server at the TU M\"unchen.
   113 
   114 The problem passed to the automatic provers consists of your current goal
   115 together with a heuristic selection of hundreds of facts (theorems) from the
   116 current theory context, filtered by relevance. Because jobs are run in the
   117 background, you can continue to work on your proof by other means. Provers can
   118 be run in parallel. Any reply (which may arrive half a minute later) will appear
   119 in the Proof General response buffer.
   120 
   121 The result of a successful proof search is some source text that usually (but
   122 not always) reconstructs the proof within Isabelle. For ATPs, the reconstructed
   123 proof relies on the general-purpose \textit{metis} proof method, which
   124 integrates the Metis ATP in Isabelle/HOL with explicit inferences going through
   125 the kernel. Thus its results are correct by construction.
   126 
   127 In this manual, we will explicitly invoke the \textbf{sledgehammer} command.
   128 Sledgehammer also provides an automatic mode that can be enabled via the ``Auto
   129 Sledgehammer'' option in Proof General's ``Isabelle'' menu. In this mode,
   130 Sledgehammer is run on every newly entered theorem. The time limit for Auto
   131 Sledgehammer and other automatic tools can be set using the ``Auto Tools Time
   132 Limit'' option.
   133 
   134 \newbox\boxA
   135 \setbox\boxA=\hbox{\texttt{NOSPAM}}
   136 
   137 \newcommand\authoremail{\texttt{blan{\color{white}NOSPAM}\kern-\wd\boxA{}chette@\allowbreak
   138 in.\allowbreak tum.\allowbreak de}}
   139 
   140 To run Sledgehammer, you must make sure that the theory \textit{Sledgehammer} is
   141 imported---this is rarely a problem in practice since it is part of
   142 \textit{Main}. Examples of Sledgehammer use can be found in Isabelle's
   143 \texttt{src/HOL/Metis\_Examples} directory.
   144 Comments and bug reports concerning Sledgehammer or this manual should be
   145 directed to the author at \authoremail.
   146 
   147 \vskip2.5\smallskipamount
   148 
   149 %\textbf{Acknowledgment.} The author would like to thank Mark Summerfield for
   150 %suggesting several textual improvements.
   151 
   152 \section{Installation}
   153 \label{installation}
   154 
   155 Sledgehammer is part of Isabelle, so you don't need to install it. However, it
   156 relies on third-party automatic provers (ATPs and SMT solvers).
   157 
   158 Among the ATPs, E, LEO-II, Satallax, SPASS, and Vampire can be run locally; in
   159 addition, E, E-SInE, E-ToFoF, iProver, iProver-Eq, LEO-II, Satallax, SNARK,
   160 Vampire, and Waldmeister are available remotely via System\-On\-TPTP
   161 \cite{sutcliffe-2000}. If you want better performance, you should at least
   162 install E and SPASS locally.
   163 
   164 Among the SMT solvers, Alt-Ergo, CVC3, Yices, and Z3 can be run locally, and
   165 CVC3 and Z3 can be run remotely on a TU M\"unchen server. If you want better
   166 performance and get the ability to replay proofs that rely on the \emph{smt}
   167 proof method without an Internet connection, you should at least install Z3
   168 locally.
   169 
   170 There are three main ways to install automatic provers on your machine:
   171 
   172 \begin{sloppy}
   173 \begin{enum}
   174 \item[\labelitemi] If you installed an official Isabelle package, it should
   175 already include properly setup executables for CVC3, E, SPASS, and Z3, ready to use.%
   176 \footnote{Vampire's and Yices's licenses prevent us from doing the same for
   177 these otherwise remarkable tools.}
   178 For Z3, you must additionally set the variable
   179 \texttt{Z3\_NON\_COMMERCIAL} to ``yes'' to confirm that you are a
   180 noncommercial user, either in the environment in which Isabelle is
   181 launched or in your
   182 \texttt{\$ISABELLE\_HOME\_USER/etc/settings} file.
   183 
   184 \item[\labelitemi] Alternatively, you can download the Isabelle-aware CVC3, E,
   185 SPASS, and Z3 binary packages from \download. Extract the archives, then add a
   186 line to your \texttt{\$ISABELLE\_HOME\_USER\slash etc\slash components}%
   187 \footnote{The variable \texttt{\$ISABELLE\_HOME\_USER} is set by Isabelle at
   188 startup. Its value can be retrieved by executing \texttt{isabelle}
   189 \texttt{getenv} \texttt{ISABELLE\_HOME\_USER} on the command line.}
   190 file with the absolute path to CVC3, E, SPASS, or Z3. For example, if the
   191 \texttt{components} file does not exist yet and you extracted SPASS to
   192 \texttt{/usr/local/spass-3.8ds}, create it with the single line
   193 
   194 \prew
   195 \texttt{/usr/local/spass-3.8ds}
   196 \postw
   197 
   198 in it.
   199 
   200 \item[\labelitemi] If you prefer to build E, LEO-II, Satallax, or SPASS
   201 manually, or found a Vampire executable somewhere (e.g.,
   202 \url{http://www.vprover.org/}), set the environment variable \texttt{E\_HOME},
   203 \texttt{LEO2\_HOME}, \texttt{SATALLAX\_HOME}, \texttt{SPASS\_HOME}, or
   204 \texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{eproof},
   205 \texttt{leo}, \texttt{satallax}, \texttt{SPASS}, or \texttt{vampire} executable.
   206 Sledgehammer has been tested with E 1.0 to 1.4, LEO-II 1.2.9, Satallax 2.2 and~2.3,
   207 SPASS 3.5, 3.7, and 3.8ds, and Vampire 0.6, 1.0, and 1.8%
   208 \footnote{Following the rewrite of Vampire, the counter for version numbers was
   209 reset to 0; hence the (new) Vampire versions 0.6, 1.0, and 1.8 are more recent
   210 than 9.0 or 11.5.}%
   211 . Since the ATPs' output formats are neither documented nor stable, other
   212 versions might not work well with Sledgehammer. Ideally,
   213 you should also set \texttt{E\_VERSION}, \texttt{LEO2\_VERSION},
   214 \texttt{SATALLAX\_VERSION}, \texttt{SPASS\_VERSION}, or
   215 \texttt{VAMPIRE\_VERSION} to the prover's version number (e.g., ``1.4'').
   216 
   217 Similarly, if you want to build Alt-Ergo or CVC3, or found a
   218 Yices or Z3 executable somewhere (e.g.,
   219 \url{http://yices.csl.sri.com/download.shtml} or
   220 \url{http://research.microsoft.com/en-us/um/redmond/projects/z3/download.html}),
   221 set the environment variable \texttt{CVC3\_\allowbreak SOLVER},
   222 \texttt{YICES\_SOLVER}, or \texttt{Z3\_SOLVER} to the complete path of
   223 the executable, \emph{including the file name}. Sledgehammer has been tested
   224 with Alt-Ergo 0.93, CVC3 2.2 and 2.4.1, Yices 1.0.28 and 1.0.33, and Z3 3.0 to 3.2.
   225 Since the SMT solvers' output formats are somewhat unstable, other
   226 versions of the solvers might not work well with Sledgehammer. Ideally,
   227 also set \texttt{CVC3\_VERSION}, \texttt{YICES\_VERSION}, or
   228 \texttt{Z3\_VERSION} to the solver's version number (e.g., ``3.2'').
   229 \end{enum}
   230 \end{sloppy}
   231 
   232 To check whether E, SPASS, Vampire, and/or Z3 are successfully installed, try
   233 out the example in \S\ref{first-steps}. If the remote versions of any of these
   234 provers is used (identified by the prefix ``\emph{remote\_\/}''), or if the
   235 local versions fail to solve the easy goal presented there, something must be
   236 wrong with the installation.
   237 
   238 Remote prover invocation requires Perl with the World Wide Web Library
   239 (\texttt{libwww-perl}) installed. If you must use a proxy server to access the
   240 Internet, set the \texttt{http\_proxy} environment variable to the proxy, either
   241 in the environment in which Isabelle is launched or in your
   242 \texttt{\$ISABELLE\_HOME\_USER/etc/settings} file. Here are a few
   243 examples:
   244 
   245 \prew
   246 \texttt{http\_proxy=http://proxy.example.org} \\
   247 \texttt{http\_proxy=http://proxy.example.org:8080} \\
   248 \texttt{http\_proxy=http://joeblow:pAsSwRd@proxy.example.org}
   249 \postw
   250 
   251 \section{First Steps}
   252 \label{first-steps}
   253 
   254 To illustrate Sledgehammer in context, let us start a theory file and
   255 attempt to prove a simple lemma:
   256 
   257 \prew
   258 \textbf{theory}~\textit{Scratch} \\
   259 \textbf{imports}~\textit{Main} \\
   260 \textbf{begin} \\[2\smallskipamount]
   261 %
   262 \textbf{lemma} ``$[a] = [b] \,\Longrightarrow\, a = b$'' \\
   263 \textbf{sledgehammer}
   264 \postw
   265 
   266 Instead of issuing the \textbf{sledgehammer} command, you can also find
   267 Sledgehammer in the ``Commands'' submenu of the ``Isabelle'' menu in Proof
   268 General or press the Emacs key sequence C-c C-a C-s.
   269 Either way, Sledgehammer produces the following output after a few seconds:
   270 
   271 \prew
   272 \slshape
   273 Sledgehammer: ``\textit{e\/}'' on goal \\
   274 $[a] = [b] \,\Longrightarrow\, a = b$ \\
   275 Try this: \textbf{by} (\textit{metis last\_ConsL}) (64 ms). \\[3\smallskipamount]
   276 %
   277 Sledgehammer: ``\textit{z3\/}'' on goal \\
   278 $[a] = [b] \,\Longrightarrow\, a = b$ \\
   279 Try this: \textbf{by} (\textit{metis list.inject}) (20 ms). \\[3\smallskipamount]
   280 %
   281 Sledgehammer: ``\textit{vampire\/}'' on goal \\
   282 $[a] = [b] \,\Longrightarrow\, a = b$ \\
   283 Try this: \textbf{by} (\textit{metis hd.simps}) (14 ms). \\[3\smallskipamount]
   284 %
   285 Sledgehammer: ``\textit{spass\/}'' on goal \\
   286 $[a] = [b] \,\Longrightarrow\, a = b$ \\
   287 Try this: \textbf{by} (\textit{metis list.inject}) (17 ms). \\[3\smallskipamount]
   288 %
   289 Sledgehammer: ``\textit{remote\_waldmeister\/}'' on goal \\
   290 $[a] = [b] \,\Longrightarrow\, a = b$ \\
   291 Try this: \textbf{by} (\textit{metis hd.simps}) (15 ms). \\[3\smallskipamount]
   292 %
   293 Sledgehammer: ``\textit{remote\_e\_sine\/}'' on goal \\
   294 $[a] = [b] \,\Longrightarrow\, a = b$ \\
   295 Try this: \textbf{by} (\textit{metis hd.simps}) (18 ms).
   296 \postw
   297 
   298 Sledgehammer ran E, E-SInE, SPASS, Vampire, Waldmeister, and Z3 in parallel.
   299 Depending on which provers are installed and how many processor cores are
   300 available, some of the provers might be missing or present with a
   301 \textit{remote\_} prefix. Waldmeister is run only for unit equational problems,
   302 where the goal's conclusion is a (universally quantified) equation.
   303 
   304 For each successful prover, Sledgehammer gives a one-liner proof that uses
   305 the \textit{metis} or \textit{smt} proof method. Approximate timings are shown
   306 in parentheses, indicating how fast the call is. You can click the proof to
   307 insert it into the theory text.
   308 
   309 In addition, you can ask Sledgehammer for an Isar text proof by passing the
   310 \textit{isar\_proof} option (\S\ref{output-format}):
   311 
   312 \prew
   313 \textbf{sledgehammer} [\textit{isar\_proof}]
   314 \postw
   315 
   316 When Isar proof construction is successful, it can yield proofs that are more
   317 readable and also faster than the \textit{metis} or \textit{smt} one-liners.
   318 This feature is experimental and is only available for ATPs.
   319 
   320 \section{Hints}
   321 \label{hints}
   322 
   323 This section presents a few hints that should help you get the most out of
   324 Sledgehammer. Frequently asked questions are answered in
   325 \S\ref{frequently-asked-questions}.
   326 
   327 %\newcommand\point[1]{\medskip\par{\sl\bfseries#1}\par\nopagebreak}
   328 \newcommand\point[1]{\subsection{\emph{#1}}}
   329 
   330 \point{Presimplify the goal}
   331 
   332 For best results, first simplify your problem by calling \textit{auto} or at
   333 least \textit{safe} followed by \textit{simp\_all}. The SMT solvers provide
   334 arithmetic decision procedures, but the ATPs typically do not (or if they do,
   335 Sledgehammer does not use it yet). Apart from Waldmeister, they are not
   336 especially good at heavy rewriting, but because they regard equations as
   337 undirected, they often prove theorems that require the reverse orientation of a
   338 \textit{simp} rule. Higher-order problems can be tackled, but the success rate
   339 is better for first-order problems. Hence, you may get better results if you
   340 first simplify the problem to remove higher-order features.
   341 
   342 \point{Make sure E, SPASS, Vampire, and Z3 are locally installed}
   343 
   344 Locally installed provers are faster and more reliable than those running on
   345 servers. See \S\ref{installation} for details on how to install them.
   346 
   347 \point{Familiarize yourself with the most important options}
   348 
   349 Sledgehammer's options are fully documented in \S\ref{command-syntax}. Many of
   350 the options are very specialized, but serious users of the tool should at least
   351 familiarize themselves with the following options:
   352 
   353 \begin{enum}
   354 \item[\labelitemi] \textbf{\textit{provers}} (\S\ref{mode-of-operation}) specifies
   355 the automatic provers (ATPs and SMT solvers) that should be run whenever
   356 Sledgehammer is invoked (e.g., ``\textit{provers}~= \textit{e spass
   357 remote\_vampire\/}''). For convenience, you can omit ``\textit{provers}~=''
   358 and simply write the prover names as a space-separated list (e.g., ``\textit{e
   359 spass remote\_vampire\/}'').
   360 
   361 \item[\labelitemi] \textbf{\textit{max\_relevant}} (\S\ref{relevance-filter})
   362 specifies the maximum number of facts that should be passed to the provers. By
   363 default, the value is prover-dependent but varies between about 150 and 1000. If
   364 the provers time out, you can try lowering this value to, say, 100 or 50 and see
   365 if that helps.
   366 
   367 \item[\labelitemi] \textbf{\textit{isar\_proof}} (\S\ref{output-format}) specifies
   368 that Isar proofs should be generated, instead of one-liner \textit{metis} or
   369 \textit{smt} proofs. The length of the Isar proofs can be controlled by setting
   370 \textit{isar\_shrink\_factor} (\S\ref{output-format}).
   371 
   372 \item[\labelitemi] \textbf{\textit{timeout}} (\S\ref{timeouts}) controls the
   373 provers' time limit. It is set to 30 seconds, but since Sledgehammer runs
   374 asynchronously you should not hesitate to raise this limit to 60 or 120 seconds
   375 if you are the kind of user who can think clearly while ATPs are active.
   376 \end{enum}
   377 
   378 Options can be set globally using \textbf{sledgehammer\_params}
   379 (\S\ref{command-syntax}). The command also prints the list of all available
   380 options with their current value. Fact selection can be influenced by specifying
   381 ``$(\textit{add}{:}~\textit{my\_facts})$'' after the \textbf{sledgehammer} call
   382 to ensure that certain facts are included, or simply ``$(\textit{my\_facts})$''
   383 to force Sledgehammer to run only with $\textit{my\_facts}$.
   384 
   385 \section{Frequently Asked Questions}
   386 \label{frequently-asked-questions}
   387 
   388 This sections answers frequently (and infrequently) asked questions about
   389 Sledgehammer. It is a good idea to skim over it now even if you don't have any
   390 questions at this stage. And if you have any further questions not listed here,
   391 send them to the author at \authoremail.
   392 
   393 \point{Which facts are passed to the automatic provers?}
   394 
   395 The relevance filter assigns a score to every available fact (lemma, theorem,
   396 definition, or axiom) based upon how many constants that fact shares with the
   397 conjecture. This process iterates to include facts relevant to those just
   398 accepted, but with a decay factor to ensure termination. The constants are
   399 weighted to give unusual ones greater significance. The relevance filter copes
   400 best when the conjecture contains some unusual constants; if all the constants
   401 are common, it is unable to discriminate among the hundreds of facts that are
   402 picked up. The relevance filter is also memoryless: It has no information about
   403 how many times a particular fact has been used in a proof, and it cannot learn.
   404 
   405 The number of facts included in a problem varies from prover to prover, since
   406 some provers get overwhelmed more easily than others. You can show the number of
   407 facts given using the \textit{verbose} option (\S\ref{output-format}) and the
   408 actual facts using \textit{debug} (\S\ref{output-format}).
   409 
   410 Sledgehammer is good at finding short proofs combining a handful of existing
   411 lemmas. If you are looking for longer proofs, you must typically restrict the
   412 number of facts, by setting the \textit{max\_relevant} option
   413 (\S\ref{relevance-filter}) to, say, 25 or 50.
   414 
   415 You can also influence which facts are actually selected in a number of ways. If
   416 you simply want to ensure that a fact is included, you can specify it using the
   417 ``$(\textit{add}{:}~\textit{my\_facts})$'' syntax. For example:
   418 %
   419 \prew
   420 \textbf{sledgehammer} (\textit{add}: \textit{hd.simps} \textit{tl.simps})
   421 \postw
   422 %
   423 The specified facts then replace the least relevant facts that would otherwise be
   424 included; the other selected facts remain the same.
   425 If you want to direct the selection in a particular direction, you can specify
   426 the facts via \textbf{using}:
   427 %
   428 \prew
   429 \textbf{using} \textit{hd.simps} \textit{tl.simps} \\
   430 \textbf{sledgehammer}
   431 \postw
   432 %
   433 The facts are then more likely to be selected than otherwise, and if they are
   434 selected at iteration $j$ they also influence which facts are selected at
   435 iterations $j + 1$, $j + 2$, etc. To give them even more weight, try
   436 %
   437 \prew
   438 \textbf{using} \textit{hd.simps} \textit{tl.simps} \\
   439 \textbf{apply}~\textbf{--} \\
   440 \textbf{sledgehammer}
   441 \postw
   442 
   443 \point{Why does Metis fail to reconstruct the proof?}
   444 
   445 There are many reasons. If Metis runs seemingly forever, that is a sign that the
   446 proof is too difficult for it. Metis's search is complete, so it should
   447 eventually find it, but that's little consolation. There are several possible
   448 solutions:
   449 
   450 \begin{enum}
   451 \item[\labelitemi] Try the \textit{isar\_proof} option (\S\ref{output-format}) to
   452 obtain a step-by-step Isar proof where each step is justified by \textit{metis}.
   453 Since the steps are fairly small, \textit{metis} is more likely to be able to
   454 replay them.
   455 
   456 \item[\labelitemi] Try the \textit{smt} proof method instead of \textit{metis}. It
   457 is usually stronger, but you need to either have Z3 available to replay the
   458 proofs, trust the SMT solver, or use certificates. See the documentation in the
   459 \emph{SMT} theory (\texttt{\$ISABELLE\_HOME/src/HOL/SMT.thy}) for details.
   460 
   461 \item[\labelitemi] Try the \textit{blast} or \textit{auto} proof methods, passing
   462 the necessary facts via \textbf{unfolding}, \textbf{using}, \textit{intro}{:},
   463 \textit{elim}{:}, \textit{dest}{:}, or \textit{simp}{:}, as appropriate.
   464 \end{enum}
   465 
   466 In some rare cases, \textit{metis} fails fairly quickly, and you get the error
   467 message
   468 
   469 \prew
   470 \slshape
   471 One-line proof reconstruction failed.
   472 \postw
   473 
   474 This message indicates that Sledgehammer determined that the goal is provable,
   475 but the proof is, for technical reasons, beyond \textit{metis}'s power. You can
   476 then try again with the \textit{strict} option (\S\ref{problem-encoding}).
   477 
   478 If the goal is actually unprovable and you did not specify an unsound encoding
   479 using \textit{type\_enc} (\S\ref{problem-encoding}), this is a bug, and you are
   480 strongly encouraged to report this to the author at \authoremail.
   481 
   482 \point{Why are the generated Isar proofs so ugly/broken?}
   483 
   484 The current implementation of the Isar proof feature,
   485 enabled by the \textit{isar\_proof} option (\S\ref{output-format}),
   486 is highly experimental. Work on a new implementation has begun. There is a large body of
   487 research into transforming resolution proofs into natural deduction proofs (such
   488 as Isar proofs), which we hope to leverage. In the meantime, a workaround is to
   489 set the \textit{isar\_shrink\_factor} option (\S\ref{output-format}) to a larger
   490 value or to try several provers and keep the nicest-looking proof.
   491 
   492 \point{How can I tell whether a suggested proof is sound?}
   493 
   494 Earlier versions of Sledgehammer often suggested unsound proofs---either proofs
   495 of nontheorems or simply proofs that rely on type-unsound inferences. This
   496 is a thing of the past, unless you explicitly specify an unsound encoding
   497 using \textit{type\_enc} (\S\ref{problem-encoding}).
   498 %
   499 Officially, the only form of ``unsoundness'' that lurks in the sound
   500 encodings is related to missing characteristic theorems of datatypes. For
   501 example,
   502 
   503 \prew
   504 \textbf{lemma}~``$\exists \mathit{xs}.\; \mathit{xs} \neq []$'' \\
   505 \textbf{sledgehammer} ()
   506 \postw
   507 
   508 suggests an argumentless \textit{metis} call that fails. However, the conjecture
   509 does actually hold, and the \textit{metis} call can be repaired by adding
   510 \textit{list.distinct}.
   511 %
   512 We hope to address this problem in a future version of Isabelle. In the
   513 meantime, you can avoid it by passing the \textit{strict} option
   514 (\S\ref{problem-encoding}).
   515 
   516 \point{What are the \textit{full\_types}, \textit{no\_types}, and
   517 \textit{mono\_tags} arguments to Metis?}
   518 
   519 The \textit{metis}~(\textit{full\_types}) proof method
   520 and its cousin \textit{metis}~(\textit{mono\_tags}) are fully-typed
   521 version of Metis. It is somewhat slower than \textit{metis}, but the proof
   522 search is fully typed, and it also includes more powerful rules such as the
   523 axiom ``$x = \const{True} \mathrel{\lor} x = \const{False}$'' for reasoning in
   524 higher-order places (e.g., in set comprehensions). The method kicks in
   525 automatically as a fallback when \textit{metis} fails, and it is sometimes
   526 generated by Sledgehammer instead of \textit{metis} if the proof obviously
   527 requires type information or if \textit{metis} failed when Sledgehammer
   528 preplayed the proof. (By default, Sledgehammer tries to run \textit{metis} with
   529 various options for up to 3 seconds each time to ensure that the generated
   530 one-line proofs actually work and to display timing information. This can be
   531 configured using the \textit{preplay\_timeout} and \textit{dont\_preplay}
   532 options (\S\ref{timeouts}).)
   533 %
   534 At the other end of the soundness spectrum, \textit{metis} (\textit{no\_types})
   535 uses no type information at all during the proof search, which is more efficient
   536 but often fails. Calls to \textit{metis} (\textit{no\_types}) are occasionally
   537 generated by Sledgehammer.
   538 %
   539 See the \textit{type\_enc} option (\S\ref{problem-encoding}) for details.
   540 
   541 Incidentally, if you ever see warnings such as
   542 
   543 \prew
   544 \slshape
   545 Metis: Falling back on ``\textit{metis} (\textit{full\_types})''.
   546 \postw
   547 
   548 for a successful \textit{metis} proof, you can advantageously pass the
   549 \textit{full\_types} option to \textit{metis} directly.
   550 
   551 \point{And what are the \textit{lifting} and \textit{hide\_lams} arguments
   552 to Metis?}
   553 
   554 Orthogonally to the encoding of types, it is important to choose an appropriate
   555 translation of $\lambda$-abstractions. Metis supports three translation schemes,
   556 in decreasing order of power: Curry combinators (the default),
   557 $\lambda$-lifting, and a ``hiding'' scheme that disables all reasoning under
   558 $\lambda$-abstractions. The more powerful schemes also give the automatic
   559 provers more rope to hang themselves. See the \textit{lam\_trans} option (\S\ref{problem-encoding}) for details.
   560 
   561 \point{Are generated proofs minimal?}
   562 
   563 Automatic provers frequently use many more facts than are necessary.
   564 Sledgehammer inclues a minimization tool that takes a set of facts returned by a
   565 given prover and repeatedly calls the same prover, \textit{metis}, or
   566 \textit{smt} with subsets of those axioms in order to find a minimal set.
   567 Reducing the number of axioms typically improves Metis's speed and success rate,
   568 while also removing superfluous clutter from the proof scripts.
   569 
   570 In earlier versions of Sledgehammer, generated proofs were systematically
   571 accompanied by a suggestion to invoke the minimization tool. This step is now
   572 performed implicitly if it can be done in a reasonable amount of time (something
   573 that can be guessed from the number of facts in the original proof and the time
   574 it took to find or preplay it).
   575 
   576 In addition, some provers (e.g., Yices) do not provide proofs or sometimes
   577 produce incomplete proofs. The minimizer is then invoked to find out which facts
   578 are actually needed from the (large) set of facts that was initially given to
   579 the prover. Finally, if a prover returns a proof with lots of facts, the
   580 minimizer is invoked automatically since Metis would be unlikely to re-find the
   581 proof.
   582 %
   583 Automatic minimization can be forced or disabled using the \textit{minimize}
   584 option (\S\ref{mode-of-operation}).
   585 
   586 \point{A strange error occurred---what should I do?}
   587 
   588 Sledgehammer tries to give informative error messages. Please report any strange
   589 error to the author at \authoremail. This applies double if you get the message
   590 
   591 \prew
   592 \slshape
   593 The prover found a type-unsound proof involving ``\textit{foo\/}'',
   594 ``\textit{bar\/}'', and ``\textit{baz\/}'' even though a supposedly type-sound
   595 encoding was used (or, less likely, your axioms are inconsistent). You might
   596 want to report this to the Isabelle developers.
   597 \postw
   598 
   599 \point{Auto can solve it---why not Sledgehammer?}
   600 
   601 Problems can be easy for \textit{auto} and difficult for automatic provers, but
   602 the reverse is also true, so don't be discouraged if your first attempts fail.
   603 Because the system refers to all theorems known to Isabelle, it is particularly
   604 suitable when your goal has a short proof from lemmas that you don't know about.
   605 
   606 \point{Why are there so many options?}
   607 
   608 Sledgehammer's philosophy should work out of the box, without user guidance.
   609 Many of the options are meant to be used mostly by the Sledgehammer developers
   610 for experimentation purposes. Of course, feel free to experiment with them if
   611 you are so inclined.
   612 
   613 \section{Command Syntax}
   614 \label{command-syntax}
   615 
   616 \subsection{Sledgehammer}
   617 
   618 Sledgehammer can be invoked at any point when there is an open goal by entering
   619 the \textbf{sledgehammer} command in the theory file. Its general syntax is as
   620 follows:
   621 
   622 \prew
   623 \textbf{sledgehammer} \qty{subcommand}$^?$ \qty{options}$^?$ \qty{facts\_override}$^?$ \qty{num}$^?$
   624 \postw
   625 
   626 For convenience, Sledgehammer is also available in the ``Commands'' submenu of
   627 the ``Isabelle'' menu in Proof General or by pressing the Emacs key sequence C-c
   628 C-a C-s. This is equivalent to entering the \textbf{sledgehammer} command with
   629 no arguments in the theory text.
   630 
   631 In the general syntax, the \qty{subcommand} may be any of the following:
   632 
   633 \begin{enum}
   634 \item[\labelitemi] \textbf{\textit{run} (the default):} Runs Sledgehammer on
   635 subgoal number \qty{num} (1 by default), with the given options and facts.
   636 
   637 \item[\labelitemi] \textbf{\textit{min}:} Attempts to minimize the facts
   638 specified in the \qty{facts\_override} argument to obtain a simpler proof
   639 involving fewer facts. The options and goal number are as for \textit{run}.
   640 
   641 \item[\labelitemi] \textbf{\textit{messages}:} Redisplays recent messages issued
   642 by Sledgehammer. This allows you to examine results that might have been lost
   643 due to Sledgehammer's asynchronous nature. The \qty{num} argument specifies a
   644 limit on the number of messages to display (10 by default).
   645 
   646 \item[\labelitemi] \textbf{\textit{supported\_provers}:} Prints the list of
   647 automatic provers supported by Sledgehammer. See \S\ref{installation} and
   648 \S\ref{mode-of-operation} for more information on how to install automatic
   649 provers.
   650 
   651 \item[\labelitemi] \textbf{\textit{running\_provers}:} Prints information about
   652 currently running automatic provers, including elapsed runtime and remaining
   653 time until timeout.
   654 
   655 \item[\labelitemi] \textbf{\textit{kill\_provers}:} Terminates all running
   656 automatic provers.
   657 
   658 \item[\labelitemi] \textbf{\textit{refresh\_tptp}:} Refreshes the list of remote
   659 ATPs available at System\-On\-TPTP \cite{sutcliffe-2000}.
   660 \end{enum}
   661 
   662 Sledgehammer's behavior can be influenced by various \qty{options}, which can be
   663 specified in brackets after the \textbf{sledgehammer} command. The
   664 \qty{options} are a list of key--value pairs of the form ``[$k_1 = v_1,
   665 \ldots, k_n = v_n$]''. For Boolean options, ``= \textit{true\/}'' is optional. For
   666 example:
   667 
   668 \prew
   669 \textbf{sledgehammer} [\textit{isar\_proof}, \,\textit{timeout} = 120]
   670 \postw
   671 
   672 Default values can be set using \textbf{sledgehammer\_\allowbreak params}:
   673 
   674 \prew
   675 \textbf{sledgehammer\_params} \qty{options}
   676 \postw
   677 
   678 The supported options are described in \S\ref{option-reference}.
   679 
   680 The \qty{facts\_override} argument lets you alter the set of facts that go
   681 through the relevance filter. It may be of the form ``(\qty{facts})'', where
   682 \qty{facts} is a space-separated list of Isabelle facts (theorems, local
   683 assumptions, etc.), in which case the relevance filter is bypassed and the given
   684 facts are used. It may also be of the form ``(\textit{add}:\ \qty{facts\/_{\mathrm{1}}})'',
   685 ``(\textit{del}:\ \qty{facts\/_{\mathrm{2}}})'', or ``(\textit{add}:\ \qty{facts\/_{\mathrm{1}}}\
   686 \textit{del}:\ \qty{facts\/_{\mathrm{2}}})'', where the relevance filter is instructed to
   687 proceed as usual except that it should consider \qty{facts\/_{\mathrm{1}}}
   688 highly-relevant and \qty{facts\/_{\mathrm{2}}} fully irrelevant.
   689 
   690 You can instruct Sledgehammer to run automatically on newly entered theorems by
   691 enabling the ``Auto Sledgehammer'' option in Proof General's ``Isabelle'' menu.
   692 For automatic runs, only the first prover set using \textit{provers}
   693 (\S\ref{mode-of-operation}) is considered, fewer facts are passed to the prover,
   694 \textit{slice} (\S\ref{mode-of-operation}) is disabled, \textit{strict}
   695 (\S\ref{problem-encoding}) is enabled, \textit{verbose} (\S\ref{output-format})
   696 and \textit{debug} (\S\ref{output-format}) are disabled, and \textit{timeout}
   697 (\S\ref{timeouts}) is superseded by the ``Auto Tools Time Limit'' in Proof
   698 General's ``Isabelle'' menu. Sledgehammer's output is also more concise.
   699 
   700 \subsection{Metis}
   701 
   702 The \textit{metis} proof method has the syntax
   703 
   704 \prew
   705 \textbf{\textit{metis}}~(\qty{options})${}^?$~\qty{facts}${}^?$
   706 \postw
   707 
   708 where \qty{facts} is a list of arbitrary facts and \qty{options} is a
   709 comma-separated list consisting of at most one $\lambda$ translation scheme
   710 specification with the same semantics as Sledgehammer's \textit{lam\_trans}
   711 option (\S\ref{problem-encoding}) and at most one type encoding specification
   712 with the same semantics as Sledgehammer's \textit{type\_enc} option
   713 (\S\ref{problem-encoding}).
   714 %
   715 The supported $\lambda$ translation schemes are \textit{hide\_lams},
   716 \textit{lifting}, and \textit{combs} (the default).
   717 %
   718 All the untyped type encodings listed in \S\ref{problem-encoding} are supported.
   719 For convenience, the following aliases are provided:
   720 \begin{enum}
   721 \item[\labelitemi] \textbf{\textit{full\_types}:} Synonym for \textit{poly\_guards\_query}.
   722 \item[\labelitemi] \textbf{\textit{partial\_types}:} Synonym for \textit{poly\_args}.
   723 \item[\labelitemi] \textbf{\textit{no\_types}:} Synonym for \textit{erased}.
   724 \end{enum}
   725 
   726 \section{Option Reference}
   727 \label{option-reference}
   728 
   729 \def\defl{\{}
   730 \def\defr{\}}
   731 
   732 \def\flushitem#1{\item[]\noindent\kern-\leftmargin \textbf{#1}}
   733 \def\optrueonly#1{\flushitem{\textit{#1} $\bigl[$= \textit{true}$\bigr]$\enskip}\nopagebreak\\[\parskip]}
   734 \def\optrue#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{true}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
   735 \def\opfalse#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{false}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
   736 \def\opsmart#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\enskip \defl\textit{smart}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
   737 \def\opsmartx#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\enskip \defl\textit{smart}\defr\\\hbox{}\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
   738 \def\opnodefault#1#2{\flushitem{\textit{#1} = \qtybf{#2}} \nopagebreak\\[\parskip]}
   739 \def\opnodefaultbrk#1#2{\flushitem{$\bigl[$\textit{#1} =$\bigr]$ \qtybf{#2}} \nopagebreak\\[\parskip]}
   740 \def\opdefault#1#2#3{\flushitem{\textit{#1} = \qtybf{#2}\enskip \defl\textit{#3}\defr} \nopagebreak\\[\parskip]}
   741 \def\oparg#1#2#3{\flushitem{\textit{#1} \qtybf{#2} = \qtybf{#3}} \nopagebreak\\[\parskip]}
   742 \def\opargbool#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}
   743 \def\opargboolorsmart#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}
   744 
   745 Sledgehammer's options are categorized as follows:\ mode of operation
   746 (\S\ref{mode-of-operation}), problem encoding (\S\ref{problem-encoding}),
   747 relevance filter (\S\ref{relevance-filter}), output format
   748 (\S\ref{output-format}), authentication (\S\ref{authentication}), and timeouts
   749 (\S\ref{timeouts}).
   750 
   751 The descriptions below refer to the following syntactic quantities:
   752 
   753 \begin{enum}
   754 \item[\labelitemi] \qtybf{string}: A string.
   755 \item[\labelitemi] \qtybf{bool\/}: \textit{true} or \textit{false}.
   756 \item[\labelitemi] \qtybf{smart\_bool\/}: \textit{true}, \textit{false}, or
   757 \textit{smart}.
   758 \item[\labelitemi] \qtybf{int\/}: An integer.
   759 %\item[\labelitemi] \qtybf{float\/}: A floating-point number (e.g., 2.5).
   760 \item[\labelitemi] \qtybf{float\_pair\/}: A pair of floating-point numbers
   761 (e.g., 0.6 0.95).
   762 \item[\labelitemi] \qtybf{smart\_int\/}: An integer or \textit{smart}.
   763 \item[\labelitemi] \qtybf{float\_or\_none\/}: A floating-point number (e.g., 60 or
   764 0.5) expressing a number of seconds, or the keyword \textit{none} ($\infty$
   765 seconds).
   766 \end{enum}
   767 
   768 Default values are indicated in curly brackets (\textrm{\{\}}). Boolean options
   769 have a negated counterpart (e.g., \textit{blocking} vs.\
   770 \textit{non\_blocking}). When setting them, ``= \textit{true\/}'' may be omitted.
   771 
   772 \subsection{Mode of Operation}
   773 \label{mode-of-operation}
   774 
   775 \begin{enum}
   776 \opnodefaultbrk{provers}{string}
   777 Specifies the automatic provers to use as a space-separated list (e.g.,
   778 ``\textit{e}~\textit{spass}~\textit{remote\_vampire\/}'').
   779 Provers can be run locally or remotely; see \S\ref{installation} for
   780 installation instructions.
   781 
   782 The following local provers are supported:
   783 
   784 \begin{enum}
   785 \item[\labelitemi] \textbf{\textit{alt\_ergo}:} Alt-Ergo is a polymorphic
   786 SMT solver developed by Bobot et al.\ \cite{alt-ergo}.
   787 It supports the TPTP polymorphic typed first-order format (TFF1) via Why3
   788 \cite{why3}. It is included for experimental purposes. To use Alt-Ergo, set the
   789 environment variable \texttt{WHY3\_HOME} to the directory that contains the
   790 \texttt{why3} executable. Sledgehammer has been tested with Alt-Ergo 0.93 and an
   791 unidentified development version of Why3.
   792 
   793 \item[\labelitemi] \textbf{\textit{cvc3}:} CVC3 is an SMT solver developed by
   794 Clark Barrett, Cesare Tinelli, and their colleagues \cite{cvc3}. To use CVC3,
   795 set the environment variable \texttt{CVC3\_SOLVER} to the complete path of the
   796 executable, including the file name, or install the prebuilt CVC3 package from
   797 \download. Sledgehammer has been tested with version 2.2.
   798 
   799 \item[\labelitemi] \textbf{\textit{e}:} E is a first-order resolution prover
   800 developed by Stephan Schulz \cite{schulz-2002}. To use E, set the environment
   801 variable \texttt{E\_HOME} to the directory that contains the \texttt{eproof}
   802 executable and \texttt{E\_VERSION} to the version number (e.g., ``1.4''), or
   803 install the prebuilt E package from \download. Sledgehammer has been tested with
   804 versions 1.0 to 1.4.
   805 
   806 \item[\labelitemi] \textbf{\textit{leo2}:} LEO-II is an automatic
   807 higher-order prover developed by Christoph Benzm\"uller et al.\ \cite{leo2},
   808 with support for the TPTP typed higher-order syntax (THF0). To use LEO-II, set
   809 the environment variable \texttt{LEO2\_HOME} to the directory that contains the
   810 \texttt{leo} executable. Sledgehammer requires version 1.2.9 or above.
   811 
   812 \item[\labelitemi] \textbf{\textit{metis}:} Although it is much less powerful than
   813 the external provers, Metis itself can be used for proof search.
   814 
   815 \item[\labelitemi] \textbf{\textit{satallax}:} Satallax is an automatic
   816 higher-order prover developed by Chad Brown et al.\ \cite{satallax}, with
   817 support for the TPTP typed higher-order syntax (THF0). To use Satallax, set the
   818 environment variable \texttt{SATALLAX\_HOME} to the directory that contains the
   819 \texttt{satallax} executable. Sledgehammer requires version 2.2 or above.
   820 
   821 \item[\labelitemi] \textbf{\textit{smt}:} The \textit{smt} proof method with the
   822 current settings (usually:\ Z3 with proof reconstruction).
   823 
   824 \item[\labelitemi] \textbf{\textit{spass}:} SPASS is a first-order resolution
   825 prover developed by Christoph Weidenbach et al.\ \cite{weidenbach-et-al-2009}.
   826 To use SPASS, set the environment variable \texttt{SPASS\_HOME} to the directory
   827 that contains the \texttt{SPASS} executable and \texttt{SPASS\_VERSION} to the
   828 version number (e.g., ``3.8ds''), or install the prebuilt SPASS package from
   829 \download. Sledgehammer requires version 3.5 or above.
   830 
   831 \item[\labelitemi] \textbf{\textit{vampire}:} Vampire is a first-order resolution
   832 prover developed by Andrei Voronkov and his colleagues
   833 \cite{riazanov-voronkov-2002}. To use Vampire, set the environment variable
   834 \texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{vampire}
   835 executable and \texttt{VAMPIRE\_VERSION} to the version number (e.g., ``1.8'').
   836 Sledgehammer has been tested with versions 0.6, 1.0, and 1.8.
   837 Versions above 1.8 support the TPTP typed first-order format (TFF0).
   838 
   839 \item[\labelitemi] \textbf{\textit{yices}:} Yices is an SMT solver developed at
   840 SRI \cite{yices}. To use Yices, set the environment variable
   841 \texttt{YICES\_SOLVER} to the complete path of the executable, including the
   842 file name. Sledgehammer has been tested with version 1.0.28.
   843 
   844 \item[\labelitemi] \textbf{\textit{z3}:} Z3 is an SMT solver developed at
   845 Microsoft Research \cite{z3}. To use Z3, set the environment variable
   846 \texttt{Z3\_SOLVER} to the complete path of the executable, including the file
   847 name, and set \texttt{Z3\_NON\_COMMERCIAL} to ``yes'' to confirm that you are a
   848 noncommercial user. Sledgehammer has been tested with versions 3.0 to 3.2.
   849 
   850 \item[\labelitemi] \textbf{\textit{z3\_tptp}:} This version of Z3 pretends to be
   851 an ATP, exploiting Z3's support for the TPTP untyped and typed first-order
   852 formats (FOF and TFF0). It is included for experimental purposes. It
   853 requires version 3.0 or above. To use it, set the environment variable
   854 \texttt{Z3\_HOME} to the directory that contains the \texttt{z3}
   855 executable.
   856 \end{enum}
   857 
   858 The following remote provers are supported:
   859 
   860 \begin{enum}
   861 \item[\labelitemi] \textbf{\textit{remote\_cvc3}:} The remote version of CVC3 runs
   862 on servers at the TU M\"unchen (or wherever \texttt{REMOTE\_SMT\_URL} is set to
   863 point).
   864 
   865 \item[\labelitemi] \textbf{\textit{remote\_e}:} The remote version of E runs
   866 on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
   867 
   868 \item[\labelitemi] \textbf{\textit{remote\_e\_sine}:} E-SInE is a metaprover
   869 developed by Kry\v stof Hoder \cite{sine} based on E. It runs on Geoff
   870 Sutcliffe's Miami servers.
   871 
   872 \item[\labelitemi] \textbf{\textit{remote\_e\_tofof}:} E-ToFoF is a metaprover
   873 developed by Geoff Sutcliffe \cite{tofof} based on E running on his Miami
   874 servers. This ATP supports the TPTP typed first-order format (TFF0). The
   875 remote version of E-ToFoF runs on Geoff Sutcliffe's Miami servers.
   876 
   877 \item[\labelitemi] \textbf{\textit{remote\_iprover}:} iProver is a pure
   878 instantiation-based prover developed by Konstantin Korovin \cite{korovin-2009}. The
   879 remote version of iProver runs on Geoff Sutcliffe's Miami servers
   880 \cite{sutcliffe-2000}.
   881 
   882 \item[\labelitemi] \textbf{\textit{remote\_iprover\_eq}:} iProver-Eq is an
   883 instantiation-based prover with native support for equality developed by
   884 Konstantin Korovin and Christoph Sticksel \cite{korovin-sticksel-2010}. The
   885 remote version of iProver-Eq runs on Geoff Sutcliffe's Miami servers
   886 \cite{sutcliffe-2000}.
   887 
   888 \item[\labelitemi] \textbf{\textit{remote\_leo2}:} The remote version of LEO-II
   889 runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
   890 
   891 \item[\labelitemi] \textbf{\textit{remote\_satallax}:} The remote version of
   892 Satallax runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
   893 
   894 \item[\labelitemi] \textbf{\textit{remote\_snark}:} SNARK is a first-order
   895 resolution prover developed by Stickel et al.\ \cite{snark}. It supports the
   896 TPTP typed first-order format (TFF0). The remote version of SNARK runs on
   897 Geoff Sutcliffe's Miami servers.
   898 
   899 \item[\labelitemi] \textbf{\textit{remote\_vampire}:} The remote version of
   900 Vampire runs on Geoff Sutcliffe's Miami servers. Version 1.8 is used.
   901 
   902 \item[\labelitemi] \textbf{\textit{remote\_waldmeister}:} Waldmeister is a unit
   903 equality prover developed by Hillenbrand et al.\ \cite{waldmeister}. It can be
   904 used to prove universally quantified equations using unconditional equations,
   905 corresponding to the TPTP CNF UEQ division. The remote version of Waldmeister
   906 runs on Geoff Sutcliffe's Miami servers.
   907 
   908 \item[\labelitemi] \textbf{\textit{remote\_z3}:} The remote version of Z3 runs on
   909 servers at the TU M\"unchen (or wherever \texttt{REMOTE\_SMT\_URL} is set to
   910 point).
   911 
   912 \item[\labelitemi] \textbf{\textit{remote\_z3\_tptp}:} The remote version of ``Z3
   913 with TPTP syntax'' runs on Geoff Sutcliffe's Miami servers.
   914 \end{enum}
   915 
   916 By default, Sledgehammer runs E, E-SInE, SPASS, Vampire, Z3 (or whatever
   917 the SMT module's \textit{smt\_solver} configuration option is set to), and (if
   918 appropriate) Waldmeister in parallel---either locally or remotely, depending on
   919 the number of processor cores available. For historical reasons, the default
   920 value of this option can be overridden using the option ``Sledgehammer:
   921 Provers'' in Proof General's ``Isabelle'' menu.
   922 
   923 It is generally a good idea to run several provers in parallel. Running E,
   924 SPASS, and Vampire for 5~seconds yields a similar success rate to running the
   925 most effective of these for 120~seconds \cite{boehme-nipkow-2010}.
   926 
   927 For the \textit{min} subcommand, the default prover is \textit{metis}. If
   928 several provers are set, the first one is used.
   929 
   930 \opnodefault{prover}{string}
   931 Alias for \textit{provers}.
   932 
   933 \opfalse{blocking}{non\_blocking}
   934 Specifies whether the \textbf{sledgehammer} command should operate
   935 synchronously. The asynchronous (non-blocking) mode lets the user start proving
   936 the putative theorem manually while Sledgehammer looks for a proof, but it can
   937 also be more confusing. Irrespective of the value of this option, Sledgehammer
   938 is always run synchronously for the new jEdit-based user interface or if
   939 \textit{debug} (\S\ref{output-format}) is enabled.
   940 
   941 \optrue{slice}{dont\_slice}
   942 Specifies whether the time allocated to a prover should be sliced into several
   943 segments, each of which has its own set of possibly prover-dependent options.
   944 For SPASS and Vampire, the first slice tries the fast but incomplete
   945 set-of-support (SOS) strategy, whereas the second slice runs without it. For E,
   946 up to three slices are tried, with different weighted search strategies and
   947 number of facts. For SMT solvers, several slices are tried with the same options
   948 each time but fewer and fewer facts. According to benchmarks with a timeout of
   949 30 seconds, slicing is a valuable optimization, and you should probably leave it
   950 enabled unless you are conducting experiments. This option is implicitly
   951 disabled for (short) automatic runs.
   952 
   953 \nopagebreak
   954 {\small See also \textit{verbose} (\S\ref{output-format}).}
   955 
   956 \opsmart{minimize}{dont\_minimize}
   957 Specifies whether the minimization tool should be invoked automatically after
   958 proof search. By default, automatic minimization takes place only if
   959 it can be done in a reasonable amount of time (as determined by
   960 the number of facts in the original proof and the time it took to find or
   961 preplay it) or the proof involves an unreasonably large number of facts.
   962 
   963 \nopagebreak
   964 {\small See also \textit{preplay\_timeout} (\S\ref{timeouts})
   965 and \textit{dont\_preplay} (\S\ref{timeouts}).}
   966 
   967 \opfalse{overlord}{no\_overlord}
   968 Specifies whether Sledgehammer should put its temporary files in
   969 \texttt{\$ISA\-BELLE\_\allowbreak HOME\_\allowbreak USER}, which is useful for
   970 debugging Sledgehammer but also unsafe if several instances of the tool are run
   971 simultaneously. The files are identified by the prefix \texttt{prob\_}; you may
   972 safely remove them after Sledgehammer has run.
   973 
   974 \nopagebreak
   975 {\small See also \textit{debug} (\S\ref{output-format}).}
   976 \end{enum}
   977 
   978 \subsection{Problem Encoding}
   979 \label{problem-encoding}
   980 
   981 \newcommand\comb[1]{\const{#1}}
   982 
   983 \begin{enum}
   984 \opdefault{lam\_trans}{string}{smart}
   985 Specifies the $\lambda$ translation scheme to use in ATP problems. The supported
   986 translation schemes are listed below:
   987 
   988 \begin{enum}
   989 \item[\labelitemi] \textbf{\textit{hide\_lams}:} Hide the $\lambda$-abstractions
   990 by replacing them by unspecified fresh constants, effectively disabling all
   991 reasoning under $\lambda$-abstractions.
   992 
   993 \item[\labelitemi] \textbf{\textit{lifting}:} Introduce a new
   994 supercombinator \const{c} for each cluster of $n$~$\lambda$-abstractions,
   995 defined using an equation $\const{c}~x_1~\ldots~x_n = t$ ($\lambda$-lifting).
   996 
   997 \item[\labelitemi] \textbf{\textit{combs}:} Rewrite lambdas to the Curry
   998 combinators (\comb{I}, \comb{K}, \comb{S}, \comb{B}, \comb{C}). Combinators
   999 enable the ATPs to synthesize $\lambda$-terms but tend to yield bulkier formulas
  1000 than $\lambda$-lifting: The translation is quadratic in the worst case, and the
  1001 equational definitions of the combinators are very prolific in the context of
  1002 resolution.
  1003 
  1004 \item[\labelitemi] \textbf{\textit{combs\_and\_lifting}:} Introduce a new
  1005 supercombinator \const{c} for each cluster of $\lambda$-abstractions and characterize it both using a
  1006 lifted equation $\const{c}~x_1~\ldots~x_n = t$ and via Curry combinators.
  1007 
  1008 \item[\labelitemi] \textbf{\textit{combs\_or\_lifting}:} For each cluster of
  1009 $\lambda$-abstractions, heuristically choose between $\lambda$-lifting and Curry
  1010 combinators.
  1011 
  1012 \item[\labelitemi] \textbf{\textit{keep\_lams}:}
  1013 Keep the $\lambda$-abstractions in the generated problems. This is available
  1014 only with provers that support the THF0 syntax.
  1015 
  1016 \item[\labelitemi] \textbf{\textit{smart}:} The actual translation scheme used
  1017 depends on the ATP and should be the most efficient scheme for that ATP.
  1018 \end{enum}
  1019 
  1020 For SMT solvers, the $\lambda$ translation scheme is always \textit{lifting},
  1021 irrespective of the value of this option.
  1022 
  1023 \opsmartx{uncurried\_aliases}{no\_uncurried\_aliases}
  1024 Specifies whether fresh function symbols should be generated as aliases for
  1025 applications of curried functions in ATP problems.
  1026 
  1027 \opdefault{type\_enc}{string}{smart}
  1028 Specifies the type encoding to use in ATP problems. Some of the type encodings
  1029 are unsound, meaning that they can give rise to spurious proofs
  1030 (unreconstructible using \textit{metis}). The supported type encodings are
  1031 listed below, with an indication of their soundness in parentheses.
  1032 An asterisk (*) means that the encoding is slightly incomplete for
  1033 reconstruction with \textit{metis}, unless the \emph{strict} option (described
  1034 below) is enabled.
  1035 
  1036 \begin{enum}
  1037 \item[\labelitemi] \textbf{\textit{erased} (very unsound):} No type information is
  1038 supplied to the ATP, not even to resolve overloading. Types are simply erased.
  1039 
  1040 \item[\labelitemi] \textbf{\textit{poly\_guards} (sound):} Types are encoded using
  1041 a predicate \const{g}$(\tau, t)$ that guards bound
  1042 variables. Constants are annotated with their types, supplied as additional
  1043 arguments, to resolve overloading.
  1044 
  1045 \item[\labelitemi] \textbf{\textit{poly\_tags} (sound):} Each term and subterm is
  1046 tagged with its type using a function $\const{t\/}(\tau, t)$.
  1047 
  1048 \item[\labelitemi] \textbf{\textit{poly\_args} (unsound):}
  1049 Like for \textit{poly\_guards} constants are annotated with their types to
  1050 resolve overloading, but otherwise no type information is encoded. This
  1051 coincides with the default encoding used by the \textit{metis} command.
  1052 
  1053 \item[\labelitemi]
  1054 \textbf{%
  1055 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags} (sound); \\
  1056 \textit{raw\_mono\_args} (unsound):} \\
  1057 Similar to \textit{poly\_guards}, \textit{poly\_tags}, and \textit{poly\_args},
  1058 respectively, but the problem is additionally monomorphized, meaning that type
  1059 variables are instantiated with heuristically chosen ground types.
  1060 Monomorphization can simplify reasoning but also leads to larger fact bases,
  1061 which can slow down the ATPs.
  1062 
  1063 \item[\labelitemi]
  1064 \textbf{%
  1065 \textit{mono\_guards}, \textit{mono\_tags} (sound);
  1066 \textit{mono\_args} (unsound):} \\
  1067 Similar to
  1068 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, and
  1069 \textit{raw\_mono\_args}, respectively but types are mangled in constant names
  1070 instead of being supplied as ground term arguments. The binary predicate
  1071 $\const{g}(\tau, t)$ becomes a unary predicate
  1072 $\const{g\_}\tau(t)$, and the binary function
  1073 $\const{t}(\tau, t)$ becomes a unary function
  1074 $\const{t\_}\tau(t)$.
  1075 
  1076 \item[\labelitemi] \textbf{\textit{mono\_native} (sound):} Exploits native
  1077 first-order types if the prover supports the TFF0, TFF1, or THF0 syntax;
  1078 otherwise, falls back on \textit{mono\_guards}. The problem is monomorphized.
  1079 
  1080 \item[\labelitemi] \textbf{\textit{mono\_native\_higher} (sound):} Exploits
  1081 native higher-order types if the prover supports the THF0 syntax; otherwise,
  1082 falls back on \textit{mono\_native} or \textit{mono\_guards}. The problem is
  1083 monomorphized.
  1084 
  1085 \item[\labelitemi] \textbf{\textit{poly\_native} (sound):} Exploits native
  1086 polymorphic first-order types if the prover supports the TFF1 syntax; otherwise,
  1087 falls back on \textit{mono\_native}.
  1088 
  1089 \item[\labelitemi]
  1090 \textbf{%
  1091 \textit{poly\_guards}?, \textit{poly\_tags}?, \textit{raw\_mono\_guards}?, \\
  1092 \textit{raw\_mono\_tags}?, \textit{mono\_guards}?, \textit{mono\_tags}?, \\
  1093 \textit{mono\_native}? (sound*):} \\
  1094 The type encodings \textit{poly\_guards}, \textit{poly\_tags},
  1095 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, \textit{mono\_guards},
  1096 \textit{mono\_tags}, and \textit{mono\_native} are fully typed and sound. For
  1097 each of these, Sledgehammer also provides a lighter variant identified by a
  1098 question mark (`\hbox{?}')\ that detects and erases monotonic types, notably
  1099 infinite types. (For \textit{mono\_native}, the types are not actually erased
  1100 but rather replaced by a shared uniform type of individuals.) As argument to the
  1101 \textit{metis} proof method, the question mark is replaced by a
  1102 \hbox{``\textit{\_query\/}''} suffix.
  1103 
  1104 \item[\labelitemi]
  1105 \textbf{%
  1106 \textit{poly\_guards}??, \textit{poly\_tags}??, \textit{raw\_mono\_guards}??, \\
  1107 \textit{raw\_mono\_tags}??, \textit{mono\_guards}??, \textit{mono\_tags}?? \\
  1108 (sound*):} \\
  1109 Even lighter versions of the `\hbox{?}' encodings. As argument to the
  1110 \textit{metis} proof method, the `\hbox{??}' suffix is replaced by
  1111 \hbox{``\textit{\_query\_query\/}''}.
  1112 
  1113 \item[\labelitemi]
  1114 \textbf{%
  1115 \textit{poly\_guards}@?, \textit{raw\_mono\_guards}@? (sound*):} \\
  1116 Alternative versions of the `\hbox{??}' encodings. As argument to the
  1117 \textit{metis} proof method, the `\hbox{@?}' suffix is replaced by
  1118 \hbox{``\textit{\_at\_query\/}''}.
  1119 
  1120 \item[\labelitemi]
  1121 \textbf{%
  1122 \textit{poly\_guards}!, \textit{poly\_tags}!, \textit{raw\_mono\_guards}!, \\
  1123 \textit{raw\_mono\_tags}!, \textit{mono\_guards}!, \textit{mono\_tags}!, \\
  1124 \textit{mono\_native}!, \textit{mono\_native\_higher}! (mildly unsound):} \\
  1125 The type encodings \textit{poly\_guards}, \textit{poly\_tags},
  1126 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, \textit{mono\_guards},
  1127 \textit{mono\_tags}, \textit{mono\_native}, and \textit{mono\_native\_higher}
  1128 also admit a mildly unsound (but very efficient) variant identified by an
  1129 exclamation mark (`\hbox{!}') that detects and erases erases all types except
  1130 those that are clearly finite (e.g., \textit{bool}). (For \textit{mono\_native}
  1131 and \textit{mono\_native\_higher}, the types are not actually erased but rather
  1132 replaced by a shared uniform type of individuals.) As argument to the
  1133 \textit{metis} proof method, the exclamation mark is replaced by the suffix
  1134 \hbox{``\textit{\_bang\/}''}.
  1135 
  1136 \item[\labelitemi]
  1137 \textbf{%
  1138 \textit{poly\_guards}!!, \textit{poly\_tags}!!, \textit{raw\_mono\_guards}!!, \\
  1139 \textit{raw\_mono\_tags}!!, \textit{mono\_guards}!!, \textit{mono\_tags}!! \\
  1140 (mildly unsound):} \\
  1141 Even lighter versions of the `\hbox{!}' encodings. As argument to the
  1142 \textit{metis} proof method, the `\hbox{!!}' suffix is replaced by
  1143 \hbox{``\textit{\_bang\_bang\/}''}.
  1144 
  1145 \item[\labelitemi]
  1146 \textbf{%
  1147 \textit{poly\_guards}@!, \textit{raw\_mono\_guards}@! (mildly unsound):} \\
  1148 Alternative versions of the `\hbox{!!}' encodings. As argument to the
  1149 \textit{metis} proof method, the `\hbox{@!}' suffix is replaced by
  1150 \hbox{``\textit{\_at\_bang\/}''}.
  1151 
  1152 \item[\labelitemi] \textbf{\textit{smart}:} The actual encoding used depends on
  1153 the ATP and should be the most efficient sound encoding for that ATP.
  1154 \end{enum}
  1155 
  1156 For SMT solvers, the type encoding is always \textit{mono\_native}, irrespective
  1157 of the value of this option.
  1158 
  1159 \nopagebreak
  1160 {\small See also \textit{max\_new\_mono\_instances} (\S\ref{relevance-filter})
  1161 and \textit{max\_mono\_iters} (\S\ref{relevance-filter}).}
  1162 
  1163 \opfalse{strict}{non\_strict}
  1164 Specifies whether Sledgehammer should run in its strict mode. In that mode,
  1165 sound type encodings marked with an asterisk (*) above are made complete
  1166 for reconstruction with \textit{metis}, at the cost of some clutter in the
  1167 generated problems. This option has no effect if \textit{type\_enc} is
  1168 deliberately set to an unsound encoding.
  1169 \end{enum}
  1170 
  1171 \subsection{Relevance Filter}
  1172 \label{relevance-filter}
  1173 
  1174 \begin{enum}
  1175 \opdefault{relevance\_thresholds}{float\_pair}{\upshape 0.45~0.85}
  1176 Specifies the thresholds above which facts are considered relevant by the
  1177 relevance filter. The first threshold is used for the first iteration of the
  1178 relevance filter and the second threshold is used for the last iteration (if it
  1179 is reached). The effective threshold is quadratically interpolated for the other
  1180 iterations. Each threshold ranges from 0 to 1, where 0 means that all theorems
  1181 are relevant and 1 only theorems that refer to previously seen constants.
  1182 
  1183 \opdefault{max\_relevant}{smart\_int}{smart}
  1184 Specifies the maximum number of facts that may be returned by the relevance
  1185 filter. If the option is set to \textit{smart}, it is set to a value that was
  1186 empirically found to be appropriate for the prover. A typical value would be
  1187 250.
  1188 
  1189 \opdefault{max\_new\_mono\_instances}{int}{\upshape 200}
  1190 Specifies the maximum number of monomorphic instances to generate beyond
  1191 \textit{max\_relevant}. The higher this limit is, the more monomorphic instances
  1192 are potentially generated. Whether monomorphization takes place depends on the
  1193 type encoding used.
  1194 
  1195 \nopagebreak
  1196 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
  1197 
  1198 \opdefault{max\_mono\_iters}{int}{\upshape 3}
  1199 Specifies the maximum number of iterations for the monomorphization fixpoint
  1200 construction. The higher this limit is, the more monomorphic instances are
  1201 potentially generated. Whether monomorphization takes place depends on the
  1202 type encoding used.
  1203 
  1204 \nopagebreak
  1205 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
  1206 \end{enum}
  1207 
  1208 \subsection{Output Format}
  1209 \label{output-format}
  1210 
  1211 \begin{enum}
  1212 
  1213 \opfalse{verbose}{quiet}
  1214 Specifies whether the \textbf{sledgehammer} command should explain what it does.
  1215 This option is implicitly disabled for automatic runs.
  1216 
  1217 \opfalse{debug}{no\_debug}
  1218 Specifies whether Sledgehammer should display additional debugging information
  1219 beyond what \textit{verbose} already displays. Enabling \textit{debug} also
  1220 enables \textit{verbose} and \textit{blocking} (\S\ref{mode-of-operation})
  1221 behind the scenes. The \textit{debug} option is implicitly disabled for
  1222 automatic runs.
  1223 
  1224 \nopagebreak
  1225 {\small See also \textit{overlord} (\S\ref{mode-of-operation}).}
  1226 
  1227 \opfalse{isar\_proof}{no\_isar\_proof}
  1228 Specifies whether Isar proofs should be output in addition to one-liner
  1229 \textit{metis} proofs. Isar proof construction is still experimental and often
  1230 fails; however, they are usually faster and sometimes more robust than
  1231 \textit{metis} proofs.
  1232 
  1233 \opdefault{isar\_shrink\_factor}{int}{\upshape 1}
  1234 Specifies the granularity of the Isar proof. A value of $n$ indicates that each
  1235 Isar proof step should correspond to a group of up to $n$ consecutive proof
  1236 steps in the ATP proof.
  1237 \end{enum}
  1238 
  1239 \subsection{Authentication}
  1240 \label{authentication}
  1241 
  1242 \begin{enum}
  1243 \opnodefault{expect}{string}
  1244 Specifies the expected outcome, which must be one of the following:
  1245 
  1246 \begin{enum}
  1247 \item[\labelitemi] \textbf{\textit{some}:} Sledgehammer found a proof.
  1248 \item[\labelitemi] \textbf{\textit{none}:} Sledgehammer found no proof.
  1249 \item[\labelitemi] \textbf{\textit{timeout}:} Sledgehammer timed out.
  1250 \item[\labelitemi] \textbf{\textit{unknown}:} Sledgehammer encountered some
  1251 problem.
  1252 \end{enum}
  1253 
  1254 Sledgehammer emits an error (if \textit{blocking} is enabled) or a warning
  1255 (otherwise) if the actual outcome differs from the expected outcome. This option
  1256 is useful for regression testing.
  1257 
  1258 \nopagebreak
  1259 {\small See also \textit{blocking} (\S\ref{mode-of-operation}) and
  1260 \textit{timeout} (\S\ref{timeouts}).}
  1261 \end{enum}
  1262 
  1263 \subsection{Timeouts}
  1264 \label{timeouts}
  1265 
  1266 \begin{enum}
  1267 \opdefault{timeout}{float\_or\_none}{\upshape 30}
  1268 Specifies the maximum number of seconds that the automatic provers should spend
  1269 searching for a proof. This excludes problem preparation and is a soft limit.
  1270 For historical reasons, the default value of this option can be overridden using
  1271 the option ``Sledgehammer: Time Limit'' in Proof General's ``Isabelle'' menu.
  1272 
  1273 \opdefault{preplay\_timeout}{float\_or\_none}{\upshape 3}
  1274 Specifies the maximum number of seconds that \textit{metis} or \textit{smt}
  1275 should spend trying to ``preplay'' the found proof. If this option is set to 0,
  1276 no preplaying takes place, and no timing information is displayed next to the
  1277 suggested \textit{metis} calls.
  1278 
  1279 \nopagebreak
  1280 {\small See also \textit{minimize} (\S\ref{mode-of-operation}).}
  1281 
  1282 \optrueonly{dont\_preplay}
  1283 Alias for ``\textit{preplay\_timeout} = 0''.
  1284 
  1285 \end{enum}
  1286 
  1287 \let\em=\sl
  1288 \bibliography{../manual}{}
  1289 \bibliographystyle{abbrv}
  1290 
  1291 \end{document}