doc-src/Sledgehammer/sledgehammer.tex
author blanchet
Thu Sep 22 19:42:06 2011 +0200 (2011-09-22)
changeset 45048 59ca831deef4
parent 44816 efa1f532508b
child 45063 b3b50d8b535a
permissions -rw-r--r--
take out remote E-SInE -- it's broken and Geoff says it might take quite a while before he gets to it, plus it's fairly obsolete in the meantime
     1 \documentclass[a4paper,12pt]{article}
     2 \usepackage[T1]{fontenc}
     3 \usepackage{amsmath}
     4 \usepackage{amssymb}
     5 \usepackage[english,french]{babel}
     6 \usepackage{color}
     7 \usepackage{footmisc}
     8 \usepackage{graphicx}
     9 %\usepackage{mathpazo}
    10 \usepackage{multicol}
    11 \usepackage{stmaryrd}
    12 %\usepackage[scaled=.85]{beramono}
    13 \usepackage{../../lib/texinputs/isabelle,../iman,../pdfsetup}
    14 
    15 \def\qty#1{\ensuremath{\left<\mathit{#1\/}\right>}}
    16 \def\qtybf#1{$\mathbf{\left<\textbf{\textit{#1\/}}\right>}$}
    17 
    18 %\oddsidemargin=4.6mm
    19 %\evensidemargin=4.6mm
    20 %\textwidth=150mm
    21 %\topmargin=4.6mm
    22 %\headheight=0mm
    23 %\headsep=0mm
    24 %\textheight=234mm
    25 
    26 \def\Colon{\mathord{:\mkern-1.5mu:}}
    27 %\def\lbrakk{\mathopen{\lbrack\mkern-3.25mu\lbrack}}
    28 %\def\rbrakk{\mathclose{\rbrack\mkern-3.255mu\rbrack}}
    29 \def\lparr{\mathopen{(\mkern-4mu\mid}}
    30 \def\rparr{\mathclose{\mid\mkern-4mu)}}
    31 
    32 \def\unk{{?}}
    33 \def\undef{(\lambda x.\; \unk)}
    34 %\def\unr{\textit{others}}
    35 \def\unr{\ldots}
    36 \def\Abs#1{\hbox{\rm{\flqq}}{\,#1\,}\hbox{\rm{\frqq}}}
    37 \def\Q{{\smash{\lower.2ex\hbox{$\scriptstyle?$}}}}
    38 
    39 \urlstyle{tt}
    40 
    41 \begin{document}
    42 
    43 \selectlanguage{english}
    44 
    45 \title{\includegraphics[scale=0.5]{isabelle_sledgehammer} \\[4ex]
    46 Hammering Away \\[\smallskipamount]
    47 \Large A User's Guide to Sledgehammer for Isabelle/HOL}
    48 \author{\hbox{} \\
    49 Jasmin Christian Blanchette \\
    50 {\normalsize Institut f\"ur Informatik, Technische Universit\"at M\"unchen} \\[4\smallskipamount]
    51 {\normalsize with contributions from} \\[4\smallskipamount]
    52 Lawrence C. Paulson \\
    53 {\normalsize Computer Laboratory, University of Cambridge} \\
    54 \hbox{}}
    55 
    56 \maketitle
    57 
    58 \tableofcontents
    59 
    60 \setlength{\parskip}{.7em plus .2em minus .1em}
    61 \setlength{\parindent}{0pt}
    62 \setlength{\abovedisplayskip}{\parskip}
    63 \setlength{\abovedisplayshortskip}{.9\parskip}
    64 \setlength{\belowdisplayskip}{\parskip}
    65 \setlength{\belowdisplayshortskip}{.9\parskip}
    66 
    67 % General-purpose enum environment with correct spacing
    68 \newenvironment{enum}%
    69     {\begin{list}{}{%
    70         \setlength{\topsep}{.1\parskip}%
    71         \setlength{\partopsep}{.1\parskip}%
    72         \setlength{\itemsep}{\parskip}%
    73         \advance\itemsep by-\parsep}}
    74     {\end{list}}
    75 
    76 \def\pre{\begingroup\vskip0pt plus1ex\advance\leftskip by\leftmargin
    77 \advance\rightskip by\leftmargin}
    78 \def\post{\vskip0pt plus1ex\endgroup}
    79 
    80 \def\prew{\pre\advance\rightskip by-\leftmargin}
    81 \def\postw{\post}
    82 
    83 \section{Introduction}
    84 \label{introduction}
    85 
    86 Sledgehammer is a tool that applies automatic theorem provers (ATPs)
    87 and satisfiability-modulo-theories (SMT) solvers on the current goal. The
    88 supported ATPs are E \cite{schulz-2002}, E-SInE \cite{sine}, E-ToFoF
    89 \cite{tofof}, LEO-II \cite{leo2}, Satallax \cite{satallax}, SNARK \cite{snark},
    90 SPASS \cite{weidenbach-et-al-2009}, Vampire \cite{riazanov-voronkov-2002}, and
    91 Waldmeister \cite{waldmeister}. The ATPs are run either locally or remotely via
    92 the System\-On\-TPTP web service \cite{sutcliffe-2000}. In addition to the ATPs,
    93 the SMT solvers Z3 \cite{z3} is used by default, and you can tell Sledgehammer
    94 to try CVC3 \cite{cvc3} and Yices \cite{yices} as well; these are run either
    95 locally or on a server at the TU M\"unchen.
    96 
    97 The problem passed to the automatic provers consists of your current goal
    98 together with a heuristic selection of hundreds of facts (theorems) from the
    99 current theory context, filtered by relevance. Because jobs are run in the
   100 background, you can continue to work on your proof by other means. Provers can
   101 be run in parallel. Any reply (which may arrive half a minute later) will appear
   102 in the Proof General response buffer.
   103 
   104 The result of a successful proof search is some source text that usually (but
   105 not always) reconstructs the proof within Isabelle. For ATPs, the reconstructed
   106 proof relies on the general-purpose Metis prover, which is fully integrated into
   107 Isabelle/HOL, with explicit inferences going through the kernel. Thus its
   108 results are correct by construction.
   109 
   110 In this manual, we will explicitly invoke the \textbf{sledgehammer} command.
   111 Sledgehammer also provides an automatic mode that can be enabled via the ``Auto
   112 Sledgehammer'' option in Proof General's ``Isabelle'' menu. In this mode,
   113 Sledgehammer is run on every newly entered theorem. The time limit for Auto
   114 Sledgehammer and other automatic tools can be set using the ``Auto Tools Time
   115 Limit'' option.
   116 
   117 \newbox\boxA
   118 \setbox\boxA=\hbox{\texttt{nospam}}
   119 
   120 \newcommand\authoremail{\texttt{blan{\color{white}nospam}\kern-\wd\boxA{}chette@\allowbreak
   121 in.\allowbreak tum.\allowbreak de}}
   122 
   123 To run Sledgehammer, you must make sure that the theory \textit{Sledgehammer} is
   124 imported---this is rarely a problem in practice since it is part of
   125 \textit{Main}. Examples of Sledgehammer use can be found in Isabelle's
   126 \texttt{src/HOL/Metis\_Examples} directory.
   127 Comments and bug reports concerning Sledgehammer or this manual should be
   128 directed to the author at \authoremail.
   129 
   130 \vskip2.5\smallskipamount
   131 
   132 %\textbf{Acknowledgment.} The author would like to thank Mark Summerfield for
   133 %suggesting several textual improvements.
   134 
   135 \section{Installation}
   136 \label{installation}
   137 
   138 Sledgehammer is part of Isabelle, so you don't need to install it. However, it
   139 relies on third-party automatic theorem provers (ATPs) and SMT solvers.
   140 
   141 \subsection{Installing ATPs}
   142 
   143 Currently, E, LEO-II, Satallax, SPASS, and Vampire can be run locally; in
   144 addition, E, E-SInE, E-ToFoF, LEO-II, Satallax, SNARK, Waldmeister, and Vampire
   145 are available remotely via System\-On\-TPTP \cite{sutcliffe-2000}. If you want
   146 better performance, you should at least install E and SPASS locally.
   147 
   148 There are three main ways to install ATPs on your machine:
   149 
   150 \begin{enum}
   151 \item[$\bullet$] If you installed an official Isabelle package with everything
   152 inside, it should already include properly setup executables for E and SPASS,
   153 ready to use.%
   154 \footnote{Vampire's license prevents us from doing the same for this otherwise
   155 wonderful tool.}
   156 
   157 \item[$\bullet$] Alternatively, you can download the Isabelle-aware E and SPASS
   158 binary packages from Isabelle's download page. Extract the archives, then add a
   159 line to your \texttt{\$ISABELLE\_HOME\_USER/etc/components}%
   160 \footnote{The variable \texttt{\$ISABELLE\_HOME\_USER} is set by Isabelle at
   161 startup. Its value can be retrieved by invoking \texttt{isabelle}
   162 \texttt{getenv} \texttt{ISABELLE\_HOME\_USER} on the command line.}
   163 file with the absolute
   164 path to E or SPASS. For example, if the \texttt{components} does not exist yet
   165 and you extracted SPASS to \texttt{/usr/local/spass-3.7}, create the
   166 \texttt{components} file with the single line
   167 
   168 \prew
   169 \texttt{/usr/local/spass-3.7}
   170 \postw
   171 
   172 in it.
   173 
   174 \item[$\bullet$] If you prefer to build E or SPASS yourself, or obtained a
   175 Vampire executable from somewhere (e.g., \url{http://www.vprover.org/}),
   176 set the environment variable \texttt{E\_HOME}, \texttt{SPASS\_HOME}, or
   177 \texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{eproof},
   178 \texttt{SPASS}, or \texttt{vampire} executable. Sledgehammer has been tested
   179 with E 1.0 to 1.4, SPASS 3.5 and 3.7, and Vampire 0.6, 1.0, and 1.8%
   180 \footnote{Following the rewrite of Vampire, the counter for version numbers was
   181 reset to 0; hence the (new) Vampire versions 0.6, 1.0, and 1.8 are more recent
   182 than, say, Vampire 9.0 or 11.5.}%
   183 . Since the ATPs' output formats are neither documented nor stable, other
   184 versions of the ATPs might or might not work well with Sledgehammer. Ideally,
   185 also set \texttt{E\_VERSION}, \texttt{SPASS\_VERSION}, or
   186 \texttt{VAMPIRE\_VERSION} to the ATP's version number (e.g., ``1.4'').
   187 \end{enum}
   188 
   189 To check whether E and SPASS are successfully installed, follow the example in
   190 \S\ref{first-steps}. If the remote versions of E and SPASS are used (identified
   191 by the prefix ``\emph{remote\_}''), or if the local versions fail to solve the
   192 easy goal presented there, this is a sign that something is wrong with your
   193 installation.
   194 
   195 Remote ATP invocation via the SystemOnTPTP web service requires Perl with the
   196 World Wide Web Library (\texttt{libwww-perl}) installed. If you must use a proxy
   197 server to access the Internet, set the \texttt{http\_proxy} environment variable
   198 to the proxy, either in the environment in which Isabelle is launched or in your
   199 \texttt{\char`\~/\$ISABELLE\_HOME\_USER/etc/settings} file. Here are a few examples:
   200 
   201 \prew
   202 \texttt{http\_proxy=http://proxy.example.org} \\
   203 \texttt{http\_proxy=http://proxy.example.org:8080} \\
   204 \texttt{http\_proxy=http://joeblow:pAsSwRd@proxy.example.org}
   205 \postw
   206 
   207 \subsection{Installing SMT Solvers}
   208 
   209 CVC3, Yices, and Z3 can be run locally or (for CVC3 and Z3) remotely on a TU
   210 M\"unchen server. If you want better performance and get the ability to replay
   211 proofs that rely on the \emph{smt} proof method, you should at least install Z3
   212 locally.
   213 
   214 There are two main ways of installing SMT solvers locally.
   215 
   216 \begin{enum}
   217 \item[$\bullet$] If you installed an official Isabelle package with everything
   218 inside, it should already include properly setup executables for CVC3 and Z3,
   219 ready to use.%
   220 \footnote{Yices's license prevents us from doing the same for this otherwise
   221 wonderful tool.}
   222 For Z3, you additionally need to set the environment variable
   223 \texttt{Z3\_NON\_COMMERCIAL} to ``yes'' to confirm that you are a noncommercial
   224 user.
   225 
   226 \item[$\bullet$] Otherwise, follow the instructions documented in the \emph{SMT}
   227 theory (\texttt{\$ISABELLE\_HOME/src/HOL/SMT.thy}).
   228 \end{enum}
   229 
   230 \section{First Steps}
   231 \label{first-steps}
   232 
   233 To illustrate Sledgehammer in context, let us start a theory file and
   234 attempt to prove a simple lemma:
   235 
   236 \prew
   237 \textbf{theory}~\textit{Scratch} \\
   238 \textbf{imports}~\textit{Main} \\
   239 \textbf{begin} \\[2\smallskipamount]
   240 %
   241 \textbf{lemma} ``$[a] = [b] \,\Longrightarrow\, a = b$'' \\
   242 \textbf{sledgehammer}
   243 \postw
   244 
   245 Instead of issuing the \textbf{sledgehammer} command, you can also find
   246 Sledgehammer in the ``Commands'' submenu of the ``Isabelle'' menu in Proof
   247 General or press the Emacs key sequence C-c C-a C-s.
   248 Either way, Sledgehammer produces the following output after a few seconds:
   249 
   250 \prew
   251 \slshape
   252 Sledgehammer: ``\textit{e}'' on goal \\
   253 $[a] = [b] \,\Longrightarrow\, a = b$ \\
   254 Try this: \textbf{by} (\textit{metis last\_ConsL}) (64 ms). \\[3\smallskipamount]
   255 %
   256 Sledgehammer: ``\textit{vampire}'' on goal \\
   257 $[a] = [b] \,\Longrightarrow\, a = b$ \\
   258 Try this: \textbf{by} (\textit{metis hd.simps}) (14 ms). \\[3\smallskipamount]
   259 %
   260 Sledgehammer: ``\textit{spass}'' on goal \\
   261 $[a] = [b] \,\Longrightarrow\, a = b$ \\
   262 Try this: \textbf{by} (\textit{metis list.inject}) (17 ms). \\[3\smallskipamount]
   263 %
   264 Sledgehammer: ``\textit{remote\_waldmeister}'' on goal \\
   265 $[a] = [b] \,\Longrightarrow\, a = b$ \\
   266 Try this: \textbf{by} (\textit{metis hd.simps}) (15 ms). \\[3\smallskipamount]
   267 %
   268 Sledgehammer: ``\textit{remote\_z3}'' on goal \\
   269 $[a] = [b] \,\Longrightarrow\, a = b$ \\
   270 Try this: \textbf{by} (\textit{metis list.inject}) (20 ms).
   271 \postw
   272 
   273 Sledgehammer ran E, SPASS, Vampire, Waldmeister, and Z3 in parallel.
   274 Depending on which provers are installed and how many processor cores are
   275 available, some of the provers might be missing or present with a
   276 \textit{remote\_} prefix. Waldmeister is run only for unit equational problems,
   277 where the goal's conclusion is a (universally quantified) equation.
   278 
   279 For each successful prover, Sledgehammer gives a one-liner proof that uses Metis
   280 or the \textit{smt} proof method. For Metis, approximate timings are shown in
   281 parentheses, indicating how fast the call is. You can click the proof to insert
   282 it into the theory text.
   283 
   284 In addition, you can ask Sledgehammer for an Isar text proof by passing the
   285 \textit{isar\_proof} option (\S\ref{output-format}):
   286 
   287 \prew
   288 \textbf{sledgehammer} [\textit{isar\_proof}]
   289 \postw
   290 
   291 When Isar proof construction is successful, it can yield proofs that are more
   292 readable and also faster than the Metis one-liners. This feature is experimental
   293 and is only available for ATPs.
   294 
   295 \section{Hints}
   296 \label{hints}
   297 
   298 This section presents a few hints that should help you get the most out of
   299 Sledgehammer and Metis. Frequently (and infrequently) asked questions are
   300 answered in \S\ref{frequently-asked-questions}.
   301 
   302 \newcommand\point[1]{\medskip\par{\sl\bfseries#1}\par\nopagebreak}
   303 
   304 \point{Presimplify the goal}
   305 
   306 For best results, first simplify your problem by calling \textit{auto} or at
   307 least \textit{safe} followed by \textit{simp\_all}. The SMT solvers provide
   308 arithmetic decision procedures, but the ATPs typically do not (or if they do,
   309 Sledgehammer does not use it yet). Apart from Waldmeister, they are not
   310 especially good at heavy rewriting, but because they regard equations as
   311 undirected, they often prove theorems that require the reverse orientation of a
   312 \textit{simp} rule. Higher-order problems can be tackled, but the success rate
   313 is better for first-order problems. Hence, you may get better results if you
   314 first simplify the problem to remove higher-order features.
   315 
   316 \point{Make sure at least E, SPASS, Vampire, and Z3 are installed}
   317 
   318 Locally installed provers are faster and more reliable than those running on
   319 servers. See \S\ref{installation} for details on how to install them.
   320 
   321 \point{Familiarize yourself with the most important options}
   322 
   323 Sledgehammer's options are fully documented in \S\ref{command-syntax}. Many of
   324 the options are very specialized, but serious users of the tool should at least
   325 familiarize themselves with the following options:
   326 
   327 \begin{enum}
   328 \item[$\bullet$] \textbf{\textit{provers}} (\S\ref{mode-of-operation}) specifies
   329 the automatic provers (ATPs and SMT solvers) that should be run whenever
   330 Sledgehammer is invoked (e.g., ``\textit{provers}~= \textit{e spass
   331 remote\_vampire}''). For convenience, you can omit ``\textit{provers}~=''
   332 and simply write the prover names as a space-separated list (e.g., ``\textit{e
   333 spass remote\_vampire}'').
   334 
   335 \item[$\bullet$] \textbf{\textit{max\_relevant}} (\S\ref{relevance-filter})
   336 specifies the maximum number of facts that should be passed to the provers. By
   337 default, the value is prover-dependent but varies between about 150 and 1000. If
   338 the provers time out, you can try lowering this value to, say, 100 or 50 and see
   339 if that helps.
   340 
   341 \item[$\bullet$] \textbf{\textit{isar\_proof}} (\S\ref{output-format}) specifies
   342 that Isar proofs should be generated, instead of one-liner Metis proofs. The
   343 length of the Isar proofs can be controlled by setting
   344 \textit{isar\_shrink\_factor} (\S\ref{output-format}).
   345 
   346 \item[$\bullet$] \textbf{\textit{timeout}} (\S\ref{timeouts}) controls the
   347 provers' time limit. It is set to 30 seconds, but since Sledgehammer runs
   348 asynchronously you should not hesitate to raise this limit to 60 or 120 seconds
   349 if you are the kind of user who can think clearly while ATPs are active.
   350 \end{enum}
   351 
   352 Options can be set globally using \textbf{sledgehammer\_params}
   353 (\S\ref{command-syntax}). The command also prints the list of all available
   354 options with their current value. Fact selection can be influenced by specifying
   355 ``$(\textit{add}{:}~\textit{my\_facts})$'' after the \textbf{sledgehammer} call
   356 to ensure that certain facts are included, or simply ``$(\textit{my\_facts})$''
   357 to force Sledgehammer to run only with $\textit{my\_facts}$.
   358 
   359 \section{Frequently Asked Questions}
   360 \label{frequently-asked-questions}
   361 
   362 This sections answers frequently (and infrequently) asked questions about
   363 Sledgehammer. It is a good idea to skim over it now even if you don't have any
   364 questions at this stage. And if you have any further questions not listed here,
   365 send them to the author at \authoremail.
   366 
   367 \point{Why does Metis fail to reconstruct the proof?}
   368 
   369 There are many reasons. If Metis runs seemingly forever, that is a sign that the
   370 proof is too difficult for it. Metis's search is complete, so it should
   371 eventually find it, but that's little consolation. There are several possible
   372 solutions:
   373 
   374 \begin{enum}
   375 \item[$\bullet$] Try the \textit{isar\_proof} option (\S\ref{output-format}) to
   376 obtain a step-by-step Isar proof where each step is justified by Metis. Since
   377 the steps are fairly small, Metis is more likely to be able to replay them.
   378 
   379 \item[$\bullet$] Try the \textit{smt} proof method instead of Metis. It is
   380 usually stronger, but you need to have Z3 available to replay the proofs, trust
   381 the SMT solver, or use certificates. See the documentation in the \emph{SMT}
   382 theory (\texttt{\$ISABELLE\_HOME/src/HOL/SMT.thy}) for details.
   383 
   384 \item[$\bullet$] Try the \textit{blast} or \textit{auto} proof methods, passing
   385 the necessary facts via \textbf{unfolding}, \textbf{using}, \textit{intro}{:},
   386 \textit{elim}{:}, \textit{dest}{:}, or \textit{simp}{:}, as appropriate.
   387 \end{enum}
   388 
   389 In some rare cases, Metis fails fairly quickly, and you get the error message
   390 
   391 \prew
   392 \slshape
   393 Proof reconstruction failed.
   394 \postw
   395 
   396 This message usually indicates that Sledgehammer found a type-incorrect proof.
   397 This was a frequent issue with older versions of Sledgehammer, which did not
   398 supply enough typing information to the ATPs by default. If you notice many
   399 unsound proofs and are not using \textit{type\_enc} (\S\ref{problem-encoding}),
   400 contact the author at \authoremail.
   401 
   402 \point{How can I tell whether a generated proof is sound?}
   403 
   404 First, if Metis can reconstruct it, the proof is sound (assuming Isabelle's
   405 inference kernel is sound). If it fails or runs seemingly forever, you can try
   406 
   407 \prew
   408 \textbf{apply}~\textbf{--} \\
   409 \textbf{sledgehammer} [\textit{sound}] (\textit{metis\_facts})
   410 \postw
   411 
   412 where \textit{metis\_facts} is the list of facts appearing in the suggested
   413 Metis call. The automatic provers should be able to re-find the proof quickly if
   414 it is sound, and the \textit{sound} option (\S\ref{problem-encoding}) ensures
   415 that no unsound proofs are found.
   416 
   417 \point{Which facts are passed to the automatic provers?}
   418 
   419 The relevance filter assigns a score to every available fact (lemma, theorem,
   420 definition, or axiom)\ based upon how many constants that fact shares with the
   421 conjecture. This process iterates to include facts relevant to those just
   422 accepted, but with a decay factor to ensure termination. The constants are
   423 weighted to give unusual ones greater significance. The relevance filter copes
   424 best when the conjecture contains some unusual constants; if all the constants
   425 are common, it is unable to discriminate among the hundreds of facts that are
   426 picked up. The relevance filter is also memoryless: It has no information about
   427 how many times a particular fact has been used in a proof, and it cannot learn.
   428 
   429 The number of facts included in a problem varies from prover to prover, since
   430 some provers get overwhelmed more easily than others. You can show the number of
   431 facts given using the \textit{verbose} option (\S\ref{output-format}) and the
   432 actual facts using \textit{debug} (\S\ref{output-format}).
   433 
   434 Sledgehammer is good at finding short proofs combining a handful of existing
   435 lemmas. If you are looking for longer proofs, you must typically restrict the
   436 number of facts, by setting the \textit{max\_relevant} option
   437 (\S\ref{relevance-filter}) to, say, 25 or 50.
   438 
   439 You can also influence which facts are actually selected in a number of ways. If
   440 you simply want to ensure that a fact is included, you can specify it using the
   441 ``$(\textit{add}{:}~\textit{my\_facts})$'' syntax. For example:
   442 %
   443 \prew
   444 \textbf{sledgehammer} (\textit{add}: \textit{hd.simps} \textit{tl.simps})
   445 \postw
   446 %
   447 The specified facts then replace the least relevant facts that would otherwise be
   448 included; the other selected facts remain the same.
   449 If you want to direct the selection in a particular direction, you can specify
   450 the facts via \textbf{using}:
   451 %
   452 \prew
   453 \textbf{using} \textit{hd.simps} \textit{tl.simps} \\
   454 \textbf{sledgehammer}
   455 \postw
   456 %
   457 The facts are then more likely to be selected than otherwise, and if they are
   458 selected at iteration $j$ they also influence which facts are selected at
   459 iterations $j + 1$, $j + 2$, etc. To give them even more weight, try
   460 %
   461 \prew
   462 \textbf{using} \textit{hd.simps} \textit{tl.simps} \\
   463 \textbf{apply}~\textbf{--} \\
   464 \textbf{sledgehammer}
   465 \postw
   466 
   467 \point{Why are the generated Isar proofs so ugly/detailed/broken?}
   468 
   469 The current implementation is experimental and explodes exponentially in the
   470 worst case. Work on a new implementation has begun. There is a large body of
   471 research into transforming resolution proofs into natural deduction proofs (such
   472 as Isar proofs), which we hope to leverage. In the meantime, a workaround is to
   473 set the \textit{isar\_shrink\_factor} option (\S\ref{output-format}) to a larger
   474 value or to try several provers and keep the nicest-looking proof.
   475 
   476 \point{What are the \textit{full\_types} and \textit{no\_types} arguments to
   477 Metis?}
   478 
   479 The \textit{metis}~(\textit{full\_types}) proof method is the fully-typed
   480 version of Metis. It is somewhat slower than \textit{metis}, but the proof
   481 search is fully typed, and it also includes more powerful rules such as the
   482 axiom ``$x = \mathit{True} \mathrel{\lor} x = \mathit{False}$'' for reasoning in
   483 higher-order places (e.g., in set comprehensions). The method kicks in
   484 automatically as a fallback when \textit{metis} fails, and it is sometimes
   485 generated by Sledgehammer instead of \textit{metis} if the proof obviously
   486 requires type information or if \textit{metis} failed when Sledgehammer
   487 preplayed the proof. (By default, Sledgehammer tries to run \textit{metis} with
   488 various options for up to 4 seconds to ensure that the generated one-line proofs
   489 actually work and to display timing information. This can be configured using
   490 the \textit{preplay\_timeout} option (\S\ref{timeouts}).)
   491 
   492 At the other end of the soundness spectrum, \textit{metis} (\textit{no\_types})
   493 uses no type information at all during the proof search, which is more efficient
   494 but often fails. Calls to \textit{metis} (\textit{no\_types}) are occasionally
   495 generated by Sledgehammer.
   496 
   497 Incidentally, if you see the warning
   498 
   499 \prew
   500 \slshape
   501 Metis: Falling back on ``\textit{metis} (\textit{full\_types})''.
   502 \postw
   503 
   504 for a successful Metis proof, you can advantageously pass the
   505 \textit{full\_types} option to \textit{metis} directly.
   506 
   507 \point{Are generated proofs minimal?}
   508 
   509 Automatic provers frequently use many more facts than are necessary.
   510 Sledgehammer inclues a minimization tool that takes a set of facts returned by a
   511 given prover and repeatedly calls the same prover or Metis with subsets of those
   512 axioms in order to find a minimal set. Reducing the number of axioms typically
   513 improves Metis's speed and success rate, while also removing superfluous clutter
   514 from the proof scripts.
   515 
   516 In earlier versions of Sledgehammer, generated proofs were systematically
   517 accompanied by a suggestion to invoke the minimization tool. This step is now
   518 performed implicitly if it can be done in a reasonable amount of time (something
   519 that can be guessed from the number of facts in the original proof and the time
   520 it took to find it or replay it).
   521 
   522 In addition, some provers (notably CVC3, Satallax, and Yices) do not provide
   523 proofs or sometimes produce incomplete proofs. The minimizer is then invoked to
   524 find out which facts are actually needed from the (large) set of facts that was
   525 initinally given to the prover. Finally, if a prover returns a proof with lots
   526 of facts, the minimizer is invoked automatically since Metis would be unlikely
   527 to re-find the proof.
   528 
   529 \point{A strange error occurred---what should I do?}
   530 
   531 Sledgehammer tries to give informative error messages. Please report any strange
   532 error to the author at \authoremail. This applies double if you get the message
   533 
   534 \prew
   535 \slshape
   536 The prover found a type-unsound proof involving ``\textit{foo}'',
   537 ``\textit{bar}'', and ``\textit{baz}'' even though a supposedly type-sound
   538 encoding was used (or, less likely, your axioms are inconsistent). You might
   539 want to report this to the Isabelle developers.
   540 \postw
   541 
   542 \point{Auto can solve it---why not Sledgehammer?}
   543 
   544 Problems can be easy for \textit{auto} and difficult for automatic provers, but
   545 the reverse is also true, so don't be discouraged if your first attempts fail.
   546 Because the system refers to all theorems known to Isabelle, it is particularly
   547 suitable when your goal has a short proof from lemmas that you don't know about.
   548 
   549 \point{Why are there so many options?}
   550 
   551 Sledgehammer's philosophy should work out of the box, without user guidance.
   552 Many of the options are meant to be used mostly by the Sledgehammer developers
   553 for experimentation purposes. Of course, feel free to experiment with them if
   554 you are so inclined.
   555 
   556 \section{Command Syntax}
   557 \label{command-syntax}
   558 
   559 Sledgehammer can be invoked at any point when there is an open goal by entering
   560 the \textbf{sledgehammer} command in the theory file. Its general syntax is as
   561 follows:
   562 
   563 \prew
   564 \textbf{sledgehammer} \qty{subcommand}$^?$ \qty{options}$^?$ \qty{facts\_override}$^?$ \qty{num}$^?$
   565 \postw
   566 
   567 For convenience, Sledgehammer is also available in the ``Commands'' submenu of
   568 the ``Isabelle'' menu in Proof General or by pressing the Emacs key sequence C-c
   569 C-a C-s. This is equivalent to entering the \textbf{sledgehammer} command with
   570 no arguments in the theory text.
   571 
   572 In the general syntax, the \qty{subcommand} may be any of the following:
   573 
   574 \begin{enum}
   575 \item[$\bullet$] \textbf{\textit{run} (the default):} Runs Sledgehammer on
   576 subgoal number \qty{num} (1 by default), with the given options and facts.
   577 
   578 \item[$\bullet$] \textbf{\textit{min}:} Attempts to minimize the facts
   579 specified in the \qty{facts\_override} argument to obtain a simpler proof
   580 involving fewer facts. The options and goal number are as for \textit{run}.
   581 
   582 \item[$\bullet$] \textbf{\textit{messages}:} Redisplays recent messages issued
   583 by Sledgehammer. This allows you to examine results that might have been lost
   584 due to Sledgehammer's asynchronous nature. The \qty{num} argument specifies a
   585 limit on the number of messages to display (5 by default).
   586 
   587 \item[$\bullet$] \textbf{\textit{supported\_provers}:} Prints the list of
   588 automatic provers supported by Sledgehammer. See \S\ref{installation} and
   589 \S\ref{mode-of-operation} for more information on how to install automatic
   590 provers.
   591 
   592 \item[$\bullet$] \textbf{\textit{running\_provers}:} Prints information about
   593 currently running automatic provers, including elapsed runtime and remaining
   594 time until timeout.
   595 
   596 \item[$\bullet$] \textbf{\textit{kill\_provers}:} Terminates all running
   597 automatic provers.
   598 
   599 \item[$\bullet$] \textbf{\textit{refresh\_tptp}:} Refreshes the list of remote
   600 ATPs available at System\-On\-TPTP \cite{sutcliffe-2000}.
   601 \end{enum}
   602 
   603 Sledgehammer's behavior can be influenced by various \qty{options}, which can be
   604 specified in brackets after the \textbf{sledgehammer} command. The
   605 \qty{options} are a list of key--value pairs of the form ``[$k_1 = v_1,
   606 \ldots, k_n = v_n$]''. For Boolean options, ``= \textit{true}'' is optional. For
   607 example:
   608 
   609 \prew
   610 \textbf{sledgehammer} [\textit{isar\_proof}, \,\textit{timeout} = 120]
   611 \postw
   612 
   613 Default values can be set using \textbf{sledgehammer\_\allowbreak params}:
   614 
   615 \prew
   616 \textbf{sledgehammer\_params} \qty{options}
   617 \postw
   618 
   619 The supported options are described in \S\ref{option-reference}.
   620 
   621 The \qty{facts\_override} argument lets you alter the set of facts that go
   622 through the relevance filter. It may be of the form ``(\qty{facts})'', where
   623 \qty{facts} is a space-separated list of Isabelle facts (theorems, local
   624 assumptions, etc.), in which case the relevance filter is bypassed and the given
   625 facts are used. It may also be of the form ``(\textit{add}:\ \qty{facts\/_{\mathrm{1}}})'',
   626 ``(\textit{del}:\ \qty{facts\/_{\mathrm{2}}})'', or ``(\textit{add}:\ \qty{facts\/_{\mathrm{1}}}\
   627 \textit{del}:\ \qty{facts\/_{\mathrm{2}}})'', where the relevance filter is instructed to
   628 proceed as usual except that it should consider \qty{facts\/_{\mathrm{1}}}
   629 highly-relevant and \qty{facts\/_{\mathrm{2}}} fully irrelevant.
   630 
   631 You can instruct Sledgehammer to run automatically on newly entered theorems by
   632 enabling the ``Auto Sledgehammer'' option in Proof General's ``Isabelle'' menu.
   633 For automatic runs, only the first prover set using \textit{provers}
   634 (\S\ref{mode-of-operation}) is considered, fewer facts are passed to the prover,
   635 \textit{slicing} (\S\ref{mode-of-operation}) is disabled, \textit{sound}
   636 (\S\ref{problem-encoding}) is enabled, \textit{verbose} (\S\ref{output-format})
   637 and \textit{debug} (\S\ref{output-format}) are disabled, and \textit{timeout}
   638 (\S\ref{timeouts}) is superseded by the ``Auto Tools Time Limit'' in Proof
   639 General's ``Isabelle'' menu. Sledgehammer's output is also more concise.
   640 
   641 The \textit{metis} proof method has the syntax
   642 
   643 \prew
   644 \textbf{\textit{metis}}~(\qty{type\_enc})${}^?$~\qty{facts}${}^?$
   645 \postw
   646 
   647 where \qty{type\_enc} is a type encoding specification with the same semantics
   648 as Sledgehammer's \textit{type\_enc} option (\S\ref{problem-encoding}) and
   649 \qty{facts} is a list of arbitrary facts. In addition to the values listed in
   650 \S\ref{problem-encoding}, \qty{type\_enc} may also be \textit{full\_types}, in
   651 which case an appropriate type-sound encoding is chosen, \textit{partial\_types}
   652 (the default type-unsound encoding), or \textit{no\_types}, a synonym for
   653 \textit{erased}.
   654 
   655 \section{Option Reference}
   656 \label{option-reference}
   657 
   658 \def\defl{\{}
   659 \def\defr{\}}
   660 
   661 \def\flushitem#1{\item[]\noindent\kern-\leftmargin \textbf{#1}}
   662 \def\optrue#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{true}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
   663 \def\opfalse#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{false}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
   664 \def\opsmart#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\enskip \defl\textit{smart}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
   665 \def\opnodefault#1#2{\flushitem{\textit{#1} = \qtybf{#2}} \nopagebreak\\[\parskip]}
   666 \def\opnodefaultbrk#1#2{\flushitem{$\bigl[$\textit{#1} =$\bigr]$ \qtybf{#2}} \nopagebreak\\[\parskip]}
   667 \def\opdefault#1#2#3{\flushitem{\textit{#1} = \qtybf{#2}\enskip \defl\textit{#3}\defr} \nopagebreak\\[\parskip]}
   668 \def\oparg#1#2#3{\flushitem{\textit{#1} \qtybf{#2} = \qtybf{#3}} \nopagebreak\\[\parskip]}
   669 \def\opargbool#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}
   670 \def\opargboolorsmart#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}
   671 
   672 Sledgehammer's options are categorized as follows:\ mode of operation
   673 (\S\ref{mode-of-operation}), problem encoding (\S\ref{problem-encoding}),
   674 relevance filter (\S\ref{relevance-filter}), output format
   675 (\S\ref{output-format}), authentication (\S\ref{authentication}), and timeouts
   676 (\S\ref{timeouts}).
   677 
   678 The descriptions below refer to the following syntactic quantities:
   679 
   680 \begin{enum}
   681 \item[$\bullet$] \qtybf{string}: A string.
   682 \item[$\bullet$] \qtybf{bool\/}: \textit{true} or \textit{false}.
   683 \item[$\bullet$] \qtybf{smart\_bool\/}: \textit{true}, \textit{false}, or
   684 \textit{smart}.
   685 \item[$\bullet$] \qtybf{int\/}: An integer.
   686 %\item[$\bullet$] \qtybf{float\/}: A floating-point number (e.g., 2.5).
   687 \item[$\bullet$] \qtybf{float\_pair\/}: A pair of floating-point numbers
   688 (e.g., 0.6 0.95).
   689 \item[$\bullet$] \qtybf{smart\_int\/}: An integer or \textit{smart}.
   690 \item[$\bullet$] \qtybf{float\_or\_none\/}: A floating-point number (e.g., 60 or
   691 0.5) expressing a number of seconds, or the keyword \textit{none} ($\infty$
   692 seconds).
   693 \end{enum}
   694 
   695 Default values are indicated in curly brackets (\textrm{\{\}}). Boolean options
   696 have a negated counterpart (e.g., \textit{blocking} vs.\
   697 \textit{non\_blocking}). When setting them, ``= \textit{true}'' may be omitted.
   698 
   699 \subsection{Mode of Operation}
   700 \label{mode-of-operation}
   701 
   702 \begin{enum}
   703 \opnodefaultbrk{provers}{string}
   704 Specifies the automatic provers to use as a space-separated list (e.g.,
   705 ``\textit{e}~\textit{spass}~\textit{remote\_vampire}''). The following local
   706 provers are supported:
   707 
   708 \begin{enum}
   709 \item[$\bullet$] \textbf{\textit{cvc3}:} CVC3 is an SMT solver developed by
   710 Clark Barrett, Cesare Tinelli, and their colleagues \cite{cvc3}. To use CVC3,
   711 set the environment variable \texttt{CVC3\_SOLVER} to the complete path of the
   712 executable, including the file name. Sledgehammer has been tested with version
   713 2.2.
   714 
   715 \item[$\bullet$] \textbf{\textit{e}:} E is a first-order resolution prover
   716 developed by Stephan Schulz \cite{schulz-2002}. To use E, set the environment
   717 variable \texttt{E\_HOME} to the directory that contains the \texttt{eproof}
   718 executable, or install the prebuilt E package from Isabelle's download page. See
   719 \S\ref{installation} for details.
   720 
   721 \item[$\bullet$] \textbf{\textit{leo2}:} LEO-II is an automatic
   722 higher-order prover developed by Christoph Benzm\"uller et al.\ \cite{leo2},
   723 with support for the TPTP many-typed higher-order syntax (THF0).
   724 
   725 \item[$\bullet$] \textbf{\textit{metis}:} Although it is much less powerful than
   726 the external provers, Metis itself can be used for proof search.
   727 
   728 \item[$\bullet$] \textbf{\textit{metis\_full\_types}:} Fully typed version of
   729 Metis, corresponding to \textit{metis} (\textit{full\_types}).
   730 
   731 \item[$\bullet$] \textbf{\textit{metis\_no\_types}:} Untyped version of Metis,
   732 corresponding to \textit{metis} (\textit{no\_types}).
   733 
   734 \item[$\bullet$] \textbf{\textit{satallax}:} Satallax is an automatic
   735 higher-order prover developed by Chad Brown et al.\ \cite{satallax}, with
   736 support for the TPTP many-typed higher-order syntax (THF0).
   737 
   738 \item[$\bullet$] \textbf{\textit{spass}:} SPASS is a first-order resolution
   739 prover developed by Christoph Weidenbach et al.\ \cite{weidenbach-et-al-2009}.
   740 To use SPASS, set the environment variable \texttt{SPASS\_HOME} to the directory
   741 that contains the \texttt{SPASS} executable, or install the prebuilt SPASS
   742 package from Isabelle's download page. Sledgehammer requires version 3.5 or
   743 above. See \S\ref{installation} for details.
   744 
   745 \item[$\bullet$] \textbf{\textit{vampire}:} Vampire is a first-order resolution
   746 prover developed by Andrei Voronkov and his colleagues
   747 \cite{riazanov-voronkov-2002}. To use Vampire, set the environment variable
   748 \texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{vampire}
   749 executable and \texttt{VAMPIRE\_VERSION} to the version number (e.g., ``1.8'').
   750 Sledgehammer has been tested with versions 0.6, 1.0, and 1.8. Vampire 1.8
   751 supports the TPTP many-typed first-order format (TFF0).
   752 
   753 \item[$\bullet$] \textbf{\textit{yices}:} Yices is an SMT solver developed at
   754 SRI \cite{yices}. To use Yices, set the environment variable
   755 \texttt{YICES\_SOLVER} to the complete path of the executable, including the
   756 file name. Sledgehammer has been tested with version 1.0.
   757 
   758 \item[$\bullet$] \textbf{\textit{z3}:} Z3 is an SMT solver developed at
   759 Microsoft Research \cite{z3}. To use Z3, set the environment variable
   760 \texttt{Z3\_SOLVER} to the complete path of the executable, including the file
   761 name, and set \texttt{Z3\_NON\_COMMERCIAL} to ``yes'' to confirm that you are a
   762 noncommercial user. Sledgehammer has been tested with versions 2.7 to 2.18.
   763 
   764 \item[$\bullet$] \textbf{\textit{z3\_tptp}:} This version of Z3 pretends to be
   765 an ATP, exploiting Z3's support for the TPTP untyped and many-typed first-order
   766 formats (FOF and TFF0). It is included for experimental purposes. It requires
   767 version 3.0 or above.
   768 \end{enum}
   769 
   770 In addition, the following remote provers are supported:
   771 
   772 \begin{enum}
   773 \item[$\bullet$] \textbf{\textit{remote\_cvc3}:} The remote version of CVC3 runs
   774 on servers at the TU M\"unchen (or wherever \texttt{REMOTE\_SMT\_URL} is set to
   775 point).
   776 
   777 \item[$\bullet$] \textbf{\textit{remote\_e}:} The remote version of E runs
   778 on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
   779 
   780 \item[$\bullet$] \textbf{\textit{remote\_e\_sine}:} E-SInE is a metaprover
   781 developed by Kry\v stof Hoder \cite{sine} based on E. The remote version of
   782 SInE runs on Geoff Sutcliffe's Miami servers.
   783 
   784 \item[$\bullet$] \textbf{\textit{remote\_e\_tofof}:} E-ToFoF is a metaprover
   785 developed by Geoff Sutcliffe \cite{tofof} based on E running on his Miami
   786 servers. This ATP supports the TPTP many-typed first-order format (TFF0). The
   787 remote version of E-ToFoF runs on Geoff Sutcliffe's Miami servers.
   788 
   789 \item[$\bullet$] \textbf{\textit{remote\_leo2}:} The remote version of LEO-II
   790 runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
   791 
   792 \item[$\bullet$] \textbf{\textit{remote\_satallax}:} The remote version of
   793 Satallax runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
   794 
   795 \item[$\bullet$] \textbf{\textit{remote\_snark}:} SNARK is a first-order
   796 resolution prover developed by Stickel et al.\ \cite{snark}. It supports the
   797 TPTP many-typed first-order format (TFF0). The remote version of SNARK runs on
   798 Geoff Sutcliffe's Miami servers.
   799 
   800 \item[$\bullet$] \textbf{\textit{remote\_vampire}:} The remote version of
   801 Vampire runs on Geoff Sutcliffe's Miami servers. Version 1.8 is used.
   802 
   803 \item[$\bullet$] \textbf{\textit{remote\_waldmeister}:} Waldmeister is a unit
   804 equality prover developed by Hillenbrand et al.\ \cite{waldmeister}. It can be
   805 used to prove universally quantified equations using unconditional equations,
   806 corresponding to the TPTP CNF UEQ division. The remote version of Waldmeister
   807 runs on Geoff Sutcliffe's Miami servers.
   808 
   809 \item[$\bullet$] \textbf{\textit{remote\_z3}:} The remote version of Z3 runs on
   810 servers at the TU M\"unchen (or wherever \texttt{REMOTE\_SMT\_URL} is set to
   811 point).
   812 
   813 \item[$\bullet$] \textbf{\textit{remote\_z3\_tptp}:} The remote version of ``Z3
   814 with TPTP syntax'' runs on Geoff Sutcliffe's Miami servers.
   815 \end{enum}
   816 
   817 By default, Sledgehammer runs E, SPASS, Vampire, Z3 (or whatever
   818 the SMT module's \textit{smt\_solver} configuration option is set to), and (if
   819 appropriate) Waldmeister in parallel---either locally or remotely, depending on
   820 the number of processor cores available. For historical reasons, the default
   821 value of this option can be overridden using the option ``Sledgehammer:
   822 Provers'' in Proof General's ``Isabelle'' menu.
   823 
   824 It is generally a good idea to run several provers in parallel. Running E,
   825 SPASS, and Vampire for 5~seconds yields a similar success rate to running the
   826 most effective of these for 120~seconds \cite{boehme-nipkow-2010}.
   827 
   828 For the \textit{min} subcommand, the default prover is \textit{metis}. If
   829 several provers are set, the first one is used.
   830 
   831 \opnodefault{prover}{string}
   832 Alias for \textit{provers}.
   833 
   834 %\opnodefault{atps}{string}
   835 %Legacy alias for \textit{provers}.
   836 
   837 %\opnodefault{atp}{string}
   838 %Legacy alias for \textit{provers}.
   839 
   840 \opfalse{blocking}{non\_blocking}
   841 Specifies whether the \textbf{sledgehammer} command should operate
   842 synchronously. The asynchronous (non-blocking) mode lets the user start proving
   843 the putative theorem manually while Sledgehammer looks for a proof, but it can
   844 also be more confusing. Irrespective of the value of this option, Sledgehammer
   845 is always run synchronously for the new jEdit-based user interface or if
   846 \textit{debug} (\S\ref{output-format}) is enabled.
   847 
   848 \optrue{slicing}{no\_slicing}
   849 Specifies whether the time allocated to a prover should be sliced into several
   850 segments, each of which has its own set of possibly prover-dependent options.
   851 For SPASS and Vampire, the first slice tries the fast but incomplete
   852 set-of-support (SOS) strategy, whereas the second slice runs without it. For E,
   853 up to three slices are tried, with different weighted search strategies and
   854 number of facts. For SMT solvers, several slices are tried with the same options
   855 each time but fewer and fewer facts. According to benchmarks with a timeout of
   856 30 seconds, slicing is a valuable optimization, and you should probably leave it
   857 enabled unless you are conducting experiments. This option is implicitly
   858 disabled for (short) automatic runs.
   859 
   860 \nopagebreak
   861 {\small See also \textit{verbose} (\S\ref{output-format}).}
   862 
   863 \opfalse{overlord}{no\_overlord}
   864 Specifies whether Sledgehammer should put its temporary files in
   865 \texttt{\$ISA\-BELLE\_\allowbreak HOME\_\allowbreak USER}, which is useful for
   866 debugging Sledgehammer but also unsafe if several instances of the tool are run
   867 simultaneously. The files are identified by the prefix \texttt{prob\_}; you may
   868 safely remove them after Sledgehammer has run.
   869 
   870 \nopagebreak
   871 {\small See also \textit{debug} (\S\ref{output-format}).}
   872 \end{enum}
   873 
   874 \subsection{Problem Encoding}
   875 \label{problem-encoding}
   876 
   877 \begin{enum}
   878 \opdefault{type\_enc}{string}{smart}
   879 Specifies the type encoding to use in ATP problems. Some of the type encodings
   880 are unsound, meaning that they can give rise to spurious proofs
   881 (unreconstructible using Metis). The supported type encodings are listed below,
   882 with an indication of their soundness in parentheses:
   883 
   884 \begin{enum}
   885 \item[$\bullet$] \textbf{\textit{erased} (very unsound):} No type information is
   886 supplied to the ATP. Types are simply erased.
   887 
   888 \item[$\bullet$] \textbf{\textit{poly\_guards} (sound):} Types are encoded using
   889 a predicate \textit{has\_\allowbreak type\/}$(\tau, t)$ that guards bound
   890 variables. Constants are annotated with their types, supplied as additional
   891 arguments, to resolve overloading.
   892 
   893 \item[$\bullet$] \textbf{\textit{poly\_tags} (sound):} Each term and subterm is
   894 tagged with its type using a function $\mathit{type\/}(\tau, t)$.
   895 
   896 \item[$\bullet$] \textbf{\textit{poly\_args} (unsound):}
   897 Like for \textit{poly\_guards} constants are annotated with their types to
   898 resolve overloading, but otherwise no type information is encoded. This
   899 coincides with the default encoding used by the \textit{metis} command.
   900 
   901 \item[$\bullet$]
   902 \textbf{%
   903 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags} (sound); \\
   904 \textit{raw\_mono\_args} (unsound):} \\
   905 Similar to \textit{poly\_guards}, \textit{poly\_tags}, and \textit{poly\_args},
   906 respectively, but the problem is additionally monomorphized, meaning that type
   907 variables are instantiated with heuristically chosen ground types.
   908 Monomorphization can simplify reasoning but also leads to larger fact bases,
   909 which can slow down the ATPs.
   910 
   911 \item[$\bullet$]
   912 \textbf{%
   913 \textit{mono\_guards}, \textit{mono\_tags} (sound);
   914 \textit{mono\_args} (unsound):} \\
   915 Similar to
   916 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, and
   917 \textit{raw\_mono\_args}, respectively but types are mangled in constant names
   918 instead of being supplied as ground term arguments. The binary predicate
   919 $\mathit{has\_type\/}(\tau, t)$ becomes a unary predicate
   920 $\mathit{has\_type\_}\tau(t)$, and the binary function
   921 $\mathit{type\/}(\tau, t)$ becomes a unary function
   922 $\mathit{type\_}\tau(t)$.
   923 
   924 \item[$\bullet$] \textbf{\textit{mono\_simple} (sound):} Exploits simple
   925 first-order types if the prover supports the TFF0 or THF0 syntax; otherwise,
   926 falls back on \textit{mono\_guards}. The problem is monomorphized.
   927 
   928 \item[$\bullet$] \textbf{\textit{mono\_simple\_higher} (sound):} Exploits simple
   929 higher-order types if the prover supports the THF0 syntax; otherwise, falls back
   930 on \textit{mono\_simple} or \textit{mono\_guards}. The problem is monomorphized.
   931 
   932 \item[$\bullet$]
   933 \textbf{%
   934 \textit{poly\_guards}?, \textit{poly\_tags}?, \textit{raw\_mono\_guards}?, \\
   935 \textit{raw\_mono\_tags}?, \textit{mono\_guards}?, \textit{mono\_tags}?, \\
   936 \textit{mono\_simple}? (quasi-sound):} \\
   937 The type encodings \textit{poly\_guards}, \textit{poly\_tags},
   938 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, \textit{mono\_guards},
   939 \textit{mono\_tags}, and \textit{mono\_simple} are fully
   940 typed and sound. For each of these, Sledgehammer also provides a lighter,
   941 virtually sound variant identified by a question mark (`\hbox{?}')\ that detects
   942 and erases monotonic types, notably infinite types. (For \textit{mono\_simple},
   943 the types are not actually erased but rather replaced by a shared uniform type
   944 of individuals.) As argument to the \textit{metis} proof method, the question
   945 mark is replaced by a \hbox{``\textit{\_query}''} suffix. If the \emph{sound}
   946 option is enabled, these encodings are fully sound.
   947 
   948 \item[$\bullet$]
   949 \textbf{%
   950 \textit{poly\_guards}??, \textit{poly\_tags}??, \textit{raw\_mono\_guards}??, \\
   951 \textit{raw\_mono\_tags}??, \textit{mono\_guards}??, \textit{mono\_tags}?? \\
   952 (quasi-sound):} \\
   953 Even lighter versions of the `\hbox{?}' encodings. As argument to the
   954 \textit{metis} proof method, the `\hbox{??}' suffix is replaced by
   955 \hbox{``\textit{\_query\_query}''}.
   956 
   957 \item[$\bullet$]
   958 \textbf{%
   959 \textit{poly\_guards}@?, \textit{poly\_tags}@?, \textit{raw\_mono\_guards}@?, \\
   960 \textit{raw\_mono\_tags}@? (quasi-sound):} \\
   961 Alternative versions of the `\hbox{??}' encodings. As argument to the
   962 \textit{metis} proof method, the `\hbox{@?}' suffix is replaced by
   963 \hbox{``\textit{\_at\_query}''}.
   964 
   965 \item[$\bullet$]
   966 \textbf{%
   967 \textit{poly\_guards}!, \textit{poly\_tags}!, \textit{raw\_mono\_guards}!, \\
   968 \textit{raw\_mono\_tags}!, \textit{mono\_guards}!, \textit{mono\_tags}!, \\
   969 \textit{mono\_simple}!, \textit{mono\_simple\_higher}! (mildly unsound):} \\
   970 The type encodings \textit{poly\_guards}, \textit{poly\_tags},
   971 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, \textit{mono\_guards},
   972 \textit{mono\_tags}, \textit{mono\_simple}, and \textit{mono\_simple\_higher}
   973 also admit a mildly unsound (but very efficient) variant identified by an
   974 exclamation mark (`\hbox{!}') that detects and erases erases all types except
   975 those that are clearly finite (e.g., \textit{bool}). (For \textit{mono\_simple}
   976 and \textit{mono\_simple\_higher}, the types are not actually erased but rather
   977 replaced by a shared uniform type of individuals.) As argument to the
   978 \textit{metis} proof method, the exclamation mark is replaced by the suffix
   979 \hbox{``\textit{\_bang}''}.
   980 
   981 \item[$\bullet$]
   982 \textbf{%
   983 \textit{poly\_guards}!!, \textit{poly\_tags}!!, \textit{raw\_mono\_guards}!!, \\
   984 \textit{raw\_mono\_tags}!!, \textit{mono\_guards}!!, \textit{mono\_tags}!! \\
   985 (mildly unsound):} \\
   986 Even lighter versions of the `\hbox{!}' encodings. As argument to the
   987 \textit{metis} proof method, the `\hbox{!!}' suffix is replaced by
   988 \hbox{``\textit{\_bang\_bang}''}.
   989 
   990 \item[$\bullet$]
   991 \textbf{%
   992 \textit{poly\_guards}@!, \textit{poly\_tags}@!, \textit{raw\_mono\_guards}@!, \\
   993 \textit{raw\_mono\_tags}@! (mildly unsound):} \\
   994 Alternative versions of the `\hbox{!!}' encodings. As argument to the
   995 \textit{metis} proof method, the `\hbox{@!}' suffix is replaced by
   996 \hbox{``\textit{\_at\_bang}''}.
   997 
   998 \item[$\bullet$] \textbf{\textit{smart}:} The actual encoding used depends on
   999 the ATP and should be the most efficient virtually sound encoding for that ATP.
  1000 \end{enum}
  1001 
  1002 For SMT solvers, the type encoding is always \textit{mono\_simple}, irrespective
  1003 of the value of this option.
  1004 
  1005 \nopagebreak
  1006 {\small See also \textit{max\_new\_mono\_instances} (\S\ref{relevance-filter})
  1007 and \textit{max\_mono\_iters} (\S\ref{relevance-filter}).}
  1008 
  1009 \opfalse{sound}{unsound}
  1010 Specifies whether Sledgehammer should run in its fully sound mode. In that mode,
  1011 quasi-sound type encodings (which are the default) are made fully sound, at the
  1012 cost of some clutter in the generated problems. This option is ignored if
  1013 \textit{type\_enc} is explicitly set to an unsound encoding.
  1014 \end{enum}
  1015 
  1016 \subsection{Relevance Filter}
  1017 \label{relevance-filter}
  1018 
  1019 \begin{enum}
  1020 \opdefault{relevance\_thresholds}{float\_pair}{\upshape 0.45~0.85}
  1021 Specifies the thresholds above which facts are considered relevant by the
  1022 relevance filter. The first threshold is used for the first iteration of the
  1023 relevance filter and the second threshold is used for the last iteration (if it
  1024 is reached). The effective threshold is quadratically interpolated for the other
  1025 iterations. Each threshold ranges from 0 to 1, where 0 means that all theorems
  1026 are relevant and 1 only theorems that refer to previously seen constants.
  1027 
  1028 \opdefault{max\_relevant}{smart\_int}{smart}
  1029 Specifies the maximum number of facts that may be returned by the relevance
  1030 filter. If the option is set to \textit{smart}, it is set to a value that was
  1031 empirically found to be appropriate for the prover. A typical value would be
  1032 250.
  1033 
  1034 \opdefault{max\_new\_mono\_instances}{int}{\upshape 200}
  1035 Specifies the maximum number of monomorphic instances to generate beyond
  1036 \textit{max\_relevant}. The higher this limit is, the more monomorphic instances
  1037 are potentially generated. Whether monomorphization takes place depends on the
  1038 type encoding used.
  1039 
  1040 \nopagebreak
  1041 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
  1042 
  1043 \opdefault{max\_mono\_iters}{int}{\upshape 3}
  1044 Specifies the maximum number of iterations for the monomorphization fixpoint
  1045 construction. The higher this limit is, the more monomorphic instances are
  1046 potentially generated. Whether monomorphization takes place depends on the
  1047 type encoding used.
  1048 
  1049 \nopagebreak
  1050 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
  1051 \end{enum}
  1052 
  1053 \subsection{Output Format}
  1054 \label{output-format}
  1055 
  1056 \begin{enum}
  1057 
  1058 \opfalse{verbose}{quiet}
  1059 Specifies whether the \textbf{sledgehammer} command should explain what it does.
  1060 This option is implicitly disabled for automatic runs.
  1061 
  1062 \opfalse{debug}{no\_debug}
  1063 Specifies whether Sledgehammer should display additional debugging information
  1064 beyond what \textit{verbose} already displays. Enabling \textit{debug} also
  1065 enables \textit{verbose} and \textit{blocking} (\S\ref{mode-of-operation})
  1066 behind the scenes. The \textit{debug} option is implicitly disabled for
  1067 automatic runs.
  1068 
  1069 \nopagebreak
  1070 {\small See also \textit{overlord} (\S\ref{mode-of-operation}).}
  1071 
  1072 \opfalse{isar\_proof}{no\_isar\_proof}
  1073 Specifies whether Isar proofs should be output in addition to one-liner
  1074 \textit{metis} proofs. Isar proof construction is still experimental and often
  1075 fails; however, they are usually faster and sometimes more robust than
  1076 \textit{metis} proofs.
  1077 
  1078 \opdefault{isar\_shrink\_factor}{int}{\upshape 1}
  1079 Specifies the granularity of the Isar proof. A value of $n$ indicates that each
  1080 Isar proof step should correspond to a group of up to $n$ consecutive proof
  1081 steps in the ATP proof.
  1082 \end{enum}
  1083 
  1084 \subsection{Authentication}
  1085 \label{authentication}
  1086 
  1087 \begin{enum}
  1088 \opnodefault{expect}{string}
  1089 Specifies the expected outcome, which must be one of the following:
  1090 
  1091 \begin{enum}
  1092 \item[$\bullet$] \textbf{\textit{some}:} Sledgehammer found a (potentially
  1093 unsound) proof.
  1094 \item[$\bullet$] \textbf{\textit{none}:} Sledgehammer found no proof.
  1095 \item[$\bullet$] \textbf{\textit{timeout}:} Sledgehammer timed out.
  1096 \item[$\bullet$] \textbf{\textit{unknown}:} Sledgehammer encountered some
  1097 problem.
  1098 \end{enum}
  1099 
  1100 Sledgehammer emits an error (if \textit{blocking} is enabled) or a warning
  1101 (otherwise) if the actual outcome differs from the expected outcome. This option
  1102 is useful for regression testing.
  1103 
  1104 \nopagebreak
  1105 {\small See also \textit{blocking} (\S\ref{mode-of-operation}) and
  1106 \textit{timeout} (\S\ref{timeouts}).}
  1107 \end{enum}
  1108 
  1109 \subsection{Timeouts}
  1110 \label{timeouts}
  1111 
  1112 \begin{enum}
  1113 \opdefault{timeout}{float\_or\_none}{\upshape 30}
  1114 Specifies the maximum number of seconds that the automatic provers should spend
  1115 searching for a proof. This excludes problem preparation and is a soft limit.
  1116 For historical reasons, the default value of this option can be overridden using
  1117 the option ``Sledgehammer: Time Limit'' in Proof General's ``Isabelle'' menu.
  1118 
  1119 \opdefault{preplay\_timeout}{float\_or\_none}{\upshape 4}
  1120 Specifies the maximum number of seconds that Metis should be spent trying to
  1121 ``preplay'' the found proof. If this option is set to 0, no preplaying takes
  1122 place, and no timing information is displayed next to the suggested Metis calls.
  1123 \end{enum}
  1124 
  1125 \let\em=\sl
  1126 \bibliography{../manual}{}
  1127 \bibliographystyle{abbrv}
  1128 
  1129 \end{document}