378 This sections answers frequently (and infrequently) asked questions about |
378 This sections answers frequently (and infrequently) asked questions about |
379 Sledgehammer. It is a good idea to skim over it now even if you don't have any |
379 Sledgehammer. It is a good idea to skim over it now even if you don't have any |
380 questions at this stage. And if you have any further questions not listed here, |
380 questions at this stage. And if you have any further questions not listed here, |
381 send them to the author at \authoremail. |
381 send them to the author at \authoremail. |
382 |
382 |
383 \point{Why does Metis fail to reconstruct the proof?} |
|
384 |
|
385 There are many reasons. If Metis runs seemingly forever, that is a sign that the |
|
386 proof is too difficult for it. Metis's search is complete, so it should |
|
387 eventually find it, but that's little consolation. There are several possible |
|
388 solutions: |
|
389 |
|
390 \begin{enum} |
|
391 \item[\labelitemi] Try the \textit{isar\_proof} option (\S\ref{output-format}) to |
|
392 obtain a step-by-step Isar proof where each step is justified by \textit{metis}. |
|
393 Since the steps are fairly small, \textit{metis} is more likely to be able to |
|
394 replay them. |
|
395 |
|
396 \item[\labelitemi] Try the \textit{smt} proof method instead of \textit{metis}. It |
|
397 is usually stronger, but you need to either have Z3 available to replay the |
|
398 proofs, trust the SMT solver, or use certificates. See the documentation in the |
|
399 \emph{SMT} theory (\texttt{\$ISABELLE\_HOME/src/HOL/SMT.thy}) for details. |
|
400 |
|
401 \item[\labelitemi] Try the \textit{blast} or \textit{auto} proof methods, passing |
|
402 the necessary facts via \textbf{unfolding}, \textbf{using}, \textit{intro}{:}, |
|
403 \textit{elim}{:}, \textit{dest}{:}, or \textit{simp}{:}, as appropriate. |
|
404 \end{enum} |
|
405 |
|
406 In some rare cases, \textit{metis} fails fairly quickly, and you get the error |
|
407 message |
|
408 |
|
409 \prew |
|
410 \slshape |
|
411 Proof reconstruction failed. |
|
412 \postw |
|
413 |
|
414 This message usually indicates that Sledgehammer found a type-incorrect proof. |
|
415 This was a frequent issue with older versions of Sledgehammer, which did not |
|
416 supply enough typing information to the ATPs by default. If you notice many |
|
417 unsound proofs and are not using \textit{type\_enc} (\S\ref{problem-encoding}), |
|
418 contact the author at \authoremail. |
|
419 |
|
420 \point{How can I tell whether a generated proof is sound?} |
|
421 |
|
422 First, if \textit{metis} can reconstruct it, the proof is sound (assuming |
|
423 Isabelle's inference kernel is sound). If it fails or runs seemingly forever, |
|
424 you can try |
|
425 |
|
426 \prew |
|
427 \textbf{apply}~\textbf{--} \\ |
|
428 \textbf{sledgehammer} [\textit{sound}] (\textit{metis\_facts}) |
|
429 \postw |
|
430 |
|
431 where \textit{metis\_facts} is the list of facts appearing in the suggested |
|
432 \textit{metis} call. The automatic provers should be able to re-find the proof |
|
433 quickly if it is sound, and the \textit{sound} option (\S\ref{problem-encoding}) |
|
434 ensures that no unsound proofs are found. |
|
435 |
|
436 \point{Which facts are passed to the automatic provers?} |
383 \point{Which facts are passed to the automatic provers?} |
437 |
384 |
438 The relevance filter assigns a score to every available fact (lemma, theorem, |
385 The relevance filter assigns a score to every available fact (lemma, theorem, |
439 definition, or axiom)\ based upon how many constants that fact shares with the |
386 definition, or axiom) based upon how many constants that fact shares with the |
440 conjecture. This process iterates to include facts relevant to those just |
387 conjecture. This process iterates to include facts relevant to those just |
441 accepted, but with a decay factor to ensure termination. The constants are |
388 accepted, but with a decay factor to ensure termination. The constants are |
442 weighted to give unusual ones greater significance. The relevance filter copes |
389 weighted to give unusual ones greater significance. The relevance filter copes |
443 best when the conjecture contains some unusual constants; if all the constants |
390 best when the conjecture contains some unusual constants; if all the constants |
444 are common, it is unable to discriminate among the hundreds of facts that are |
391 are common, it is unable to discriminate among the hundreds of facts that are |
481 \textbf{using} \textit{hd.simps} \textit{tl.simps} \\ |
428 \textbf{using} \textit{hd.simps} \textit{tl.simps} \\ |
482 \textbf{apply}~\textbf{--} \\ |
429 \textbf{apply}~\textbf{--} \\ |
483 \textbf{sledgehammer} |
430 \textbf{sledgehammer} |
484 \postw |
431 \postw |
485 |
432 |
|
433 \point{Why does Metis fail to reconstruct the proof?} |
|
434 |
|
435 There are many reasons. If Metis runs seemingly forever, that is a sign that the |
|
436 proof is too difficult for it. Metis's search is complete, so it should |
|
437 eventually find it, but that's little consolation. There are several possible |
|
438 solutions: |
|
439 |
|
440 \begin{enum} |
|
441 \item[\labelitemi] Try the \textit{isar\_proof} option (\S\ref{output-format}) to |
|
442 obtain a step-by-step Isar proof where each step is justified by \textit{metis}. |
|
443 Since the steps are fairly small, \textit{metis} is more likely to be able to |
|
444 replay them. |
|
445 |
|
446 \item[\labelitemi] Try the \textit{smt} proof method instead of \textit{metis}. It |
|
447 is usually stronger, but you need to either have Z3 available to replay the |
|
448 proofs, trust the SMT solver, or use certificates. See the documentation in the |
|
449 \emph{SMT} theory (\texttt{\$ISABELLE\_HOME/src/HOL/SMT.thy}) for details. |
|
450 |
|
451 \item[\labelitemi] Try the \textit{blast} or \textit{auto} proof methods, passing |
|
452 the necessary facts via \textbf{unfolding}, \textbf{using}, \textit{intro}{:}, |
|
453 \textit{elim}{:}, \textit{dest}{:}, or \textit{simp}{:}, as appropriate. |
|
454 \end{enum} |
|
455 |
|
456 In some rare cases, \textit{metis} fails fairly quickly, and you get the error |
|
457 message |
|
458 |
|
459 \prew |
|
460 \slshape |
|
461 One-line proof reconstruction failed. |
|
462 \postw |
|
463 |
|
464 This message indicates that Sledgehammer determined that the goal is provable, |
|
465 but the proof is, for technical reasons, beyond \textit{metis}'s power. You can |
|
466 then try again with the \textit{strict} option (\S\ref{problem-encoding}). |
|
467 |
|
468 If the goal is actually unprovable are you did not specify an unsound encoding |
|
469 using \textit{type\_enc} (\S\ref{problem-encoding}), this is a bug, and you are |
|
470 strongly encouraged to report this to the author at \authoremail. |
|
471 |
486 \point{Why are the generated Isar proofs so ugly/broken?} |
472 \point{Why are the generated Isar proofs so ugly/broken?} |
487 |
473 |
488 The current implementation is experimental and explodes exponentially in the |
474 The current implementation of the Isar proof feature, |
489 worst case. Work on a new implementation has begun. There is a large body of |
475 enabled by the \textit{isar\_proof} option (\S\ref{output-format}), |
|
476 is highly experimental. Work on a new implementation has begun. There is a large body of |
490 research into transforming resolution proofs into natural deduction proofs (such |
477 research into transforming resolution proofs into natural deduction proofs (such |
491 as Isar proofs), which we hope to leverage. In the meantime, a workaround is to |
478 as Isar proofs), which we hope to leverage. In the meantime, a workaround is to |
492 set the \textit{isar\_shrink\_factor} option (\S\ref{output-format}) to a larger |
479 set the \textit{isar\_shrink\_factor} option (\S\ref{output-format}) to a larger |
493 value or to try several provers and keep the nicest-looking proof. |
480 value or to try several provers and keep the nicest-looking proof. |
|
481 |
|
482 \point{How can I tell whether a suggested proof is sound?} |
|
483 |
|
484 Earlier versions of Sledgehammer often suggested unsound proofs---either proofs |
|
485 of nontheorems or simply proofs that rely on type-unsound inferences. This |
|
486 is a thing of the pass, unless you explicitly specify an unsound encoding |
|
487 using \textit{type\_enc} (\S\ref{problem-encoding}). |
|
488 % |
|
489 Officially, the only form of ``unsoundness'' that lurks in the sound |
|
490 encodings is related to missing characteristic theorems of datatypes. For |
|
491 example, |
|
492 |
|
493 \prew |
|
494 \textbf{lemma}~``$\exists \mathit{xs}.\; \mathit{xs} \neq []$'' \\ |
|
495 \textbf{sledgehammer} () |
|
496 \postw |
|
497 |
|
498 suggests an argumentless \textit{metis} call that fails. However, the conjecture |
|
499 does actually hold, and the \textit{metis} call can be repaired by adding |
|
500 \textit{list.distinct}. |
|
501 % |
|
502 We hope to address this problem in a future version of Isabelle. In the |
|
503 meantime, you can avoid it by passing the \textit{strict} option |
|
504 (\S\ref{problem-encoding}). |
494 |
505 |
495 \point{What are the \textit{full\_types}, \textit{no\_types}, and |
506 \point{What are the \textit{full\_types}, \textit{no\_types}, and |
496 \textit{mono\_tags} arguments to Metis?} |
507 \textit{mono\_tags} arguments to Metis?} |
497 |
508 |
498 The \textit{metis}~(\textit{full\_types}) proof method |
509 The \textit{metis}~(\textit{full\_types}) proof method |
667 |
678 |
668 You can instruct Sledgehammer to run automatically on newly entered theorems by |
679 You can instruct Sledgehammer to run automatically on newly entered theorems by |
669 enabling the ``Auto Sledgehammer'' option in Proof General's ``Isabelle'' menu. |
680 enabling the ``Auto Sledgehammer'' option in Proof General's ``Isabelle'' menu. |
670 For automatic runs, only the first prover set using \textit{provers} |
681 For automatic runs, only the first prover set using \textit{provers} |
671 (\S\ref{mode-of-operation}) is considered, fewer facts are passed to the prover, |
682 (\S\ref{mode-of-operation}) is considered, fewer facts are passed to the prover, |
672 \textit{slice} (\S\ref{mode-of-operation}) is disabled, \textit{sound} |
683 \textit{slice} (\S\ref{mode-of-operation}) is disabled, \textit{strict} |
673 (\S\ref{problem-encoding}) is enabled, \textit{verbose} (\S\ref{output-format}) |
684 (\S\ref{problem-encoding}) is enabled, \textit{verbose} (\S\ref{output-format}) |
674 and \textit{debug} (\S\ref{output-format}) are disabled, and \textit{timeout} |
685 and \textit{debug} (\S\ref{output-format}) are disabled, and \textit{timeout} |
675 (\S\ref{timeouts}) is superseded by the ``Auto Tools Time Limit'' in Proof |
686 (\S\ref{timeouts}) is superseded by the ``Auto Tools Time Limit'' in Proof |
676 General's ``Isabelle'' menu. Sledgehammer's output is also more concise. |
687 General's ``Isabelle'' menu. Sledgehammer's output is also more concise. |
677 |
688 |
694 \textit{lam\_lifting}, and \textit{combinators} (the default). |
705 \textit{lam\_lifting}, and \textit{combinators} (the default). |
695 % |
706 % |
696 All the untyped type encodings listed in \S\ref{problem-encoding} are supported. |
707 All the untyped type encodings listed in \S\ref{problem-encoding} are supported. |
697 For convenience, the following aliases are provided: |
708 For convenience, the following aliases are provided: |
698 \begin{enum} |
709 \begin{enum} |
699 \item[\labelitemi] \textbf{\textit{full\_types}:} Synonym for An appropriate type-sound encoding. |
710 \item[\labelitemi] \textbf{\textit{full\_types}:} Synonym for \textit{poly\_guards\_query}. |
700 \item[\labelitemi] \textbf{\textit{partial\_types}:} Synonym for \textit{poly\_args}. |
711 \item[\labelitemi] \textbf{\textit{partial\_types}:} Synonym for \textit{poly\_args}. |
701 \item[\labelitemi] \textbf{\textit{no\_types}:} Synonym for \textit{erased}. |
712 \item[\labelitemi] \textbf{\textit{no\_types}:} Synonym for \textit{erased}. |
702 \end{enum} |
713 \end{enum} |
703 |
714 |
704 \section{Option Reference} |
715 \section{Option Reference} |
992 |
1003 |
993 \opdefault{type\_enc}{string}{smart} |
1004 \opdefault{type\_enc}{string}{smart} |
994 Specifies the type encoding to use in ATP problems. Some of the type encodings |
1005 Specifies the type encoding to use in ATP problems. Some of the type encodings |
995 are unsound, meaning that they can give rise to spurious proofs |
1006 are unsound, meaning that they can give rise to spurious proofs |
996 (unreconstructible using \textit{metis}). The supported type encodings are |
1007 (unreconstructible using \textit{metis}). The supported type encodings are |
997 listed below, with an indication of their soundness in parentheses: |
1008 listed below, with an indication of their soundness in parentheses. |
|
1009 An asterisk (*) indicates that the encodings sometimes lead to reconstruction |
|
1010 failures in \textit{metis}, unless the \emph{strict} option (described below) is |
|
1011 enabled. |
998 |
1012 |
999 \begin{enum} |
1013 \begin{enum} |
1000 \item[\labelitemi] \textbf{\textit{erased} (very unsound):} No type information is |
1014 \item[\labelitemi] \textbf{\textit{erased} (very unsound):} No type information is |
1001 supplied to the ATP. Types are simply erased. |
1015 supplied to the ATP, not even to resolve overloading. Types are simply erased. |
1002 |
1016 |
1003 \item[\labelitemi] \textbf{\textit{poly\_guards} (sound):} Types are encoded using |
1017 \item[\labelitemi] \textbf{\textit{poly\_guards} (sound):} Types are encoded using |
1004 a predicate \textit{has\_\allowbreak type\/}$(\tau, t)$ that guards bound |
1018 a predicate \const{g}$(\tau, t)$ that guards bound |
1005 variables. Constants are annotated with their types, supplied as additional |
1019 variables. Constants are annotated with their types, supplied as additional |
1006 arguments, to resolve overloading. |
1020 arguments, to resolve overloading. |
1007 |
1021 |
1008 \item[\labelitemi] \textbf{\textit{poly\_tags} (sound):} Each term and subterm is |
1022 \item[\labelitemi] \textbf{\textit{poly\_tags} (sound):} Each term and subterm is |
1009 tagged with its type using a function $\const{type\/}(\tau, t)$. |
1023 tagged with its type using a function $\const{t\/}(\tau, t)$. |
1010 |
1024 |
1011 \item[\labelitemi] \textbf{\textit{poly\_args} (unsound):} |
1025 \item[\labelitemi] \textbf{\textit{poly\_args} (unsound):} |
1012 Like for \textit{poly\_guards} constants are annotated with their types to |
1026 Like for \textit{poly\_guards} constants are annotated with their types to |
1013 resolve overloading, but otherwise no type information is encoded. This |
1027 resolve overloading, but otherwise no type information is encoded. This |
1014 coincides with the default encoding used by the \textit{metis} command. |
1028 coincides with the default encoding used by the \textit{metis} command. |
1029 \textit{mono\_args} (unsound):} \\ |
1043 \textit{mono\_args} (unsound):} \\ |
1030 Similar to |
1044 Similar to |
1031 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, and |
1045 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, and |
1032 \textit{raw\_mono\_args}, respectively but types are mangled in constant names |
1046 \textit{raw\_mono\_args}, respectively but types are mangled in constant names |
1033 instead of being supplied as ground term arguments. The binary predicate |
1047 instead of being supplied as ground term arguments. The binary predicate |
1034 $\const{has\_type\/}(\tau, t)$ becomes a unary predicate |
1048 $\const{g}(\tau, t)$ becomes a unary predicate |
1035 $\const{has\_type\_}\tau(t)$, and the binary function |
1049 $\const{g\_}\tau(t)$, and the binary function |
1036 $\const{type\/}(\tau, t)$ becomes a unary function |
1050 $\const{t}(\tau, t)$ becomes a unary function |
1037 $\const{type\_}\tau(t)$. |
1051 $\const{t\_}\tau(t)$. |
1038 |
1052 |
1039 \item[\labelitemi] \textbf{\textit{mono\_simple} (sound):} Exploits simple |
1053 \item[\labelitemi] \textbf{\textit{mono\_simple} (sound):} Exploits simple |
1040 first-order types if the prover supports the TFF0 or THF0 syntax; otherwise, |
1054 first-order types if the prover supports the TFF0 or THF0 syntax; otherwise, |
1041 falls back on \textit{mono\_guards}. The problem is monomorphized. |
1055 falls back on \textit{mono\_guards}. The problem is monomorphized. |
1042 |
1056 |
1046 |
1060 |
1047 \item[\labelitemi] |
1061 \item[\labelitemi] |
1048 \textbf{% |
1062 \textbf{% |
1049 \textit{poly\_guards}?, \textit{poly\_tags}?, \textit{raw\_mono\_guards}?, \\ |
1063 \textit{poly\_guards}?, \textit{poly\_tags}?, \textit{raw\_mono\_guards}?, \\ |
1050 \textit{raw\_mono\_tags}?, \textit{mono\_guards}?, \textit{mono\_tags}?, \\ |
1064 \textit{raw\_mono\_tags}?, \textit{mono\_guards}?, \textit{mono\_tags}?, \\ |
1051 \textit{mono\_simple}? (quasi-sound):} \\ |
1065 \textit{mono\_simple}? (sound*):} \\ |
1052 The type encodings \textit{poly\_guards}, \textit{poly\_tags}, |
1066 The type encodings \textit{poly\_guards}, \textit{poly\_tags}, |
1053 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, \textit{mono\_guards}, |
1067 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, \textit{mono\_guards}, |
1054 \textit{mono\_tags}, and \textit{mono\_simple} are fully |
1068 \textit{mono\_tags}, and \textit{mono\_simple} are fully |
1055 typed and sound. For each of these, Sledgehammer also provides a lighter, |
1069 typed and sound. For each of these, Sledgehammer also provides a lighter, |
1056 virtually sound variant identified by a question mark (`\hbox{?}')\ that detects |
1070 virtually sound variant identified by a question mark (`\hbox{?}')\ that detects |
1057 and erases monotonic types, notably infinite types. (For \textit{mono\_simple}, |
1071 and erases monotonic types, notably infinite types. (For \textit{mono\_simple}, |
1058 the types are not actually erased but rather replaced by a shared uniform type |
1072 the types are not actually erased but rather replaced by a shared uniform type |
1059 of individuals.) As argument to the \textit{metis} proof method, the question |
1073 of individuals.) As argument to the \textit{metis} proof method, the question |
1060 mark is replaced by a \hbox{``\textit{\_query\/}''} suffix. If the \emph{sound} |
1074 mark is replaced by a \hbox{``\textit{\_query\/}''} suffix. |
1061 option is enabled, these encodings are fully sound. |
|
1062 |
1075 |
1063 \item[\labelitemi] |
1076 \item[\labelitemi] |
1064 \textbf{% |
1077 \textbf{% |
1065 \textit{poly\_guards}??, \textit{poly\_tags}??, \textit{raw\_mono\_guards}??, \\ |
1078 \textit{poly\_guards}??, \textit{poly\_tags}??, \textit{raw\_mono\_guards}??, \\ |
1066 \textit{raw\_mono\_tags}??, \textit{mono\_guards}??, \textit{mono\_tags}?? \\ |
1079 \textit{raw\_mono\_tags}??, \textit{mono\_guards}??, \textit{mono\_tags}?? \\ |
1067 (quasi-sound):} \\ |
1080 (sound*):} \\ |
1068 Even lighter versions of the `\hbox{?}' encodings. As argument to the |
1081 Even lighter versions of the `\hbox{?}' encodings. As argument to the |
1069 \textit{metis} proof method, the `\hbox{??}' suffix is replaced by |
1082 \textit{metis} proof method, the `\hbox{??}' suffix is replaced by |
1070 \hbox{``\textit{\_query\_query\/}''}. |
1083 \hbox{``\textit{\_query\_query\/}''}. |
1071 |
1084 |
1072 \item[\labelitemi] |
1085 \item[\labelitemi] |
1073 \textbf{% |
1086 \textbf{% |
1074 \textit{poly\_guards}@?, \textit{raw\_mono\_guards}@? (quasi-sound):} \\ |
1087 \textit{poly\_guards}@?, \textit{raw\_mono\_guards}@? (sound*):} \\ |
1075 Alternative versions of the `\hbox{??}' encodings. As argument to the |
1088 Alternative versions of the `\hbox{??}' encodings. As argument to the |
1076 \textit{metis} proof method, the `\hbox{@?}' suffix is replaced by |
1089 \textit{metis} proof method, the `\hbox{@?}' suffix is replaced by |
1077 \hbox{``\textit{\_at\_query\/}''}. |
1090 \hbox{``\textit{\_at\_query\/}''}. |
1078 |
1091 |
1079 \item[\labelitemi] |
1092 \item[\labelitemi] |
1117 |
1130 |
1118 \nopagebreak |
1131 \nopagebreak |
1119 {\small See also \textit{max\_new\_mono\_instances} (\S\ref{relevance-filter}) |
1132 {\small See also \textit{max\_new\_mono\_instances} (\S\ref{relevance-filter}) |
1120 and \textit{max\_mono\_iters} (\S\ref{relevance-filter}).} |
1133 and \textit{max\_mono\_iters} (\S\ref{relevance-filter}).} |
1121 |
1134 |
1122 \opfalse{sound}{unsound} |
1135 \opfalse{strict}{non\_stric} |
1123 Specifies whether Sledgehammer should run in its fully sound mode. In that mode, |
1136 Specifies whether Sledgehammer should run in its strict mode. In that mode, |
1124 quasi-sound type encodings (which are the default) are made fully sound, at the |
1137 sound type encodings marked with an asterisk (*) above are made more reliable |
1125 cost of some clutter in the generated problems. |
1138 for reconstruction with \textit{metis}, at the cost of some clutter in the |
1126 This option is ignored if \textit{type\_enc} is deliberately set to an encoding |
1139 generated problems. This option has no effect if \textit{type\_enc} is |
1127 that is not sound or quasi-sound. |
1140 deliberately set to an unsound encoding. |
1128 \end{enum} |
1141 \end{enum} |
1129 |
1142 |
1130 \subsection{Relevance Filter} |
1143 \subsection{Relevance Filter} |
1131 \label{relevance-filter} |
1144 \label{relevance-filter} |
1132 |
1145 |