author | wenzelm |
Mon, 11 Sep 2023 19:30:48 +0200 | |
changeset 78659 | b5f3d1051b13 |
parent 76945 | fcd1df8f48fc |
child 78726 | 810eca753919 |
permissions | -rw-r--r-- |
76183 | 1 |
(* Title: HOL/Tools/Mirabelle/mirabelle_sledgehammer.ML |
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
2 |
Author: Jasmin Blanchette, TU Munich |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
3 |
Author: Sascha Boehme, TU Munich |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
4 |
Author: Tobias Nipkow, TU Munich |
73697 | 5 |
Author: Makarius |
74897
8b1ab558e3ee
reused Sledgehammer code to parse parameters of sledgehammer action in Mirabelle
desharna
parents:
74892
diff
changeset
|
6 |
Author: Martin Desharnais, UniBw Munich, MPI-INF Saarbruecken |
73691
2f9877db82a1
reimplemented Mirabelle as Isabelle/ML presentation hook + Isabelle/Scala tool, but sledgehammer is still inactive;
wenzelm
parents:
73292
diff
changeset
|
7 |
|
2f9877db82a1
reimplemented Mirabelle as Isabelle/ML presentation hook + Isabelle/Scala tool, but sledgehammer is still inactive;
wenzelm
parents:
73292
diff
changeset
|
8 |
Mirabelle action: "sledgehammer". |
32385
594890623c46
split actions from Mirabelle core (Mirabelle may thus be applied to basic theories in HOL)
boehmes
parents:
diff
changeset
|
9 |
*) |
594890623c46
split actions from Mirabelle core (Mirabelle may thus be applied to basic theories in HOL)
boehmes
parents:
diff
changeset
|
10 |
|
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
11 |
structure Mirabelle_Sledgehammer: MIRABELLE_ACTION = |
32385
594890623c46
split actions from Mirabelle core (Mirabelle may thus be applied to basic theories in HOL)
boehmes
parents:
diff
changeset
|
12 |
struct |
594890623c46
split actions from Mirabelle core (Mirabelle may thus be applied to basic theories in HOL)
boehmes
parents:
diff
changeset
|
13 |
|
47480
24d8c9e9dae4
Mirabelle's 'usage' description relating to Sledgehammer now synchs with the ML code of the Mirabelle-Sledgehammer action
sultana
parents:
47477
diff
changeset
|
14 |
(*To facilitate synching the description of Mirabelle Sledgehammer parameters |
24d8c9e9dae4
Mirabelle's 'usage' description relating to Sledgehammer now synchs with the ML code of the Mirabelle-Sledgehammer action
sultana
parents:
47477
diff
changeset
|
15 |
(in ../lib/Tools/mirabelle) with the parameters actually used by this |
24d8c9e9dae4
Mirabelle's 'usage' description relating to Sledgehammer now synchs with the ML code of the Mirabelle-Sledgehammer action
sultana
parents:
47477
diff
changeset
|
16 |
interface, the former extracts PARAMETER and DESCRIPTION from code below which |
24d8c9e9dae4
Mirabelle's 'usage' description relating to Sledgehammer now synchs with the ML code of the Mirabelle-Sledgehammer action
sultana
parents:
47477
diff
changeset
|
17 |
has this pattern (provided it appears in a single line): |
24d8c9e9dae4
Mirabelle's 'usage' description relating to Sledgehammer now synchs with the ML code of the Mirabelle-Sledgehammer action
sultana
parents:
47477
diff
changeset
|
18 |
val .*K = "PARAMETER" (*DESCRIPTION*) |
24d8c9e9dae4
Mirabelle's 'usage' description relating to Sledgehammer now synchs with the ML code of the Mirabelle-Sledgehammer action
sultana
parents:
47477
diff
changeset
|
19 |
*) |
72342
4195e75a92ef
[mirabelle] add initial documentation in Sledgehammer's doc
desharna
parents:
72173
diff
changeset
|
20 |
(* NOTE: Do not forget to update the Sledgehammer documentation to reflect changes here. *) |
72173 | 21 |
|
22 |
val check_trivialK = "check_trivial" (*=BOOL: check if goals are "trivial"*) |
|
75372 | 23 |
val exhaustive_preplayK = "exhaustive_preplay" (*=BOOL: show exhaustive preplay data*) |
74981
10df7a627ab6
split option "sledgehammer_atp_dest_dir" into "sledgehammer_atp_prob_dest_dir" and "sledgehammer_atp_proof_dest_dir"
desharna
parents:
74967
diff
changeset
|
24 |
val keep_probsK = "keep_probs" (*=BOOL: keep temporary problem files created by sledgehammer*) |
10df7a627ab6
split option "sledgehammer_atp_dest_dir" into "sledgehammer_atp_prob_dest_dir" and "sledgehammer_atp_proof_dest_dir"
desharna
parents:
74967
diff
changeset
|
25 |
val keep_proofsK = "keep_proofs" (*=BOOL: keep temporary proof files created by ATPs*) |
73849 | 26 |
val proof_methodK = "proof_method" (*=STRING: how to reconstruct proofs (e.g. using metis)*) |
44448
04bd6a9307c6
don't perform a triviality check if the goal is skipped anyway
blanchet
parents:
44447
diff
changeset
|
27 |
|
47481
b5873d4ff1e2
gathered mirabelle_sledgehammer's hardcoded defaults
sultana
parents:
47480
diff
changeset
|
28 |
(*defaults used in this Mirabelle action*) |
73292
f84a93f1de2f
tuned Mirabelle to parse option check_trivial only once
desharna
parents:
73289
diff
changeset
|
29 |
val check_trivial_default = false |
75372 | 30 |
val exhaustive_preplay_default = false |
74981
10df7a627ab6
split option "sledgehammer_atp_dest_dir" into "sledgehammer_atp_prob_dest_dir" and "sledgehammer_atp_proof_dest_dir"
desharna
parents:
74967
diff
changeset
|
31 |
val keep_probs_default = false |
10df7a627ab6
split option "sledgehammer_atp_dest_dir" into "sledgehammer_atp_prob_dest_dir" and "sledgehammer_atp_proof_dest_dir"
desharna
parents:
74967
diff
changeset
|
32 |
val keep_proofs_default = false |
46826
4c80c4034f1d
added max_new_mono_instances, max_mono_iters, to Mirabelle-Sledgehammer; changed sh_minimize to avoid setting Mirabelle-level defaults;
sultana
parents:
46825
diff
changeset
|
33 |
|
32549 | 34 |
datatype sh_data = ShData of { |
35 |
calls: int, |
|
36 |
success: int, |
|
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
37 |
nontriv_calls: int, |
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
38 |
nontriv_success: int, |
32585 | 39 |
lemmas: int, |
32818 | 40 |
max_lems: int, |
32549 | 41 |
time_isa: int, |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
42 |
time_prover: int} |
32549 | 43 |
|
40667
b8579f24ce67
make Mirabelle take into consideration whether the SMT solver proof should be reconstructed with Metis or SMT
blanchet
parents:
40627
diff
changeset
|
44 |
datatype re_data = ReData of { |
32549 | 45 |
calls: int, |
46 |
success: int, |
|
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
47 |
nontriv_calls: int, |
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
48 |
nontriv_success: int, |
32676 | 49 |
proofs: int, |
32549 | 50 |
time: int, |
32550 | 51 |
timeout: int, |
32990 | 52 |
lemmas: int * int * int, |
39341
d2b981a0429a
indicate triviality in the list of proved things
blanchet
parents:
39340
diff
changeset
|
53 |
posns: (Position.T * bool) list |
32550 | 54 |
} |
32549 | 55 |
|
32818 | 56 |
fun make_sh_data |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
57 |
(calls,success,nontriv_calls,nontriv_success,lemmas,max_lems,time_isa, |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
58 |
time_prover) = |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
59 |
ShData{calls=calls, success=success, nontriv_calls=nontriv_calls, |
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
60 |
nontriv_success=nontriv_success, lemmas=lemmas, max_lems=max_lems, |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
61 |
time_isa=time_isa, time_prover=time_prover} |
32521 | 62 |
|
40667
b8579f24ce67
make Mirabelle take into consideration whether the SMT solver proof should be reconstructed with Metis or SMT
blanchet
parents:
40627
diff
changeset
|
63 |
fun make_re_data (calls,success,nontriv_calls,nontriv_success,proofs,time, |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
64 |
timeout,lemmas,posns) = |
40667
b8579f24ce67
make Mirabelle take into consideration whether the SMT solver proof should be reconstructed with Metis or SMT
blanchet
parents:
40627
diff
changeset
|
65 |
ReData{calls=calls, success=success, nontriv_calls=nontriv_calls, |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
66 |
nontriv_success=nontriv_success, proofs=proofs, time=time, |
32990 | 67 |
timeout=timeout, lemmas=lemmas, posns=posns} |
32549 | 68 |
|
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
69 |
val empty_sh_data = make_sh_data (0, 0, 0, 0, 0, 0, 0, 0) |
40667
b8579f24ce67
make Mirabelle take into consideration whether the SMT solver proof should be reconstructed with Metis or SMT
blanchet
parents:
40627
diff
changeset
|
70 |
val empty_re_data = make_re_data (0, 0, 0, 0, 0, 0, 0, (0,0,0), []) |
34035
08d34921b7dd
also consider the fully-typed version of metis for Mirabelle measurements
boehmes
parents:
33316
diff
changeset
|
71 |
|
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
72 |
fun tuple_of_sh_data (ShData {calls, success, nontriv_calls, nontriv_success, lemmas, max_lems, |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
73 |
time_isa, time_prover}) = |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
74 |
(calls, success, nontriv_calls, nontriv_success, lemmas, max_lems, time_isa, time_prover) |
34035
08d34921b7dd
also consider the fully-typed version of metis for Mirabelle measurements
boehmes
parents:
33316
diff
changeset
|
75 |
|
40667
b8579f24ce67
make Mirabelle take into consideration whether the SMT solver proof should be reconstructed with Metis or SMT
blanchet
parents:
40627
diff
changeset
|
76 |
fun tuple_of_re_data (ReData {calls, success, nontriv_calls, nontriv_success, |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
77 |
proofs, time, timeout, lemmas, posns}) = (calls, success, nontriv_calls, |
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
78 |
nontriv_success, proofs, time, timeout, lemmas, posns) |
34035
08d34921b7dd
also consider the fully-typed version of metis for Mirabelle measurements
boehmes
parents:
33316
diff
changeset
|
79 |
|
08d34921b7dd
also consider the fully-typed version of metis for Mirabelle measurements
boehmes
parents:
33316
diff
changeset
|
80 |
datatype data = Data of { |
08d34921b7dd
also consider the fully-typed version of metis for Mirabelle measurements
boehmes
parents:
33316
diff
changeset
|
81 |
sh: sh_data, |
57753 | 82 |
re_u: re_data (* proof method with unminimized set of lemmas *) |
34035
08d34921b7dd
also consider the fully-typed version of metis for Mirabelle measurements
boehmes
parents:
33316
diff
changeset
|
83 |
} |
32521 | 84 |
|
73697 | 85 |
type change_data = (data -> data) -> unit |
86 |
||
57753 | 87 |
fun make_data (sh, re_u) = Data {sh=sh, re_u=re_u} |
34035
08d34921b7dd
also consider the fully-typed version of metis for Mirabelle measurements
boehmes
parents:
33316
diff
changeset
|
88 |
|
57753 | 89 |
val empty_data = make_data (empty_sh_data, empty_re_data) |
34035
08d34921b7dd
also consider the fully-typed version of metis for Mirabelle measurements
boehmes
parents:
33316
diff
changeset
|
90 |
|
57753 | 91 |
fun map_sh_data f (Data {sh, re_u}) = |
57752 | 92 |
let val sh' = make_sh_data (f (tuple_of_sh_data sh)) |
57753 | 93 |
in make_data (sh', re_u) end |
32521 | 94 |
|
57753 | 95 |
fun map_re_data f (Data {sh, re_u}) = |
34035
08d34921b7dd
also consider the fully-typed version of metis for Mirabelle measurements
boehmes
parents:
33316
diff
changeset
|
96 |
let |
40667
b8579f24ce67
make Mirabelle take into consideration whether the SMT solver proof should be reconstructed with Metis or SMT
blanchet
parents:
40627
diff
changeset
|
97 |
val f' = make_re_data o f o tuple_of_re_data |
57753 | 98 |
val re_u' = f' re_u |
99 |
in make_data (sh, re_u') end |
|
32990 | 100 |
|
101 |
fun inc_max (n:int) (s,sos,m) = (s+n, sos + n*n, Int.max(m,n)); |
|
32521 | 102 |
|
32818 | 103 |
val inc_sh_calls = map_sh_data |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
104 |
(fn (calls, success, nontriv_calls, nontriv_success, lemmas,max_lems, time_isa, time_prover) => |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
105 |
(calls + 1, success, nontriv_calls, nontriv_success, lemmas, max_lems, time_isa, time_prover)) |
32549 | 106 |
|
32818 | 107 |
val inc_sh_success = map_sh_data |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
108 |
(fn (calls, success, nontriv_calls, nontriv_success, lemmas,max_lems, time_isa, time_prover) => |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
109 |
(calls, success + 1, nontriv_calls, nontriv_success, lemmas,max_lems, time_isa, time_prover)) |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
110 |
|
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
111 |
val inc_sh_nontriv_calls = map_sh_data |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
112 |
(fn (calls, success, nontriv_calls, nontriv_success, lemmas,max_lems, time_isa, time_prover) => |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
113 |
(calls, success, nontriv_calls + 1, nontriv_success, lemmas, max_lems, time_isa, time_prover)) |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
114 |
|
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
115 |
val inc_sh_nontriv_success = map_sh_data |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
116 |
(fn (calls, success, nontriv_calls, nontriv_success, lemmas,max_lems, time_isa, time_prover) => |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
117 |
(calls, success, nontriv_calls, nontriv_success + 1, lemmas,max_lems, time_isa, time_prover)) |
32585 | 118 |
|
32818 | 119 |
fun inc_sh_lemmas n = map_sh_data |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
120 |
(fn (calls, success, nontriv_calls, nontriv_success, lemmas, max_lems, time_isa, time_prover) => |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
121 |
(calls, success, nontriv_calls, nontriv_success, lemmas+n, max_lems, time_isa, time_prover)) |
32521 | 122 |
|
32818 | 123 |
fun inc_sh_max_lems n = map_sh_data |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
124 |
(fn (calls, success, nontriv_calls, nontriv_success, lemmas, max_lems, time_isa, time_prover) => |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
125 |
(calls, success,nontriv_calls, nontriv_success, lemmas, Int.max (max_lems, n), time_isa, |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
126 |
time_prover)) |
32549 | 127 |
|
32818 | 128 |
fun inc_sh_time_isa t = map_sh_data |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
129 |
(fn (calls, success, nontriv_calls, nontriv_success, lemmas, max_lems, time_isa, time_prover) => |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
130 |
(calls, success, nontriv_calls, nontriv_success, lemmas, max_lems, time_isa + t, time_prover)) |
32818 | 131 |
|
40062 | 132 |
fun inc_sh_time_prover t = map_sh_data |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
133 |
(fn (calls, success, nontriv_calls, nontriv_success, lemmas, max_lems, time_isa, time_prover) => |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
134 |
(calls, success, nontriv_calls, nontriv_success, lemmas, max_lems, time_isa, time_prover + t)) |
32571
d4bb776874b8
count number of iterations required for minimization (and fixed bug: minimization was always called)
nipkow
parents:
32567
diff
changeset
|
135 |
|
55285 | 136 |
val inc_proof_method_calls = map_re_data |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
137 |
(fn (calls,success,nontriv_calls, nontriv_success, proofs,time,timeout,lemmas,posns) |
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
138 |
=> (calls + 1, success, nontriv_calls, nontriv_success, proofs, time, timeout, lemmas,posns)) |
32533 | 139 |
|
55285 | 140 |
val inc_proof_method_success = map_re_data |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
141 |
(fn (calls,success,nontriv_calls, nontriv_success, proofs,time,timeout,lemmas,posns) |
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
142 |
=> (calls, success + 1, nontriv_calls, nontriv_success, proofs, time, timeout, lemmas,posns)) |
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
143 |
|
55285 | 144 |
val inc_proof_method_nontriv_calls = map_re_data |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
145 |
(fn (calls,success,nontriv_calls, nontriv_success, proofs,time,timeout,lemmas,posns) |
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
146 |
=> (calls, success, nontriv_calls + 1, nontriv_success, proofs, time, timeout, lemmas,posns)) |
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
147 |
|
55285 | 148 |
val inc_proof_method_nontriv_success = map_re_data |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
149 |
(fn (calls,success,nontriv_calls, nontriv_success, proofs,time,timeout,lemmas,posns) |
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
150 |
=> (calls, success, nontriv_calls, nontriv_success + 1, proofs, time, timeout, lemmas,posns)) |
32676 | 151 |
|
55285 | 152 |
val inc_proof_method_proofs = map_re_data |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
153 |
(fn (calls,success,nontriv_calls, nontriv_success, proofs,time,timeout,lemmas,posns) |
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
154 |
=> (calls, success, nontriv_calls, nontriv_success, proofs + 1, time, timeout, lemmas,posns)) |
32551
421323205efd
position information is now passed to all actions;
nipkow
parents:
32550
diff
changeset
|
155 |
|
57753 | 156 |
fun inc_proof_method_time t = map_re_data |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
157 |
(fn (calls,success,nontriv_calls, nontriv_success, proofs,time,timeout,lemmas,posns) |
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
158 |
=> (calls, success, nontriv_calls, nontriv_success, proofs, time + t, timeout, lemmas,posns)) |
32536 | 159 |
|
55285 | 160 |
val inc_proof_method_timeout = map_re_data |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
161 |
(fn (calls,success,nontriv_calls, nontriv_success, proofs,time,timeout,lemmas,posns) |
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
162 |
=> (calls, success, nontriv_calls, nontriv_success, proofs, time, timeout + 1, lemmas,posns)) |
32549 | 163 |
|
57753 | 164 |
fun inc_proof_method_lemmas n = map_re_data |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
165 |
(fn (calls,success,nontriv_calls, nontriv_success, proofs,time,timeout,lemmas,posns) |
57753 | 166 |
=> (calls, success, nontriv_calls, nontriv_success, proofs, time, timeout, inc_max n lemmas, posns)) |
32551
421323205efd
position information is now passed to all actions;
nipkow
parents:
32550
diff
changeset
|
167 |
|
57753 | 168 |
fun inc_proof_method_posns pos = map_re_data |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
169 |
(fn (calls,success,nontriv_calls, nontriv_success, proofs,time,timeout,lemmas,posns) |
57753 | 170 |
=> (calls, success, nontriv_calls, nontriv_success, proofs, time, timeout, lemmas, pos::posns)) |
32521 | 171 |
|
44090 | 172 |
val str0 = string_of_int o the_default 0 |
173 |
||
32521 | 174 |
local |
175 |
||
176 |
val str = string_of_int |
|
177 |
val str3 = Real.fmt (StringCvt.FIX (SOME 3)) |
|
178 |
fun percentage a b = string_of_int (a * 100 div b) |
|
73697 | 179 |
fun ms t = Real.fromInt t / 1000.0 |
32521 | 180 |
fun avg_time t n = |
181 |
if n > 0 then (Real.fromInt t / 1000.0) / Real.fromInt n else 0.0 |
|
182 |
||
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
183 |
fun log_sh_data (ShData {calls, success, nontriv_calls, nontriv_success, lemmas, max_lems, time_isa, |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
184 |
time_prover}) = |
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
185 |
"\nTotal number of sledgehammer calls: " ^ str calls ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
186 |
"\nNumber of successful sledgehammer calls: " ^ str success ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
187 |
"\nNumber of sledgehammer lemmas: " ^ str lemmas ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
188 |
"\nMax number of sledgehammer lemmas: " ^ str max_lems ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
189 |
"\nSuccess rate: " ^ percentage success calls ^ "%" ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
190 |
"\nTotal number of nontrivial sledgehammer calls: " ^ str nontriv_calls ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
191 |
"\nNumber of successful nontrivial sledgehammer calls: " ^ str nontriv_success ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
192 |
"\nTotal time for sledgehammer calls (Isabelle): " ^ str3 (ms time_isa) ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
193 |
"\nTotal time for successful sledgehammer calls (ATP): " ^ str3 (ms time_prover) ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
194 |
"\nAverage time for sledgehammer calls (Isabelle): " ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
195 |
str3 (avg_time time_isa calls) ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
196 |
"\nAverage time for successful sledgehammer calls (ATP): " ^ |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
197 |
str3 (avg_time time_prover success) |
32521 | 198 |
|
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
199 |
fun log_re_data sh_calls (ReData {calls, success, nontriv_calls, nontriv_success, proofs, time, |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
200 |
timeout, lemmas = (lemmas, lems_sos, lems_max), posns}) = |
73697 | 201 |
let |
202 |
val proved = |
|
203 |
posns |> map (fn (pos, triv) => |
|
204 |
str0 (Position.line_of pos) ^ ":" ^ str0 (Position.offset_of pos) ^ |
|
205 |
(if triv then "[T]" else "")) |
|
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
206 |
in |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
207 |
"\nTotal number of proof method calls: " ^ str calls ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
208 |
"\nNumber of successful proof method calls: " ^ str success ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
209 |
" (proof: " ^ str proofs ^ ")" ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
210 |
"\nNumber of proof method timeouts: " ^ str timeout ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
211 |
"\nSuccess rate: " ^ percentage success sh_calls ^ "%" ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
212 |
"\nTotal number of nontrivial proof method calls: " ^ str nontriv_calls ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
213 |
"\nNumber of successful nontrivial proof method calls: " ^ str nontriv_success ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
214 |
" (proof: " ^ str proofs ^ ")" ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
215 |
"\nNumber of successful proof method lemmas: " ^ str lemmas ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
216 |
"\nSOS of successful proof method lemmas: " ^ str lems_sos ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
217 |
"\nMax number of successful proof method lemmas: " ^ str lems_max ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
218 |
"\nTotal time for successful proof method calls: " ^ str3 (ms time) ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
219 |
"\nAverage time for successful proof method calls: " ^ str3 (avg_time time success) ^ |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
220 |
"\nProved: " ^ space_implode " " proved |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
221 |
end |
32571
d4bb776874b8
count number of iterations required for minimization (and fixed bug: minimization was always called)
nipkow
parents:
32567
diff
changeset
|
222 |
|
32521 | 223 |
in |
224 |
||
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
225 |
fun log_data (Data {sh, re_u}) = |
34035
08d34921b7dd
also consider the fully-typed version of metis for Mirabelle measurements
boehmes
parents:
33316
diff
changeset
|
226 |
let |
08d34921b7dd
also consider the fully-typed version of metis for Mirabelle measurements
boehmes
parents:
33316
diff
changeset
|
227 |
val ShData {calls=sh_calls, ...} = sh |
73697 | 228 |
val ReData {calls=re_calls, ...} = re_u |
34035
08d34921b7dd
also consider the fully-typed version of metis for Mirabelle measurements
boehmes
parents:
33316
diff
changeset
|
229 |
in |
73697 | 230 |
if sh_calls > 0 then |
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
231 |
let val text1 = log_sh_data sh in |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
232 |
if re_calls > 0 then text1 ^ "\n" ^ log_re_data sh_calls re_u else text1 |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
233 |
end |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
234 |
else |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
235 |
"" |
34035
08d34921b7dd
also consider the fully-typed version of metis for Mirabelle measurements
boehmes
parents:
33316
diff
changeset
|
236 |
end |
32521 | 237 |
|
238 |
end |
|
239 |
||
46340 | 240 |
type stature = ATP_Problem_Generate.stature |
38752
6628adcae4a7
consider "locality" when assigning weights to facts
blanchet
parents:
38700
diff
changeset
|
241 |
|
58473 | 242 |
fun is_good_line s = |
53081
2a62d848a56a
improved ad hoc success detection in Mirabelle -- if the metis call fails and the structured proof succeeds, remember only the success
blanchet
parents:
52555
diff
changeset
|
243 |
(String.isSubstring " ms)" s orelse String.isSubstring " s)" s) |
2a62d848a56a
improved ad hoc success detection in Mirabelle -- if the metis call fails and the structured proof succeeds, remember only the success
blanchet
parents:
52555
diff
changeset
|
244 |
andalso not (String.isSubstring "(> " s) |
2a62d848a56a
improved ad hoc success detection in Mirabelle -- if the metis call fails and the structured proof succeeds, remember only the success
blanchet
parents:
52555
diff
changeset
|
245 |
andalso not (String.isSubstring ", > " s) |
2a62d848a56a
improved ad hoc success detection in Mirabelle -- if the metis call fails and the structured proof succeeds, remember only the success
blanchet
parents:
52555
diff
changeset
|
246 |
andalso not (String.isSubstring "may fail" s) |
2a62d848a56a
improved ad hoc success detection in Mirabelle -- if the metis call fails and the structured proof succeeds, remember only the success
blanchet
parents:
52555
diff
changeset
|
247 |
|
51206 | 248 |
(* Fragile hack *) |
55285 | 249 |
fun proof_method_from_msg args msg = |
250 |
(case AList.lookup (op =) args proof_methodK of |
|
58473 | 251 |
SOME name => |
252 |
if name = "smart" then |
|
253 |
if exists is_good_line (split_lines msg) then |
|
254 |
"none" |
|
255 |
else |
|
256 |
"fail" |
|
257 |
else |
|
258 |
name |
|
41357
ae76960d86a2
added "sledgehammer_tac" as possible reconstructor in Mirabelle
blanchet
parents:
41338
diff
changeset
|
259 |
| NONE => |
58473 | 260 |
if exists is_good_line (split_lines msg) then |
51203 | 261 |
"none" (* trust the preplayed proof *) |
262 |
else if String.isSubstring "metis (" msg then |
|
45519
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
263 |
msg |> Substring.full |
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
264 |
|> Substring.position "metis (" |
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
265 |
|> snd |> Substring.position ")" |
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
266 |
|> fst |> Substring.string |
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
267 |
|> suffix ")" |
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
268 |
else if String.isSubstring "metis" msg then |
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
269 |
"metis" |
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
270 |
else |
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
271 |
"smt") |
40667
b8579f24ce67
make Mirabelle take into consideration whether the SMT solver proof should be reconstructed with Metis or SMT
blanchet
parents:
40627
diff
changeset
|
272 |
|
32521 | 273 |
local |
274 |
||
75343 | 275 |
fun run_sh params keep pos state = |
32521 | 276 |
let |
74981
10df7a627ab6
split option "sledgehammer_atp_dest_dir" into "sledgehammer_atp_prob_dest_dir" and "sledgehammer_atp_proof_dest_dir"
desharna
parents:
74967
diff
changeset
|
277 |
fun set_file_name (SOME (dir, keep_probs, keep_proofs)) = |
74892 | 278 |
let |
279 |
val filename = "prob_" ^ |
|
74472 | 280 |
StringCvt.padLeft #"0" 5 (str0 (Position.line_of pos)) ^ "_" ^ |
74892 | 281 |
StringCvt.padLeft #"0" 6 (str0 (Position.offset_of pos)) |
282 |
in |
|
74981
10df7a627ab6
split option "sledgehammer_atp_dest_dir" into "sledgehammer_atp_prob_dest_dir" and "sledgehammer_atp_proof_dest_dir"
desharna
parents:
74967
diff
changeset
|
283 |
Config.put Sledgehammer_Prover_ATP.atp_problem_prefix (filename ^ "__") |
10df7a627ab6
split option "sledgehammer_atp_dest_dir" into "sledgehammer_atp_prob_dest_dir" and "sledgehammer_atp_proof_dest_dir"
desharna
parents:
74967
diff
changeset
|
284 |
#> (keep_probs ? Config.put Sledgehammer_Prover_ATP.atp_problem_dest_dir dir) |
10df7a627ab6
split option "sledgehammer_atp_dest_dir" into "sledgehammer_atp_prob_dest_dir" and "sledgehammer_atp_proof_dest_dir"
desharna
parents:
74967
diff
changeset
|
285 |
#> (keep_proofs ? Config.put Sledgehammer_Prover_ATP.atp_proof_dest_dir dir) |
74892 | 286 |
#> Config.put SMT_Config.debug_files (dir ^ "/" ^ filename ^ "__" ^ serial_string ()) |
287 |
end |
|
44090 | 288 |
| set_file_name NONE = I |
75343 | 289 |
val state' = state |
290 |
|> Proof.map_context (set_file_name keep) |
|
74952
ae2185967e67
exported Sledgehammer.launch_prover and use it in Mirabelle
desharna
parents:
74951
diff
changeset
|
291 |
|
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
292 |
val ((_, (sledgehammer_outcome, msg)), cpu_time) = Mirabelle.cpu_time (fn () => |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
293 |
Sledgehammer.run_sledgehammer params Sledgehammer_Prover.Normal NONE 1 |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
294 |
Sledgehammer_Fact.no_fact_override state') () |
32521 | 295 |
in |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
296 |
(sledgehammer_outcome, msg, cpu_time) |
32521 | 297 |
end |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
298 |
handle |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
299 |
ERROR msg => (Sledgehammer.SH_Unknown, " error: " ^ msg, 0) |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
300 |
| _ => (Sledgehammer.SH_Unknown, " error: unexpected error", 0) |
32521 | 301 |
|
32498
1132c7c13f36
Mirabelle: actions are responsible for handling exceptions,
boehmes
parents:
32496
diff
changeset
|
302 |
in |
1132c7c13f36
Mirabelle: actions are responsible for handling exceptions,
boehmes
parents:
32496
diff
changeset
|
303 |
|
75343 | 304 |
fun run_sledgehammer (params as {provers, ...}) output_dir keep_probs keep_proofs |
75372 | 305 |
exhaustive_preplay proof_method_from_msg thy_index trivial pos st = |
32385
594890623c46
split actions from Mirabelle core (Mirabelle may thus be applied to basic theories in HOL)
boehmes
parents:
diff
changeset
|
306 |
let |
55205 | 307 |
val thy = Proof.theory_of st |
76945
fcd1df8f48fc
added session to mirabelle output directory structure
desharna
parents:
76525
diff
changeset
|
308 |
val thy_long_name = Context.theory_long_name thy |
fcd1df8f48fc
added session to mirabelle output directory structure
desharna
parents:
76525
diff
changeset
|
309 |
val session_name = Long_Name.qualifier thy_long_name |
fcd1df8f48fc
added session to mirabelle output directory structure
desharna
parents:
76525
diff
changeset
|
310 |
val thy_name = Long_Name.base_name thy_long_name |
39340 | 311 |
val triv_str = if trivial then "[T] " else "" |
74981
10df7a627ab6
split option "sledgehammer_atp_dest_dir" into "sledgehammer_atp_prob_dest_dir" and "sledgehammer_atp_proof_dest_dir"
desharna
parents:
74967
diff
changeset
|
312 |
val keep = |
10df7a627ab6
split option "sledgehammer_atp_dest_dir" into "sledgehammer_atp_prob_dest_dir" and "sledgehammer_atp_proof_dest_dir"
desharna
parents:
74967
diff
changeset
|
313 |
if keep_probs orelse keep_proofs then |
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
314 |
let val subdir = StringCvt.padLeft #"0" 4 (string_of_int thy_index) ^ "_" ^ thy_name in |
76945
fcd1df8f48fc
added session to mirabelle output directory structure
desharna
parents:
76525
diff
changeset
|
315 |
Path.append (Path.append output_dir (Path.basic session_name)) (Path.basic subdir) |
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
316 |
|> Isabelle_System.make_directory |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
317 |
|> Path.implode |
74981
10df7a627ab6
split option "sledgehammer_atp_dest_dir" into "sledgehammer_atp_prob_dest_dir" and "sledgehammer_atp_proof_dest_dir"
desharna
parents:
74967
diff
changeset
|
318 |
|> (fn dir => SOME (dir, keep_probs, keep_proofs)) |
74078
a2cbe81e1e32
changed Mirabelle_Sledgehammer keep option from path to boolean
desharna
parents:
73849
diff
changeset
|
319 |
end |
a2cbe81e1e32
changed Mirabelle_Sledgehammer keep option from path to boolean
desharna
parents:
73849
diff
changeset
|
320 |
else |
a2cbe81e1e32
changed Mirabelle_Sledgehammer keep option from path to boolean
desharna
parents:
73849
diff
changeset
|
321 |
NONE |
74897
8b1ab558e3ee
reused Sledgehammer code to parse parameters of sledgehammer action in Mirabelle
desharna
parents:
74892
diff
changeset
|
322 |
val prover_name = hd provers |
75343 | 323 |
val (sledgehamer_outcome, msg, cpu_time) = run_sh params keep pos st |
75372 | 324 |
val (time_prover, change_data, proof_method_and_used_thms, exhaustive_preplay_msg) = |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
325 |
(case sledgehamer_outcome of |
75372 | 326 |
Sledgehammer.SH_Some ({used_facts, run_time, ...}, preplay_results) => |
38700 | 327 |
let |
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
328 |
val num_used_facts = length used_facts |
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
329 |
val time_prover = Time.toMilliseconds run_time |
46340 | 330 |
fun get_thms (name, stature) = |
50267
1da2e67242d6
moved thms_of_name to Sledgehammer_Util and removed copies, updated references
smolkas
parents:
49916
diff
changeset
|
331 |
try (Sledgehammer_Util.thms_of_name (Proof.context_of st)) |
49916
412346127bfa
fixed theorem lookup code in Isar proof reconstruction
blanchet
parents:
49914
diff
changeset
|
332 |
name |
47154
2c357e2b8436
more robustness in case a theorem cannot be retrieved (which typically happens with backtick facts)
blanchet
parents:
47060
diff
changeset
|
333 |
|> Option.map (pair (name, stature)) |
74996
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
334 |
val change_data = |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
335 |
inc_sh_success |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
336 |
#> not trivial ? inc_sh_nontriv_success |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
337 |
#> inc_sh_lemmas num_used_facts |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
338 |
#> inc_sh_max_lems num_used_facts |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
339 |
#> inc_sh_time_prover time_prover |
75372 | 340 |
|
341 |
val exhaustive_preplay_msg = |
|
342 |
if exhaustive_preplay then |
|
343 |
preplay_results |
|
344 |
|> map |
|
76525 | 345 |
(fn (meth, (play_outcome, used_facts)) => |
75372 | 346 |
"Preplay: " ^ |
347 |
Sledgehammer_Proof_Methods.string_of_proof_method (map fst used_facts) meth ^ |
|
348 |
" (" ^ Sledgehammer_Proof_Methods.string_of_play_outcome play_outcome ^ ")") |
|
349 |
|> cat_lines |
|
350 |
else |
|
351 |
"" |
|
32525 | 352 |
in |
75001 | 353 |
(SOME time_prover, change_data, |
75372 | 354 |
SOME (proof_method_from_msg msg, map_filter get_thms used_facts), |
355 |
exhaustive_preplay_msg) |
|
32525 | 356 |
end |
75372 | 357 |
| _ => (NONE, I, NONE, "")) |
75001 | 358 |
val outcome_msg = |
359 |
"(SH " ^ string_of_int cpu_time ^ "ms" ^ |
|
360 |
(case time_prover of NONE => "" | SOME ms => ", ATP " ^ string_of_int ms ^ "ms") ^ |
|
361 |
") [" ^ prover_name ^ "]: " |
|
74953
aade20a03edb
tuned run_sledgehammer and called it directly from Mirabelle
desharna
parents:
74952
diff
changeset
|
362 |
in |
75372 | 363 |
(sledgehamer_outcome, triv_str ^ outcome_msg ^ msg ^ |
364 |
(if exhaustive_preplay_msg = "" then "" else ("\n" ^ exhaustive_preplay_msg)), |
|
365 |
change_data #> inc_sh_time_isa cpu_time, |
|
74997
d4a52993a81e
removed unsynchronized references in mirabelle_sledgehammer
desharna
parents:
74996
diff
changeset
|
366 |
proof_method_and_used_thms) |
32525 | 367 |
end |
368 |
||
369 |
end |
|
370 |
||
44542 | 371 |
fun override_params prover type_enc timeout = |
372 |
[("provers", prover), |
|
48293 | 373 |
("max_facts", "0"), |
44542 | 374 |
("type_enc", type_enc), |
46386 | 375 |
("strict", "true"), |
45706 | 376 |
("slice", "false"), |
44461 | 377 |
("timeout", timeout |> Time.toSeconds |> string_of_int)] |
44430 | 378 |
|
74996
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
379 |
fun run_proof_method trivial full name meth named_thms timeout pos st = |
32525 | 380 |
let |
55285 | 381 |
fun do_method named_thms ctxt = |
44462
d9a657c44380
more reliable "sledgehammer\_tac" reconstruction, by avoiding "insert_tac"
blanchet
parents:
44461
diff
changeset
|
382 |
let |
59083 | 383 |
val ref_of_str = (* FIXME proper wrapper for parser combinators *) |
384 |
suffix ";" #> Token.explode (Thy_Header.get_keywords' ctxt) Position.none |
|
62969 | 385 |
#> Parse.thm #> fst |
44462
d9a657c44380
more reliable "sledgehammer\_tac" reconstruction, by avoiding "insert_tac"
blanchet
parents:
44461
diff
changeset
|
386 |
val thms = named_thms |> maps snd |
d9a657c44380
more reliable "sledgehammer\_tac" reconstruction, by avoiding "insert_tac"
blanchet
parents:
44461
diff
changeset
|
387 |
val facts = named_thms |> map (ref_of_str o fst o fst) |
48292 | 388 |
val fact_override = {add = facts, del = [], only = true} |
44566
bf8331161ad9
split timeout among ATPs in and add Metis to the mix as backup
blanchet
parents:
44542
diff
changeset
|
389 |
fun my_timeout time_slice = |
62826 | 390 |
timeout |> Time.toReal |> curry (op *) time_slice |> Time.fromReal |
44566
bf8331161ad9
split timeout among ATPs in and add Metis to the mix as backup
blanchet
parents:
44542
diff
changeset
|
391 |
fun sledge_tac time_slice prover type_enc = |
44542 | 392 |
Sledgehammer_Tactics.sledgehammer_as_oracle_tac ctxt |
57814 | 393 |
(override_params prover type_enc (my_timeout time_slice)) fact_override [] |
44462
d9a657c44380
more reliable "sledgehammer\_tac" reconstruction, by avoiding "insert_tac"
blanchet
parents:
44461
diff
changeset
|
394 |
in |
74997
d4a52993a81e
removed unsynchronized references in mirabelle_sledgehammer
desharna
parents:
74996
diff
changeset
|
395 |
if meth = "sledgehammer_tac" then |
75038 | 396 |
sledge_tac 0.25 ATP_Proof.vampireN "mono_native" |
397 |
ORELSE' sledge_tac 0.25 ATP_Proof.eN "poly_guards??" |
|
398 |
ORELSE' sledge_tac 0.25 ATP_Proof.spassN "mono_native" |
|
58061 | 399 |
ORELSE' SMT_Solver.smt_tac ctxt thms |
74997
d4a52993a81e
removed unsynchronized references in mirabelle_sledgehammer
desharna
parents:
74996
diff
changeset
|
400 |
else if meth = "smt" then |
58061 | 401 |
SMT_Solver.smt_tac ctxt thms |
45519
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
402 |
else if full then |
46320 | 403 |
Metis_Tactic.metis_tac [ATP_Proof_Reconstruct.full_typesN] |
54500 | 404 |
ATP_Proof_Reconstruct.default_metis_lam_trans ctxt thms |
74997
d4a52993a81e
removed unsynchronized references in mirabelle_sledgehammer
desharna
parents:
74996
diff
changeset
|
405 |
else if String.isPrefix "metis (" meth then |
45519
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
406 |
let |
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
407 |
val (type_encs, lam_trans) = |
74997
d4a52993a81e
removed unsynchronized references in mirabelle_sledgehammer
desharna
parents:
74996
diff
changeset
|
408 |
meth |
59083 | 409 |
|> Token.explode (Thy_Header.get_keywords' ctxt) Position.start |
45519
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
410 |
|> filter Token.is_proper |> tl |
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
411 |
|> Metis_Tactic.parse_metis_options |> fst |
46320 | 412 |
|>> the_default [ATP_Proof_Reconstruct.partial_typesN] |
54500 | 413 |
||> the_default ATP_Proof_Reconstruct.default_metis_lam_trans |
45519
cd6e78cb6ee8
make metis reconstruction handling more flexible
blanchet
parents:
45514
diff
changeset
|
414 |
in Metis_Tactic.metis_tac type_encs lam_trans ctxt thms end |
74997
d4a52993a81e
removed unsynchronized references in mirabelle_sledgehammer
desharna
parents:
74996
diff
changeset
|
415 |
else if meth = "metis" then |
55211 | 416 |
Metis_Tactic.metis_tac [] ATP_Proof_Reconstruct.default_metis_lam_trans ctxt thms |
74997
d4a52993a81e
removed unsynchronized references in mirabelle_sledgehammer
desharna
parents:
74996
diff
changeset
|
417 |
else if meth = "none" then |
58473 | 418 |
K all_tac |
74997
d4a52993a81e
removed unsynchronized references in mirabelle_sledgehammer
desharna
parents:
74996
diff
changeset
|
419 |
else if meth = "fail" then |
58473 | 420 |
K no_tac |
44462
d9a657c44380
more reliable "sledgehammer\_tac" reconstruction, by avoiding "insert_tac"
blanchet
parents:
44461
diff
changeset
|
421 |
else |
74997
d4a52993a81e
removed unsynchronized references in mirabelle_sledgehammer
desharna
parents:
74996
diff
changeset
|
422 |
(warning ("Unknown method " ^ quote meth); K no_tac) |
44462
d9a657c44380
more reliable "sledgehammer\_tac" reconstruction, by avoiding "insert_tac"
blanchet
parents:
44461
diff
changeset
|
423 |
end |
55285 | 424 |
fun apply_method named_thms = |
425 |
Mirabelle.can_apply timeout (do_method named_thms) st |
|
32521 | 426 |
|
74996
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
427 |
fun with_time (false, t) = ("failed (" ^ string_of_int t ^ ")", I) |
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
428 |
| with_time (true, t) = |
74996
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
429 |
("succeeded (" ^ string_of_int t ^ ")", |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
430 |
inc_proof_method_success |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
431 |
#> not trivial ? inc_proof_method_nontriv_success |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
432 |
#> inc_proof_method_lemmas (length named_thms) |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
433 |
#> inc_proof_method_time t |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
434 |
#> inc_proof_method_posns (pos, trivial) |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
435 |
#> name = "proof" ? inc_proof_method_proofs) |
55285 | 436 |
fun timed_method named_thms = |
73697 | 437 |
with_time (Mirabelle.cpu_time apply_method named_thms) |
74996
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
438 |
handle Timeout.TIMEOUT _ => ("timeout", inc_proof_method_timeout) |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
439 |
| ERROR msg => ("error: " ^ msg, I) |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
440 |
in |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
441 |
timed_method named_thms |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
442 |
|> apsnd (fn change_data => change_data |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
443 |
#> inc_proof_method_calls |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
444 |
#> not trivial ? inc_proof_method_nontriv_calls) |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
445 |
end |
32385
594890623c46
split actions from Mirabelle core (Mirabelle may thus be applied to basic theories in HOL)
boehmes
parents:
diff
changeset
|
446 |
|
75004 | 447 |
val try0 = Try0.try0 (SOME (Time.fromSeconds 5)) ([], [], [], []) |
39337
ffa577c0bbfa
keep track of trivial vs. nontrivial calls using "try" for 30 seconds
blanchet
parents:
39321
diff
changeset
|
448 |
|
74078
a2cbe81e1e32
changed Mirabelle_Sledgehammer keep option from path to boolean
desharna
parents:
73849
diff
changeset
|
449 |
fun make_action ({arguments, timeout, output_dir, ...} : Mirabelle.action_context) = |
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
450 |
let |
74897
8b1ab558e3ee
reused Sledgehammer code to parse parameters of sledgehammer action in Mirabelle
desharna
parents:
74892
diff
changeset
|
451 |
(* Parse Mirabelle-specific parameters *) |
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
452 |
val check_trivial = |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
453 |
Mirabelle.get_bool_argument arguments (check_trivialK, check_trivial_default) |
74981
10df7a627ab6
split option "sledgehammer_atp_dest_dir" into "sledgehammer_atp_prob_dest_dir" and "sledgehammer_atp_proof_dest_dir"
desharna
parents:
74967
diff
changeset
|
454 |
val keep_probs = Mirabelle.get_bool_argument arguments (keep_probsK, keep_probs_default) |
10df7a627ab6
split option "sledgehammer_atp_dest_dir" into "sledgehammer_atp_prob_dest_dir" and "sledgehammer_atp_proof_dest_dir"
desharna
parents:
74967
diff
changeset
|
455 |
val keep_proofs = Mirabelle.get_bool_argument arguments (keep_proofsK, keep_proofs_default) |
75372 | 456 |
val exhaustive_preplay = |
457 |
Mirabelle.get_bool_argument arguments (exhaustive_preplayK, exhaustive_preplay_default) |
|
74897
8b1ab558e3ee
reused Sledgehammer code to parse parameters of sledgehammer action in Mirabelle
desharna
parents:
74892
diff
changeset
|
458 |
val proof_method_from_msg = proof_method_from_msg arguments |
8b1ab558e3ee
reused Sledgehammer code to parse parameters of sledgehammer action in Mirabelle
desharna
parents:
74892
diff
changeset
|
459 |
|
8b1ab558e3ee
reused Sledgehammer code to parse parameters of sledgehammer action in Mirabelle
desharna
parents:
74892
diff
changeset
|
460 |
val params = Sledgehammer_Commands.default_params \<^theory> arguments |
44431
3e0ce0ae1022
added "max_calls" option to get a fixed number of Sledgehammer calls per theory
blanchet
parents:
44430
diff
changeset
|
461 |
|
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
462 |
val data = Synchronized.var "Mirabelle_Sledgehammer.data" empty_data |
32385
594890623c46
split actions from Mirabelle core (Mirabelle may thus be applied to basic theories in HOL)
boehmes
parents:
diff
changeset
|
463 |
|
74948
15ce207f69c8
added support for initialization messages to Mirabelle
desharna
parents:
74897
diff
changeset
|
464 |
val init_msg = "Params for sledgehammer: " ^ Sledgehammer_Prover.string_of_params params |
15ce207f69c8
added support for initialization messages to Mirabelle
desharna
parents:
74897
diff
changeset
|
465 |
|
75003
f21e7e6172a0
renamed run_action to run in Mirabelle.action record
desharna
parents:
75001
diff
changeset
|
466 |
fun run ({theory_index, name, pos, pre, ...} : Mirabelle.command) = |
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
467 |
let val goal = Thm.major_prem_of (#goal (Proof.goal pre)) in |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
468 |
if can Logic.dest_conjunction goal orelse can Logic.dest_equals goal then |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
469 |
"" |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
470 |
else |
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
471 |
let |
75004 | 472 |
val trivial = check_trivial andalso try0 pre handle Timeout.TIMEOUT _ => false |
74997
d4a52993a81e
removed unsynchronized references in mirabelle_sledgehammer
desharna
parents:
74996
diff
changeset
|
473 |
val (outcome, log1, change_data1, proof_method_and_used_thms) = |
75372 | 474 |
run_sledgehammer params output_dir keep_probs keep_proofs exhaustive_preplay |
475 |
proof_method_from_msg theory_index trivial pos pre |
|
74996
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
476 |
val (log2, change_data2) = |
74997
d4a52993a81e
removed unsynchronized references in mirabelle_sledgehammer
desharna
parents:
74996
diff
changeset
|
477 |
(case proof_method_and_used_thms of |
d4a52993a81e
removed unsynchronized references in mirabelle_sledgehammer
desharna
parents:
74996
diff
changeset
|
478 |
SOME (proof_method, used_thms) => |
d4a52993a81e
removed unsynchronized references in mirabelle_sledgehammer
desharna
parents:
74996
diff
changeset
|
479 |
run_proof_method trivial false name proof_method used_thms timeout pos pre |
d4a52993a81e
removed unsynchronized references in mirabelle_sledgehammer
desharna
parents:
74996
diff
changeset
|
480 |
|> apfst (prefix (proof_method ^ " (sledgehammer): ")) |
74996
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
481 |
| NONE => ("", I)) |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
482 |
val () = Synchronized.change data |
1f4c39ffb116
tuned mirabelle_sledgehammer to have a single call to Synchronized.change per run
desharna
parents:
74981
diff
changeset
|
483 |
(change_data1 #> change_data2 #> inc_sh_calls #> not trivial ? inc_sh_nontriv_calls) |
74967
3f55c5feca58
prefixed all mirabelle_sledgehammer output lines with sledgehammer output
desharna
parents:
74953
diff
changeset
|
484 |
in |
3f55c5feca58
prefixed all mirabelle_sledgehammer output lines with sledgehammer output
desharna
parents:
74953
diff
changeset
|
485 |
log1 ^ "\n" ^ log2 |
3f55c5feca58
prefixed all mirabelle_sledgehammer output lines with sledgehammer output
desharna
parents:
74953
diff
changeset
|
486 |
|> Symbol.trim_blanks |
3f55c5feca58
prefixed all mirabelle_sledgehammer output lines with sledgehammer output
desharna
parents:
74953
diff
changeset
|
487 |
|> prefix_lines (Sledgehammer.short_string_of_sledgehammer_outcome outcome ^ " ") |
3f55c5feca58
prefixed all mirabelle_sledgehammer output lines with sledgehammer output
desharna
parents:
74953
diff
changeset
|
488 |
end |
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
489 |
end |
73697 | 490 |
|
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
491 |
fun finalize () = log_data (Synchronized.value data) |
75003
f21e7e6172a0
renamed run_action to run in Mirabelle.action record
desharna
parents:
75001
diff
changeset
|
492 |
in (init_msg, {run = run, finalize = finalize}) end |
73847
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
493 |
|
58f6b41efe88
refactored Mirabelle to produce output in real time
desharna
parents:
73797
diff
changeset
|
494 |
val () = Mirabelle.register_action "sledgehammer" make_action |
32385
594890623c46
split actions from Mirabelle core (Mirabelle may thus be applied to basic theories in HOL)
boehmes
parents:
diff
changeset
|
495 |
|
74951
0b6f795d3b78
proper filtering inf induction rules in Mirabelle
desharna
parents:
74950
diff
changeset
|
496 |
end |