updated CASC instructions + tuning
authorblanchet
Thu, 15 Dec 2016 15:05:35 +0100
changeset 64561 a7664ca9ffc5
parent 64560 c48becd96398
child 64575 d44f0b714e13
updated CASC instructions + tuning
src/HOL/TPTP/ATP_Problem_Import.thy
src/HOL/TPTP/CASC/ReadMe
src/HOL/TPTP/CASC/SysDesc_Isabelle.html
src/HOL/TPTP/CASC/SysDesc_Nitpick.html
src/HOL/TPTP/CASC/SysDesc_Refute.html
src/HOL/TPTP/atp_problem_import.ML
src/HOL/TPTP/lib/Tools/tptp_isabelle
src/HOL/TPTP/lib/Tools/tptp_isabelle_hot
src/HOL/TPTP/lib/Tools/tptp_nitpick
src/HOL/TPTP/lib/Tools/tptp_refute
src/HOL/TPTP/lib/Tools/tptp_sledgehammer
src/HOL/TPTP/lib/Tools/tptp_translate
--- a/src/HOL/TPTP/ATP_Problem_Import.thy	Wed Dec 14 18:52:17 2016 +0100
+++ b/src/HOL/TPTP/ATP_Problem_Import.thy	Thu Dec 15 15:05:35 2016 +0100
@@ -22,6 +22,7 @@
 
 (*
 ML {* ATP_Problem_Import.isabelle_tptp_file @{theory} 50 "$TPTP/Problems/PUZ/PUZ107^5.p" *}
+ML {* ATP_Problem_Import.isabelle_hot_tptp_file @{theory} 20 "$TPTP/Problems/SYO/SYO304^5.p" *}
 *)
 
 end
--- a/src/HOL/TPTP/CASC/ReadMe	Wed Dec 14 18:52:17 2016 +0100
+++ b/src/HOL/TPTP/CASC/ReadMe	Thu Dec 15 15:05:35 2016 +0100
@@ -1,4 +1,4 @@
-Notes from Geoff:
+Notes from Geoff Sutcliffe:
 
 I added a few lines to the top of bin/isabelle ...
 
@@ -48,10 +48,11 @@
 		STAREXEC_WALLCLOCK_LIMIT=300 ./bin/starexec_run_isabelle_hot $TPTP/Problems/CSR/CSR150^3.p
 		STAREXEC_WALLCLOCK_LIMIT=300 ./bin/starexec_run_isabelle_hot $TPTP/Problems/SYO/SYO304^5.p
 
-  The first problem is unprovable; the second one is proved by Satallax.
+  The first problem is unprovable; the second one is proved by Satallax (after
+  some delay).
 
-  All the tools accept CNF, FOF, TFF0, or THF0 problems and output SZS
-  statuses of the form
+  All the tools accept CNF, FOF, TFF0, TFF1, THF0, or THF1 problems and output
+  SZS statuses of the form
 
   	% SZS status XXX
 
@@ -60,43 +61,32 @@
     {Unknown, TimedOut, Unsatisfiable, Theorem, Satisfiable, CounterSatisfiable}
 
   Nitpick also output a model within "% SZS begin" and "% SZS end" tags, in
-  its idiosyncratic syntax.
-
-  In 2011, there were some problems with Java (needed for Nitpick), because it
-  required so much memory at startup. I doubt there will be any problems this
-  year, because Isabelle now includes its own version of Java, but the solution
-  back then was to replace
-
-  	exec isabelle java
-
-  in the last line of the "contrib/kodkodi-1.5.2/bin/kodkodi" script with
-
-  	/usr/lib64/jvm/java-1.5.0-gcj-4.5-1.5.0.0/jre/bin/java
-
-  See the emails we exchanged on 18 July 2011, with the subject "No problem on
-  my Linux 64-bit".
+  its idiosyncratic syntax. For TFF0 and THF0, phantom type arguments are not
+  supported, and type quantifiers are only allowed at the outermost position
+  in a formula, as "forall".
 
   Enjoy!
 
 
 Notes to myself:
 
-  I downloaded the official Isabelle2015 Linux package from
+  I downloaded the official Isabelle2016-1 Linux package from
 
-    http://isabelle.in.tum.de/dist/Isabelle2015_linux.tar.gz
+    http://isabelle.in.tum.de/dist/Isabelle2016-1_linux.tar.gz
 
-  on "macbroy21" and renamed the directory "Isabelle2015-CASC". I modified
+  on "macbroy21" and renamed the directory "Isabelle2016-1-CASC". I modified
 
-    src/HOL/TPTP/atp_problem_import.ML
+    src/HOL/TPTP
 
   to include changes backported from the development version of Isabelle. I
-  then built a "HOL-TPTP" image:
+  also modified "bin/isabelle" as suggested by Geoff above. I then built a
+  "HOL-TPTP" image:
 
     ./bin/isabelle build -b HOL-TPTP
 
-  I copied the heaps over to "./heaps":
+  I moved the heaps over to "./heaps":
 
-    mv ~/.isabelle/Isabelle2015/heaps .
+    mv ~/.isabelle/Isabelle2016-1/heaps .
 
   I created some wrapper scripts in "./bin":
 
@@ -117,7 +107,7 @@
       http://page.mi.fu-berlin.de/cbenzmueller/leo/leo2_v1.6.2.tgz
 
     I did "make opt". I copied "bin/leo.opt" to
-    "~/Isabelle2015-CASC/contrib/leo".
+    "~/Isabelle2016-1-CASC/contrib/leo".
 
     I added this line to "etc/settings":
 
@@ -141,19 +131,19 @@
 
       SATALLAX_HOME=$ISABELLE_HOME/contrib
 
-  Vampire (2.6):
+  Vampire 4.0 (commit 2fedff6)
 
-    I copied the file "vampire", which I probably got from the 2013 CASC
-    archive and moved it to "~/Isabelle2013-CASC/contrib/vampire".
+    I copied the file "vampire", which I got from Giles Reger on 23 September
+    2015.
 
     I added these lines to "etc/settings":
 
       VAMPIRE_HOME=$ISABELLE_HOME/contrib
-      VAMPIRE_VERSION=3.0
+      VAMPIRE_VERSION=4.0
 
   Z3 TPTP (4.3.2.0 postrelease):
 
-    I cloned out the git repository:
+    For Isabelle2015, I cloned out the git repository:
 
       git clone https://git01.codeplex.com/z3
 
@@ -173,9 +163,11 @@
   "/tmp/T.thy" with the following content:
 
     theory T imports Main begin
+
     lemma "a = b ==> [b] = [a]"
-    sledgehammer [cvc4 e leo2 satallax spass vampire z3 z3_tptp] ()
-    oops
+      sledgehammer [cvc4 e leo2 satallax spass vampire z3 (*z3_tptp*)] ()
+      oops
+
     end
 
   Then I ran
@@ -196,4 +188,4 @@
 
 
                 Jasmin Blanchette
-                10 June 2015
+                15 December 2016
--- a/src/HOL/TPTP/CASC/SysDesc_Isabelle.html	Wed Dec 14 18:52:17 2016 +0100
+++ b/src/HOL/TPTP/CASC/SysDesc_Isabelle.html	Thu Dec 15 15:05:35 2016 +0100
@@ -1,5 +1,5 @@
 <HR><!------------------------------------------------------------------------>
-<H2>Isabelle/HOL 2015</H2>
+<H2>Isabelle/HOL 2016-1</H2>
 Jasmin C. Blanchette<sup>1</sup>, Lawrence C. Paulson<sup>2</sup>,
 Tobias Nipkow<sup>1</sup>, Makarius Wenzel<sup>3</sup> <BR>
 <sup>1</sup>Technische Universit&auml;t M&uuml;nchen, Germany <BR>
@@ -8,9 +8,9 @@
 
 <H3>Architecture</H3>
 
-Isabelle/HOL 2015 [<A HREF="#References">NPW13</A>] is the higher-order 
-logic incarnation of the generic proof assistant 
-<A HREF="http://www.cl.cam.ac.uk/research/hvg/Isabelle/">Isabelle2015</A>.
+Isabelle/HOL 2016-1 [<A HREF="#References">NPW13</A>] is the higher-order
+logic incarnation of the generic proof assistant
+<A HREF="http://www.cl.cam.ac.uk/research/hvg/Isabelle/">Isabelle2016-1</A>.
 Isabelle/HOL provides several automatic proof tactics, notably an equational
 reasoner [<A HREF="#References">Nip89</A>], a classical reasoner [<A
 HREF="#References">Pau94</A>], and a tableau prover [<A
--- a/src/HOL/TPTP/CASC/SysDesc_Nitpick.html	Wed Dec 14 18:52:17 2016 +0100
+++ b/src/HOL/TPTP/CASC/SysDesc_Nitpick.html	Thu Dec 15 15:05:35 2016 +0100
@@ -1,11 +1,11 @@
 <HR><!------------------------------------------------------------------------>
-<H2>Nitpick 2015</H2>
+<H2>Nitpick 2016-1</H2>
 Jasmin C. Blanchette<BR>
 Technische Universit&auml;t M&uuml;nchen, Germany <BR>
 
 <H3>Architecture</H3>
 
-Nitpick [<A HREF="#References">BN10</A>] is an open source counterexample 
+Nitpick [<A HREF="#References">BN10</A>] is an open source counterexample
 generator for Isabelle/HOL [<A HREF="#References">NPW13</A>]. It builds on
 Kodkod [<A HREF="#References">TJ07</A>], a highly optimized first-order
 relational model finder based on SAT. The name Nitpick is appropriated from a
--- a/src/HOL/TPTP/CASC/SysDesc_Refute.html	Wed Dec 14 18:52:17 2016 +0100
+++ b/src/HOL/TPTP/CASC/SysDesc_Refute.html	Thu Dec 15 15:05:35 2016 +0100
@@ -1,12 +1,12 @@
 <HR><!------------------------------------------------------------------------>
-<H2>Refute 2015</H2>
+<H2>Refute 2016-1</H2>
 Jasmin C. Blanchette<sup>1</sup>, Tjark Weber<sup>2</sup><BR>
 <sup>1</sup>Technische Universit&auml;t M&uuml;nchen, Germany <BR>
 <sup>2</sup>Uppsala Universitet, Sweden <BR>
 
 <H3>Architecture</H3>
 
-Refute [<A HREF="#References">Web08</A>] is an open source counterexample 
+Refute [<A HREF="#References">Web08</A>] is an open source counterexample
 generator for Isabelle/HOL [<A HREF="#References">NPW13</A>] based on a
 SAT solver, and Nitpick's [<A HREF="#References">BN10</A>] precursor.
 
--- a/src/HOL/TPTP/atp_problem_import.ML	Wed Dec 14 18:52:17 2016 +0100
+++ b/src/HOL/TPTP/atp_problem_import.ML	Thu Dec 15 15:05:35 2016 +0100
@@ -11,7 +11,7 @@
     'a list * ('a list * 'a list) * Proof.context
   val nitpick_tptp_file : theory -> int -> string -> unit
   val refute_tptp_file : theory -> int -> string -> unit
-  val can_tac : Proof.context -> tactic -> term -> bool
+  val can_tac : Proof.context -> (Proof.context -> tactic) -> term -> bool
   val SOLVE_TIMEOUT :  int -> string -> tactic -> tactic
   val atp_tac : Proof.context -> int -> (string * string) list -> int -> term list -> string ->
     int -> tactic
@@ -131,14 +131,15 @@
 
 (** Sledgehammer and Isabelle (combination of provers) **)
 
-fun can_tac ctxt tactic conj = can (Goal.prove ctxt [] [] conj) (K tactic)
+fun can_tac ctxt tactic conj =
+  can (Goal.prove_internal ctxt [] (Thm.cterm_of ctxt conj)) (fn [] => tactic ctxt)
 
 fun SOLVE_TIMEOUT seconds name tac st =
   let
     val _ = writeln ("running " ^ name ^ " for " ^ string_of_int seconds ^ " s")
     val result =
       Timeout.apply (Time.fromSeconds seconds) (fn () => SINGLE (SOLVE tac) st) ()
-        handle Timeout.TIMEOUT _ => NONE | ERROR _ => NONE
+      handle Timeout.TIMEOUT _ => NONE | ERROR _ => NONE
   in
     (case result of
       NONE => (writeln ("FAILURE: " ^ name); Seq.empty)
@@ -274,7 +275,7 @@
     val conj = make_conj ([], []) conjs
     val assms = op @ assms
   in
-    can_tac ctxt (sledgehammer_tac true ctxt timeout assms 1) conj
+    can_tac ctxt (fn ctxt => sledgehammer_tac true ctxt timeout assms 1) conj
     |> print_szs_of_success conjs
   end
 
@@ -287,9 +288,9 @@
     val (last_hope_atp, last_hope_completeness) =
       if demo then (ATP_Proof.satallaxN, 0) else (ATP_Proof.vampireN, 2)
   in
-    (can_tac ctxt (auto_etc_tac ctxt (timeout div 2) assms 1) full_conj orelse
-     can_tac ctxt (sledgehammer_tac demo ctxt (timeout div 2) assms 1) conj orelse
-     can_tac ctxt (SOLVE_TIMEOUT timeout (last_hope_atp ^ "(*)")
+    (can_tac ctxt (fn ctxt => auto_etc_tac ctxt (timeout div 2) assms 1) full_conj orelse
+     can_tac ctxt (fn ctxt => sledgehammer_tac demo ctxt (timeout div 2) assms 1) conj orelse
+     can_tac ctxt (fn ctxt => SOLVE_TIMEOUT timeout (last_hope_atp ^ "(*)")
        (atp_tac ctxt last_hope_completeness [] timeout assms last_hope_atp 1)) full_conj)
     |> print_szs_of_success conjs
   end
--- a/src/HOL/TPTP/lib/Tools/tptp_isabelle	Wed Dec 14 18:52:17 2016 +0100
+++ b/src/HOL/TPTP/lib/Tools/tptp_isabelle	Thu Dec 15 15:05:35 2016 +0100
@@ -24,12 +24,12 @@
 TIMEOUT=$1
 shift
 
-isabelle build -b HOL-TPTP
+isabelle build -b HOL-TPTP 2>&1 | grep --line-buffered -v "elapsed time$"
 
 for FILE in "$@"
 do
   echo "theory $SCRATCH imports \"$TPTP_HOME/ATP_Problem_Import\" begin \
 ML {* ATP_Problem_Import.isabelle_tptp_file @{theory} ($TIMEOUT) \"$FILE\" *} end" \
     > /tmp/$SCRATCH.thy
-  isabelle process -e "use_thy \"/tmp/$SCRATCH\"; exit 1;" -l HOL-TPTP | grep --line-buffered -v "^###\|^PROOF FAILED for depth\|^Failure node\|inferences so far.  Searching to depth\|^val \|^Loading theory\|^Warning-The type of\|^   monotype.$"
+  isabelle process -e "use_thy \"/tmp/$SCRATCH\"; exit 1;" -l HOL-TPTP | grep --line-buffered -v "^###\|^PROOF FAILED for depth\|^Failure node\|inferences so far.  Searching to depth\|^val \|^Loading theory\|^poly.*warning: The type of\|^   monotype.$"
 done
--- a/src/HOL/TPTP/lib/Tools/tptp_isabelle_hot	Wed Dec 14 18:52:17 2016 +0100
+++ b/src/HOL/TPTP/lib/Tools/tptp_isabelle_hot	Thu Dec 15 15:05:35 2016 +0100
@@ -24,12 +24,12 @@
 TIMEOUT=$1
 shift
 
-isabelle build -b HOL-TPTP
+isabelle build -b HOL-TPTP 2>&1 | grep --line-buffered -v "elapsed time$"
 
 for FILE in "$@"
 do
   echo "theory $SCRATCH imports \"$TPTP_HOME/ATP_Problem_Import\" begin \
 ML {* ATP_Problem_Import.isabelle_hot_tptp_file @{theory} ($TIMEOUT) \"$FILE\" *} end" \
     > /tmp/$SCRATCH.thy
-  isabelle process -e "use_thy \"/tmp/$SCRATCH\"; exit 1;" -l HOL-TPTP | grep --line-buffered -v "^###\|^PROOF FAILED for depth\|^Failure node\|inferences so far.  Searching to depth\|^val \|^Loading theory\|^Warning-The type of\|^   monotype.$"
+  isabelle process -e "use_thy \"/tmp/$SCRATCH\"; exit 1;" -l HOL-TPTP | grep --line-buffered -v "^###\|^PROOF FAILED for depth\|^Failure node\|inferences so far.  Searching to depth\|^val \|^Loading theory\|^poly.*warning: The type of\|^   monotype.$"
 done
--- a/src/HOL/TPTP/lib/Tools/tptp_nitpick	Wed Dec 14 18:52:17 2016 +0100
+++ b/src/HOL/TPTP/lib/Tools/tptp_nitpick	Thu Dec 15 15:05:35 2016 +0100
@@ -24,12 +24,12 @@
 TIMEOUT=$1
 shift
 
-isabelle build -b HOL-TPTP
+isabelle build -b HOL-TPTP 2>&1 | grep --line-buffered -v "elapsed time$"
 
 for FILE in "$@"
 do
   echo "theory $SCRATCH imports \"$TPTP_HOME/ATP_Problem_Import\" begin \
 ML {* ATP_Problem_Import.nitpick_tptp_file @{theory} ($TIMEOUT) \"$FILE\" *} end" \
     > /tmp/$SCRATCH.thy
-  isabelle process -e "use_thy \"/tmp/$SCRATCH\"; exit 1;" -l HOL-TPTP | grep --line-buffered -v "^###\|^PROOF FAILED for depth\|^Failure node\|inferences so far.  Searching to depth\|^val \|^Loading theory\|^Warning-The type of\|^   monotype.$"
+  isabelle process -e "use_thy \"/tmp/$SCRATCH\"; exit 1;" -l HOL-TPTP | grep --line-buffered -v "^###\|^PROOF FAILED for depth\|^Failure node\|inferences so far.  Searching to depth\|^val \|^Loading theory\|^poly.*warning: The type of\|^   monotype.$"
 done
--- a/src/HOL/TPTP/lib/Tools/tptp_refute	Wed Dec 14 18:52:17 2016 +0100
+++ b/src/HOL/TPTP/lib/Tools/tptp_refute	Thu Dec 15 15:05:35 2016 +0100
@@ -23,12 +23,12 @@
 TIMEOUT=$1
 shift
 
-isabelle build -b HOL-TPTP
+isabelle build -b HOL-TPTP 2>&1 | grep --line-buffered -v "elapsed time$"
 
 for FILE in "$@"
 do
   echo "theory $SCRATCH imports \"$TPTP_HOME/ATP_Problem_Import\" begin \
 ML {* ATP_Problem_Import.refute_tptp_file @{theory} ($TIMEOUT) \"$FILE\" *} end" \
     > /tmp/$SCRATCH.thy
-  isabelle process -e "use_thy \"/tmp/$SCRATCH\"; exit 1;" -l HOL-TPTP | grep --line-buffered -v "^###\|^PROOF FAILED for depth\|^Failure node\|inferences so far.  Searching to depth\|^val \|^Loading theory\|^Warning-The type of\|^   monotype.$"
+  isabelle process -e "use_thy \"/tmp/$SCRATCH\"; exit 1;" -l HOL-TPTP | grep --line-buffered -v "^###\|^PROOF FAILED for depth\|^Failure node\|inferences so far.  Searching to depth\|^val \|^Loading theory\|^poly.*warning: The type of\|^   monotype.$"
 done
--- a/src/HOL/TPTP/lib/Tools/tptp_sledgehammer	Wed Dec 14 18:52:17 2016 +0100
+++ b/src/HOL/TPTP/lib/Tools/tptp_sledgehammer	Thu Dec 15 15:05:35 2016 +0100
@@ -24,12 +24,12 @@
 TIMEOUT=$1
 shift
 
-isabelle build -b HOL-TPTP
+isabelle build -b HOL-TPTP 2>&1 | grep --line-buffered -v "elapsed time$"
 
 for FILE in "$@"
 do
   echo "theory $SCRATCH imports \"$TPTP_HOME/ATP_Problem_Import\" begin \
 ML {* ATP_Problem_Import.sledgehammer_tptp_file @{theory} ($TIMEOUT) \"$FILE\" *} end" \
     > /tmp/$SCRATCH.thy
-  isabelle process -e "use_thy \"/tmp/$SCRATCH\"; exit 1;" -l HOL-TPTP | grep --line-buffered -v "^###\|^PROOF FAILED for depth\|^Failure node\|inferences so far.  Searching to depth\|^val \|^Loading theory\|^Warning-The type of\|^   monotype.$"
+  isabelle process -e "use_thy \"/tmp/$SCRATCH\"; exit 1;" -l HOL-TPTP | grep --line-buffered -v "^###\|^PROOF FAILED for depth\|^Failure node\|inferences so far.  Searching to depth\|^val \|^Loading theory\|^poly.*warning: The type of\|^   monotype.$"
 done
--- a/src/HOL/TPTP/lib/Tools/tptp_translate	Wed Dec 14 18:52:17 2016 +0100
+++ b/src/HOL/TPTP/lib/Tools/tptp_translate	Thu Dec 15 15:05:35 2016 +0100
@@ -23,9 +23,9 @@
 
 args=("$@")
 
-isabelle build -b HOL-TPTP
+isabelle build -b HOL-TPTP 2>&1 | grep --line-buffered -v "elapsed time$"
 
 echo "theory $SCRATCH imports \"$TPTP_HOME/ATP_Problem_Import\" begin \
 ML {* ATP_Problem_Import.translate_tptp_file @{theory} \"${args[0]}\" \"${args[1]}\" *} end" \
   > /tmp/$SCRATCH.thy
-isabelle process -e "use_thy \"/tmp/$SCRATCH\"; exit 1;" -l HOL-TPTP | grep --line-buffered -v "^###\|^PROOF FAILED for depth\|^Failure node\|inferences so far.  Searching to depth\|^val \|^Loading theory\|^Warning-The type of\|^   monotype.$"
+isabelle process -e "use_thy \"/tmp/$SCRATCH\"; exit 1;" -l HOL-TPTP | grep --line-buffered -v "^###\|^PROOF FAILED for depth\|^Failure node\|inferences so far.  Searching to depth\|^val \|^Loading theory\|^poly.*warning: The type of\|^   monotype.$"