# HG changeset patch # User blanchet # Date 1369126874 -7200 # Node ID 6c38df1d294af3a7d16451f9a330295f0c357551 # Parent f353fe3c2b92cb883bdd9cc4ebaabf1936a56133 added CASC-related files, to keep a public record of the Isabelle submission at the competition diff -r f353fe3c2b92 -r 6c38df1d294a src/HOL/TPTP/CASC/ReadMe --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/HOL/TPTP/CASC/ReadMe Tue May 21 11:01:14 2013 +0200 @@ -0,0 +1,168 @@ +Isabelle/HOL 2013 at CASC-24 + +Notes to Geoff: + + Once you have open the archive, Isabelle and its tool are ready to go. The + various tools are invoked as follows: + + Isabelle, competition version: + ./bin/isabelle tptp_isabelle %d %s + + Isabelle, demo version: + ./bin/isabelle tptp_isabelle_hot %d %s + + Nitpick and Nitrox: + ./bin/isabelle tptp_nitpick %d %s + + Refute: + ./bin/isabelle tptp_refute %d %s + + Here's an example: + + ./bin/isabelle tptp_isabelle_hot 300 $TPTP/Problems/SET/SET014^4.p + + The output should look as follows: + + > val it = (): unit + val commit = fn: unit -> bool + Loading theory "Scratch_tptp_isabelle_hot_29414_2568" + running nitpick for 7 s + FAILURE: nitpick + running simp for 15 s + SUCCESS: simp + % SZS status Theorem + + Additional sanity tests: + + ./bin/isabelle tptp_isabelle_hot 300 $TPTP/Problems/CSR/CSR150^3.p + ./bin/isabelle tptp_isabelle_hot 300 $TPTP/Problems/SYO/SYO304^5.p + ./bin/isabelle tptp_isabelle_hot 300 $TPTP/Problems/PUZ/PUZ087^1.p + + The first problem is unprovable; the second one is proved by Satallax; the + third one is proved by LEO-II. + + All the tools accept CNF, FOF, TFF0, or THF0 problems and output SZS statuses + of the form + + % SZS status XXX + + where XXX is in the set + + {Unknown, TimedOut, Unsatisfiable, Theorem, Satisfiable, CounterSatisfiable} + + Nitpick and Nitrox also output a model within "% SZS begin" and "% SZS end" + tags. + + In 2011, there were some problems with Java (needed for Nitpick), because it + required so much memory at startup. I doubt there will be any problems this + year, because Isabelle now includes its own version of Java, but the solution + back then was to replace + + exec "$ISABELLE_TOOL" java + + in the last line of the "contrib/kodkodi-1.5.2/bin/kodkodi" script with + + /usr/lib64/jvm/java-1.5.0-gcj-4.5-1.5.0.0/jre/bin/java + + See the emails we exchanged on July 18, 2011, with the subject "No problem on + my Linux 64-bit". + + Enjoy! + + +Notes to myself: + + I downloaded the official Isabelle2013 Linux package from + + http://www.cl.cam.ac.uk/research/hvg/Isabelle/dist/Isabelle2013_linux.tar.gz + + on a "macbroy" machine and renamed the directory "Isabelle2013-CASC". I built + a "HOL-TPTP" image: + + ./bin/isabelle build -b HOL-TPTP + + I copied the heaps over to "./heaps": + + mv ~/.isabelle/Isabelle2013/heaps . + + To use this image and suppress some scary output, I added + + HOL-TPTP | grep --line-buffered -v "^###\|^PROOF FAILED for depth\|^Failure node\|inferences so far. Searching to depth\|^val it = (): unit" + + to the next-to-last lines of "src/HOL/TPTP/lib/Tools/tptp_[inrs]*". + + At this point I tested the "SYN044^4" mentioned above. + + I renamed "README" to "README.orig" and copied this "ReadMe" over. + + Next, I installed and enabled ATPs. + + LEO-II (1.4.3): + + I logged to a 32-bit Linux ("lxlabbroy") machine. I retrieved LEO-II from + + http://www.ags.uni-sb.de/~leo/leo2_v1.4.3.tgz + + I did "make opt". I copied + "bin/leo.opt" to "~/Isabelle2013-CASC/contrib/leo". + + I added this line to "etc/settings": + + LEO2_HOME=$ISABELLE_HOME/contrib + + Satallax (2.7): + + I logged to a 32-bit Linux ("lxlabbroy") machine. I retrieved Satallax from + + http://www.ps.uni-saarland.de/~cebrown/satallax/downloads/satallax-2.7.tar.gz + + I added E to the path so that it gets detected by Satallax's configure + script: + + export PATH=$PATH:~/Isabelle2013-CASC/contrib/e-1.6-2/x86-linux + + I followed the instructions in "satallax-2.7/INSTALL". I copied + "bin/satallax.opt" to "~/Isabelle2013-CASC/contrib/satallax". + + I added this line to "etc/settings": + + SATALLAX_HOME=$ISABELLE_HOME/contrib + + Vampire (2.6): + + I copied the file "vampire_rel.linux64" from the 2012 CASC archive to + "~/Isabelle2013-CASC/contrib/vampire". + + I added these lines to "etc/settings": + + VAMPIRE_HOME=$ISABELLE_HOME/contrib + VAMPIRE_VERSION=2.6 + + Z3 (3.2): + + I uncommented the following line in "contrib/z3-3.2/etc/settings": + + # Z3_NON_COMMERCIAL="yes" + + To test that the examples actually worked, I did + + ./bin/isabelle tty + theory T imports Main begin; + lemma "a = b ==> [b] = [a]"; + sledgehammer [e leo2 satallax spass vampire z3 z3_tptp] (); + + and I performed the aforementioned sanity tests. + + Ideas for next year: + + * Unfold definitions, esp. if it makes the problem more first-order (cf. + "SEU466^1"). + * Detect and remove needless definitions. + * Expand "p b" to "(b & p True) | (~ b & p False)" (cf. "CSR148^3"). + * Select subset of axioms (cf. "CSR148^3"). + + That's it. + + + Jasmin Blanchette + 21 May 2013 diff -r f353fe3c2b92 -r 6c38df1d294a src/HOL/TPTP/CASC/SysDesc_Isabelle.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/HOL/TPTP/CASC/SysDesc_Isabelle.html Tue May 21 11:01:14 2013 +0200 @@ -0,0 +1,183 @@ +
+Isabelle includes a parser for the TPTP syntaxes CNF, FOF, TFF0, and THF0, due +to Nik Sultana. It also includes TPTP versions of its popular tools, invokable +on the command line as isabelle tptp_tool max_secs +file.p. For example: + +
+ ++isabelle tptp_isabelle_hot 100 SEU/SEU824^3.p +
+Two versions of Isabelle participate this year. The demo (or HOT) version +includes its competitors LEO-II [BPTF08] and Satallax +[Bro12] as Sledgehammer backends, whereas the +competition version leaves them out. + +
+The implementation of Isabelle relies on a small LCF-style kernel, meaning that +inferences are implemented as operations on an abstract theorem +datatype. Assuming the kernel is correct, all values of type theorem +are correct by construction. +
+Most of the code for Isabelle was written by the Isabelle teams at the +University of Cambridge and the Technische Universität München. +Isabelle/HOL is available for all major platforms under a BSD-style license +from +
+ http://www.cl.cam.ac.uk/research/hvg/Isabelle+ +
+Isabelle won last year by a thin margin. This year: first or second place. + +
+ +
+Nitpick employs Kodkod to find a finite model of the negated conjecture. The +translation from HOL to Kodkod's first-order relational logic (FORL) is +parameterized by the cardinalities of the atomic types occurring in it. Nitpick +enumerates the possible cardinalities for each atomic type, exploiting +monotonicity to prune the search space [BK11]. If a +formula has a finite counterexample, the tool eventually finds it, unless it +runs out of resources. + +
+SAT solvers are particularly sensitive to the encoding of problems, so special +care is needed when translating HOL formulas. +As a rule, HOL scalars are mapped to FORL singletons and functions are mapped to +FORL relations accompanied by a constraint. More specifically, +an n-ary first-order function (curried or not) can be coded as an +(n + 1)-ary relation accompanied by a constraint. However, if the return +type is the type of Booleans, the function is more efficiently coded as an +unconstrained n-ary relation. +Higher-order quantification and functions bring complications of their own. A +function from σ to τ cannot be directly passed as an argument in FORL; +Nitpick's workaround is to pass |σ| arguments of type τ that encode a +function table. + +
+Nitpick, like most of Isabelle/HOL, is written in Standard ML. Unlike Isabelle +itself, which adheres to the LCF small-kernel discipline, Nitpick does not +certify its results and must be trusted. +
+Nitpick is available as part of Isabelle/HOL for all major platforms under a +BSD-style license from +
+ http://www.cl.cam.ac.uk/research/hvg/Isabelle+ +
+ +
+Nitrox employs Kodkod to find a finite model of the negated conjecture. It +performs a few transformations on the input, such as pushing quantifiers inside, +but 99% of the solving logic is in Kodkod and the underlying SAT solver. + +
+The translation from HOL to Kodkod's first-order relational logic (FORL) is +parameterized by the cardinalities of the atomic types occurring in it. Nitrox +enumerates the possible cardinalities for the universe. If a formula has a +finite counterexample, the tool eventually finds it, unless it runs out of +resources. + +
+Nitpick is optimized to work with higher-order logic (HOL) and its definitional +principles (e.g., (co)inductive predicates, (co)inductive datatypes, +(co)recursive functions). When invoked on untyped first-order problem, few of +its optimizations come into play, and the problem handed to Kodkod is +essentially a first-order relational logic (FORL) rendering of the TPTP FOF +problem. There are two main exceptions: +
+Nitrox, like most of Isabelle/HOL, is written in Standard ML. Unlike Isabelle +itself, which adheres to the LCF small-kernel discipline, Nitrox does not +certify its results and must be trusted. Kodkod is written in Java. MiniSat 1.14 +is used as the SAT solver. + +
+ +
+Refute employs a SAT solver to find a finite model of the negated conjecture. +The translation from HOL to propositional logic is parameterized by the +cardinalities of the atomic types occurring in the conjecture. Refute enumerates +the possible cardinalities for each atomic type. If a formula has a finite +counterexample, the tool eventually finds it, unless it runs out of resources. + +
+Refute, like most of Isabelle/HOL, is written in Standard ML. Unlike Isabelle +itself, which adheres to the LCF small-kernel discipline, Refute does not +certify its results and must be trusted. +
+Refute is available as part of Isabelle/HOL for all major platforms under a +BSD-style license from +
+ http://www.cl.cam.ac.uk/research/hvg/Isabelle+ +
+ +