# HG changeset patch # User blanchet # Date 1369126874 -7200 # Node ID 6c38df1d294af3a7d16451f9a330295f0c357551 # Parent f353fe3c2b92cb883bdd9cc4ebaabf1936a56133 added CASC-related files, to keep a public record of the Isabelle submission at the competition diff -r f353fe3c2b92 -r 6c38df1d294a src/HOL/TPTP/CASC/ReadMe --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/HOL/TPTP/CASC/ReadMe Tue May 21 11:01:14 2013 +0200 @@ -0,0 +1,168 @@ +Isabelle/HOL 2013 at CASC-24 + +Notes to Geoff: + + Once you have open the archive, Isabelle and its tool are ready to go. The + various tools are invoked as follows: + + Isabelle, competition version: + ./bin/isabelle tptp_isabelle %d %s + + Isabelle, demo version: + ./bin/isabelle tptp_isabelle_hot %d %s + + Nitpick and Nitrox: + ./bin/isabelle tptp_nitpick %d %s + + Refute: + ./bin/isabelle tptp_refute %d %s + + Here's an example: + + ./bin/isabelle tptp_isabelle_hot 300 $TPTP/Problems/SET/SET014^4.p + + The output should look as follows: + + > val it = (): unit + val commit = fn: unit -> bool + Loading theory "Scratch_tptp_isabelle_hot_29414_2568" + running nitpick for 7 s + FAILURE: nitpick + running simp for 15 s + SUCCESS: simp + % SZS status Theorem + + Additional sanity tests: + + ./bin/isabelle tptp_isabelle_hot 300 $TPTP/Problems/CSR/CSR150^3.p + ./bin/isabelle tptp_isabelle_hot 300 $TPTP/Problems/SYO/SYO304^5.p + ./bin/isabelle tptp_isabelle_hot 300 $TPTP/Problems/PUZ/PUZ087^1.p + + The first problem is unprovable; the second one is proved by Satallax; the + third one is proved by LEO-II. + + All the tools accept CNF, FOF, TFF0, or THF0 problems and output SZS statuses + of the form + + % SZS status XXX + + where XXX is in the set + + {Unknown, TimedOut, Unsatisfiable, Theorem, Satisfiable, CounterSatisfiable} + + Nitpick and Nitrox also output a model within "% SZS begin" and "% SZS end" + tags. + + In 2011, there were some problems with Java (needed for Nitpick), because it + required so much memory at startup. I doubt there will be any problems this + year, because Isabelle now includes its own version of Java, but the solution + back then was to replace + + exec "$ISABELLE_TOOL" java + + in the last line of the "contrib/kodkodi-1.5.2/bin/kodkodi" script with + + /usr/lib64/jvm/java-1.5.0-gcj-4.5-1.5.0.0/jre/bin/java + + See the emails we exchanged on July 18, 2011, with the subject "No problem on + my Linux 64-bit". + + Enjoy! + + +Notes to myself: + + I downloaded the official Isabelle2013 Linux package from + + http://www.cl.cam.ac.uk/research/hvg/Isabelle/dist/Isabelle2013_linux.tar.gz + + on a "macbroy" machine and renamed the directory "Isabelle2013-CASC". I built + a "HOL-TPTP" image: + + ./bin/isabelle build -b HOL-TPTP + + I copied the heaps over to "./heaps": + + mv ~/.isabelle/Isabelle2013/heaps . + + To use this image and suppress some scary output, I added + + HOL-TPTP | grep --line-buffered -v "^###\|^PROOF FAILED for depth\|^Failure node\|inferences so far. Searching to depth\|^val it = (): unit" + + to the next-to-last lines of "src/HOL/TPTP/lib/Tools/tptp_[inrs]*". + + At this point I tested the "SYN044^4" mentioned above. + + I renamed "README" to "README.orig" and copied this "ReadMe" over. + + Next, I installed and enabled ATPs. + + LEO-II (1.4.3): + + I logged to a 32-bit Linux ("lxlabbroy") machine. I retrieved LEO-II from + + http://www.ags.uni-sb.de/~leo/leo2_v1.4.3.tgz + + I did "make opt". I copied + "bin/leo.opt" to "~/Isabelle2013-CASC/contrib/leo". + + I added this line to "etc/settings": + + LEO2_HOME=$ISABELLE_HOME/contrib + + Satallax (2.7): + + I logged to a 32-bit Linux ("lxlabbroy") machine. I retrieved Satallax from + + http://www.ps.uni-saarland.de/~cebrown/satallax/downloads/satallax-2.7.tar.gz + + I added E to the path so that it gets detected by Satallax's configure + script: + + export PATH=$PATH:~/Isabelle2013-CASC/contrib/e-1.6-2/x86-linux + + I followed the instructions in "satallax-2.7/INSTALL". I copied + "bin/satallax.opt" to "~/Isabelle2013-CASC/contrib/satallax". + + I added this line to "etc/settings": + + SATALLAX_HOME=$ISABELLE_HOME/contrib + + Vampire (2.6): + + I copied the file "vampire_rel.linux64" from the 2012 CASC archive to + "~/Isabelle2013-CASC/contrib/vampire". + + I added these lines to "etc/settings": + + VAMPIRE_HOME=$ISABELLE_HOME/contrib + VAMPIRE_VERSION=2.6 + + Z3 (3.2): + + I uncommented the following line in "contrib/z3-3.2/etc/settings": + + # Z3_NON_COMMERCIAL="yes" + + To test that the examples actually worked, I did + + ./bin/isabelle tty + theory T imports Main begin; + lemma "a = b ==> [b] = [a]"; + sledgehammer [e leo2 satallax spass vampire z3 z3_tptp] (); + + and I performed the aforementioned sanity tests. + + Ideas for next year: + + * Unfold definitions, esp. if it makes the problem more first-order (cf. + "SEU466^1"). + * Detect and remove needless definitions. + * Expand "p b" to "(b & p True) | (~ b & p False)" (cf. "CSR148^3"). + * Select subset of axioms (cf. "CSR148^3"). + + That's it. + + + Jasmin Blanchette + 21 May 2013 diff -r f353fe3c2b92 -r 6c38df1d294a src/HOL/TPTP/CASC/SysDesc_Isabelle.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/HOL/TPTP/CASC/SysDesc_Isabelle.html Tue May 21 11:01:14 2013 +0200 @@ -0,0 +1,183 @@ +
+

Isabelle/HOL 2013

+Jasmin C. Blanchette1, Lawrence C. Paulson2, +Tobias Nipkow1, Makarius Wenzel3
+1Technische Universität München, Germany
+2University of Cambridge, United Kingdom
+3Université Paris Sud, France
+ +

Architecture

+ +Isabelle/HOL 2013 [NPW13] is the higher-order +logic incarnation of the generic proof assistant +Isabelle2013. +Isabelle/HOL provides several automatic proof tactics, notably an equational +reasoner [Nip89], a classical reasoner [Pau94], and a tableau prover [Pau99]. It also integrates external first- and +higher-order provers via its subsystem Sledgehammer [PB10, BBP11]. + +

+Isabelle includes a parser for the TPTP syntaxes CNF, FOF, TFF0, and THF0, due +to Nik Sultana. It also includes TPTP versions of its popular tools, invokable +on the command line as isabelle tptp_tool max_secs +file.p. For example: + +

+isabelle tptp_isabelle_hot 100 SEU/SEU824^3.p
+
+ +

+Two versions of Isabelle participate this year. The demo (or HOT) version +includes its competitors LEO-II [BPTF08] and Satallax +[Bro12] as Sledgehammer backends, whereas the +competition version leaves them out. + +

Strategies

+ +The Isabelle tactic submitted to the competition simply tries the +following tactics sequentially: + +
+
sledgehammer +
Invokes the following sequence of provers as oracles via Sledgehammer: + +
nitpick +
For problems involving only the type $o of Booleans, checks + whether a finite model exists using Nitpick [BN10]. +
simp +
Performs equational reasoning using rewrite rules [Nip89]. +
blast +
Searches for a proof using a fast untyped tableau prover and then + attempts to reconstruct the proof using Isabelle tactics + [Pau99]. +
auto+spass +
Combines simplification and classical reasoning + [Pau94] under one roof; then invoke Sledgehammer + with SPASS on any subgoals that emerge. +
z3 +
Invokes the SMT solver Z3 3.2 [dMB08]. +
cvc3 +
Invokes the SMT solver CVC3 2.2 [BT07]. +
fast +
Searches for a proof using sequent-style reasoning, performing a + depth-first search [Pau94]. Unlike + blast, it construct proofs directly in Isabelle. That makes it + slower but enables it to work in the presence of the more unusual features + of HOL, such as type classes and function unknowns. +
best +
Similar to fast, except that it performs a best-first search. +
force +
Similar to auto, but more exhaustive. +
meson +
Implements Loveland's MESON procedure [Lov78]. +Constructs proofs directly in Isabelle. +
fastforce +
Combines fast and force. +
+ +

Implementation

+ +Isabelle is a generic theorem prover written in Standard ML. Its meta-logic, +Isabelle/Pure, provides an intuitionistic fragment of higher-order logic. The +HOL object logic extends pure with a more elaborate version of higher-order +logic, complete with the familiar connectives and quantifiers. Other object +logics are available, notably FOL (first-order logic) and ZF +(Zermelo–Fraenkel set theory). +

+The implementation of Isabelle relies on a small LCF-style kernel, meaning that +inferences are implemented as operations on an abstract theorem +datatype. Assuming the kernel is correct, all values of type theorem +are correct by construction. +

+Most of the code for Isabelle was written by the Isabelle teams at the +University of Cambridge and the Technische Universität München. +Isabelle/HOL is available for all major platforms under a BSD-style license +from +

+    http://www.cl.cam.ac.uk/research/hvg/Isabelle
+ +

Expected Competition Performance

+ +

+Isabelle won last year by a thin margin. This year: first or second place. + +

References

+
+ +
BBP11 +
Blanchette J. C., Böhme S., Paulson L. C. (2011), + Extending Sledgehammer with SMT Solvers, + CADE-23, LNAI 6803, pp. 116–130, Springer. +
BN10 +
Blanchette J. C., Nipkow T. (2010), + Nitpick: A Counterexample Generator for Higher-Order Logic Based on a Relational Model Finder, + ITP 2010, LNCS 6172, pp. 131–146, Springer. +
BPTF08 +
Benzmüller C., Paulson L. C., Theiss F., Fietzke A. (2008), + LEO-II—A Cooperative Automatic Theorem Prover for Higher-Order Logic, + IJCAR 2008, LNAI 5195, pp. 162–170, Springer. +
BPWW12 +
Blanchette J. C., Popescu A., Wand D., Weidenbach C. (2012), + More SPASS with Isabelle, + ITP 2012, Springer. +
Bro12 +
Brown C. (2012), + Satallax: An Automated Higher-Order Prover (System Description), + IJCAR 2012, Springer. +
BT07 +
Barrett C., Tinelli C. (2007), + CVC3 (System Description), + CAV 2007, LNCS 4590, pp. 298–302, Springer. +
dMB08 +
de Moura L. M., Bjørner N. (2008), + Z3: An Efficient SMT Solver, + TACAS 2008, LNCS 4963, pp. 337–340, Springer. +
Lov78 +
Loveland D. W. (1978), + Automated Theorem Proving: A Logical Basis, + North-Holland Publishing Co. +
Nip89 +
Nipkow T. (1989), + Equational Reasoning in Isabelle, + Sci. Comput. Program. 12(2), + pp. 123–149, + Elsevier. +
NPW13 +
Nipkow T., Paulson L. C., Wenzel M. (2013), + Isabelle/HOL: A Proof Assistant for Higher-Order Logic, + http://www.cl.cam.ac.uk/research/hvg/Isabelle/dist/Isabelle/doc/tutorial.pdf. +
Pau94 +
Paulson L. C. (1994), + Isabelle: A Generic Theorem Prover, + LNCS 828, + Springer. +
Pau99 +
Paulson L. C. (1999), + A Generic Tableau Prover and Its Integration with Isabelle, + J. Univ. Comp. Sci. 5, + pp. 73–87. +
PB10 +
Paulson L. C., Blanchette J. C. (2010), + Three Years of Experience with Sledgehammer, a Practical Link between Automatic and Interactive Theorem Provers, + IWIL-2010. +
RV02 +
Riazanov A., Voronkov A. (2002), + The Design and Implementation of Vampire, + AI Comm. 15(2-3), 91–110. +
Sch04 +
Schulz S. (2004), + System Description: E 0.81, + IJCAR 2004, LNAI 3097, pp. 223–228, Springer. +
+

+ +


diff -r f353fe3c2b92 -r 6c38df1d294a src/HOL/TPTP/CASC/SysDesc_Nitpick.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/HOL/TPTP/CASC/SysDesc_Nitpick.html Tue May 21 11:01:14 2013 +0200 @@ -0,0 +1,82 @@ +
+

Nitpick 2013

+Jasmin C. Blanchette
+Technische Universität München, Germany
+ +

Architecture

+ +Nitpick [BN10] is an open source counterexample +generator for Isabelle/HOL [NPW13]. It builds on +Kodkod [TJ07], a highly optimized first-order +relational model finder based on SAT. The name Nitpick is appropriated from a +now retired Alloy precursor. In a case study, it was applied successfully to a +formalization of the C++ memory model [BWB+11]. + +

Strategies

+ +

+Nitpick employs Kodkod to find a finite model of the negated conjecture. The +translation from HOL to Kodkod's first-order relational logic (FORL) is +parameterized by the cardinalities of the atomic types occurring in it. Nitpick +enumerates the possible cardinalities for each atomic type, exploiting +monotonicity to prune the search space [BK11]. If a +formula has a finite counterexample, the tool eventually finds it, unless it +runs out of resources. + +

+SAT solvers are particularly sensitive to the encoding of problems, so special +care is needed when translating HOL formulas. +As a rule, HOL scalars are mapped to FORL singletons and functions are mapped to +FORL relations accompanied by a constraint. More specifically, +an n-ary first-order function (curried or not) can be coded as an +(n + 1)-ary relation accompanied by a constraint. However, if the return +type is the type of Booleans, the function is more efficiently coded as an +unconstrained n-ary relation. +Higher-order quantification and functions bring complications of their own. A +function from σ to τ cannot be directly passed as an argument in FORL; +Nitpick's workaround is to pass |σ| arguments of type τ that encode a +function table. + +

Implementation

+ +

+Nitpick, like most of Isabelle/HOL, is written in Standard ML. Unlike Isabelle +itself, which adheres to the LCF small-kernel discipline, Nitpick does not +certify its results and must be trusted. +

+Nitpick is available as part of Isabelle/HOL for all major platforms under a +BSD-style license from +

+    http://www.cl.cam.ac.uk/research/hvg/Isabelle
+ +

Expected Competition Performance

+ +Thanks to Kodkod's amazing power, we expect that Nitpick will beat both Satallax +and Refute with its hands tied behind its back in the TNT category. + +

References

+
+
BK11 +
Blanchette J. C., Krauss A. (2011), + Monotonicity Inference for Higher-Order Formulas, + J. Autom. Reasoning 47(4), pp. 369–398, 2011. +
BN10 +
Blanchette J. C., Nipkow T. (2010), + Nitpick: A Counterexample Generator for Higher-Order Logic Based on a Relational Model Finder, + ITP 2010, LNCS 6172, pp. 131–146, Springer. +
BWB+11 +
Blanchette J. C., Weber T., Batty M., Owens S., Sarkar S. (2011), + Nitpicking C++ Concurrency, + PPDP 2011, pp. 113–124, ACM Press. +
NPW13 +
Nipkow T., Paulson L. C., Wenzel M. (2013), + Isabelle/HOL: A Proof Assistant for Higher-Order Logic, + http://www.cl.cam.ac.uk/research/hvg/Isabelle/dist/Isabelle/doc/tutorial.pdf. +
TJ07 +
Torlak E., Jackson D. (2007), + Kodkod: A Relational Model Finder, TACAS 2007, + LNCS 4424, pp. 632–647, Springer. +
+

+ +


diff -r f353fe3c2b92 -r 6c38df1d294a src/HOL/TPTP/CASC/SysDesc_Nitrox.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/HOL/TPTP/CASC/SysDesc_Nitrox.html Tue May 21 11:01:14 2013 +0200 @@ -0,0 +1,78 @@ +
+

Nitrox 2013

+Jasmin C. Blanchette1, Emina Torlak2
+1Technische Universität München, Germany
+2University of California, Berkeley, USA
+ +

Architecture

+ +Nitrox is the first-order version of Nitpick [BN10], +an open source counterexample generator for Isabelle/HOL +[NPW13]. It builds on Kodkod +[TJ07], a highly optimized first-order relational +model finder based on SAT. The name Nitrox is a portmanteau of Nitpick +and Paradox (clever, eh?). + +

Strategies

+ +

+Nitrox employs Kodkod to find a finite model of the negated conjecture. It +performs a few transformations on the input, such as pushing quantifiers inside, +but 99% of the solving logic is in Kodkod and the underlying SAT solver. + +

+The translation from HOL to Kodkod's first-order relational logic (FORL) is +parameterized by the cardinalities of the atomic types occurring in it. Nitrox +enumerates the possible cardinalities for the universe. If a formula has a +finite counterexample, the tool eventually finds it, unless it runs out of +resources. + +

+Nitpick is optimized to work with higher-order logic (HOL) and its definitional +principles (e.g., (co)inductive predicates, (co)inductive datatypes, +(co)recursive functions). When invoked on untyped first-order problem, few of +its optimizations come into play, and the problem handed to Kodkod is +essentially a first-order relational logic (FORL) rendering of the TPTP FOF +problem. There are two main exceptions: +

+ +

Implementation

+ +

+Nitrox, like most of Isabelle/HOL, is written in Standard ML. Unlike Isabelle +itself, which adheres to the LCF small-kernel discipline, Nitrox does not +certify its results and must be trusted. Kodkod is written in Java. MiniSat 1.14 +is used as the SAT solver. + +

Expected Competition Performance

+ +Since Nitpick was designed for HOL, it doesn't have any type inference à +la Paradox. It also doesn't use the SAT solver incrementally, which penalizes it +a bit (but not as much as the missing type inference). Kodkod itself is known to +perform less well on FOF than Paradox, because it is designed and optimized for +a somewhat different logic, FORL. On the other hand, Kodkod's symmetry breaking +might be better calibrated than Paradox's. Hence, we expect Nitrox to end up in +second or third place in the FNT category. + +

References

+
+
BN10 +
Blanchette J. C., Nipkow T. (2010), + Nitpick: A Counterexample Generator for Higher-Order Logic Based on a Relational Model Finder, + ITP 2010, LNCS 6172, pp. 131–146, Springer. +
NPW13 +
Nipkow T., Paulson L. C., Wenzel M. (2013), + Isabelle/HOL: A Proof Assistant for Higher-Order Logic, + http://www.cl.cam.ac.uk/research/hvg/Isabelle/dist/Isabelle/doc/tutorial.pdf. +
TJ07 +
Torlak E., Jackson D. (2007), + Kodkod: A Relational Model Finder, TACAS 2007, + LNCS 4424, pp. 632–647, Springer. +
+

+ +


diff -r f353fe3c2b92 -r 6c38df1d294a src/HOL/TPTP/CASC/SysDesc_Refute.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/HOL/TPTP/CASC/SysDesc_Refute.html Tue May 21 11:01:14 2013 +0200 @@ -0,0 +1,55 @@ +
+

Refute 2013

+Jasmin C. Blanchette1, Tjark Weber2
+1Technische Universität München, Germany
+2Uppsala Universitet, Sweden
+ +

Architecture

+ +Refute [Web08] is an open source counterexample +generator for Isabelle/HOL [NPW13] based on a +SAT solver, and Nitpick's [BN10] precursor. + +

Strategies

+ +

+Refute employs a SAT solver to find a finite model of the negated conjecture. +The translation from HOL to propositional logic is parameterized by the +cardinalities of the atomic types occurring in the conjecture. Refute enumerates +the possible cardinalities for each atomic type. If a formula has a finite +counterexample, the tool eventually finds it, unless it runs out of resources. + +

Implementation

+ +

+Refute, like most of Isabelle/HOL, is written in Standard ML. Unlike Isabelle +itself, which adheres to the LCF small-kernel discipline, Refute does not +certify its results and must be trusted. +

+Refute is available as part of Isabelle/HOL for all major platforms under a +BSD-style license from +

+    http://www.cl.cam.ac.uk/research/hvg/Isabelle
+ +

Expected Competition Performance

+ +We expect that Refute will solve about 75% of the problems solved by Nitpick in +the TNT category, and perhaps a few problems that Nitpick cannot solve. + +

References

+
+
BN10 +
Blanchette J. C., Nipkow T. (2010), + Nitpick: A Counterexample Generator for Higher-Order Logic Based on a Relational Model Finder, + ITP 2010, LNCS 6172, pp. 131–146, Springer. +
NPW13 +
Nipkow T., Paulson L. C., Wenzel M. (2013), + Isabelle/HOL: A Proof Assistant for Higher-Order Logic, + http://www.cl.cam.ac.uk/research/hvg/Isabelle/dist/Isabelle/doc/tutorial.pdf. +
Web08 +
Weber T. (2008), + SAT-based Finite Model Generation for Higher-Order Logic, Ph.D. thesis. +
+

+ +