Automated Theorem Proving - Benchmarks and Competitions

Benchmarks and Competitions

The quality of implemented system has benefited from the existence of a large library of standard benchmark examples — the Thousands of Problems for Theorem Provers (TPTP) Problem Library — as well as from the CADE ATP System Competition (CASC), a yearly competition of first-order systems for many important classes of first-order problems.

Some important systems (all have won at least one CASC competition division) are listed below.

  • E is a high-performance prover for full first-order logic, but built on a purely equational calculus, developed primarily in the automated reasoning group of Technical University of Munich.
  • Otter, developed at the Argonne National Laboratory, is the first widely used high-performance theorem prover. It is based on first-order resolution and paramodulation. Otter has since been replaced by Prover9, which is paired with Mace4.
  • SETHEO is a high-performance system based on the goal-directed model elimination calculus. It is developed in the automated reasoning group of Technical University of Munich. E and SETHEO have been combined (with other systems) in the composite theorem prover E-SETHEO.
  • Vampire is developed and implemented at Manchester University by Andrei Voronkov and Krystof Hoder, formerly also by Alexandre Riazanov. It has won the CADE ATP System Competition in the most prestigious CNF (MIX) division for eleven years (1999, 2001–2010).
  • Waldmeister is a specialized system for unit-equational first-order logic. It has won the CASC UEQ division for the last fourteen years (1997–2010).
  • SPASS is a first order logic theorem prover with equality. This is developed by the research group Automation of Logic, Max Planck Institute for Computer Science.

Read more about this topic:  Automated Theorem Proving