1\documentclass[a4paper,12pt]{article}
2\usepackage{lmodern}
3\usepackage[T1]{fontenc}
4\usepackage{amsmath}
5\usepackage{amssymb}
6\usepackage[english]{babel}
7\usepackage{color}
8\usepackage{footmisc}
9\usepackage{graphicx}
10%\usepackage{mathpazo}
11\usepackage{multicol}
12\usepackage{stmaryrd}
13%\usepackage[scaled=.85]{beramono}
14\usepackage{isabelle,iman,pdfsetup}
15
16\newcommand\download{\url{https://isabelle.in.tum.de/components/}}
17
18\let\oldS=\S
19\def\S{\oldS\,}
20
21\def\qty#1{\ensuremath{\left<\mathit{#1\/}\right>}}
22\def\qtybf#1{$\mathbf{\left<\textbf{\textit{#1\/}}\right>}$}
23
24\newcommand\const[1]{\textsf{#1}}
25
26%\oddsidemargin=4.6mm
27%\evensidemargin=4.6mm
28%\textwidth=150mm
29%\topmargin=4.6mm
30%\headheight=0mm
31%\headsep=0mm
32%\textheight=234mm
33
34\def\Colon{\mathord{:\mkern-1.5mu:}}
35%\def\lbrakk{\mathopen{\lbrack\mkern-3.25mu\lbrack}}
36%\def\rbrakk{\mathclose{\rbrack\mkern-3.255mu\rbrack}}
37\def\lparr{\mathopen{(\mkern-4mu\mid}}
38\def\rparr{\mathclose{\mid\mkern-4mu)}}
39
40\def\unk{{?}}
41\def\undef{(\lambda x.\; \unk)}
42%\def\unr{\textit{others}}
43\def\unr{\ldots}
44\def\Abs#1{\hbox{\rm{\flqq}}{\,#1\,}\hbox{\rm{\frqq}}}
45\def\Q{{\smash{\lower.2ex\hbox{$\scriptstyle?$}}}}
46
47\urlstyle{tt}
48
49\renewcommand\_{\hbox{\textunderscore\kern-.05ex}}
50
51\begin{document}
52
53%%% TYPESETTING
54%\renewcommand\labelitemi{$\bullet$}
55\renewcommand\labelitemi{\raise.065ex\hbox{\small\textbullet}}
56
57\title{\includegraphics[scale=0.5]{isabelle_sledgehammer} \\[4ex]
58Hammering Away \\[\smallskipamount]
59\Large A User's Guide to Sledgehammer for Isabelle/HOL}
60\author{\hbox{} \\
61Jasmin Blanchette \\
62{\normalsize Institut f\"ur Informatik, Technische Universit\"at M\"unchen} \\[4\smallskipamount]
63{\normalsize with contributions from} \\[4\smallskipamount]
64Lawrence C. Paulson \\
65{\normalsize Computer Laboratory, University of Cambridge} \\
66\hbox{}}
67
68\maketitle
69
70\tableofcontents
71
72\setlength{\parskip}{.7em plus .2em minus .1em}
73\setlength{\parindent}{0pt}
74\setlength{\abovedisplayskip}{\parskip}
75\setlength{\abovedisplayshortskip}{.9\parskip}
76\setlength{\belowdisplayskip}{\parskip}
77\setlength{\belowdisplayshortskip}{.9\parskip}
78
79% general-purpose enum environment with correct spacing
80\newenvironment{enum}%
81    {\begin{list}{}{%
82        \setlength{\topsep}{.1\parskip}%
83        \setlength{\partopsep}{.1\parskip}%
84        \setlength{\itemsep}{\parskip}%
85        \advance\itemsep by-\parsep}}
86    {\end{list}}
87
88\def\pre{\begingroup\vskip0pt plus1ex\advance\leftskip by\leftmargin
89\advance\rightskip by\leftmargin}
90\def\post{\vskip0pt plus1ex\endgroup}
91
92\def\prew{\pre\advance\rightskip by-\leftmargin}
93\def\postw{\post}
94
95
96\section{Introduction}
97\label{introduction}
98
99Sledgehammer is a tool that applies automatic theorem provers (ATPs)
100and satisfiability-modulo-theories (SMT) solvers on the current goal.%
101\footnote{The distinction between ATPs and SMT solvers is convenient but mostly
102historical.}
103%
104The supported ATPs are agsyHOL \cite{agsyHOL}, Alt-Ergo \cite{alt-ergo}, E
105\cite{schulz-2002}, iProver \cite{korovin-2009}, LEO-II \cite{leo2}, Leo-III
106\cite{leo3}, Satallax \cite{satallax}, SNARK \cite{snark}, SPASS
107\cite{weidenbach-et-al-2009}, Vampire \cite{riazanov-voronkov-2002},
108Waldmeister \cite{waldmeister}, and Zipperposition \cite{cruanes-2014}. The
109ATPs are run either locally or remotely via the System\-On\-TPTP web service
110\cite{sutcliffe-2000}. The supported SMT solvers are CVC3 \cite{cvc3}, CVC4
111\cite{cvc4}, veriT \cite{bouton-et-al-2009}, and Z3 \cite{z3}. These are
112always run locally.
113
114The problem passed to the external provers (or solvers) consists of your current
115goal together with a heuristic selection of hundreds of facts (theorems) from the
116current theory context, filtered by relevance.
117
118The result of a successful proof search is some source text that usually (but
119not always) reconstructs the proof within Isabelle. For ATPs, the reconstructed
120proof typically relies on the general-purpose \textit{metis} proof method, which
121integrates the Metis ATP in Isabelle/HOL with explicit inferences going through
122the kernel. Thus its results are correct by construction.
123
124For Isabelle/jEdit users, Sledgehammer provides an automatic mode that can be
125enabled via the ``Auto Sledgehammer'' option under ``Plugins > Plugin Options >
126Isabelle > General.'' In this mode, a reduced version of Sledgehammer is run on
127every newly entered theorem for a few seconds.
128
129\newbox\boxA
130\setbox\boxA=\hbox{\texttt{NOSPAM}}
131
132\newcommand\authoremail{\texttt{blan{\color{white}NOSPAM}\kern-\wd\boxA{}chette@\allowbreak
133in.\allowbreak tum.\allowbreak de}}
134
135To run Sledgehammer, you must make sure that the theory \textit{Sledgehammer} is
136imported---this is rarely a problem in practice since it is part of
137\textit{Main}. Examples of Sledgehammer use can be found in Isabelle's
138\texttt{src/HOL/Metis\_Examples} directory.
139Comments and bug reports concerning Sledgehammer or this manual should be
140directed to the author at \authoremail.
141
142
143\section{Installation}
144\label{installation}
145
146Sledgehammer is part of Isabelle, so you do not need to install it. However, it
147relies on third-party automatic provers (ATPs and SMT solvers).
148
149Among the ATPs, agsyHOL, Alt-Ergo, E, LEO-II, Leo-III, Satallax, SPASS,
150Vampire, and Zipperposition can be run locally; in addition, agsyHOL,
151Alt-Ergo, E, iProver, LEO-II, Leo-III, Satallax, SNARK, Vampire, Waldmeister,
152and Zipperposition are available remotely via System\-On\-TPTP
153\cite{sutcliffe-2000}. The SMT solvers CVC3, CVC4, veriT, and Z3 can be run
154locally.
155
156There are three main ways to install automatic provers on your machine:
157
158\begin{sloppy}
159\begin{enum}
160\item[\labelitemi] If you installed an official Isabelle package, it should
161already include properly setup executables for CVC4, E, SPASS, Vampire, and Z3,
162ready to use. To use Vampire, you must confirm that you are a noncommercial
163user, as indicated by the message that is displayed when Sledgehammer is
164invoked the first time.
165
166\item[\labelitemi] Alternatively, you can download the Isabelle-aware CVC3,
167CVC4, E, SPASS, Vampire, and Z3 binary packages from \download. Extract the
168archives, then add a line to your \texttt{\$ISABELLE\_HOME\_USER\slash etc\slash
169components}%
170\footnote{The variable \texttt{\$ISABELLE\_HOME\_USER} is set by Isabelle at
171startup. Its value can be retrieved by executing \texttt{isabelle}
172\texttt{getenv} \texttt{ISABELLE\_HOME\_USER} on the command line.}
173file with the absolute path to CVC3, CVC4, E, SPASS, Vampire, or Z3. For
174example, if the \texttt{components} file does not exist yet and you extracted
175SPASS to \texttt{/usr/local/spass-3.8ds}, create it with the single line
176
177\prew
178\texttt{/usr/local/spass-3.8ds}
179\postw
180
181in it.
182
183\item[\labelitemi] If you prefer to build agsyHOL, Alt-Ergo, E, LEO-II,
184Leo-III, or Satallax manually, set the environment variable
185\texttt{AGSYHOL\_HOME}, \texttt{E\_HOME}, \texttt{LEO2\_HOME},
186\texttt{LEO3\_HOME}, or \texttt{SATALLAX\_HOME}
187to the directory that contains the \texttt{agsyHOL},
188\texttt{eprover} (and/or \texttt{eproof} or \texttt{eproof\_ram}),
189\texttt{leo}, \texttt{leo3}, or \texttt{satallax} executable;
190for Alt-Ergo, set the environment variable \texttt{WHY3\_HOME} to the
191directory that contains the \texttt{why3} executable. Sledgehammer has been
192tested with agsyHOL 1.0, Alt-Ergo 0.95.2, E 1.6 to 2.0, LEO-II 1.3.4, Leo-III
1931.1, and Satallax 2.7. Since the ATPs' output formats are neither documented
194nor stable, other versions might not work well with Sledgehammer. Ideally, you
195should also set \texttt{E\_VERSION}, \texttt{LEO2\_VERSION},
196\texttt{LEO3\_VERSION}, or \texttt{SATALLAX\_VERSION} to the prover's version
197number (e.g., ``2.7''); this might help Sledgehammer invoke the prover
198optimally.
199
200Similarly, if you want to install CVC3, CVC4, veriT, or Z3, set the environment
201variable \texttt{CVC3\_\allowbreak SOLVER}, \texttt{CVC4\_\allowbreak SOLVER},
202\texttt{VERIT\_\allowbreak SOLVER}, or \texttt{Z3\_SOLVER} to the complete path
203of the executable, \emph{including the file name}. Sledgehammer has been tested
204with CVC3 2.2 and 2.4.1, CVC4 1.5-prerelease, veriT smtcomp2019, and Z3 4.3.2.
205Since Z3's output format is somewhat unstable, other versions of the solver
206might not work well with Sledgehammer. Ideally, also set
207\texttt{CVC3\_VERSION}, \texttt{CVC4\_VERSION}, \texttt{VERIT\_VERSION}, or
208\texttt{Z3\_VERSION} to the solver's version number (e.g., ``4.4.0'').
209\end{enum}
210\end{sloppy}
211
212To check whether the provers are successfully installed, try out the example
213in \S\ref{first-steps}. If the remote versions of any of these provers is used
214(identified by the prefix ``\textit{remote\_\/}''), or if the local versions
215fail to solve the easy goal presented there, something must be wrong with the
216installation.
217
218Remote prover invocation requires Perl with the World Wide Web Library
219(\texttt{libwww-perl}) installed. If you must use a proxy server to access the
220Internet, set the \texttt{http\_proxy} environment variable to the proxy, either
221in the environment in which Isabelle is launched or in your
222\texttt{\$ISABELLE\_HOME\_USER/etc/settings} file. Here are a few
223examples:
224
225\prew
226\texttt{http\_proxy=http://proxy.example.org} \\
227\texttt{http\_proxy=http://proxy.example.org:8080} \\
228\texttt{http\_proxy=http://joeblow:pAsSwRd@proxy.example.org}
229\postw
230
231
232\section{First Steps}
233\label{first-steps}
234
235To illustrate Sledgehammer in context, let us start a theory file and
236attempt to prove a simple lemma:
237
238\prew
239\textbf{theory}~\textit{Scratch} \\
240\textbf{imports}~\textit{Main} \\
241\textbf{begin} \\[2\smallskipamount]
242%
243\textbf{lemma} ``$[a] = [b] \,\Longrightarrow\, a = b$'' \\
244\textbf{sledgehammer}
245\postw
246
247Instead of issuing the \textbf{sledgehammer} command, you can also use the
248Sledgehammer panel in Isabelle/jEdit. Sledgehammer produces the following output
249after a few seconds:
250
251\prew
252\slshape
253Proof found\ldots \\
254``\textit{e\/}'': Try this: \textbf{by} \textit{simp} (0.3 ms) \\
255%
256``\textit{cvc4\/}'': Try this: \textbf{by} \textit{simp} (0.4 ms) \\
257%
258``\textit{z3\/}'': Try this: \textbf{by} \textit{simp} (0.5 ms) \\
259%
260``\textit{spass\/}'': Try this: \textbf{by} \textit{simp} (0.3 ms)
261%
262\postw
263
264Sledgehammer ran CVC4, E, SPASS, and Z3 in parallel. Depending on which
265provers are installed and how many processor cores are available, some of the
266provers might be missing or present with a \textit{remote\_} prefix.
267
268For each successful prover, Sledgehammer gives a one-line \textit{metis} or
269\textit{smt} method call. Rough timings are shown in parentheses, indicating how
270fast the call is. You can click the proof to insert it into the theory text.
271
272In addition, you can ask Sledgehammer for an Isar text proof by enabling the
273\textit{isar\_proofs} option (\S\ref{output-format}):
274
275\prew
276\textbf{sledgehammer} [\textit{isar\_proofs}]
277\postw
278
279When Isar proof construction is successful, it can yield proofs that are more
280readable and also faster than the \textit{metis} or \textit{smt} one-line
281proofs. This feature is experimental and is only available for ATPs.
282
283
284\section{Hints}
285\label{hints}
286
287This section presents a few hints that should help you get the most out of
288Sledgehammer. Frequently asked questions are answered in
289\S\ref{frequently-asked-questions}.
290
291%\newcommand\point[1]{\medskip\par{\sl\bfseries#1}\par\nopagebreak}
292\newcommand\point[1]{\subsection{\emph{#1}}}
293
294
295\point{Presimplify the goal}
296
297For best results, first simplify your problem by calling \textit{auto} or at
298least \textit{safe} followed by \textit{simp\_all}. The SMT solvers provide
299arithmetic decision procedures, but the ATPs typically do not (or if they do,
300Sledgehammer does not use it yet). Apart from Waldmeister, they are not
301particularly good at heavy rewriting, but because they regard equations as
302undirected, they often prove theorems that require the reverse orientation of a
303\textit{simp} rule. Higher-order problems can be tackled, but the success rate
304is better for first-order problems. Hence, you may get better results if you
305first simplify the problem to remove higher-order features.
306
307
308\point{Familiarize yourself with the main options}
309
310Sledgehammer's options are fully documented in \S\ref{command-syntax}. Many of
311the options are very specialized, but serious users of the tool should at least
312familiarize themselves with the following options:
313
314\begin{enum}
315\item[\labelitemi] \textbf{\textit{provers}} (\S\ref{mode-of-operation}) specifies
316the automatic provers (ATPs and SMT solvers) that should be run whenever
317Sledgehammer is invoked (e.g., ``\textit{provers}~= \textit{cvc4 e spass
318vampire\/}''). For convenience, you can omit ``\textit{provers}~=''
319and simply write the prover names as a space-separated list (e.g., ``\textit{cvc4 e
320spass vampire\/}'').
321
322\item[\labelitemi] \textbf{\textit{max\_facts}} (\S\ref{relevance-filter})
323specifies the maximum number of facts that should be passed to the provers. By
324default, the value is prover-dependent but varies between about 50 and 1000. If
325the provers time out, you can try lowering this value to, say, 25 or 50 and see
326if that helps.
327
328\item[\labelitemi] \textbf{\textit{isar\_proofs}} (\S\ref{output-format}) specifies
329that Isar proofs should be generated, in addition to one-line \textit{metis} or
330\textit{smt} proofs. The length of the Isar proofs can be controlled by setting
331\textit{compress} (\S\ref{output-format}).
332
333\item[\labelitemi] \textbf{\textit{timeout}} (\S\ref{timeouts}) controls the
334provers' time limit. It is set to 30 seconds by default.
335\end{enum}
336
337Options can be set globally using \textbf{sledgehammer\_params}
338(\S\ref{command-syntax}). The command also prints the list of all available
339options with their current value. Fact selection can be influenced by specifying
340``$(\textit{add}{:}~\textit{my\_facts})$'' after the \textbf{sledgehammer} call
341to ensure that certain facts are included, or simply ``$(\textit{my\_facts})$''
342to force Sledgehammer to run only with $\textit{my\_facts}$ (and any facts
343chained into the goal).
344
345
346\section{Frequently Asked Questions}
347\label{frequently-asked-questions}
348
349This sections answers frequently (and infrequently) asked questions about
350Sledgehammer. It is a good idea to skim over it now even if you do not have any
351questions at this stage. And if you have any further questions not listed here,
352send them to the author at \authoremail.
353
354
355\point{Which facts are passed to the automatic provers?}
356
357Sledgehammer heuristically selects a few hundred relevant lemmas from the
358currently loaded libraries. The component that performs this selection is
359called \emph{relevance filter} (\S\ref{relevance-filter}).
360
361\begin{enum}
362\item[\labelitemi]
363The traditional relevance filter, called \emph{MePo}
364(\underline{Me}ng--\underline{Pau}lson), assigns a score to every available fact
365(lemma, theorem, definition, or axiom) based upon how many constants that fact
366shares with the conjecture. This process iterates to include facts relevant to
367those just accepted. The constants are weighted to give unusual ones greater
368significance. MePo copes best when the conjecture contains some unusual
369constants; if all the constants are common, it is unable to discriminate among
370the hundreds of facts that are picked up. The filter is also memoryless: It has
371no information about how many times a particular fact has been used in a proof,
372and it cannot learn.
373
374\item[\labelitemi]
375An alternative to MePo is \emph{MaSh} (\underline{Ma}chine Learner for
376\underline{S}ledge\underline{h}ammer). It applies machine learning to the
377problem of finding relevant facts.
378
379\item[\labelitemi] The \emph{MeSh} filter combines MePo and MaSh. This is
380the default.
381\end{enum}
382
383The number of facts included in a problem varies from prover to prover, since
384some provers get overwhelmed more easily than others. You can show the number of
385facts given using the \textit{verbose} option (\S\ref{output-format}) and the
386actual facts using \textit{debug} (\S\ref{output-format}).
387
388Sledgehammer is good at finding short proofs combining a handful of existing
389lemmas. If you are looking for longer proofs, you must typically restrict the
390number of facts, by setting the \textit{max\_facts} option
391(\S\ref{relevance-filter}) to, say, 25 or 50.
392
393You can also influence which facts are actually selected in a number of ways. If
394you simply want to ensure that a fact is included, you can specify it using the
395``$(\textit{add}{:}~\textit{my\_facts})$'' syntax. For example:
396%
397\prew
398\textbf{sledgehammer} (\textit{add}: \textit{hd.simps} \textit{tl.simps})
399\postw
400%
401The specified facts then replace the least relevant facts that would otherwise be
402included; the other selected facts remain the same.
403If you want to direct the selection in a particular direction, you can specify
404the facts via \textbf{using}:
405%
406\prew
407\textbf{using} \textit{hd.simps} \textit{tl.simps} \\
408\textbf{sledgehammer}
409\postw
410%
411The facts are then more likely to be selected than otherwise, and if they are
412selected at iteration $j$ they also influence which facts are selected at
413iterations $j + 1$, $j + 2$, etc. To give them even more weight, try
414%
415\prew
416\textbf{using} \textit{hd.simps} \textit{tl.simps} \\
417\textbf{apply}~\textbf{--} \\
418\textbf{sledgehammer}
419\postw
420
421
422\point{Why does Metis fail to reconstruct the proof?}
423
424There are many reasons. If Metis runs seemingly forever, that is a sign that the
425proof is too difficult for it. Metis's search is complete for first-order logic
426with equality, so if the proof was found by a superposition-based ATP such as
427E, SPASS, or Vampire, Metis should eventually find it, but that is little
428consolation.
429
430In some rare cases, \textit{metis} fails fairly quickly, and you get the error
431message ``One-line proof reconstruction failed.'' This indicates that
432Sledgehammer determined that the goal is provable, but the proof is, for
433technical reasons, beyond \textit{metis}'s power. You can then try again with
434the \textit{strict} option (\S\ref{problem-encoding}).
435
436If the goal is actually unprovable and you did not specify an unsound encoding
437using \textit{type\_enc} (\S\ref{problem-encoding}), this is a bug, and you are
438strongly encouraged to report this to the author at \authoremail.
439
440
441\point{How can I tell whether a suggested proof is sound?}
442
443Earlier versions of Sledgehammer often suggested unsound proofs---either proofs
444of nontheorems or simply proofs that rely on type-unsound inferences. This
445is a thing of the past, unless you explicitly specify an unsound encoding
446using \textit{type\_enc} (\S\ref{problem-encoding}).
447%
448Officially, the only form of ``unsoundness'' that lurks in the sound
449encodings is related to missing characteristic theorems of datatypes. For
450example,
451
452\prew
453\textbf{lemma}~``$\exists \mathit{xs}.\; \mathit{xs} \neq []$'' \\
454\textbf{sledgehammer} ()
455\postw
456
457suggests an argumentless \textit{metis} call that fails. However, the conjecture
458does actually hold, and the \textit{metis} call can be repaired by adding
459\textit{list.distinct}.
460%
461We hope to address this problem in a future version of Isabelle. In the
462meantime, you can avoid it by passing the \textit{strict} option
463(\S\ref{problem-encoding}).
464
465
466\point{What are the \textit{full\_types}, \textit{no\_types}, and
467\textit{mono\_tags} arguments to Metis?}
468
469The \textit{metis}~(\textit{full\_types}) proof method
470and its cousin \textit{metis}~(\textit{mono\_tags}) are fully-typed
471versions of Metis. It is somewhat slower than \textit{metis}, but the proof
472search is fully typed, and it also includes more powerful rules such as the
473axiom ``$x = \const{True} \mathrel{\lor} x = \const{False}$'' for reasoning in
474higher-order places (e.g., in set comprehensions). The method is automatically
475tried as a fallback when \textit{metis} fails, and it is sometimes
476generated by Sledgehammer instead of \textit{metis} if the proof obviously
477requires type information or if \textit{metis} failed when Sledgehammer
478preplayed the proof. (By default, Sledgehammer tries to run \textit{metis} with
479various sets of option for up to 1~second each time to ensure that the generated
480one-line proofs actually work and to display timing information. This can be
481configured using the \textit{preplay\_timeout} and \textit{dont\_preplay}
482options (\S\ref{timeouts}).)
483%
484At the other end of the soundness spectrum, \textit{metis} (\textit{no\_types})
485uses no type information at all during the proof search, which is more efficient
486but often fails. Calls to \textit{metis} (\textit{no\_types}) are occasionally
487generated by Sledgehammer.
488%
489See the \textit{type\_enc} option (\S\ref{problem-encoding}) for details.
490
491Incidentally, if you ever see warnings such as
492
493\prew
494\slshape
495Metis: Falling back on ``\textit{metis} (\textit{full\_types})''
496\postw
497
498for a successful \textit{metis} proof, you can advantageously pass the
499\textit{full\_types} option to \textit{metis} directly.
500
501
502\point{And what are the \textit{lifting} and \textit{hide\_lams} arguments
503to Metis?}
504
505Orthogonally to the encoding of types, it is important to choose an appropriate
506translation of $\lambda$-abstractions. Metis supports three translation schemes,
507in decreasing order of power: Curry combinators (the default),
508$\lambda$-lifting, and a ``hiding'' scheme that disables all reasoning under
509$\lambda$-abstractions. The more powerful schemes also give the automatic
510provers more rope to hang themselves. See the \textit{lam\_trans} option (\S\ref{problem-encoding}) for details.
511
512
513\point{Are the generated proofs minimal?}
514
515Automatic provers frequently use many more facts than are necessary.
516Sledgehammer includes a minimization tool that takes a set of facts returned by
517a given prover and repeatedly calls a prover or proof method with subsets of
518those facts to find a minimal set. Reducing the number of facts typically helps
519reconstruction, while also removing superfluous clutter from the proof scripts.
520
521In earlier versions of Sledgehammer, generated proofs were systematically
522accompanied by a suggestion to invoke the minimization tool. This step is now
523performed by default but can be disabled using the \textit{minimize} option
524(\S\ref{mode-of-operation}).
525
526
527\point{A strange error occurred---what should I do?}
528
529Sledgehammer tries to give informative error messages. Please report any strange
530error to the author at \authoremail.
531
532
533\point{Auto can solve it---why not Sledgehammer?}
534
535Problems can be easy for \textit{auto} and difficult for automatic provers, but
536the reverse is also true, so do not be discouraged if your first attempts fail.
537Because the system refers to all theorems known to Isabelle, it is particularly
538suitable when your goal has a short proof but requires lemmas that you do not
539know about.
540
541
542\point{Why are there so many options?}
543
544Sledgehammer's philosophy should work out of the box, without user guidance.
545Many of the options are meant to be used mostly by the Sledgehammer developers
546for experiments. Of course, feel free to try them out if you are so inclined.
547
548
549\section{Command Syntax}
550\label{command-syntax}
551
552\subsection{Sledgehammer}
553\label{sledgehammer}
554
555Sledgehammer can be invoked at any point when there is an open goal by entering
556the \textbf{sledgehammer} command in the theory file. Its general syntax is as
557follows:
558
559\prew
560\textbf{sledgehammer} \qty{subcommand}$^?$ \qty{options}$^?$ \qty{facts\_override}$^?$ \qty{num}$^?$
561\postw
562
563In the general syntax, the \qty{subcommand} may be any of the following:
564
565\begin{enum}
566\item[\labelitemi] \textbf{\textit{run} (the default):} Runs Sledgehammer on
567subgoal number \qty{num} (1 by default), with the given options and facts.
568
569\item[\labelitemi] \textbf{\textit{supported\_provers}:} Prints the list of
570automatic provers supported by Sledgehammer. See \S\ref{installation} and
571\S\ref{mode-of-operation} for more information on how to install automatic
572provers.
573
574\item[\labelitemi] \textbf{\textit{refresh\_tptp}:} Refreshes the list of remote
575ATPs available at System\-On\-TPTP \cite{sutcliffe-2000}.
576\end{enum}
577
578In addition, the following subcommands provide finer control over machine
579learning with MaSh:
580
581\begin{enum}
582\item[\labelitemi] \textbf{\textit{unlearn}:} Resets MaSh, erasing any
583persistent state.
584
585\item[\labelitemi] \textbf{\textit{learn\_isar}:} Invokes MaSh on the current
586theory to process all the available facts, learning from their Isabelle/Isar
587proofs. This happens automatically at Sledgehammer invocations if the
588\textit{learn} option (\S\ref{relevance-filter}) is enabled.
589
590\item[\labelitemi] \textbf{\textit{learn\_prover}:} Invokes MaSh on the current
591theory to process all the available facts, learning from proofs generated by
592automatic provers. The prover to use and its timeout can be set using the
593\textit{prover} (\S\ref{mode-of-operation}) and \textit{timeout}
594(\S\ref{timeouts}) options. It is recommended to perform learning using a
595first-order ATP (such as E, SPASS, and Vampire) as opposed to a
596higher-order ATP or an SMT solver.
597
598\item[\labelitemi] \textbf{\textit{relearn\_isar}:} Same as \textit{unlearn}
599followed by \textit{learn\_isar}.
600
601\item[\labelitemi] \textbf{\textit{relearn\_prover}:} Same as \textit{unlearn}
602followed by \textit{learn\_prover}.
603\end{enum}
604
605Sledgehammer's behavior can be influenced by various \qty{options}, which can be
606specified in brackets after the \textbf{sledgehammer} command. The
607\qty{options} are a list of key--value pairs of the form ``[$k_1 = v_1,
608\ldots, k_n = v_n$]''. For Boolean options, ``= \textit{true\/}'' is optional.
609For example:
610
611\prew
612\textbf{sledgehammer} [\textit{isar\_proofs}, \,\textit{timeout} = 120]
613\postw
614
615Default values can be set using \textbf{sledgehammer\_\allowbreak params}:
616
617\prew
618\textbf{sledgehammer\_params} \qty{options}
619\postw
620
621The supported options are described in \S\ref{option-reference}.
622
623The \qty{facts\_override} argument lets you alter the set of facts that go
624through the relevance filter. It may be of the form ``(\qty{facts})'', where
625\qty{facts} is a space-separated list of Isabelle facts (theorems, local
626assumptions, etc.), in which case the relevance filter is bypassed and the given
627facts are used. It may also be of the form ``(\textit{add}:\ \qty{facts\/_{\mathrm{1}}})'',
628``(\textit{del}:\ \qty{facts\/_{\mathrm{2}}})'', or ``(\textit{add}:\ \qty{facts\/_{\mathrm{1}}}\
629\textit{del}:\ \qty{facts\/_{\mathrm{2}}})'', where the relevance filter is instructed to
630proceed as usual except that it should consider \qty{facts\/_{\mathrm{1}}}
631highly-relevant and \qty{facts\/_{\mathrm{2}}} fully irrelevant.
632
633If you use Isabelle/jEdit, Sledgehammer also provides an automatic mode that can
634be enabled via the ``Auto Sledgehammer'' option under ``Plugins > Plugin Options
635> Isabelle > General.'' For automatic runs, only the first prover set using
636\textit{provers} (\S\ref{mode-of-operation}) is considered (typically E),
637\textit{slice} (\S\ref{mode-of-operation}) is disabled,
638fewer facts are
639passed to the prover, \textit{fact\_filter} (\S\ref{relevance-filter}) is set to
640\textit{mepo}, \textit{strict} (\S\ref{problem-encoding}) is enabled,
641\textit{verbose} (\S\ref{output-format}) and \textit{debug}
642(\S\ref{output-format}) are disabled, and \textit{timeout} (\S\ref{timeouts}) is
643superseded by the ``Auto Time Limit'' option in jEdit. Sledgehammer's output is
644also more concise.
645
646
647\subsection{Metis}
648\label{metis}
649
650The \textit{metis} proof method has the syntax
651
652\prew
653\textbf{\textit{metis}}~(\qty{options})${}^?$~\qty{facts}${}^?$
654\postw
655
656where \qty{facts} is a list of arbitrary facts and \qty{options} is a
657comma-separated list consisting of at most one $\lambda$ translation scheme
658specification with the same semantics as Sledgehammer's \textit{lam\_trans}
659option (\S\ref{problem-encoding}) and at most one type encoding specification
660with the same semantics as Sledgehammer's \textit{type\_enc} option
661(\S\ref{problem-encoding}).
662%
663The supported $\lambda$ translation schemes are \textit{hide\_lams},
664\textit{lifting}, and \textit{combs} (the default).
665%
666All the untyped type encodings listed in \S\ref{problem-encoding} are supported.
667For convenience, the following aliases are provided:
668\begin{enum}
669\item[\labelitemi] \textbf{\textit{full\_types}:} Alias for \textit{poly\_guards\_query}.
670\item[\labelitemi] \textbf{\textit{partial\_types}:} Alias for \textit{poly\_args}.
671\item[\labelitemi] \textbf{\textit{no\_types}:} Alias for \textit{erased}.
672\end{enum}
673
674
675\section{Option Reference}
676\label{option-reference}
677
678\def\defl{\{}
679\def\defr{\}}
680
681\def\flushitem#1{\item[]\noindent\kern-\leftmargin \textbf{#1}}
682\def\optrueonly#1{\flushitem{\textit{#1} $\bigl[$= \textit{true}$\bigr]$\enskip}\nopagebreak\\[\parskip]}
683\def\optrue#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{true}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
684\def\opfalse#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{false}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
685\def\opsmart#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\enskip \defl\textit{smart}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
686\def\opsmartx#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\enskip \defl\textit{smart}\defr\\\hbox{}\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
687\def\opnodefault#1#2{\flushitem{\textit{#1} = \qtybf{#2}} \nopagebreak\\[\parskip]}
688\def\opnodefaultbrk#1#2{\flushitem{$\bigl[$\textit{#1} =$\bigr]$ \qtybf{#2}} \nopagebreak\\[\parskip]}
689\def\opdefault#1#2#3{\flushitem{\textit{#1} = \qtybf{#2}\enskip \defl\textit{#3}\defr} \nopagebreak\\[\parskip]}
690\def\oparg#1#2#3{\flushitem{\textit{#1} \qtybf{#2} = \qtybf{#3}} \nopagebreak\\[\parskip]}
691\def\opargbool#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}
692\def\opargboolorsmart#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}
693
694Sledgehammer's options are categorized as follows:\ mode of operation
695(\S\ref{mode-of-operation}), problem encoding (\S\ref{problem-encoding}),
696relevance filter (\S\ref{relevance-filter}), output format
697(\S\ref{output-format}), regression testing (\S\ref{regression-testing}),
698and timeouts (\S\ref{timeouts}).
699
700The descriptions below refer to the following syntactic quantities:
701
702\begin{enum}
703\item[\labelitemi] \qtybf{string}: A string.
704\item[\labelitemi] \qtybf{bool\/}: \textit{true} or \textit{false}.
705\item[\labelitemi] \qtybf{smart\_bool\/}: \textit{true}, \textit{false}, or
706\textit{smart}.
707\item[\labelitemi] \qtybf{int\/}: An integer.
708\item[\labelitemi] \qtybf{float}: A floating-point number (e.g., 2.5 or 60)
709expressing a number of seconds.
710\item[\labelitemi] \qtybf{float\_pair\/}: A pair of floating-point numbers
711(e.g., 0.6 0.95).
712\item[\labelitemi] \qtybf{smart\_int\/}: An integer or \textit{smart}.
713\end{enum}
714
715Default values are indicated in curly brackets (\textrm{\{\}}). Boolean options
716have a negative counterpart (e.g., \textit{minimize} vs.\
717\textit{dont\_minimize}). When setting Boolean options or their negative
718counterparts, ``= \textit{true\/}'' may be omitted.
719
720
721\subsection{Mode of Operation}
722\label{mode-of-operation}
723
724\begin{enum}
725\opnodefaultbrk{provers}{string}
726Specifies the automatic provers to use as a space-separated list (e.g.,
727``\textit{cvc4}~\textit{e}~\textit{spass}~\textit{vampire\/}'').
728Provers can be run locally or remotely; see \S\ref{installation} for
729installation instructions.
730
731The following local provers are supported:
732
733\begin{sloppy}
734\begin{enum}
735\item[\labelitemi] \textbf{\textit{agsyhol}:} agsyHOL is an automatic
736higher-order prover developed by Fredrik Lindblad \cite{agsyHOL}. To use
737agsyHOL, set the environment variable \texttt{AGSYHOL\_HOME} to the directory
738that contains the \texttt{agsyHOL} executable. Sledgehammer has been tested
739with version 1.0.
740
741\item[\labelitemi] \textbf{\textit{alt\_ergo}:} Alt-Ergo is a polymorphic
742ATP developed by Bobot et al.\ \cite{alt-ergo}.
743It supports the TPTP polymorphic typed first-order format (TF1) via Why3
744\cite{why3}. To use Alt-Ergo, set the environment variable \texttt{WHY3\_HOME}
745to the directory that contains the \texttt{why3} executable. Sledgehammer
746requires Alt-Ergo 0.95.2 and Why3 0.83.
747
748\item[\labelitemi] \textbf{\textit{cvc3}:} CVC3 is an SMT solver developed by
749Clark Barrett, Cesare Tinelli, and their colleagues \cite{cvc3}. To use CVC3,
750set the environment variable \texttt{CVC3\_SOLVER} to the complete path of the
751executable, including the file name, or install the prebuilt CVC3 package from
752\download. Sledgehammer has been tested with versions 2.2 and 2.4.1.
753
754\item[\labelitemi] \textbf{\textit{cvc4}:} CVC4 \cite{cvc4} is the successor to
755CVC3. To use CVC4, set the environment variable \texttt{CVC4\_SOLVER} to the
756complete path of the executable, including the file name, or install the
757prebuilt CVC4 package from \download. Sledgehammer has been tested with version
7581.5-prerelease.
759
760\item[\labelitemi] \textbf{\textit{e}:} E is a first-order resolution prover
761developed by Stephan Schulz \cite{schulz-2002}. To use E, set the environment
762variable \texttt{E\_HOME} to the directory that contains the \texttt{eproof}
763executable and \texttt{E\_VERSION} to the version number (e.g., ``1.8''), or
764install the prebuilt E package from \download. Sledgehammer has been tested with
765versions 1.6 to 1.8.
766
767\item[\labelitemi] \textbf{\textit{e\_par}:} E-Par is an experimental metaprover
768developed by Josef Urban that implements strategy scheduling on top of E. To use
769E-Par, set the environment variable \texttt{E\_HOME} to the directory that
770contains the \texttt{runepar.pl} script and the \texttt{eprover} and
771\texttt{epclextract} executables, or use the prebuilt E package from \download.
772Be aware that E-Par is experimental software. It has been known to generate
773zombie processes. Use at your own risks.
774
775\item[\labelitemi] \textbf{\textit{ehoh}:} Ehoh is an experimental version of
776E that supports a $\lambda$-free fragment of higher-order logic. Use at your
777own risks.
778
779\item[\labelitemi] \textbf{\textit{iprover}:} iProver is a pure
780instantiation-based prover developed by Konstantin Korovin
781\cite{korovin-2009}. To use iProver, set the environment variable
782\texttt{IPROVER\_HOME} to the directory that contains the \texttt{iproveropt}
783executable. Sledgehammer has been tested with version 2.8. iProver depends on
784E to clausify problems, so make sure that E is installed as well.
785
786\item[\labelitemi] \textbf{\textit{leo2}:} LEO-II is an automatic
787higher-order prover developed by Christoph Benzm\"uller et al.\ \cite{leo2},
788with support for the TPTP typed higher-order syntax (TH0). To use LEO-II, set
789the environment variable \texttt{LEO2\_HOME} to the directory that contains the
790\texttt{leo} executable. Sledgehammer requires version 1.3.4 or above.
791
792\item[\labelitemi] \textbf{\textit{leo3}:} Leo-III is an automatic
793higher-order prover developed by Alexander Steen, Max Wisniewski, Christoph
794Benzm\"uller et al.\ \cite{leo3},
795with support for the TPTP typed higher-order syntax (TH0). To use Leo-III, set
796the environment variable \texttt{LEO3\_HOME} to the directory that contains the
797\texttt{leo3} executable. Sledgehammer requires version 1.1 or above.
798
799\item[\labelitemi] \textbf{\textit{satallax}:} Satallax is an automatic
800higher-order prover developed by Chad Brown et al.\ \cite{satallax}, with
801support for the TPTP typed higher-order syntax (TH0). To use Satallax, set the
802environment variable \texttt{SATALLAX\_HOME} to the directory that contains the
803\texttt{satallax} executable. Sledgehammer requires version 2.2 or above.
804
805\item[\labelitemi] \textbf{\textit{spass}:} SPASS is a first-order resolution
806prover developed by Christoph Weidenbach et al.\ \cite{weidenbach-et-al-2009}.
807To use SPASS, set the environment variable \texttt{SPASS\_HOME} to the directory
808that contains the \texttt{SPASS} executable and \texttt{SPASS\_VERSION} to the
809version number (e.g., ``3.8ds''), or install the prebuilt SPASS package from
810\download. Sledgehammer requires version 3.8ds or above.
811
812\item[\labelitemi] \textbf{\textit{vampire}:} Vampire is a first-order
813resolution prover developed by Andrei Voronkov and his colleagues
814\cite{riazanov-voronkov-2002}. To use Vampire, set the environment variable
815\texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{vampire}
816executable and \texttt{VAMPIRE\_VERSION} to the version number (e.g.,
817``4.2.2''). Sledgehammer has been tested with versions 1.8 to 4.2.2 (in the
818post-2010 numbering scheme).
819
820\item[\labelitemi] \textbf{\textit{verit}:} veriT \cite{bouton-et-al-2009} is an
821SMT solver developed by David D\'eharbe, Pascal Fontaine, and their colleagues.
822It is specifically designed to produce detailed proofs for reconstruction in
823proof assistants. To use veriT, set the environment variable
824\texttt{VERIT\_SOLVER} to the complete path of the executable, including the
825file name. Sledgehammer has been tested with version smtcomp2019.
826
827\item[\labelitemi] \textbf{\textit{z3}:} Z3 is an SMT solver developed at
828Microsoft Research \cite{z3}. To use Z3, set the environment variable
829\texttt{Z3\_SOLVER} to the complete path of the executable, including the
830file name. Sledgehammer has been tested with a pre-release version of 4.4.0.
831
832\item[\labelitemi] \textbf{\textit{z3\_tptp}:} This version of Z3 pretends to
833be an ATP, exploiting Z3's support for the TPTP typed first-order format
834(TF0). It is included for experimental purposes. It requires version 4.3.1 of
835Z3 or above. To use it, set the environment variable \texttt{Z3\_TPTP\_HOME}
836to the directory that contains the \texttt{z3\_tptp} executable.
837
838\item[\labelitemi] \textbf{\textit{zipperposition}:} Zipperposition
839\cite{cruanes-2014} is a first-order resolution prover developed by Simon
840Cruanes and colleagues. To use Zipperposition, set the environment variable
841\texttt{ZIPPERPOSITION\_HOME} to the directory that contains the
842\texttt{zipperposition} executable.
843\end{enum}
844
845\end{sloppy}
846
847Moreover, the following remote provers are supported:
848
849\begin{enum}
850\item[\labelitemi] \textbf{\textit{remote\_agsyhol}:} The remote version of
851agsyHOL runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
852
853\item[\labelitemi] \textbf{\textit{remote\_alt\_ergo}:} The remote version of
854Alt-Ergo runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
855
856\item[\labelitemi] \textbf{\textit{remote\_e}:} The remote version of E runs
857on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
858
859\item[\labelitemi] \textbf{\textit{remote\_iprover}:} The
860remote version of iProver runs on Geoff Sutcliffe's Miami servers
861\cite{sutcliffe-2000}.
862
863\item[\labelitemi] \textbf{\textit{remote\_leo2}:} The remote version of LEO-II
864runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
865
866\item[\labelitemi] \textbf{\textit{remote\_leo3}:} The remote version of Leo-III
867runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
868
869\item[\labelitemi] \textbf{\textit{remote\_pirate}:} Pirate is a
870highly experimental first-order resolution prover developed by Daniel Wand.
871The remote version of Pirate run on a private server he generously set up.
872
873\item[\labelitemi] \textbf{\textit{remote\_snark}:} SNARK is a first-order
874resolution prover developed by Stickel et al.\ \cite{snark}. The remote
875version of SNARK runs on Geoff Sutcliffe's Miami servers.
876
877\item[\labelitemi] \textbf{\textit{remote\_vampire}:} The remote version of
878Vampire runs on Geoff Sutcliffe's Miami servers.
879
880\item[\labelitemi] \textbf{\textit{remote\_waldmeister}:} Waldmeister is a unit
881equality prover developed by Hillenbrand et al.\ \cite{waldmeister}. It can be
882used to prove universally quantified equations using unconditional equations,
883corresponding to the TPTP CNF UEQ division. The remote version of Waldmeister
884runs on Geoff Sutcliffe's Miami servers.
885
886\item[\labelitemi] \textbf{\textit{remote\_zipperposition}:} The remote
887version of Zipperposition runs on Geoff Sutcliffe's Miami servers.
888\end{enum}
889
890By default, Sledgehammer runs a subset of CVC4, E, SPASS, Vampire, veriT, and
891Z3 in parallel, either locally or remotely---depending on the number of
892processor cores available and on which provers are actually installed. It is
893generally desirable to run several provers in parallel.
894
895\opnodefault{prover}{string}
896Alias for \textit{provers}.
897
898\optrue{slice}{dont\_slice}
899Specifies whether the time allocated to a prover should be sliced into several
900segments, each of which has its own set of possibly prover-dependent options.
901For SPASS and Vampire, the first slice tries the fast but incomplete
902set-of-support (SOS) strategy, whereas the second slice runs without it. For E,
903up to three slices are tried, with different weighted search strategies and
904number of facts. For SMT solvers, several slices are tried with the same options
905each time but fewer and fewer facts. According to benchmarks with a timeout of
90630 seconds, slicing is a valuable optimization, and you should probably leave it
907enabled unless you are conducting experiments.
908
909\nopagebreak
910{\small See also \textit{verbose} (\S\ref{output-format}).}
911
912\optrue{minimize}{dont\_minimize}
913Specifies whether the minimization tool should be invoked automatically after
914proof search.
915
916\nopagebreak
917{\small See also \textit{preplay\_timeout} (\S\ref{timeouts})
918and \textit{dont\_preplay} (\S\ref{timeouts}).}
919
920\opfalse{spy}{dont\_spy}
921Specifies whether Sledgehammer should record statistics in
922\texttt{\$ISA\-BELLE\_\allowbreak HOME\_\allowbreak USER/\allowbreak spy\_\allowbreak sledgehammer}.
923These statistics can be useful to the developers of Sledgehammer. If you are willing to have your
924interactions recorded in the name of science, please enable this feature and send the statistics
925file every now and then to the author of this manual (\authoremail).
926To change the default value of this option globally, set the environment variable
927\texttt{SLEDGEHAMMER\_SPY} to \textit{yes}.
928
929\nopagebreak
930{\small See also \textit{debug} (\S\ref{output-format}).}
931
932\opfalse{overlord}{no\_overlord}
933Specifies whether Sledgehammer should put its temporary files in
934\texttt{\$ISA\-BELLE\_\allowbreak HOME\_\allowbreak USER}, which is useful for
935debugging Sledgehammer but also unsafe if several instances of the tool are run
936simultaneously. The files are identified by the prefixes \texttt{prob\_} and
937\texttt{mash\_}; you may safely remove them after Sledgehammer has run.
938
939\textbf{Warning:} This option is not thread-safe. Use at your own risks.
940
941\nopagebreak
942{\small See also \textit{debug} (\S\ref{output-format}).}
943\end{enum}
944
945
946\subsection{Relevance Filter}
947\label{relevance-filter}
948
949\begin{enum}
950\opdefault{fact\_filter}{string}{smart}
951Specifies the relevance filter to use. The following filters are available:
952
953\begin{enum}
954\item[\labelitemi] \textbf{\textit{mepo}:}
955The traditional memoryless MePo relevance filter.
956
957\item[\labelitemi] \textbf{\textit{mash}:}
958The MaSh machine learner. Three learning algorithms are provided:
959
960\begin{enum}
961\item[\labelitemi] \textbf{\textit{nb}} is an implementation of naive Bayes.
962
963\item[\labelitemi] \textbf{\textit{knn}} is an implementation of $k$-nearest
964neighbors.
965
966\item[\labelitemi] \textbf{\textit{nb\_knn}} (also called \textbf{\textit{yes}}
967and \textbf{\textit{sml}}) is a combination of naive Bayes and $k$-nearest
968neighbors.
969\end{enum}
970
971In addition, the special value \textit{none} is used to disable machine learning by
972default (cf.\ \textit{smart} below).
973
974The default algorithm is \textit{nb\_knn}. The algorithm can be selected by
975setting the ``MaSh'' option under ``Plugins > Plugin Options > Isabelle >
976General'' in Isabelle/jEdit. Persistent data for both algorithms is stored in
977the directory \texttt{\$ISABELLE\_\allowbreak HOME\_\allowbreak USER/\allowbreak
978mash}.
979
980\item[\labelitemi] \textbf{\textit{mesh}:} The MeSh filter, which combines the
981rankings from MePo and MaSh.
982
983\item[\labelitemi] \textbf{\textit{smart}:} A combination of MePo, MaSh, and
984MeSh. If the learning algorithm is set to be \textit{none}, \textit{smart}
985behaves like MePo.
986\end{enum}
987
988\opdefault{max\_facts}{smart\_int}{smart}
989Specifies the maximum number of facts that may be returned by the relevance
990filter. If the option is set to \textit{smart} (the default), it effectively
991takes a value that was empirically found to be appropriate for the prover.
992Typical values lie between 50 and 1000.
993
994\opdefault{fact\_thresholds}{float\_pair}{\upshape 0.45~0.85}
995Specifies the thresholds above which facts are considered relevant by the
996relevance filter. The first threshold is used for the first iteration of the
997relevance filter and the second threshold is used for the last iteration (if it
998is reached). The effective threshold is quadratically interpolated for the other
999iterations. Each threshold ranges from 0 to 1, where 0 means that all theorems
1000are relevant and 1 only theorems that refer to previously seen constants.
1001
1002\optrue{learn}{dont\_learn}
1003Specifies whether MaSh should be run automatically by Sledgehammer to learn the
1004available theories (and hence provide more accurate results). Learning takes
1005place only if MaSh is enabled.
1006
1007\opdefault{max\_new\_mono\_instances}{int}{smart}
1008Specifies the maximum number of monomorphic instances to generate beyond
1009\textit{max\_facts}. The higher this limit is, the more monomorphic instances
1010are potentially generated. Whether monomorphization takes place depends on the
1011type encoding used. If the option is set to \textit{smart} (the default), it
1012takes a value that was empirically found to be appropriate for the prover. For
1013most provers, this value is 100.
1014
1015\nopagebreak
1016{\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
1017
1018\opdefault{max\_mono\_iters}{int}{smart}
1019Specifies the maximum number of iterations for the monomorphization fixpoint
1020construction. The higher this limit is, the more monomorphic instances are
1021potentially generated. Whether monomorphization takes place depends on the
1022type encoding used. If the option is set to \textit{smart} (the default), it
1023takes a value that was empirically found to be appropriate for the prover.
1024For most provers, this value is 3.
1025
1026\nopagebreak
1027{\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
1028\end{enum}
1029
1030
1031\subsection{Problem Encoding}
1032\label{problem-encoding}
1033
1034\newcommand\comb[1]{\const{#1}}
1035
1036\begin{enum}
1037\opdefault{lam\_trans}{string}{smart}
1038Specifies the $\lambda$ translation scheme to use in ATP problems. The supported
1039translation schemes are listed below:
1040
1041\begin{enum}
1042\item[\labelitemi] \textbf{\textit{hide\_lams}:} Hide the $\lambda$-abstractions
1043by replacing them by unspecified fresh constants, effectively disabling all
1044reasoning under $\lambda$-abstractions.
1045
1046\item[\labelitemi] \textbf{\textit{lifting}:} Introduce a new
1047supercombinator \const{c} for each cluster of $n$~$\lambda$-abstractions,
1048defined using an equation $\const{c}~x_1~\ldots~x_n = t$ ($\lambda$-lifting).
1049
1050\item[\labelitemi] \textbf{\textit{combs}:} Rewrite lambdas to the Curry
1051combinators (\comb{I}, \comb{K}, \comb{S}, \comb{B}, \comb{C}). Combinators
1052enable the ATPs to synthesize $\lambda$-terms but tend to yield bulkier formulas
1053than $\lambda$-lifting: The translation is quadratic in the worst case, and the
1054equational definitions of the combinators are very prolific in the context of
1055resolution.
1056
1057\item[\labelitemi] \textbf{\textit{combs\_and\_lifting}:} Introduce a new
1058supercombinator \const{c} for each cluster of $\lambda$-abstractions and characterize it both using a
1059lifted equation $\const{c}~x_1~\ldots~x_n = t$ and via Curry combinators.
1060
1061\item[\labelitemi] \textbf{\textit{combs\_or\_lifting}:} For each cluster of
1062$\lambda$-abstractions, heuristically choose between $\lambda$-lifting and Curry
1063combinators.
1064
1065\item[\labelitemi] \textbf{\textit{keep\_lams}:}
1066Keep the $\lambda$-abstractions in the generated problems. This is available
1067only with provers that support the TH0 syntax.
1068
1069\item[\labelitemi] \textbf{\textit{smart}:} The actual translation scheme used
1070depends on the ATP and should be the most efficient scheme for that ATP.
1071\end{enum}
1072
1073For SMT solvers, the $\lambda$ translation scheme is always \textit{lifting},
1074irrespective of the value of this option.
1075
1076\opsmartx{uncurried\_aliases}{no\_uncurried\_aliases}
1077Specifies whether fresh function symbols should be generated as aliases for
1078applications of curried functions in ATP problems.
1079
1080\opdefault{type\_enc}{string}{smart}
1081Specifies the type encoding to use in ATP problems. Some of the type encodings
1082are unsound, meaning that they can give rise to spurious proofs
1083(unreconstructible using \textit{metis}). The type encodings are
1084listed below, with an indication of their soundness in parentheses.
1085An asterisk (*) indicates that the encoding is slightly incomplete for
1086reconstruction with \textit{metis}, unless the \textit{strict} option (described
1087below) is enabled.
1088
1089\begin{enum}
1090\item[\labelitemi] \textbf{\textit{erased} (unsound):} No type information is
1091supplied to the ATP, not even to resolve overloading. Types are simply erased.
1092
1093\item[\labelitemi] \textbf{\textit{poly\_guards} (sound):} Types are encoded using
1094a predicate \const{g}$(\tau, t)$ that guards bound
1095variables. Constants are annotated with their types, supplied as extra
1096arguments, to resolve overloading.
1097
1098\item[\labelitemi] \textbf{\textit{poly\_tags} (sound):} Each term and subterm is
1099tagged with its type using a function $\const{t\/}(\tau, t)$.
1100
1101\item[\labelitemi] \textbf{\textit{poly\_args} (unsound):}
1102Like for \textit{poly\_guards} constants are annotated with their types to
1103resolve overloading, but otherwise no type information is encoded. This
1104is the default encoding used by the \textit{metis} proof method.
1105
1106\item[\labelitemi]
1107\textbf{%
1108\textit{raw\_mono\_guards}, \textit{raw\_mono\_tags} (sound); \\
1109\textit{raw\_mono\_args} (unsound):} \\
1110Similar to \textit{poly\_guards}, \textit{poly\_tags}, and \textit{poly\_args},
1111respectively, but the problem is additionally monomorphized, meaning that type
1112variables are instantiated with heuristically chosen ground types.
1113Monomorphization can simplify reasoning but also leads to larger fact bases,
1114which can slow down the ATPs.
1115
1116\item[\labelitemi]
1117\textbf{%
1118\textit{mono\_guards}, \textit{mono\_tags} (sound);
1119\textit{mono\_args} (unsound):} \\
1120Similar to
1121\textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, and
1122\textit{raw\_mono\_args}, respectively but types are mangled in constant names
1123instead of being supplied as ground term arguments. The binary predicate
1124$\const{g}(\tau, t)$ becomes a unary predicate
1125$\const{g\_}\tau(t)$, and the binary function
1126$\const{t}(\tau, t)$ becomes a unary function
1127$\const{t\_}\tau(t)$.
1128
1129\item[\labelitemi] \textbf{\textit{mono\_native} (sound):} Exploits native
1130first-order types if the prover supports the TF0, TF1, TH0, or TH1 syntax;
1131otherwise, falls back on \textit{mono\_guards}. The problem is monomorphized.
1132
1133\item[\labelitemi] \textbf{\textit{mono\_native\_higher} (sound):} Exploits
1134native higher-order types if the prover supports the TH0 syntax; otherwise,
1135falls back on \textit{mono\_native} or \textit{mono\_guards}. The problem is
1136monomorphized.
1137
1138\item[\labelitemi] \textbf{\textit{poly\_native} (sound):} Exploits native
1139first-order polymorphic types if the prover supports the TF1 or TH1 syntax;
1140otherwise, falls back on \textit{mono\_native}.
1141
1142\item[\labelitemi]
1143\textbf{%
1144\textit{poly\_guards}?, \textit{poly\_tags}?, \textit{raw\_mono\_guards}?, \\
1145\textit{raw\_mono\_tags}?, \textit{mono\_guards}?, \textit{mono\_tags}?, \\
1146\textit{mono\_native}? (sound*):} \\
1147The type encodings \textit{poly\_guards}, \textit{poly\_tags},
1148\textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, \textit{mono\_guards},
1149\textit{mono\_tags}, and \textit{mono\_native} are fully typed and sound. For
1150each of these, Sledgehammer also provides a lighter variant identified by a
1151question mark (`\hbox{?}')\ that detects and erases monotonic types, notably
1152infinite types. (For \textit{mono\_native}, the types are not actually erased
1153but rather replaced by a shared uniform type of individuals.) As argument to the
1154\textit{metis} proof method, the question mark is replaced by a
1155\hbox{``\textit{\_query\/}''} suffix.
1156
1157\item[\labelitemi]
1158\textbf{%
1159\textit{poly\_guards}??, \textit{poly\_tags}??, \textit{raw\_mono\_guards}??, \\
1160\textit{raw\_mono\_tags}??, \textit{mono\_guards}??, \textit{mono\_tags}?? \\
1161(sound*):} \\
1162Even lighter versions of the `\hbox{?}' encodings. As argument to the
1163\textit{metis} proof method, the `\hbox{??}' suffix is replaced by
1164\hbox{``\textit{\_query\_query\/}''}.
1165
1166\item[\labelitemi]
1167\textbf{%
1168\textit{poly\_guards}@, \textit{poly\_tags}@, \textit{raw\_mono\_guards}@, \\
1169\textit{raw\_mono\_tags}@ (sound*):} \\
1170Alternative versions of the `\hbox{??}' encodings. As argument to the
1171\textit{metis} proof method, the `\hbox{@}' suffix is replaced by
1172\hbox{``\textit{\_at\/}''}.
1173
1174\item[\labelitemi] \textbf{\textit{poly\_args}?, \textit{raw\_mono\_args}? (unsound):} \\
1175Lighter versions of \textit{poly\_args} and \textit{raw\_mono\_args}.
1176
1177\item[\labelitemi] \textbf{\textit{smart}:} The actual encoding used depends on
1178the ATP and should be the most efficient sound encoding for that ATP.
1179\end{enum}
1180
1181For SMT solvers, the type encoding is always \textit{mono\_native}, irrespective
1182of the value of this option.
1183
1184\nopagebreak
1185{\small See also \textit{max\_new\_mono\_instances} (\S\ref{relevance-filter})
1186and \textit{max\_mono\_iters} (\S\ref{relevance-filter}).}
1187
1188\opfalse{strict}{non\_strict}
1189Specifies whether Sledgehammer should run in its strict mode. In that mode,
1190sound type encodings marked with an asterisk (*) above are made complete
1191for reconstruction with \textit{metis}, at the cost of some clutter in the
1192generated problems. This option has no effect if \textit{type\_enc} is
1193deliberately set to an unsound encoding.
1194\end{enum}
1195
1196
1197\subsection{Output Format}
1198\label{output-format}
1199
1200\begin{enum}
1201
1202\opfalse{verbose}{quiet}
1203Specifies whether the \textbf{sledgehammer} command should explain what it does.
1204
1205\opfalse{debug}{no\_debug}
1206Specifies whether Sledgehammer should display additional debugging information
1207beyond what \textit{verbose} already displays. Enabling \textit{debug} also
1208enables \textit{verbose} behind the scenes.
1209
1210\nopagebreak
1211{\small See also \textit{spy} (\S\ref{mode-of-operation}) and
1212\textit{overlord} (\S\ref{mode-of-operation}).}
1213
1214\opsmart{isar\_proofs}{no\_isar\_proofs}
1215Specifies whether Isar proofs should be output in addition to one-line proofs.
1216The construction of Isar proof is still experimental and may sometimes fail;
1217however, when they succeed they are usually faster and more intelligible than
1218one-line proofs. If the option is set to \textit{smart} (the default), Isar
1219proofs are only generated when no working one-line proof is available.
1220
1221\opdefault{compress}{int}{smart}
1222Specifies the granularity of the generated Isar proofs if \textit{isar\_proofs}
1223is explicitly enabled. A value of $n$ indicates that each Isar proof step should
1224correspond to a group of up to $n$ consecutive proof steps in the ATP proof. If
1225the option is set to \textit{smart} (the default), the compression factor is 10
1226if the \textit{isar\_proofs} option is explicitly enabled; otherwise, it is
1227$\infty$.
1228
1229\optrueonly{dont\_compress}
1230Alias for ``\textit{compress} = 1''.
1231
1232\optrue{try0}{dont\_try0}
1233Specifies whether standard proof methods such as \textit{auto} and
1234\textit{blast} should be tried as alternatives to \textit{metis} in Isar proofs.
1235The collection of methods is roughly the same as for the \textbf{try0} command.
1236
1237\opsmart{smt\_proofs}{no\_smt\_proofs}
1238Specifies whether the \textit{smt} proof method should be tried in addition to
1239Isabelle's other proof methods. If the option is set to \textit{smart} (the
1240default), the \textit{smt} method is used for one-line proofs but not in Isar
1241proofs.
1242\end{enum}
1243
1244
1245\subsection{Regression Testing}
1246\label{regression-testing}
1247
1248\begin{enum}
1249\opnodefault{expect}{string}
1250Specifies the expected outcome, which must be one of the following:
1251
1252\begin{enum}
1253\item[\labelitemi] \textbf{\textit{some}:} Sledgehammer found a proof.
1254\item[\labelitemi] \textbf{\textit{none}:} Sledgehammer found no proof.
1255\item[\labelitemi] \textbf{\textit{timeout}:} Sledgehammer timed out.
1256\item[\labelitemi] \textbf{\textit{unknown}:} Sledgehammer encountered some
1257problem.
1258\end{enum}
1259
1260Sledgehammer emits an error if the actual outcome differs from the expected outcome. This option is
1261useful for regression testing.
1262
1263\nopagebreak
1264{\small See also \textit{timeout} (\S\ref{timeouts}).}
1265\end{enum}
1266
1267
1268\subsection{Timeouts}
1269\label{timeouts}
1270
1271\begin{enum}
1272\opdefault{timeout}{float}{\upshape 30}
1273Specifies the maximum number of seconds that the automatic provers should spend
1274searching for a proof. This excludes problem preparation and is a soft limit.
1275
1276\opdefault{preplay\_timeout}{float}{\upshape 1}
1277Specifies the maximum number of seconds that \textit{metis} or other proof
1278methods should spend trying to ``preplay'' the found proof. If this option
1279is set to 0, no preplaying takes place, and no timing information is displayed
1280next to the suggested proof method calls.
1281
1282\nopagebreak
1283{\small See also \textit{minimize} (\S\ref{mode-of-operation}).}
1284
1285\optrueonly{dont\_preplay}
1286Alias for ``\textit{preplay\_timeout} = 0''.
1287
1288\end{enum}
1289
1290\let\em=\sl
1291\bibliography{manual}{}
1292\bibliographystyle{abbrv}
1293
1294\end{document}
1295