1\documentclass[a4paper,12pt]{article} 2\usepackage{lmodern} 3\usepackage[T1]{fontenc} 4\usepackage{amsmath} 5\usepackage{amssymb} 6\usepackage[english]{babel} 7\usepackage{color} 8\usepackage{footmisc} 9\usepackage{graphicx} 10%\usepackage{mathpazo} 11\usepackage{multicol} 12\usepackage{stmaryrd} 13%\usepackage[scaled=.85]{beramono} 14\usepackage{isabelle,iman,pdfsetup} 15 16\newcommand\download{\url{https://isabelle.in.tum.de/components/}} 17 18\let\oldS=\S 19\def\S{\oldS\,} 20 21\def\qty#1{\ensuremath{\left<\mathit{#1\/}\right>}} 22\def\qtybf#1{$\mathbf{\left<\textbf{\textit{#1\/}}\right>}$} 23 24\newcommand\const[1]{\textsf{#1}} 25 26%\oddsidemargin=4.6mm 27%\evensidemargin=4.6mm 28%\textwidth=150mm 29%\topmargin=4.6mm 30%\headheight=0mm 31%\headsep=0mm 32%\textheight=234mm 33 34\def\Colon{\mathord{:\mkern-1.5mu:}} 35%\def\lbrakk{\mathopen{\lbrack\mkern-3.25mu\lbrack}} 36%\def\rbrakk{\mathclose{\rbrack\mkern-3.255mu\rbrack}} 37\def\lparr{\mathopen{(\mkern-4mu\mid}} 38\def\rparr{\mathclose{\mid\mkern-4mu)}} 39 40\def\unk{{?}} 41\def\undef{(\lambda x.\; \unk)} 42%\def\unr{\textit{others}} 43\def\unr{\ldots} 44\def\Abs#1{\hbox{\rm{\flqq}}{\,#1\,}\hbox{\rm{\frqq}}} 45\def\Q{{\smash{\lower.2ex\hbox{$\scriptstyle?$}}}} 46 47\urlstyle{tt} 48 49\renewcommand\_{\hbox{\textunderscore\kern-.05ex}} 50 51\begin{document} 52 53%%% TYPESETTING 54%\renewcommand\labelitemi{$\bullet$} 55\renewcommand\labelitemi{\raise.065ex\hbox{\small\textbullet}} 56 57\title{\includegraphics[scale=0.5]{isabelle_sledgehammer} \\[4ex] 58Hammering Away \\[\smallskipamount] 59\Large A User's Guide to Sledgehammer for Isabelle/HOL} 60\author{\hbox{} \\ 61Jasmin Blanchette \\ 62{\normalsize Institut f\"ur Informatik, Technische Universit\"at M\"unchen} \\[4\smallskipamount] 63{\normalsize with contributions from} \\[4\smallskipamount] 64Lawrence C. Paulson \\ 65{\normalsize Computer Laboratory, University of Cambridge} \\ 66\hbox{}} 67 68\maketitle 69 70\tableofcontents 71 72\setlength{\parskip}{.7em plus .2em minus .1em} 73\setlength{\parindent}{0pt} 74\setlength{\abovedisplayskip}{\parskip} 75\setlength{\abovedisplayshortskip}{.9\parskip} 76\setlength{\belowdisplayskip}{\parskip} 77\setlength{\belowdisplayshortskip}{.9\parskip} 78 79% general-purpose enum environment with correct spacing 80\newenvironment{enum}% 81 {\begin{list}{}{% 82 \setlength{\topsep}{.1\parskip}% 83 \setlength{\partopsep}{.1\parskip}% 84 \setlength{\itemsep}{\parskip}% 85 \advance\itemsep by-\parsep}} 86 {\end{list}} 87 88\def\pre{\begingroup\vskip0pt plus1ex\advance\leftskip by\leftmargin 89\advance\rightskip by\leftmargin} 90\def\post{\vskip0pt plus1ex\endgroup} 91 92\def\prew{\pre\advance\rightskip by-\leftmargin} 93\def\postw{\post} 94 95 96\section{Introduction} 97\label{introduction} 98 99Sledgehammer is a tool that applies automatic theorem provers (ATPs) 100and satisfiability-modulo-theories (SMT) solvers on the current goal.% 101\footnote{The distinction between ATPs and SMT solvers is convenient but mostly 102historical.} 103% 104The supported ATPs are agsyHOL \cite{agsyHOL}, Alt-Ergo \cite{alt-ergo}, E 105\cite{schulz-2002}, iProver \cite{korovin-2009}, LEO-II \cite{leo2}, Leo-III 106\cite{leo3}, Satallax \cite{satallax}, SNARK \cite{snark}, SPASS 107\cite{weidenbach-et-al-2009}, Vampire \cite{riazanov-voronkov-2002}, 108Waldmeister \cite{waldmeister}, and Zipperposition \cite{cruanes-2014}. The 109ATPs are run either locally or remotely via the System\-On\-TPTP web service 110\cite{sutcliffe-2000}. The supported SMT solvers are CVC3 \cite{cvc3}, CVC4 111\cite{cvc4}, veriT \cite{bouton-et-al-2009}, and Z3 \cite{z3}. These are 112always run locally. 113 114The problem passed to the external provers (or solvers) consists of your current 115goal together with a heuristic selection of hundreds of facts (theorems) from the 116current theory context, filtered by relevance. 117 118The result of a successful proof search is some source text that usually (but 119not always) reconstructs the proof within Isabelle. For ATPs, the reconstructed 120proof typically relies on the general-purpose \textit{metis} proof method, which 121integrates the Metis ATP in Isabelle/HOL with explicit inferences going through 122the kernel. Thus its results are correct by construction. 123 124For Isabelle/jEdit users, Sledgehammer provides an automatic mode that can be 125enabled via the ``Auto Sledgehammer'' option under ``Plugins > Plugin Options > 126Isabelle > General.'' In this mode, a reduced version of Sledgehammer is run on 127every newly entered theorem for a few seconds. 128 129\newbox\boxA 130\setbox\boxA=\hbox{\texttt{NOSPAM}} 131 132\newcommand\authoremail{\texttt{blan{\color{white}NOSPAM}\kern-\wd\boxA{}chette@\allowbreak 133in.\allowbreak tum.\allowbreak de}} 134 135To run Sledgehammer, you must make sure that the theory \textit{Sledgehammer} is 136imported---this is rarely a problem in practice since it is part of 137\textit{Main}. Examples of Sledgehammer use can be found in Isabelle's 138\texttt{src/HOL/Metis\_Examples} directory. 139Comments and bug reports concerning Sledgehammer or this manual should be 140directed to the author at \authoremail. 141 142 143\section{Installation} 144\label{installation} 145 146Sledgehammer is part of Isabelle, so you do not need to install it. However, it 147relies on third-party automatic provers (ATPs and SMT solvers). 148 149Among the ATPs, agsyHOL, Alt-Ergo, E, LEO-II, Leo-III, Satallax, SPASS, 150Vampire, and Zipperposition can be run locally; in addition, agsyHOL, 151Alt-Ergo, E, iProver, LEO-II, Leo-III, Satallax, SNARK, Vampire, Waldmeister, 152and Zipperposition are available remotely via System\-On\-TPTP 153\cite{sutcliffe-2000}. The SMT solvers CVC3, CVC4, veriT, and Z3 can be run 154locally. 155 156There are three main ways to install automatic provers on your machine: 157 158\begin{sloppy} 159\begin{enum} 160\item[\labelitemi] If you installed an official Isabelle package, it should 161already include properly setup executables for CVC4, E, SPASS, Vampire, and Z3, 162ready to use. To use Vampire, you must confirm that you are a noncommercial 163user, as indicated by the message that is displayed when Sledgehammer is 164invoked the first time. 165 166\item[\labelitemi] Alternatively, you can download the Isabelle-aware CVC3, 167CVC4, E, SPASS, Vampire, and Z3 binary packages from \download. Extract the 168archives, then add a line to your \texttt{\$ISABELLE\_HOME\_USER\slash etc\slash 169components}% 170\footnote{The variable \texttt{\$ISABELLE\_HOME\_USER} is set by Isabelle at 171startup. Its value can be retrieved by executing \texttt{isabelle} 172\texttt{getenv} \texttt{ISABELLE\_HOME\_USER} on the command line.} 173file with the absolute path to CVC3, CVC4, E, SPASS, Vampire, or Z3. For 174example, if the \texttt{components} file does not exist yet and you extracted 175SPASS to \texttt{/usr/local/spass-3.8ds}, create it with the single line 176 177\prew 178\texttt{/usr/local/spass-3.8ds} 179\postw 180 181in it. 182 183\item[\labelitemi] If you prefer to build agsyHOL, Alt-Ergo, E, LEO-II, 184Leo-III, or Satallax manually, set the environment variable 185\texttt{AGSYHOL\_HOME}, \texttt{E\_HOME}, \texttt{LEO2\_HOME}, 186\texttt{LEO3\_HOME}, or \texttt{SATALLAX\_HOME} 187to the directory that contains the \texttt{agsyHOL}, 188\texttt{eprover} (and/or \texttt{eproof} or \texttt{eproof\_ram}), 189\texttt{leo}, \texttt{leo3}, or \texttt{satallax} executable; 190for Alt-Ergo, set the environment variable \texttt{WHY3\_HOME} to the 191directory that contains the \texttt{why3} executable. Sledgehammer has been 192tested with agsyHOL 1.0, Alt-Ergo 0.95.2, E 1.6 to 2.0, LEO-II 1.3.4, Leo-III 1931.1, and Satallax 2.7. Since the ATPs' output formats are neither documented 194nor stable, other versions might not work well with Sledgehammer. Ideally, you 195should also set \texttt{E\_VERSION}, \texttt{LEO2\_VERSION}, 196\texttt{LEO3\_VERSION}, or \texttt{SATALLAX\_VERSION} to the prover's version 197number (e.g., ``2.7''); this might help Sledgehammer invoke the prover 198optimally. 199 200Similarly, if you want to install CVC3, CVC4, veriT, or Z3, set the environment 201variable \texttt{CVC3\_\allowbreak SOLVER}, \texttt{CVC4\_\allowbreak SOLVER}, 202\texttt{VERIT\_\allowbreak SOLVER}, or \texttt{Z3\_SOLVER} to the complete path 203of the executable, \emph{including the file name}. Sledgehammer has been tested 204with CVC3 2.2 and 2.4.1, CVC4 1.5-prerelease, veriT smtcomp2019, and Z3 4.3.2. 205Since Z3's output format is somewhat unstable, other versions of the solver 206might not work well with Sledgehammer. Ideally, also set 207\texttt{CVC3\_VERSION}, \texttt{CVC4\_VERSION}, \texttt{VERIT\_VERSION}, or 208\texttt{Z3\_VERSION} to the solver's version number (e.g., ``4.4.0''). 209\end{enum} 210\end{sloppy} 211 212To check whether the provers are successfully installed, try out the example 213in \S\ref{first-steps}. If the remote versions of any of these provers is used 214(identified by the prefix ``\textit{remote\_\/}''), or if the local versions 215fail to solve the easy goal presented there, something must be wrong with the 216installation. 217 218Remote prover invocation requires Perl with the World Wide Web Library 219(\texttt{libwww-perl}) installed. If you must use a proxy server to access the 220Internet, set the \texttt{http\_proxy} environment variable to the proxy, either 221in the environment in which Isabelle is launched or in your 222\texttt{\$ISABELLE\_HOME\_USER/etc/settings} file. Here are a few 223examples: 224 225\prew 226\texttt{http\_proxy=http://proxy.example.org} \\ 227\texttt{http\_proxy=http://proxy.example.org:8080} \\ 228\texttt{http\_proxy=http://joeblow:pAsSwRd@proxy.example.org} 229\postw 230 231 232\section{First Steps} 233\label{first-steps} 234 235To illustrate Sledgehammer in context, let us start a theory file and 236attempt to prove a simple lemma: 237 238\prew 239\textbf{theory}~\textit{Scratch} \\ 240\textbf{imports}~\textit{Main} \\ 241\textbf{begin} \\[2\smallskipamount] 242% 243\textbf{lemma} ``$[a] = [b] \,\Longrightarrow\, a = b$'' \\ 244\textbf{sledgehammer} 245\postw 246 247Instead of issuing the \textbf{sledgehammer} command, you can also use the 248Sledgehammer panel in Isabelle/jEdit. Sledgehammer produces the following output 249after a few seconds: 250 251\prew 252\slshape 253Proof found\ldots \\ 254``\textit{e\/}'': Try this: \textbf{by} \textit{simp} (0.3 ms) \\ 255% 256``\textit{cvc4\/}'': Try this: \textbf{by} \textit{simp} (0.4 ms) \\ 257% 258``\textit{z3\/}'': Try this: \textbf{by} \textit{simp} (0.5 ms) \\ 259% 260``\textit{spass\/}'': Try this: \textbf{by} \textit{simp} (0.3 ms) 261% 262\postw 263 264Sledgehammer ran CVC4, E, SPASS, and Z3 in parallel. Depending on which 265provers are installed and how many processor cores are available, some of the 266provers might be missing or present with a \textit{remote\_} prefix. 267 268For each successful prover, Sledgehammer gives a one-line \textit{metis} or 269\textit{smt} method call. Rough timings are shown in parentheses, indicating how 270fast the call is. You can click the proof to insert it into the theory text. 271 272In addition, you can ask Sledgehammer for an Isar text proof by enabling the 273\textit{isar\_proofs} option (\S\ref{output-format}): 274 275\prew 276\textbf{sledgehammer} [\textit{isar\_proofs}] 277\postw 278 279When Isar proof construction is successful, it can yield proofs that are more 280readable and also faster than the \textit{metis} or \textit{smt} one-line 281proofs. This feature is experimental and is only available for ATPs. 282 283 284\section{Hints} 285\label{hints} 286 287This section presents a few hints that should help you get the most out of 288Sledgehammer. Frequently asked questions are answered in 289\S\ref{frequently-asked-questions}. 290 291%\newcommand\point[1]{\medskip\par{\sl\bfseries#1}\par\nopagebreak} 292\newcommand\point[1]{\subsection{\emph{#1}}} 293 294 295\point{Presimplify the goal} 296 297For best results, first simplify your problem by calling \textit{auto} or at 298least \textit{safe} followed by \textit{simp\_all}. The SMT solvers provide 299arithmetic decision procedures, but the ATPs typically do not (or if they do, 300Sledgehammer does not use it yet). Apart from Waldmeister, they are not 301particularly good at heavy rewriting, but because they regard equations as 302undirected, they often prove theorems that require the reverse orientation of a 303\textit{simp} rule. Higher-order problems can be tackled, but the success rate 304is better for first-order problems. Hence, you may get better results if you 305first simplify the problem to remove higher-order features. 306 307 308\point{Familiarize yourself with the main options} 309 310Sledgehammer's options are fully documented in \S\ref{command-syntax}. Many of 311the options are very specialized, but serious users of the tool should at least 312familiarize themselves with the following options: 313 314\begin{enum} 315\item[\labelitemi] \textbf{\textit{provers}} (\S\ref{mode-of-operation}) specifies 316the automatic provers (ATPs and SMT solvers) that should be run whenever 317Sledgehammer is invoked (e.g., ``\textit{provers}~= \textit{cvc4 e spass 318vampire\/}''). For convenience, you can omit ``\textit{provers}~='' 319and simply write the prover names as a space-separated list (e.g., ``\textit{cvc4 e 320spass vampire\/}''). 321 322\item[\labelitemi] \textbf{\textit{max\_facts}} (\S\ref{relevance-filter}) 323specifies the maximum number of facts that should be passed to the provers. By 324default, the value is prover-dependent but varies between about 50 and 1000. If 325the provers time out, you can try lowering this value to, say, 25 or 50 and see 326if that helps. 327 328\item[\labelitemi] \textbf{\textit{isar\_proofs}} (\S\ref{output-format}) specifies 329that Isar proofs should be generated, in addition to one-line \textit{metis} or 330\textit{smt} proofs. The length of the Isar proofs can be controlled by setting 331\textit{compress} (\S\ref{output-format}). 332 333\item[\labelitemi] \textbf{\textit{timeout}} (\S\ref{timeouts}) controls the 334provers' time limit. It is set to 30 seconds by default. 335\end{enum} 336 337Options can be set globally using \textbf{sledgehammer\_params} 338(\S\ref{command-syntax}). The command also prints the list of all available 339options with their current value. Fact selection can be influenced by specifying 340``$(\textit{add}{:}~\textit{my\_facts})$'' after the \textbf{sledgehammer} call 341to ensure that certain facts are included, or simply ``$(\textit{my\_facts})$'' 342to force Sledgehammer to run only with $\textit{my\_facts}$ (and any facts 343chained into the goal). 344 345 346\section{Frequently Asked Questions} 347\label{frequently-asked-questions} 348 349This sections answers frequently (and infrequently) asked questions about 350Sledgehammer. It is a good idea to skim over it now even if you do not have any 351questions at this stage. And if you have any further questions not listed here, 352send them to the author at \authoremail. 353 354 355\point{Which facts are passed to the automatic provers?} 356 357Sledgehammer heuristically selects a few hundred relevant lemmas from the 358currently loaded libraries. The component that performs this selection is 359called \emph{relevance filter} (\S\ref{relevance-filter}). 360 361\begin{enum} 362\item[\labelitemi] 363The traditional relevance filter, called \emph{MePo} 364(\underline{Me}ng--\underline{Pau}lson), assigns a score to every available fact 365(lemma, theorem, definition, or axiom) based upon how many constants that fact 366shares with the conjecture. This process iterates to include facts relevant to 367those just accepted. The constants are weighted to give unusual ones greater 368significance. MePo copes best when the conjecture contains some unusual 369constants; if all the constants are common, it is unable to discriminate among 370the hundreds of facts that are picked up. The filter is also memoryless: It has 371no information about how many times a particular fact has been used in a proof, 372and it cannot learn. 373 374\item[\labelitemi] 375An alternative to MePo is \emph{MaSh} (\underline{Ma}chine Learner for 376\underline{S}ledge\underline{h}ammer). It applies machine learning to the 377problem of finding relevant facts. 378 379\item[\labelitemi] The \emph{MeSh} filter combines MePo and MaSh. This is 380the default. 381\end{enum} 382 383The number of facts included in a problem varies from prover to prover, since 384some provers get overwhelmed more easily than others. You can show the number of 385facts given using the \textit{verbose} option (\S\ref{output-format}) and the 386actual facts using \textit{debug} (\S\ref{output-format}). 387 388Sledgehammer is good at finding short proofs combining a handful of existing 389lemmas. If you are looking for longer proofs, you must typically restrict the 390number of facts, by setting the \textit{max\_facts} option 391(\S\ref{relevance-filter}) to, say, 25 or 50. 392 393You can also influence which facts are actually selected in a number of ways. If 394you simply want to ensure that a fact is included, you can specify it using the 395``$(\textit{add}{:}~\textit{my\_facts})$'' syntax. For example: 396% 397\prew 398\textbf{sledgehammer} (\textit{add}: \textit{hd.simps} \textit{tl.simps}) 399\postw 400% 401The specified facts then replace the least relevant facts that would otherwise be 402included; the other selected facts remain the same. 403If you want to direct the selection in a particular direction, you can specify 404the facts via \textbf{using}: 405% 406\prew 407\textbf{using} \textit{hd.simps} \textit{tl.simps} \\ 408\textbf{sledgehammer} 409\postw 410% 411The facts are then more likely to be selected than otherwise, and if they are 412selected at iteration $j$ they also influence which facts are selected at 413iterations $j + 1$, $j + 2$, etc. To give them even more weight, try 414% 415\prew 416\textbf{using} \textit{hd.simps} \textit{tl.simps} \\ 417\textbf{apply}~\textbf{--} \\ 418\textbf{sledgehammer} 419\postw 420 421 422\point{Why does Metis fail to reconstruct the proof?} 423 424There are many reasons. If Metis runs seemingly forever, that is a sign that the 425proof is too difficult for it. Metis's search is complete for first-order logic 426with equality, so if the proof was found by a superposition-based ATP such as 427E, SPASS, or Vampire, Metis should eventually find it, but that is little 428consolation. 429 430In some rare cases, \textit{metis} fails fairly quickly, and you get the error 431message ``One-line proof reconstruction failed.'' This indicates that 432Sledgehammer determined that the goal is provable, but the proof is, for 433technical reasons, beyond \textit{metis}'s power. You can then try again with 434the \textit{strict} option (\S\ref{problem-encoding}). 435 436If the goal is actually unprovable and you did not specify an unsound encoding 437using \textit{type\_enc} (\S\ref{problem-encoding}), this is a bug, and you are 438strongly encouraged to report this to the author at \authoremail. 439 440 441\point{How can I tell whether a suggested proof is sound?} 442 443Earlier versions of Sledgehammer often suggested unsound proofs---either proofs 444of nontheorems or simply proofs that rely on type-unsound inferences. This 445is a thing of the past, unless you explicitly specify an unsound encoding 446using \textit{type\_enc} (\S\ref{problem-encoding}). 447% 448Officially, the only form of ``unsoundness'' that lurks in the sound 449encodings is related to missing characteristic theorems of datatypes. For 450example, 451 452\prew 453\textbf{lemma}~``$\exists \mathit{xs}.\; \mathit{xs} \neq []$'' \\ 454\textbf{sledgehammer} () 455\postw 456 457suggests an argumentless \textit{metis} call that fails. However, the conjecture 458does actually hold, and the \textit{metis} call can be repaired by adding 459\textit{list.distinct}. 460% 461We hope to address this problem in a future version of Isabelle. In the 462meantime, you can avoid it by passing the \textit{strict} option 463(\S\ref{problem-encoding}). 464 465 466\point{What are the \textit{full\_types}, \textit{no\_types}, and 467\textit{mono\_tags} arguments to Metis?} 468 469The \textit{metis}~(\textit{full\_types}) proof method 470and its cousin \textit{metis}~(\textit{mono\_tags}) are fully-typed 471versions of Metis. It is somewhat slower than \textit{metis}, but the proof 472search is fully typed, and it also includes more powerful rules such as the 473axiom ``$x = \const{True} \mathrel{\lor} x = \const{False}$'' for reasoning in 474higher-order places (e.g., in set comprehensions). The method is automatically 475tried as a fallback when \textit{metis} fails, and it is sometimes 476generated by Sledgehammer instead of \textit{metis} if the proof obviously 477requires type information or if \textit{metis} failed when Sledgehammer 478preplayed the proof. (By default, Sledgehammer tries to run \textit{metis} with 479various sets of option for up to 1~second each time to ensure that the generated 480one-line proofs actually work and to display timing information. This can be 481configured using the \textit{preplay\_timeout} and \textit{dont\_preplay} 482options (\S\ref{timeouts}).) 483% 484At the other end of the soundness spectrum, \textit{metis} (\textit{no\_types}) 485uses no type information at all during the proof search, which is more efficient 486but often fails. Calls to \textit{metis} (\textit{no\_types}) are occasionally 487generated by Sledgehammer. 488% 489See the \textit{type\_enc} option (\S\ref{problem-encoding}) for details. 490 491Incidentally, if you ever see warnings such as 492 493\prew 494\slshape 495Metis: Falling back on ``\textit{metis} (\textit{full\_types})'' 496\postw 497 498for a successful \textit{metis} proof, you can advantageously pass the 499\textit{full\_types} option to \textit{metis} directly. 500 501 502\point{And what are the \textit{lifting} and \textit{hide\_lams} arguments 503to Metis?} 504 505Orthogonally to the encoding of types, it is important to choose an appropriate 506translation of $\lambda$-abstractions. Metis supports three translation schemes, 507in decreasing order of power: Curry combinators (the default), 508$\lambda$-lifting, and a ``hiding'' scheme that disables all reasoning under 509$\lambda$-abstractions. The more powerful schemes also give the automatic 510provers more rope to hang themselves. See the \textit{lam\_trans} option (\S\ref{problem-encoding}) for details. 511 512 513\point{Are the generated proofs minimal?} 514 515Automatic provers frequently use many more facts than are necessary. 516Sledgehammer includes a minimization tool that takes a set of facts returned by 517a given prover and repeatedly calls a prover or proof method with subsets of 518those facts to find a minimal set. Reducing the number of facts typically helps 519reconstruction, while also removing superfluous clutter from the proof scripts. 520 521In earlier versions of Sledgehammer, generated proofs were systematically 522accompanied by a suggestion to invoke the minimization tool. This step is now 523performed by default but can be disabled using the \textit{minimize} option 524(\S\ref{mode-of-operation}). 525 526 527\point{A strange error occurred---what should I do?} 528 529Sledgehammer tries to give informative error messages. Please report any strange 530error to the author at \authoremail. 531 532 533\point{Auto can solve it---why not Sledgehammer?} 534 535Problems can be easy for \textit{auto} and difficult for automatic provers, but 536the reverse is also true, so do not be discouraged if your first attempts fail. 537Because the system refers to all theorems known to Isabelle, it is particularly 538suitable when your goal has a short proof but requires lemmas that you do not 539know about. 540 541 542\point{Why are there so many options?} 543 544Sledgehammer's philosophy should work out of the box, without user guidance. 545Many of the options are meant to be used mostly by the Sledgehammer developers 546for experiments. Of course, feel free to try them out if you are so inclined. 547 548 549\section{Command Syntax} 550\label{command-syntax} 551 552\subsection{Sledgehammer} 553\label{sledgehammer} 554 555Sledgehammer can be invoked at any point when there is an open goal by entering 556the \textbf{sledgehammer} command in the theory file. Its general syntax is as 557follows: 558 559\prew 560\textbf{sledgehammer} \qty{subcommand}$^?$ \qty{options}$^?$ \qty{facts\_override}$^?$ \qty{num}$^?$ 561\postw 562 563In the general syntax, the \qty{subcommand} may be any of the following: 564 565\begin{enum} 566\item[\labelitemi] \textbf{\textit{run} (the default):} Runs Sledgehammer on 567subgoal number \qty{num} (1 by default), with the given options and facts. 568 569\item[\labelitemi] \textbf{\textit{supported\_provers}:} Prints the list of 570automatic provers supported by Sledgehammer. See \S\ref{installation} and 571\S\ref{mode-of-operation} for more information on how to install automatic 572provers. 573 574\item[\labelitemi] \textbf{\textit{refresh\_tptp}:} Refreshes the list of remote 575ATPs available at System\-On\-TPTP \cite{sutcliffe-2000}. 576\end{enum} 577 578In addition, the following subcommands provide finer control over machine 579learning with MaSh: 580 581\begin{enum} 582\item[\labelitemi] \textbf{\textit{unlearn}:} Resets MaSh, erasing any 583persistent state. 584 585\item[\labelitemi] \textbf{\textit{learn\_isar}:} Invokes MaSh on the current 586theory to process all the available facts, learning from their Isabelle/Isar 587proofs. This happens automatically at Sledgehammer invocations if the 588\textit{learn} option (\S\ref{relevance-filter}) is enabled. 589 590\item[\labelitemi] \textbf{\textit{learn\_prover}:} Invokes MaSh on the current 591theory to process all the available facts, learning from proofs generated by 592automatic provers. The prover to use and its timeout can be set using the 593\textit{prover} (\S\ref{mode-of-operation}) and \textit{timeout} 594(\S\ref{timeouts}) options. It is recommended to perform learning using a 595first-order ATP (such as E, SPASS, and Vampire) as opposed to a 596higher-order ATP or an SMT solver. 597 598\item[\labelitemi] \textbf{\textit{relearn\_isar}:} Same as \textit{unlearn} 599followed by \textit{learn\_isar}. 600 601\item[\labelitemi] \textbf{\textit{relearn\_prover}:} Same as \textit{unlearn} 602followed by \textit{learn\_prover}. 603\end{enum} 604 605Sledgehammer's behavior can be influenced by various \qty{options}, which can be 606specified in brackets after the \textbf{sledgehammer} command. The 607\qty{options} are a list of key--value pairs of the form ``[$k_1 = v_1, 608\ldots, k_n = v_n$]''. For Boolean options, ``= \textit{true\/}'' is optional. 609For example: 610 611\prew 612\textbf{sledgehammer} [\textit{isar\_proofs}, \,\textit{timeout} = 120] 613\postw 614 615Default values can be set using \textbf{sledgehammer\_\allowbreak params}: 616 617\prew 618\textbf{sledgehammer\_params} \qty{options} 619\postw 620 621The supported options are described in \S\ref{option-reference}. 622 623The \qty{facts\_override} argument lets you alter the set of facts that go 624through the relevance filter. It may be of the form ``(\qty{facts})'', where 625\qty{facts} is a space-separated list of Isabelle facts (theorems, local 626assumptions, etc.), in which case the relevance filter is bypassed and the given 627facts are used. It may also be of the form ``(\textit{add}:\ \qty{facts\/_{\mathrm{1}}})'', 628``(\textit{del}:\ \qty{facts\/_{\mathrm{2}}})'', or ``(\textit{add}:\ \qty{facts\/_{\mathrm{1}}}\ 629\textit{del}:\ \qty{facts\/_{\mathrm{2}}})'', where the relevance filter is instructed to 630proceed as usual except that it should consider \qty{facts\/_{\mathrm{1}}} 631highly-relevant and \qty{facts\/_{\mathrm{2}}} fully irrelevant. 632 633If you use Isabelle/jEdit, Sledgehammer also provides an automatic mode that can 634be enabled via the ``Auto Sledgehammer'' option under ``Plugins > Plugin Options 635> Isabelle > General.'' For automatic runs, only the first prover set using 636\textit{provers} (\S\ref{mode-of-operation}) is considered (typically E), 637\textit{slice} (\S\ref{mode-of-operation}) is disabled, 638fewer facts are 639passed to the prover, \textit{fact\_filter} (\S\ref{relevance-filter}) is set to 640\textit{mepo}, \textit{strict} (\S\ref{problem-encoding}) is enabled, 641\textit{verbose} (\S\ref{output-format}) and \textit{debug} 642(\S\ref{output-format}) are disabled, and \textit{timeout} (\S\ref{timeouts}) is 643superseded by the ``Auto Time Limit'' option in jEdit. Sledgehammer's output is 644also more concise. 645 646 647\subsection{Metis} 648\label{metis} 649 650The \textit{metis} proof method has the syntax 651 652\prew 653\textbf{\textit{metis}}~(\qty{options})${}^?$~\qty{facts}${}^?$ 654\postw 655 656where \qty{facts} is a list of arbitrary facts and \qty{options} is a 657comma-separated list consisting of at most one $\lambda$ translation scheme 658specification with the same semantics as Sledgehammer's \textit{lam\_trans} 659option (\S\ref{problem-encoding}) and at most one type encoding specification 660with the same semantics as Sledgehammer's \textit{type\_enc} option 661(\S\ref{problem-encoding}). 662% 663The supported $\lambda$ translation schemes are \textit{hide\_lams}, 664\textit{lifting}, and \textit{combs} (the default). 665% 666All the untyped type encodings listed in \S\ref{problem-encoding} are supported. 667For convenience, the following aliases are provided: 668\begin{enum} 669\item[\labelitemi] \textbf{\textit{full\_types}:} Alias for \textit{poly\_guards\_query}. 670\item[\labelitemi] \textbf{\textit{partial\_types}:} Alias for \textit{poly\_args}. 671\item[\labelitemi] \textbf{\textit{no\_types}:} Alias for \textit{erased}. 672\end{enum} 673 674 675\section{Option Reference} 676\label{option-reference} 677 678\def\defl{\{} 679\def\defr{\}} 680 681\def\flushitem#1{\item[]\noindent\kern-\leftmargin \textbf{#1}} 682\def\optrueonly#1{\flushitem{\textit{#1} $\bigl[$= \textit{true}$\bigr]$\enskip}\nopagebreak\\[\parskip]} 683\def\optrue#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{true}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]} 684\def\opfalse#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{false}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]} 685\def\opsmart#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\enskip \defl\textit{smart}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]} 686\def\opsmartx#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\enskip \defl\textit{smart}\defr\\\hbox{}\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]} 687\def\opnodefault#1#2{\flushitem{\textit{#1} = \qtybf{#2}} \nopagebreak\\[\parskip]} 688\def\opnodefaultbrk#1#2{\flushitem{$\bigl[$\textit{#1} =$\bigr]$ \qtybf{#2}} \nopagebreak\\[\parskip]} 689\def\opdefault#1#2#3{\flushitem{\textit{#1} = \qtybf{#2}\enskip \defl\textit{#3}\defr} \nopagebreak\\[\parskip]} 690\def\oparg#1#2#3{\flushitem{\textit{#1} \qtybf{#2} = \qtybf{#3}} \nopagebreak\\[\parskip]} 691\def\opargbool#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]} 692\def\opargboolorsmart#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]} 693 694Sledgehammer's options are categorized as follows:\ mode of operation 695(\S\ref{mode-of-operation}), problem encoding (\S\ref{problem-encoding}), 696relevance filter (\S\ref{relevance-filter}), output format 697(\S\ref{output-format}), regression testing (\S\ref{regression-testing}), 698and timeouts (\S\ref{timeouts}). 699 700The descriptions below refer to the following syntactic quantities: 701 702\begin{enum} 703\item[\labelitemi] \qtybf{string}: A string. 704\item[\labelitemi] \qtybf{bool\/}: \textit{true} or \textit{false}. 705\item[\labelitemi] \qtybf{smart\_bool\/}: \textit{true}, \textit{false}, or 706\textit{smart}. 707\item[\labelitemi] \qtybf{int\/}: An integer. 708\item[\labelitemi] \qtybf{float}: A floating-point number (e.g., 2.5 or 60) 709expressing a number of seconds. 710\item[\labelitemi] \qtybf{float\_pair\/}: A pair of floating-point numbers 711(e.g., 0.6 0.95). 712\item[\labelitemi] \qtybf{smart\_int\/}: An integer or \textit{smart}. 713\end{enum} 714 715Default values are indicated in curly brackets (\textrm{\{\}}). Boolean options 716have a negative counterpart (e.g., \textit{minimize} vs.\ 717\textit{dont\_minimize}). When setting Boolean options or their negative 718counterparts, ``= \textit{true\/}'' may be omitted. 719 720 721\subsection{Mode of Operation} 722\label{mode-of-operation} 723 724\begin{enum} 725\opnodefaultbrk{provers}{string} 726Specifies the automatic provers to use as a space-separated list (e.g., 727``\textit{cvc4}~\textit{e}~\textit{spass}~\textit{vampire\/}''). 728Provers can be run locally or remotely; see \S\ref{installation} for 729installation instructions. 730 731The following local provers are supported: 732 733\begin{sloppy} 734\begin{enum} 735\item[\labelitemi] \textbf{\textit{agsyhol}:} agsyHOL is an automatic 736higher-order prover developed by Fredrik Lindblad \cite{agsyHOL}. To use 737agsyHOL, set the environment variable \texttt{AGSYHOL\_HOME} to the directory 738that contains the \texttt{agsyHOL} executable. Sledgehammer has been tested 739with version 1.0. 740 741\item[\labelitemi] \textbf{\textit{alt\_ergo}:} Alt-Ergo is a polymorphic 742ATP developed by Bobot et al.\ \cite{alt-ergo}. 743It supports the TPTP polymorphic typed first-order format (TF1) via Why3 744\cite{why3}. To use Alt-Ergo, set the environment variable \texttt{WHY3\_HOME} 745to the directory that contains the \texttt{why3} executable. Sledgehammer 746requires Alt-Ergo 0.95.2 and Why3 0.83. 747 748\item[\labelitemi] \textbf{\textit{cvc3}:} CVC3 is an SMT solver developed by 749Clark Barrett, Cesare Tinelli, and their colleagues \cite{cvc3}. To use CVC3, 750set the environment variable \texttt{CVC3\_SOLVER} to the complete path of the 751executable, including the file name, or install the prebuilt CVC3 package from 752\download. Sledgehammer has been tested with versions 2.2 and 2.4.1. 753 754\item[\labelitemi] \textbf{\textit{cvc4}:} CVC4 \cite{cvc4} is the successor to 755CVC3. To use CVC4, set the environment variable \texttt{CVC4\_SOLVER} to the 756complete path of the executable, including the file name, or install the 757prebuilt CVC4 package from \download. Sledgehammer has been tested with version 7581.5-prerelease. 759 760\item[\labelitemi] \textbf{\textit{e}:} E is a first-order resolution prover 761developed by Stephan Schulz \cite{schulz-2002}. To use E, set the environment 762variable \texttt{E\_HOME} to the directory that contains the \texttt{eproof} 763executable and \texttt{E\_VERSION} to the version number (e.g., ``1.8''), or 764install the prebuilt E package from \download. Sledgehammer has been tested with 765versions 1.6 to 1.8. 766 767\item[\labelitemi] \textbf{\textit{e\_par}:} E-Par is an experimental metaprover 768developed by Josef Urban that implements strategy scheduling on top of E. To use 769E-Par, set the environment variable \texttt{E\_HOME} to the directory that 770contains the \texttt{runepar.pl} script and the \texttt{eprover} and 771\texttt{epclextract} executables, or use the prebuilt E package from \download. 772Be aware that E-Par is experimental software. It has been known to generate 773zombie processes. Use at your own risks. 774 775\item[\labelitemi] \textbf{\textit{ehoh}:} Ehoh is an experimental version of 776E that supports a $\lambda$-free fragment of higher-order logic. Use at your 777own risks. 778 779\item[\labelitemi] \textbf{\textit{iprover}:} iProver is a pure 780instantiation-based prover developed by Konstantin Korovin 781\cite{korovin-2009}. To use iProver, set the environment variable 782\texttt{IPROVER\_HOME} to the directory that contains the \texttt{iproveropt} 783executable. Sledgehammer has been tested with version 2.8. iProver depends on 784E to clausify problems, so make sure that E is installed as well. 785 786\item[\labelitemi] \textbf{\textit{leo2}:} LEO-II is an automatic 787higher-order prover developed by Christoph Benzm\"uller et al.\ \cite{leo2}, 788with support for the TPTP typed higher-order syntax (TH0). To use LEO-II, set 789the environment variable \texttt{LEO2\_HOME} to the directory that contains the 790\texttt{leo} executable. Sledgehammer requires version 1.3.4 or above. 791 792\item[\labelitemi] \textbf{\textit{leo3}:} Leo-III is an automatic 793higher-order prover developed by Alexander Steen, Max Wisniewski, Christoph 794Benzm\"uller et al.\ \cite{leo3}, 795with support for the TPTP typed higher-order syntax (TH0). To use Leo-III, set 796the environment variable \texttt{LEO3\_HOME} to the directory that contains the 797\texttt{leo3} executable. Sledgehammer requires version 1.1 or above. 798 799\item[\labelitemi] \textbf{\textit{satallax}:} Satallax is an automatic 800higher-order prover developed by Chad Brown et al.\ \cite{satallax}, with 801support for the TPTP typed higher-order syntax (TH0). To use Satallax, set the 802environment variable \texttt{SATALLAX\_HOME} to the directory that contains the 803\texttt{satallax} executable. Sledgehammer requires version 2.2 or above. 804 805\item[\labelitemi] \textbf{\textit{spass}:} SPASS is a first-order resolution 806prover developed by Christoph Weidenbach et al.\ \cite{weidenbach-et-al-2009}. 807To use SPASS, set the environment variable \texttt{SPASS\_HOME} to the directory 808that contains the \texttt{SPASS} executable and \texttt{SPASS\_VERSION} to the 809version number (e.g., ``3.8ds''), or install the prebuilt SPASS package from 810\download. Sledgehammer requires version 3.8ds or above. 811 812\item[\labelitemi] \textbf{\textit{vampire}:} Vampire is a first-order 813resolution prover developed by Andrei Voronkov and his colleagues 814\cite{riazanov-voronkov-2002}. To use Vampire, set the environment variable 815\texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{vampire} 816executable and \texttt{VAMPIRE\_VERSION} to the version number (e.g., 817``4.2.2''). Sledgehammer has been tested with versions 1.8 to 4.2.2 (in the 818post-2010 numbering scheme). 819 820\item[\labelitemi] \textbf{\textit{verit}:} veriT \cite{bouton-et-al-2009} is an 821SMT solver developed by David D\'eharbe, Pascal Fontaine, and their colleagues. 822It is specifically designed to produce detailed proofs for reconstruction in 823proof assistants. To use veriT, set the environment variable 824\texttt{VERIT\_SOLVER} to the complete path of the executable, including the 825file name. Sledgehammer has been tested with version smtcomp2019. 826 827\item[\labelitemi] \textbf{\textit{z3}:} Z3 is an SMT solver developed at 828Microsoft Research \cite{z3}. To use Z3, set the environment variable 829\texttt{Z3\_SOLVER} to the complete path of the executable, including the 830file name. Sledgehammer has been tested with a pre-release version of 4.4.0. 831 832\item[\labelitemi] \textbf{\textit{z3\_tptp}:} This version of Z3 pretends to 833be an ATP, exploiting Z3's support for the TPTP typed first-order format 834(TF0). It is included for experimental purposes. It requires version 4.3.1 of 835Z3 or above. To use it, set the environment variable \texttt{Z3\_TPTP\_HOME} 836to the directory that contains the \texttt{z3\_tptp} executable. 837 838\item[\labelitemi] \textbf{\textit{zipperposition}:} Zipperposition 839\cite{cruanes-2014} is a first-order resolution prover developed by Simon 840Cruanes and colleagues. To use Zipperposition, set the environment variable 841\texttt{ZIPPERPOSITION\_HOME} to the directory that contains the 842\texttt{zipperposition} executable. 843\end{enum} 844 845\end{sloppy} 846 847Moreover, the following remote provers are supported: 848 849\begin{enum} 850\item[\labelitemi] \textbf{\textit{remote\_agsyhol}:} The remote version of 851agsyHOL runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}. 852 853\item[\labelitemi] \textbf{\textit{remote\_alt\_ergo}:} The remote version of 854Alt-Ergo runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}. 855 856\item[\labelitemi] \textbf{\textit{remote\_e}:} The remote version of E runs 857on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}. 858 859\item[\labelitemi] \textbf{\textit{remote\_iprover}:} The 860remote version of iProver runs on Geoff Sutcliffe's Miami servers 861\cite{sutcliffe-2000}. 862 863\item[\labelitemi] \textbf{\textit{remote\_leo2}:} The remote version of LEO-II 864runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}. 865 866\item[\labelitemi] \textbf{\textit{remote\_leo3}:} The remote version of Leo-III 867runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}. 868 869\item[\labelitemi] \textbf{\textit{remote\_pirate}:} Pirate is a 870highly experimental first-order resolution prover developed by Daniel Wand. 871The remote version of Pirate run on a private server he generously set up. 872 873\item[\labelitemi] \textbf{\textit{remote\_snark}:} SNARK is a first-order 874resolution prover developed by Stickel et al.\ \cite{snark}. The remote 875version of SNARK runs on Geoff Sutcliffe's Miami servers. 876 877\item[\labelitemi] \textbf{\textit{remote\_vampire}:} The remote version of 878Vampire runs on Geoff Sutcliffe's Miami servers. 879 880\item[\labelitemi] \textbf{\textit{remote\_waldmeister}:} Waldmeister is a unit 881equality prover developed by Hillenbrand et al.\ \cite{waldmeister}. It can be 882used to prove universally quantified equations using unconditional equations, 883corresponding to the TPTP CNF UEQ division. The remote version of Waldmeister 884runs on Geoff Sutcliffe's Miami servers. 885 886\item[\labelitemi] \textbf{\textit{remote\_zipperposition}:} The remote 887version of Zipperposition runs on Geoff Sutcliffe's Miami servers. 888\end{enum} 889 890By default, Sledgehammer runs a subset of CVC4, E, SPASS, Vampire, veriT, and 891Z3 in parallel, either locally or remotely---depending on the number of 892processor cores available and on which provers are actually installed. It is 893generally desirable to run several provers in parallel. 894 895\opnodefault{prover}{string} 896Alias for \textit{provers}. 897 898\optrue{slice}{dont\_slice} 899Specifies whether the time allocated to a prover should be sliced into several 900segments, each of which has its own set of possibly prover-dependent options. 901For SPASS and Vampire, the first slice tries the fast but incomplete 902set-of-support (SOS) strategy, whereas the second slice runs without it. For E, 903up to three slices are tried, with different weighted search strategies and 904number of facts. For SMT solvers, several slices are tried with the same options 905each time but fewer and fewer facts. According to benchmarks with a timeout of 90630 seconds, slicing is a valuable optimization, and you should probably leave it 907enabled unless you are conducting experiments. 908 909\nopagebreak 910{\small See also \textit{verbose} (\S\ref{output-format}).} 911 912\optrue{minimize}{dont\_minimize} 913Specifies whether the minimization tool should be invoked automatically after 914proof search. 915 916\nopagebreak 917{\small See also \textit{preplay\_timeout} (\S\ref{timeouts}) 918and \textit{dont\_preplay} (\S\ref{timeouts}).} 919 920\opfalse{spy}{dont\_spy} 921Specifies whether Sledgehammer should record statistics in 922\texttt{\$ISA\-BELLE\_\allowbreak HOME\_\allowbreak USER/\allowbreak spy\_\allowbreak sledgehammer}. 923These statistics can be useful to the developers of Sledgehammer. If you are willing to have your 924interactions recorded in the name of science, please enable this feature and send the statistics 925file every now and then to the author of this manual (\authoremail). 926To change the default value of this option globally, set the environment variable 927\texttt{SLEDGEHAMMER\_SPY} to \textit{yes}. 928 929\nopagebreak 930{\small See also \textit{debug} (\S\ref{output-format}).} 931 932\opfalse{overlord}{no\_overlord} 933Specifies whether Sledgehammer should put its temporary files in 934\texttt{\$ISA\-BELLE\_\allowbreak HOME\_\allowbreak USER}, which is useful for 935debugging Sledgehammer but also unsafe if several instances of the tool are run 936simultaneously. The files are identified by the prefixes \texttt{prob\_} and 937\texttt{mash\_}; you may safely remove them after Sledgehammer has run. 938 939\textbf{Warning:} This option is not thread-safe. Use at your own risks. 940 941\nopagebreak 942{\small See also \textit{debug} (\S\ref{output-format}).} 943\end{enum} 944 945 946\subsection{Relevance Filter} 947\label{relevance-filter} 948 949\begin{enum} 950\opdefault{fact\_filter}{string}{smart} 951Specifies the relevance filter to use. The following filters are available: 952 953\begin{enum} 954\item[\labelitemi] \textbf{\textit{mepo}:} 955The traditional memoryless MePo relevance filter. 956 957\item[\labelitemi] \textbf{\textit{mash}:} 958The MaSh machine learner. Three learning algorithms are provided: 959 960\begin{enum} 961\item[\labelitemi] \textbf{\textit{nb}} is an implementation of naive Bayes. 962 963\item[\labelitemi] \textbf{\textit{knn}} is an implementation of $k$-nearest 964neighbors. 965 966\item[\labelitemi] \textbf{\textit{nb\_knn}} (also called \textbf{\textit{yes}} 967and \textbf{\textit{sml}}) is a combination of naive Bayes and $k$-nearest 968neighbors. 969\end{enum} 970 971In addition, the special value \textit{none} is used to disable machine learning by 972default (cf.\ \textit{smart} below). 973 974The default algorithm is \textit{nb\_knn}. The algorithm can be selected by 975setting the ``MaSh'' option under ``Plugins > Plugin Options > Isabelle > 976General'' in Isabelle/jEdit. Persistent data for both algorithms is stored in 977the directory \texttt{\$ISABELLE\_\allowbreak HOME\_\allowbreak USER/\allowbreak 978mash}. 979 980\item[\labelitemi] \textbf{\textit{mesh}:} The MeSh filter, which combines the 981rankings from MePo and MaSh. 982 983\item[\labelitemi] \textbf{\textit{smart}:} A combination of MePo, MaSh, and 984MeSh. If the learning algorithm is set to be \textit{none}, \textit{smart} 985behaves like MePo. 986\end{enum} 987 988\opdefault{max\_facts}{smart\_int}{smart} 989Specifies the maximum number of facts that may be returned by the relevance 990filter. If the option is set to \textit{smart} (the default), it effectively 991takes a value that was empirically found to be appropriate for the prover. 992Typical values lie between 50 and 1000. 993 994\opdefault{fact\_thresholds}{float\_pair}{\upshape 0.45~0.85} 995Specifies the thresholds above which facts are considered relevant by the 996relevance filter. The first threshold is used for the first iteration of the 997relevance filter and the second threshold is used for the last iteration (if it 998is reached). The effective threshold is quadratically interpolated for the other 999iterations. Each threshold ranges from 0 to 1, where 0 means that all theorems 1000are relevant and 1 only theorems that refer to previously seen constants. 1001 1002\optrue{learn}{dont\_learn} 1003Specifies whether MaSh should be run automatically by Sledgehammer to learn the 1004available theories (and hence provide more accurate results). Learning takes 1005place only if MaSh is enabled. 1006 1007\opdefault{max\_new\_mono\_instances}{int}{smart} 1008Specifies the maximum number of monomorphic instances to generate beyond 1009\textit{max\_facts}. The higher this limit is, the more monomorphic instances 1010are potentially generated. Whether monomorphization takes place depends on the 1011type encoding used. If the option is set to \textit{smart} (the default), it 1012takes a value that was empirically found to be appropriate for the prover. For 1013most provers, this value is 100. 1014 1015\nopagebreak 1016{\small See also \textit{type\_enc} (\S\ref{problem-encoding}).} 1017 1018\opdefault{max\_mono\_iters}{int}{smart} 1019Specifies the maximum number of iterations for the monomorphization fixpoint 1020construction. The higher this limit is, the more monomorphic instances are 1021potentially generated. Whether monomorphization takes place depends on the 1022type encoding used. If the option is set to \textit{smart} (the default), it 1023takes a value that was empirically found to be appropriate for the prover. 1024For most provers, this value is 3. 1025 1026\nopagebreak 1027{\small See also \textit{type\_enc} (\S\ref{problem-encoding}).} 1028\end{enum} 1029 1030 1031\subsection{Problem Encoding} 1032\label{problem-encoding} 1033 1034\newcommand\comb[1]{\const{#1}} 1035 1036\begin{enum} 1037\opdefault{lam\_trans}{string}{smart} 1038Specifies the $\lambda$ translation scheme to use in ATP problems. The supported 1039translation schemes are listed below: 1040 1041\begin{enum} 1042\item[\labelitemi] \textbf{\textit{hide\_lams}:} Hide the $\lambda$-abstractions 1043by replacing them by unspecified fresh constants, effectively disabling all 1044reasoning under $\lambda$-abstractions. 1045 1046\item[\labelitemi] \textbf{\textit{lifting}:} Introduce a new 1047supercombinator \const{c} for each cluster of $n$~$\lambda$-abstractions, 1048defined using an equation $\const{c}~x_1~\ldots~x_n = t$ ($\lambda$-lifting). 1049 1050\item[\labelitemi] \textbf{\textit{combs}:} Rewrite lambdas to the Curry 1051combinators (\comb{I}, \comb{K}, \comb{S}, \comb{B}, \comb{C}). Combinators 1052enable the ATPs to synthesize $\lambda$-terms but tend to yield bulkier formulas 1053than $\lambda$-lifting: The translation is quadratic in the worst case, and the 1054equational definitions of the combinators are very prolific in the context of 1055resolution. 1056 1057\item[\labelitemi] \textbf{\textit{combs\_and\_lifting}:} Introduce a new 1058supercombinator \const{c} for each cluster of $\lambda$-abstractions and characterize it both using a 1059lifted equation $\const{c}~x_1~\ldots~x_n = t$ and via Curry combinators. 1060 1061\item[\labelitemi] \textbf{\textit{combs\_or\_lifting}:} For each cluster of 1062$\lambda$-abstractions, heuristically choose between $\lambda$-lifting and Curry 1063combinators. 1064 1065\item[\labelitemi] \textbf{\textit{keep\_lams}:} 1066Keep the $\lambda$-abstractions in the generated problems. This is available 1067only with provers that support the TH0 syntax. 1068 1069\item[\labelitemi] \textbf{\textit{smart}:} The actual translation scheme used 1070depends on the ATP and should be the most efficient scheme for that ATP. 1071\end{enum} 1072 1073For SMT solvers, the $\lambda$ translation scheme is always \textit{lifting}, 1074irrespective of the value of this option. 1075 1076\opsmartx{uncurried\_aliases}{no\_uncurried\_aliases} 1077Specifies whether fresh function symbols should be generated as aliases for 1078applications of curried functions in ATP problems. 1079 1080\opdefault{type\_enc}{string}{smart} 1081Specifies the type encoding to use in ATP problems. Some of the type encodings 1082are unsound, meaning that they can give rise to spurious proofs 1083(unreconstructible using \textit{metis}). The type encodings are 1084listed below, with an indication of their soundness in parentheses. 1085An asterisk (*) indicates that the encoding is slightly incomplete for 1086reconstruction with \textit{metis}, unless the \textit{strict} option (described 1087below) is enabled. 1088 1089\begin{enum} 1090\item[\labelitemi] \textbf{\textit{erased} (unsound):} No type information is 1091supplied to the ATP, not even to resolve overloading. Types are simply erased. 1092 1093\item[\labelitemi] \textbf{\textit{poly\_guards} (sound):} Types are encoded using 1094a predicate \const{g}$(\tau, t)$ that guards bound 1095variables. Constants are annotated with their types, supplied as extra 1096arguments, to resolve overloading. 1097 1098\item[\labelitemi] \textbf{\textit{poly\_tags} (sound):} Each term and subterm is 1099tagged with its type using a function $\const{t\/}(\tau, t)$. 1100 1101\item[\labelitemi] \textbf{\textit{poly\_args} (unsound):} 1102Like for \textit{poly\_guards} constants are annotated with their types to 1103resolve overloading, but otherwise no type information is encoded. This 1104is the default encoding used by the \textit{metis} proof method. 1105 1106\item[\labelitemi] 1107\textbf{% 1108\textit{raw\_mono\_guards}, \textit{raw\_mono\_tags} (sound); \\ 1109\textit{raw\_mono\_args} (unsound):} \\ 1110Similar to \textit{poly\_guards}, \textit{poly\_tags}, and \textit{poly\_args}, 1111respectively, but the problem is additionally monomorphized, meaning that type 1112variables are instantiated with heuristically chosen ground types. 1113Monomorphization can simplify reasoning but also leads to larger fact bases, 1114which can slow down the ATPs. 1115 1116\item[\labelitemi] 1117\textbf{% 1118\textit{mono\_guards}, \textit{mono\_tags} (sound); 1119\textit{mono\_args} (unsound):} \\ 1120Similar to 1121\textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, and 1122\textit{raw\_mono\_args}, respectively but types are mangled in constant names 1123instead of being supplied as ground term arguments. The binary predicate 1124$\const{g}(\tau, t)$ becomes a unary predicate 1125$\const{g\_}\tau(t)$, and the binary function 1126$\const{t}(\tau, t)$ becomes a unary function 1127$\const{t\_}\tau(t)$. 1128 1129\item[\labelitemi] \textbf{\textit{mono\_native} (sound):} Exploits native 1130first-order types if the prover supports the TF0, TF1, TH0, or TH1 syntax; 1131otherwise, falls back on \textit{mono\_guards}. The problem is monomorphized. 1132 1133\item[\labelitemi] \textbf{\textit{mono\_native\_higher} (sound):} Exploits 1134native higher-order types if the prover supports the TH0 syntax; otherwise, 1135falls back on \textit{mono\_native} or \textit{mono\_guards}. The problem is 1136monomorphized. 1137 1138\item[\labelitemi] \textbf{\textit{poly\_native} (sound):} Exploits native 1139first-order polymorphic types if the prover supports the TF1 or TH1 syntax; 1140otherwise, falls back on \textit{mono\_native}. 1141 1142\item[\labelitemi] 1143\textbf{% 1144\textit{poly\_guards}?, \textit{poly\_tags}?, \textit{raw\_mono\_guards}?, \\ 1145\textit{raw\_mono\_tags}?, \textit{mono\_guards}?, \textit{mono\_tags}?, \\ 1146\textit{mono\_native}? (sound*):} \\ 1147The type encodings \textit{poly\_guards}, \textit{poly\_tags}, 1148\textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, \textit{mono\_guards}, 1149\textit{mono\_tags}, and \textit{mono\_native} are fully typed and sound. For 1150each of these, Sledgehammer also provides a lighter variant identified by a 1151question mark (`\hbox{?}')\ that detects and erases monotonic types, notably 1152infinite types. (For \textit{mono\_native}, the types are not actually erased 1153but rather replaced by a shared uniform type of individuals.) As argument to the 1154\textit{metis} proof method, the question mark is replaced by a 1155\hbox{``\textit{\_query\/}''} suffix. 1156 1157\item[\labelitemi] 1158\textbf{% 1159\textit{poly\_guards}??, \textit{poly\_tags}??, \textit{raw\_mono\_guards}??, \\ 1160\textit{raw\_mono\_tags}??, \textit{mono\_guards}??, \textit{mono\_tags}?? \\ 1161(sound*):} \\ 1162Even lighter versions of the `\hbox{?}' encodings. As argument to the 1163\textit{metis} proof method, the `\hbox{??}' suffix is replaced by 1164\hbox{``\textit{\_query\_query\/}''}. 1165 1166\item[\labelitemi] 1167\textbf{% 1168\textit{poly\_guards}@, \textit{poly\_tags}@, \textit{raw\_mono\_guards}@, \\ 1169\textit{raw\_mono\_tags}@ (sound*):} \\ 1170Alternative versions of the `\hbox{??}' encodings. As argument to the 1171\textit{metis} proof method, the `\hbox{@}' suffix is replaced by 1172\hbox{``\textit{\_at\/}''}. 1173 1174\item[\labelitemi] \textbf{\textit{poly\_args}?, \textit{raw\_mono\_args}? (unsound):} \\ 1175Lighter versions of \textit{poly\_args} and \textit{raw\_mono\_args}. 1176 1177\item[\labelitemi] \textbf{\textit{smart}:} The actual encoding used depends on 1178the ATP and should be the most efficient sound encoding for that ATP. 1179\end{enum} 1180 1181For SMT solvers, the type encoding is always \textit{mono\_native}, irrespective 1182of the value of this option. 1183 1184\nopagebreak 1185{\small See also \textit{max\_new\_mono\_instances} (\S\ref{relevance-filter}) 1186and \textit{max\_mono\_iters} (\S\ref{relevance-filter}).} 1187 1188\opfalse{strict}{non\_strict} 1189Specifies whether Sledgehammer should run in its strict mode. In that mode, 1190sound type encodings marked with an asterisk (*) above are made complete 1191for reconstruction with \textit{metis}, at the cost of some clutter in the 1192generated problems. This option has no effect if \textit{type\_enc} is 1193deliberately set to an unsound encoding. 1194\end{enum} 1195 1196 1197\subsection{Output Format} 1198\label{output-format} 1199 1200\begin{enum} 1201 1202\opfalse{verbose}{quiet} 1203Specifies whether the \textbf{sledgehammer} command should explain what it does. 1204 1205\opfalse{debug}{no\_debug} 1206Specifies whether Sledgehammer should display additional debugging information 1207beyond what \textit{verbose} already displays. Enabling \textit{debug} also 1208enables \textit{verbose} behind the scenes. 1209 1210\nopagebreak 1211{\small See also \textit{spy} (\S\ref{mode-of-operation}) and 1212\textit{overlord} (\S\ref{mode-of-operation}).} 1213 1214\opsmart{isar\_proofs}{no\_isar\_proofs} 1215Specifies whether Isar proofs should be output in addition to one-line proofs. 1216The construction of Isar proof is still experimental and may sometimes fail; 1217however, when they succeed they are usually faster and more intelligible than 1218one-line proofs. If the option is set to \textit{smart} (the default), Isar 1219proofs are only generated when no working one-line proof is available. 1220 1221\opdefault{compress}{int}{smart} 1222Specifies the granularity of the generated Isar proofs if \textit{isar\_proofs} 1223is explicitly enabled. A value of $n$ indicates that each Isar proof step should 1224correspond to a group of up to $n$ consecutive proof steps in the ATP proof. If 1225the option is set to \textit{smart} (the default), the compression factor is 10 1226if the \textit{isar\_proofs} option is explicitly enabled; otherwise, it is 1227$\infty$. 1228 1229\optrueonly{dont\_compress} 1230Alias for ``\textit{compress} = 1''. 1231 1232\optrue{try0}{dont\_try0} 1233Specifies whether standard proof methods such as \textit{auto} and 1234\textit{blast} should be tried as alternatives to \textit{metis} in Isar proofs. 1235The collection of methods is roughly the same as for the \textbf{try0} command. 1236 1237\opsmart{smt\_proofs}{no\_smt\_proofs} 1238Specifies whether the \textit{smt} proof method should be tried in addition to 1239Isabelle's other proof methods. If the option is set to \textit{smart} (the 1240default), the \textit{smt} method is used for one-line proofs but not in Isar 1241proofs. 1242\end{enum} 1243 1244 1245\subsection{Regression Testing} 1246\label{regression-testing} 1247 1248\begin{enum} 1249\opnodefault{expect}{string} 1250Specifies the expected outcome, which must be one of the following: 1251 1252\begin{enum} 1253\item[\labelitemi] \textbf{\textit{some}:} Sledgehammer found a proof. 1254\item[\labelitemi] \textbf{\textit{none}:} Sledgehammer found no proof. 1255\item[\labelitemi] \textbf{\textit{timeout}:} Sledgehammer timed out. 1256\item[\labelitemi] \textbf{\textit{unknown}:} Sledgehammer encountered some 1257problem. 1258\end{enum} 1259 1260Sledgehammer emits an error if the actual outcome differs from the expected outcome. This option is 1261useful for regression testing. 1262 1263\nopagebreak 1264{\small See also \textit{timeout} (\S\ref{timeouts}).} 1265\end{enum} 1266 1267 1268\subsection{Timeouts} 1269\label{timeouts} 1270 1271\begin{enum} 1272\opdefault{timeout}{float}{\upshape 30} 1273Specifies the maximum number of seconds that the automatic provers should spend 1274searching for a proof. This excludes problem preparation and is a soft limit. 1275 1276\opdefault{preplay\_timeout}{float}{\upshape 1} 1277Specifies the maximum number of seconds that \textit{metis} or other proof 1278methods should spend trying to ``preplay'' the found proof. If this option 1279is set to 0, no preplaying takes place, and no timing information is displayed 1280next to the suggested proof method calls. 1281 1282\nopagebreak 1283{\small See also \textit{minimize} (\S\ref{mode-of-operation}).} 1284 1285\optrueonly{dont\_preplay} 1286Alias for ``\textit{preplay\_timeout} = 0''. 1287 1288\end{enum} 1289 1290\let\em=\sl 1291\bibliography{manual}{} 1292\bibliographystyle{abbrv} 1293 1294\end{document} 1295