1<html lang="en"> 2<head> 3<title>Optimize Options - Using the GNU Compiler Collection (GCC)</title> 4<meta http-equiv="Content-Type" content="text/html"> 5<meta name="description" content="Using the GNU Compiler Collection (GCC)"> 6<meta name="generator" content="makeinfo 4.13"> 7<link title="Top" rel="start" href="index.html#Top"> 8<link rel="up" href="Invoking-GCC.html#Invoking-GCC" title="Invoking GCC"> 9<link rel="prev" href="Debugging-Options.html#Debugging-Options" title="Debugging Options"> 10<link rel="next" href="Preprocessor-Options.html#Preprocessor-Options" title="Preprocessor Options"> 11<link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage"> 12<!-- 13Copyright (C) 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 141998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 152010 Free Software Foundation, Inc. 16 17Permission is granted to copy, distribute and/or modify this document 18under the terms of the GNU Free Documentation License, Version 1.3 or 19any later version published by the Free Software Foundation; with the 20Invariant Sections being ``Funding Free Software'', the Front-Cover 21Texts being (a) (see below), and with the Back-Cover Texts being (b) 22(see below). A copy of the license is included in the section entitled 23``GNU Free Documentation License''. 24 25(a) The FSF's Front-Cover Text is: 26 27 A GNU Manual 28 29(b) The FSF's Back-Cover Text is: 30 31 You have freedom to copy and modify this GNU Manual, like GNU 32 software. Copies published by the Free Software Foundation raise 33 funds for GNU development.--> 34<meta http-equiv="Content-Style-Type" content="text/css"> 35<style type="text/css"><!-- 36 pre.display { font-family:inherit } 37 pre.format { font-family:inherit } 38 pre.smalldisplay { font-family:inherit; font-size:smaller } 39 pre.smallformat { font-family:inherit; font-size:smaller } 40 pre.smallexample { font-size:smaller } 41 pre.smalllisp { font-size:smaller } 42 span.sc { font-variant:small-caps } 43 span.roman { font-family:serif; font-weight:normal; } 44 span.sansserif { font-family:sans-serif; font-weight:normal; } 45--></style> 46<link rel="stylesheet" type="text/css" href="../cs.css"> 47</head> 48<body> 49<div class="node"> 50<a name="Optimize-Options"></a> 51<p> 52Next: <a rel="next" accesskey="n" href="Preprocessor-Options.html#Preprocessor-Options">Preprocessor Options</a>, 53Previous: <a rel="previous" accesskey="p" href="Debugging-Options.html#Debugging-Options">Debugging Options</a>, 54Up: <a rel="up" accesskey="u" href="Invoking-GCC.html#Invoking-GCC">Invoking GCC</a> 55<hr> 56</div> 57 58<h3 class="section">3.10 Options That Control Optimization</h3> 59 60<p><a name="index-optimize-options-679"></a><a name="index-options_002c-optimization-680"></a> 61These options control various sorts of optimizations. 62 63 <p>Without any optimization option, the compiler's goal is to reduce the 64cost of compilation and to make debugging produce the expected 65results. Statements are independent: if you stop the program with a 66breakpoint between statements, you can then assign a new value to any 67variable or change the program counter to any other statement in the 68function and get exactly the results you would expect from the source 69code. 70 71 <p>Turning on optimization flags makes the compiler attempt to improve 72the performance and/or code size at the expense of compilation time 73and possibly the ability to debug the program. 74 75 <p>The compiler performs optimization based on the knowledge it has of the 76program. Compiling multiple files at once to a single output file mode allows 77the compiler to use information gained from all of the files when compiling 78each of them. 79 80 <p>Not all optimizations are controlled directly by a flag. Only 81optimizations that have a flag are listed in this section. 82 83 <p>Most optimizations are only enabled if an <samp><span class="option">-O</span></samp> level is set on 84the command line. Otherwise they are disabled, even if individual 85optimization flags are specified. 86 87 <p>Depending on the target and how GCC was configured, a slightly different 88set of optimizations may be enabled at each <samp><span class="option">-O</span></samp> level than 89those listed here. You can invoke GCC with ‘<samp><span class="samp">-Q --help=optimizers</span></samp>’ 90to find out the exact set of optimizations that are enabled at each level. 91See <a href="Overall-Options.html#Overall-Options">Overall Options</a>, for examples. 92 93 <dl> 94<dt><code>-O</code><dt><code>-O1</code><dd><a name="index-O-681"></a><a name="index-O1-682"></a>Optimize. Optimizing compilation takes somewhat more time, and a lot 95more memory for a large function. 96 97 <p>With <samp><span class="option">-O</span></samp>, the compiler tries to reduce code size and execution 98time, without performing any optimizations that take a great deal of 99compilation time. 100 101 <p><samp><span class="option">-O</span></samp> turns on the following optimization flags: 102 <pre class="smallexample"> -fauto-inc-dec 103 -fcompare-elim 104 -fcprop-registers 105 -fdce 106 -fdefer-pop 107 -fdelayed-branch 108 -fdse 109 -fguess-branch-probability 110 -fif-conversion2 111 -fif-conversion 112 -fipa-pure-const 113 -fipa-profile 114 -fipa-reference 115 -fmerge-constants 116 -fsplit-wide-types 117 -ftree-bit-ccp 118 -ftree-builtin-call-dce 119 -ftree-ccp 120 -ftree-ch 121 -ftree-copyrename 122 -ftree-dce 123 -ftree-dominator-opts 124 -ftree-dse 125 -ftree-forwprop 126 -ftree-fre 127 -ftree-phiprop 128 -ftree-sra 129 -ftree-pta 130 -ftree-ter 131 -funit-at-a-time 132</pre> 133 <p><samp><span class="option">-O</span></samp> also turns on <samp><span class="option">-fomit-frame-pointer</span></samp> on machines 134where doing so does not interfere with debugging. 135 136 <br><dt><code>-O2</code><dd><a name="index-O2-683"></a>Optimize even more. GCC performs nearly all supported optimizations 137that do not involve a space-speed tradeoff. 138As compared to <samp><span class="option">-O</span></samp>, this option increases both compilation time 139and the performance of the generated code. 140 141 <p><samp><span class="option">-O2</span></samp> turns on all optimization flags specified by <samp><span class="option">-O</span></samp>. It 142also turns on the following optimization flags: 143 <pre class="smallexample"> -fthread-jumps 144 -falign-functions -falign-jumps 145 -falign-loops -falign-labels 146 -fcaller-saves 147 -fcrossjumping 148 -fcse-follow-jumps -fcse-skip-blocks 149 -fdelete-null-pointer-checks 150 -fdevirtualize 151 -fexpensive-optimizations 152 -fgcse -fgcse-lm 153 -finline-small-functions 154 -findirect-inlining 155 -fipa-sra 156 -foptimize-sibling-calls 157 -fpartial-inlining 158 -fpeephole2 159 -fregmove 160 -freorder-blocks -freorder-functions 161 -frerun-cse-after-loop 162 -fsched-interblock -fsched-spec 163 -fschedule-insns -fschedule-insns2 164 -fstrict-aliasing -fstrict-overflow 165 -ftree-if-to-switch-conversion 166 -ftree-switch-conversion 167 -ftree-pre 168 -ftree-vrp 169</pre> 170 <p>Please note the warning under <samp><span class="option">-fgcse</span></samp> about 171invoking <samp><span class="option">-O2</span></samp> on programs that use computed gotos. 172 173 <br><dt><code>-O3</code><dd><a name="index-O3-684"></a>Optimize yet more. <samp><span class="option">-O3</span></samp> turns on all optimizations specified 174by <samp><span class="option">-O2</span></samp> and also turns on the <samp><span class="option">-finline-functions</span></samp>, 175<samp><span class="option">-funswitch-loops</span></samp>, <samp><span class="option">-fpredictive-commoning</span></samp>, 176<samp><span class="option">-fgcse-after-reload</span></samp>, <samp><span class="option">-ftree-vectorize</span></samp> and 177<samp><span class="option">-fipa-cp-clone</span></samp> options. 178 179 <br><dt><code>-O0</code><dd><a name="index-O0-685"></a>Reduce compilation time and make debugging produce the expected 180results. This is the default. 181 182 <br><dt><code>-Os</code><dd><a name="index-Os-686"></a>Optimize for size. <samp><span class="option">-Os</span></samp> enables all <samp><span class="option">-O2</span></samp> optimizations that 183do not typically increase code size. It also performs further 184optimizations designed to reduce code size. 185 186 <p><samp><span class="option">-Os</span></samp> disables the following optimization flags: 187 <pre class="smallexample"> -falign-functions -falign-jumps -falign-loops 188 -falign-labels -freorder-blocks -freorder-blocks-and-partition 189 -fprefetch-loop-arrays -ftree-vect-loop-version 190</pre> 191 <br><dt><code>-Ofast</code><dd><a name="index-Ofast-687"></a>Disregard strict standards compliance. <samp><span class="option">-Ofast</span></samp> enables all 192<samp><span class="option">-O3</span></samp> optimizations. It also enables optimizations that are not 193valid for all standard compliant programs. 194It turns on <samp><span class="option">-ffast-math</span></samp>. 195 196 <p>If you use multiple <samp><span class="option">-O</span></samp> options, with or without level numbers, 197the last such option is the one that is effective. 198</dl> 199 200 <p>Options of the form <samp><span class="option">-f</span><var>flag</var></samp> specify machine-independent 201flags. Most flags have both positive and negative forms; the negative 202form of <samp><span class="option">-ffoo</span></samp> would be <samp><span class="option">-fno-foo</span></samp>. In the table 203below, only one of the forms is listed—the one you typically will 204use. You can figure out the other form by either removing ‘<samp><span class="samp">no-</span></samp>’ 205or adding it. 206 207 <p>The following options control specific optimizations. They are either 208activated by <samp><span class="option">-O</span></samp> options or are related to ones that are. You 209can use the following flags in the rare cases when “fine-tuning” of 210optimizations to be performed is desired. 211 212 <dl> 213<dt><code>-fno-default-inline</code><dd><a name="index-fno_002ddefault_002dinline-688"></a>Do not make member functions inline by default merely because they are 214defined inside the class scope (C++ only). Otherwise, when you specify 215<samp><span class="option">-O</span></samp><!-- /@w -->, member functions defined inside class scope are compiled 216inline by default; i.e., you don't need to add ‘<samp><span class="samp">inline</span></samp>’ in front of 217the member function name. 218 219 <br><dt><code>-fno-defer-pop</code><dd><a name="index-fno_002ddefer_002dpop-689"></a>Always pop the arguments to each function call as soon as that function 220returns. For machines which must pop arguments after a function call, 221the compiler normally lets arguments accumulate on the stack for several 222function calls and pops them all at once. 223 224 <p>Disabled at levels <samp><span class="option">-O</span></samp>, <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 225 226 <br><dt><code>-fforward-propagate</code><dd><a name="index-fforward_002dpropagate-690"></a>Perform a forward propagation pass on RTL. The pass tries to combine two 227instructions and checks if the result can be simplified. If loop unrolling 228is active, two passes are performed and the second is scheduled after 229loop unrolling. 230 231 <p>This option is enabled by default at optimization levels <samp><span class="option">-O</span></samp>, 232<samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 233 234 <br><dt><code>-ffp-contract=</code><var>style</var><dd><a name="index-ffp_002dcontract-691"></a><samp><span class="option">-ffp-contract=off</span></samp> disables floating-point expression contraction. 235<samp><span class="option">-ffp-contract=fast</span></samp> enables floating-point expression contraction 236such as forming of fused multiply-add operations if the target has 237native support for them. 238<samp><span class="option">-ffp-contract=on</span></samp> enables floating-point expression contraction 239if allowed by the language standard. This is currently not implemented 240and treated equal to <samp><span class="option">-ffp-contract=off</span></samp>. 241 242 <p>The default is <samp><span class="option">-ffp-contract=fast</span></samp>. 243 244 <br><dt><code>-fomit-frame-pointer</code><dd><a name="index-fomit_002dframe_002dpointer-692"></a>Don't keep the frame pointer in a register for functions that 245don't need one. This avoids the instructions to save, set up and 246restore frame pointers; it also makes an extra register available 247in many functions. <strong>It also makes debugging impossible on 248some machines.</strong> 249 250 <p>On some machines, such as the VAX, this flag has no effect, because 251the standard calling sequence automatically handles the frame pointer 252and nothing is saved by pretending it doesn't exist. The 253machine-description macro <code>FRAME_POINTER_REQUIRED</code> controls 254whether a target machine supports this flag. See <a href="../gccint/Registers.html#Registers">Register Usage</a>. 255 256 <p>Starting with GCC version 4.6, the default setting (when not optimizing for 257size) for 32-bit Linux x86 and 32-bit Darwin x86 targets has been changed to 258<samp><span class="option">-fomit-frame-pointer</span></samp>. The default can be reverted to 259<samp><span class="option">-fno-omit-frame-pointer</span></samp> by configuring GCC with the 260<samp><span class="option">--enable-frame-pointer</span></samp> configure option. 261 262 <p>Enabled at levels <samp><span class="option">-O</span></samp>, <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 263 264 <br><dt><code>-foptimize-sibling-calls</code><dd><a name="index-foptimize_002dsibling_002dcalls-693"></a>Optimize sibling and tail recursive calls. 265 266 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 267 268 <br><dt><code>-fno-inline</code><dd><a name="index-fno_002dinline-694"></a>Don't pay attention to the <code>inline</code> keyword. Normally this option 269is used to keep the compiler from expanding any functions inline. 270Note that if you are not optimizing, no functions can be expanded inline. 271 272 <br><dt><code>-finline-small-functions</code><dd><a name="index-finline_002dsmall_002dfunctions-695"></a>Integrate functions into their callers when their body is smaller than expected 273function call code (so overall size of program gets smaller). The compiler 274heuristically decides which functions are simple enough to be worth integrating 275in this way. 276 277 <p>Enabled at level <samp><span class="option">-O2</span></samp>. 278 279 <br><dt><code>-findirect-inlining</code><dd><a name="index-findirect_002dinlining-696"></a>Inline also indirect calls that are discovered to be known at compile 280time thanks to previous inlining. This option has any effect only 281when inlining itself is turned on by the <samp><span class="option">-finline-functions</span></samp> 282or <samp><span class="option">-finline-small-functions</span></samp> options. 283 284 <p>Enabled at level <samp><span class="option">-O2</span></samp>. 285 286 <br><dt><code>-finline-functions</code><dd><a name="index-finline_002dfunctions-697"></a>Integrate all simple functions into their callers. The compiler 287heuristically decides which functions are simple enough to be worth 288integrating in this way. 289 290 <p>If all calls to a given function are integrated, and the function is 291declared <code>static</code>, then the function is normally not output as 292assembler code in its own right. 293 294 <p>Enabled at level <samp><span class="option">-O3</span></samp>. 295 296 <br><dt><code>-finline-functions-called-once</code><dd><a name="index-finline_002dfunctions_002dcalled_002donce-698"></a>Consider all <code>static</code> functions called once for inlining into their 297caller even if they are not marked <code>inline</code>. If a call to a given 298function is integrated, then the function is not output as assembler code 299in its own right. 300 301 <p>Enabled at levels <samp><span class="option">-O1</span></samp>, <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp> and <samp><span class="option">-Os</span></samp>. 302 303 <br><dt><code>-fearly-inlining</code><dd><a name="index-fearly_002dinlining-699"></a>Inline functions marked by <code>always_inline</code> and functions whose body seems 304smaller than the function call overhead early before doing 305<samp><span class="option">-fprofile-generate</span></samp> instrumentation and real inlining pass. Doing so 306makes profiling significantly cheaper and usually inlining faster on programs 307having large chains of nested wrapper functions. 308 309 <p>Enabled by default. 310 311 <br><dt><code>-fipa-sra</code><dd><a name="index-fipa_002dsra-700"></a>Perform interprocedural scalar replacement of aggregates, removal of 312unused parameters and replacement of parameters passed by reference 313by parameters passed by value. 314 315 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp> and <samp><span class="option">-Os</span></samp>. 316 317 <br><dt><code>-finline-limit=</code><var>n</var><dd><a name="index-finline_002dlimit-701"></a>By default, GCC limits the size of functions that can be inlined. This flag 318allows coarse control of this limit. <var>n</var> is the size of functions that 319can be inlined in number of pseudo instructions. 320 321 <p>Inlining is actually controlled by a number of parameters, which may be 322specified individually by using <samp><span class="option">--param </span><var>name</var><span class="option">=</span><var>value</var></samp>. 323The <samp><span class="option">-finline-limit=</span><var>n</var></samp> option sets some of these parameters 324as follows: 325 326 <dl> 327<dt><code>max-inline-insns-single</code><dd>is set to <var>n</var>/2. 328<br><dt><code>max-inline-insns-auto</code><dd>is set to <var>n</var>/2. 329</dl> 330 331 <p>See below for a documentation of the individual 332parameters controlling inlining and for the defaults of these parameters. 333 334 <p><em>Note:</em> there may be no value to <samp><span class="option">-finline-limit</span></samp> that results 335in default behavior. 336 337 <p><em>Note:</em> pseudo instruction represents, in this particular context, an 338abstract measurement of function's size. In no way does it represent a count 339of assembly instructions and as such its exact meaning might change from one 340release to an another. 341 342 <br><dt><code>-fno-keep-inline-dllexport</code><dd><a name="index-g_t_002dfno_002dkeep_002dinline_002ddllexport-702"></a>This is a more fine-grained version of <samp><span class="option">-fkeep-inline-functions</span></samp>, 343which applies only to functions that are declared using the <code>dllexport</code> 344attribute or declspec (See <a href="Function-Attributes.html#Function-Attributes">Declaring Attributes of Functions</a>.) 345 346 <br><dt><code>-fkeep-inline-functions</code><dd><a name="index-fkeep_002dinline_002dfunctions-703"></a>In C, emit <code>static</code> functions that are declared <code>inline</code> 347into the object file, even if the function has been inlined into all 348of its callers. This switch does not affect functions using the 349<code>extern inline</code> extension in GNU C90. In C++, emit any and all 350inline functions into the object file. 351 352 <br><dt><code>-fkeep-static-consts</code><dd><a name="index-fkeep_002dstatic_002dconsts-704"></a>Emit variables declared <code>static const</code> when optimization isn't turned 353on, even if the variables aren't referenced. 354 355 <p>GCC enables this option by default. If you want to force the compiler to 356check if the variable was referenced, regardless of whether or not 357optimization is turned on, use the <samp><span class="option">-fno-keep-static-consts</span></samp> option. 358 359 <br><dt><code>-fmerge-constants</code><dd><a name="index-fmerge_002dconstants-705"></a>Attempt to merge identical constants (string constants and floating point 360constants) across compilation units. 361 362 <p>This option is the default for optimized compilation if the assembler and 363linker support it. Use <samp><span class="option">-fno-merge-constants</span></samp> to inhibit this 364behavior. 365 366 <p>Enabled at levels <samp><span class="option">-O</span></samp>, <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 367 368 <br><dt><code>-fmerge-all-constants</code><dd><a name="index-fmerge_002dall_002dconstants-706"></a>Attempt to merge identical constants and identical variables. 369 370 <p>This option implies <samp><span class="option">-fmerge-constants</span></samp>. In addition to 371<samp><span class="option">-fmerge-constants</span></samp> this considers e.g. even constant initialized 372arrays or initialized constant variables with integral or floating point 373types. Languages like C or C++ require each variable, including multiple 374instances of the same variable in recursive calls, to have distinct locations, 375so using this option will result in non-conforming 376behavior. 377 378 <br><dt><code>-fmodulo-sched</code><dd><a name="index-fmodulo_002dsched-707"></a>Perform swing modulo scheduling immediately before the first scheduling 379pass. This pass looks at innermost loops and reorders their 380instructions by overlapping different iterations. 381 382 <br><dt><code>-fmodulo-sched-allow-regmoves</code><dd><a name="index-fmodulo_002dsched_002dallow_002dregmoves-708"></a>Perform more aggressive SMS based modulo scheduling with register moves 383allowed. By setting this flag certain anti-dependences edges will be 384deleted which will trigger the generation of reg-moves based on the 385life-range analysis. This option is effective only with 386<samp><span class="option">-fmodulo-sched</span></samp> enabled. 387 388 <br><dt><code>-fno-branch-count-reg</code><dd><a name="index-fno_002dbranch_002dcount_002dreg-709"></a>Do not use “decrement and branch” instructions on a count register, 389but instead generate a sequence of instructions that decrement a 390register, compare it against zero, then branch based upon the result. 391This option is only meaningful on architectures that support such 392instructions, which include x86, PowerPC, IA-64 and S/390. 393 394 <p>The default is <samp><span class="option">-fbranch-count-reg</span></samp>. 395 396 <br><dt><code>-fno-function-cse</code><dd><a name="index-fno_002dfunction_002dcse-710"></a>Do not put function addresses in registers; make each instruction that 397calls a constant function contain the function's address explicitly. 398 399 <p>This option results in less efficient code, but some strange hacks 400that alter the assembler output may be confused by the optimizations 401performed when this option is not used. 402 403 <p>The default is <samp><span class="option">-ffunction-cse</span></samp> 404 405 <br><dt><code>-fno-zero-initialized-in-bss</code><dd><a name="index-fno_002dzero_002dinitialized_002din_002dbss-711"></a>If the target supports a BSS section, GCC by default puts variables that 406are initialized to zero into BSS. This can save space in the resulting 407code. 408 409 <p>This option turns off this behavior because some programs explicitly 410rely on variables going to the data section. E.g., so that the 411resulting executable can find the beginning of that section and/or make 412assumptions based on that. 413 414 <p>The default is <samp><span class="option">-fzero-initialized-in-bss</span></samp>. 415 416 <br><dt><code>-fmudflap -fmudflapth -fmudflapir</code><dd><a name="index-fmudflap-712"></a><a name="index-fmudflapth-713"></a><a name="index-fmudflapir-714"></a><a name="index-bounds-checking-715"></a><a name="index-mudflap-716"></a>For front-ends that support it (C and C++), instrument all risky 417pointer/array dereferencing operations, some standard library 418string/heap functions, and some other associated constructs with 419range/validity tests. Modules so instrumented should be immune to 420buffer overflows, invalid heap use, and some other classes of C/C++ 421programming errors. The instrumentation relies on a separate runtime 422library (<samp><span class="file">libmudflap</span></samp>), which will be linked into a program if 423<samp><span class="option">-fmudflap</span></samp> is given at link time. Run-time behavior of the 424instrumented program is controlled by the <samp><span class="env">MUDFLAP_OPTIONS</span></samp> 425environment variable. See <code>env MUDFLAP_OPTIONS=-help a.out</code> 426for its options. 427 428 <p>Use <samp><span class="option">-fmudflapth</span></samp> instead of <samp><span class="option">-fmudflap</span></samp> to compile and to 429link if your program is multi-threaded. Use <samp><span class="option">-fmudflapir</span></samp>, in 430addition to <samp><span class="option">-fmudflap</span></samp> or <samp><span class="option">-fmudflapth</span></samp>, if 431instrumentation should ignore pointer reads. This produces less 432instrumentation (and therefore faster execution) and still provides 433some protection against outright memory corrupting writes, but allows 434erroneously read data to propagate within a program. 435 436 <br><dt><code>-fthread-jumps</code><dd><a name="index-fthread_002djumps-717"></a>Perform optimizations where we check to see if a jump branches to a 437location where another comparison subsumed by the first is found. If 438so, the first branch is redirected to either the destination of the 439second branch or a point immediately following it, depending on whether 440the condition is known to be true or false. 441 442 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 443 444 <br><dt><code>-fsplit-wide-types</code><dd><a name="index-fsplit_002dwide_002dtypes-718"></a>When using a type that occupies multiple registers, such as <code>long 445long</code> on a 32-bit system, split the registers apart and allocate them 446independently. This normally generates better code for those types, 447but may make debugging more difficult. 448 449 <p>Enabled at levels <samp><span class="option">-O</span></samp>, <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, 450<samp><span class="option">-Os</span></samp>. 451 452 <br><dt><code>-fcse-follow-jumps</code><dd><a name="index-fcse_002dfollow_002djumps-719"></a>In common subexpression elimination (CSE), scan through jump instructions 453when the target of the jump is not reached by any other path. For 454example, when CSE encounters an <code>if</code> statement with an 455<code>else</code> clause, CSE will follow the jump when the condition 456tested is false. 457 458 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 459 460 <br><dt><code>-fcse-skip-blocks</code><dd><a name="index-fcse_002dskip_002dblocks-720"></a>This is similar to <samp><span class="option">-fcse-follow-jumps</span></samp>, but causes CSE to 461follow jumps which conditionally skip over blocks. When CSE 462encounters a simple <code>if</code> statement with no else clause, 463<samp><span class="option">-fcse-skip-blocks</span></samp> causes CSE to follow the jump around the 464body of the <code>if</code>. 465 466 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 467 468 <br><dt><code>-frerun-cse-after-loop</code><dd><a name="index-frerun_002dcse_002dafter_002dloop-721"></a>Re-run common subexpression elimination after loop optimizations has been 469performed. 470 471 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 472 473 <br><dt><code>-fgcse</code><dd><a name="index-fgcse-722"></a>Perform a global common subexpression elimination pass. 474This pass also performs global constant and copy propagation. 475 476 <p><em>Note:</em> When compiling a program using computed gotos, a GCC 477extension, you may get better runtime performance if you disable 478the global common subexpression elimination pass by adding 479<samp><span class="option">-fno-gcse</span></samp> to the command line. 480 481 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 482 483 <br><dt><code>-fgcse-lm</code><dd><a name="index-fgcse_002dlm-723"></a>When <samp><span class="option">-fgcse-lm</span></samp> is enabled, global common subexpression elimination will 484attempt to move loads which are only killed by stores into themselves. This 485allows a loop containing a load/store sequence to be changed to a load outside 486the loop, and a copy/store within the loop. 487 488 <p>Enabled by default when gcse is enabled. 489 490 <br><dt><code>-fgcse-sm</code><dd><a name="index-fgcse_002dsm-724"></a>When <samp><span class="option">-fgcse-sm</span></samp> is enabled, a store motion pass is run after 491global common subexpression elimination. This pass will attempt to move 492stores out of loops. When used in conjunction with <samp><span class="option">-fgcse-lm</span></samp>, 493loops containing a load/store sequence can be changed to a load before 494the loop and a store after the loop. 495 496 <p>Not enabled at any optimization level. 497 498 <br><dt><code>-fgcse-las</code><dd><a name="index-fgcse_002dlas-725"></a>When <samp><span class="option">-fgcse-las</span></samp> is enabled, the global common subexpression 499elimination pass eliminates redundant loads that come after stores to the 500same memory location (both partial and full redundancies). 501 502 <p>Not enabled at any optimization level. 503 504 <br><dt><code>-fgcse-after-reload</code><dd><a name="index-fgcse_002dafter_002dreload-726"></a>When <samp><span class="option">-fgcse-after-reload</span></samp> is enabled, a redundant load elimination 505pass is performed after reload. The purpose of this pass is to cleanup 506redundant spilling. 507 508 <br><dt><code>-funsafe-loop-optimizations</code><dd><a name="index-funsafe_002dloop_002doptimizations-727"></a>If given, the loop optimizer will assume that loop indices do not 509overflow, and that the loops with nontrivial exit condition are not 510infinite. This enables a wider range of loop optimizations even if 511the loop optimizer itself cannot prove that these assumptions are valid. 512Using <samp><span class="option">-Wunsafe-loop-optimizations</span></samp>, the compiler will warn you 513if it finds this kind of loop. 514 515 <br><dt><code>-fcrossjumping</code><dd><a name="index-fcrossjumping-728"></a>Perform cross-jumping transformation. This transformation unifies equivalent code and save code size. The 516resulting code may or may not perform better than without cross-jumping. 517 518 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 519 520 <br><dt><code>-fauto-inc-dec</code><dd><a name="index-fauto_002dinc_002ddec-729"></a>Combine increments or decrements of addresses with memory accesses. 521This pass is always skipped on architectures that do not have 522instructions to support this. Enabled by default at <samp><span class="option">-O</span></samp> and 523higher on architectures that support this. 524 525 <br><dt><code>-fdce</code><dd><a name="index-fdce-730"></a>Perform dead code elimination (DCE) on RTL. 526Enabled by default at <samp><span class="option">-O</span></samp> and higher. 527 528 <br><dt><code>-fdse</code><dd><a name="index-fdse-731"></a>Perform dead store elimination (DSE) on RTL. 529Enabled by default at <samp><span class="option">-O</span></samp> and higher. 530 531 <br><dt><code>-fif-conversion</code><dd><a name="index-fif_002dconversion-732"></a>Attempt to transform conditional jumps into branch-less equivalents. This 532include use of conditional moves, min, max, set flags and abs instructions, and 533some tricks doable by standard arithmetics. The use of conditional execution 534on chips where it is available is controlled by <code>if-conversion2</code>. 535 536 <p>Enabled at levels <samp><span class="option">-O</span></samp>, <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 537 538 <br><dt><code>-fif-conversion2</code><dd><a name="index-fif_002dconversion2-733"></a>Use conditional execution (where available) to transform conditional jumps into 539branch-less equivalents. 540 541 <p>Enabled at levels <samp><span class="option">-O</span></samp>, <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 542 543 <br><dt><code>-fdelete-null-pointer-checks</code><dd><a name="index-fdelete_002dnull_002dpointer_002dchecks-734"></a>Assume that programs cannot safely dereference null pointers, and that 544no code or data element resides there. This enables simple constant 545folding optimizations at all optimization levels. In addition, other 546optimization passes in GCC use this flag to control global dataflow 547analyses that eliminate useless checks for null pointers; these assume 548that if a pointer is checked after it has already been dereferenced, 549it cannot be null. 550 551 <p>Note however that in some environments this assumption is not true. 552Use <samp><span class="option">-fno-delete-null-pointer-checks</span></samp> to disable this optimization 553for programs which depend on that behavior. 554 555 <p>Some targets, especially embedded ones, disable this option at all levels. 556Otherwise it is enabled at all levels: <samp><span class="option">-O0</span></samp>, <samp><span class="option">-O1</span></samp>, 557<samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. Passes that use the information 558are enabled independently at different optimization levels. 559 560 <br><dt><code>-fdevirtualize</code><dd><a name="index-fdevirtualize-735"></a>Attempt to convert calls to virtual functions to direct calls. This 561is done both within a procedure and interprocedurally as part of 562indirect inlining (<code>-findirect-inlining</code>) and interprocedural constant 563propagation (<samp><span class="option">-fipa-cp</span></samp>). 564Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 565 566 <br><dt><code>-fexpensive-optimizations</code><dd><a name="index-fexpensive_002doptimizations-736"></a>Perform a number of minor optimizations that are relatively expensive. 567 568 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 569 570 <br><dt><code>-foptimize-register-move</code><dt><code>-fregmove</code><dd><a name="index-foptimize_002dregister_002dmove-737"></a><a name="index-fregmove-738"></a>Attempt to reassign register numbers in move instructions and as 571operands of other simple instructions in order to maximize the amount of 572register tying. This is especially helpful on machines with two-operand 573instructions. 574 575 <p>Note <samp><span class="option">-fregmove</span></samp> and <samp><span class="option">-foptimize-register-move</span></samp> are the same 576optimization. 577 578 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 579 580 <br><dt><code>-fira-algorithm=</code><var>algorithm</var><dd>Use specified coloring algorithm for the integrated register 581allocator. The <var>algorithm</var> argument should be <code>priority</code> or 582<code>CB</code>. The first algorithm specifies Chow's priority coloring, 583the second one specifies Chaitin-Briggs coloring. The second 584algorithm can be unimplemented for some architectures. If it is 585implemented, it is the default because Chaitin-Briggs coloring as a 586rule generates a better code. 587 588 <br><dt><code>-fira-region=</code><var>region</var><dd>Use specified regions for the integrated register allocator. The 589<var>region</var> argument should be one of <code>all</code>, <code>mixed</code>, or 590<code>one</code>. The first value means using all loops as register 591allocation regions, the second value which is the default means using 592all loops except for loops with small register pressure as the 593regions, and third one means using all function as a single region. 594The first value can give best result for machines with small size and 595irregular register set, the third one results in faster and generates 596decent code and the smallest size code, and the default value usually 597give the best results in most cases and for most architectures. 598 599 <br><dt><code>-fira-loop-pressure</code><dd><a name="index-fira_002dloop_002dpressure-739"></a>Use IRA to evaluate register pressure in loops for decision to move 600loop invariants. Usage of this option usually results in generation 601of faster and smaller code on machines with big register files (>= 32 602registers) but it can slow compiler down. 603 604 <p>This option is enabled at level <samp><span class="option">-O3</span></samp> for some targets. 605 606 <br><dt><code>-fno-ira-share-save-slots</code><dd><a name="index-fno_002dira_002dshare_002dsave_002dslots-740"></a>Switch off sharing stack slots used for saving call used hard 607registers living through a call. Each hard register will get a 608separate stack slot and as a result function stack frame will be 609bigger. 610 611 <br><dt><code>-fno-ira-share-spill-slots</code><dd><a name="index-fno_002dira_002dshare_002dspill_002dslots-741"></a>Switch off sharing stack slots allocated for pseudo-registers. Each 612pseudo-register which did not get a hard register will get a separate 613stack slot and as a result function stack frame will be bigger. 614 615 <br><dt><code>-fira-verbose=</code><var>n</var><dd><a name="index-fira_002dverbose-742"></a>Set up how verbose dump file for the integrated register allocator 616will be. Default value is 5. If the value is greater or equal to 10, 617the dump file will be stderr as if the value were <var>n</var> minus 10. 618 619 <br><dt><code>-fdelayed-branch</code><dd><a name="index-fdelayed_002dbranch-743"></a>If supported for the target machine, attempt to reorder instructions 620to exploit instruction slots available after delayed branch 621instructions. 622 623 <p>Enabled at levels <samp><span class="option">-O</span></samp>, <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 624 625 <br><dt><code>-fschedule-insns</code><dd><a name="index-fschedule_002dinsns-744"></a>If supported for the target machine, attempt to reorder instructions to 626eliminate execution stalls due to required data being unavailable. This 627helps machines that have slow floating point or memory load instructions 628by allowing other instructions to be issued until the result of the load 629or floating point instruction is required. 630 631 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>. 632 633 <br><dt><code>-fschedule-insns2</code><dd><a name="index-fschedule_002dinsns2-745"></a>Similar to <samp><span class="option">-fschedule-insns</span></samp>, but requests an additional pass of 634instruction scheduling after register allocation has been done. This is 635especially useful on machines with a relatively small number of 636registers and where memory load instructions take more than one cycle. 637 638 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 639 640 <br><dt><code>-fno-sched-interblock</code><dd><a name="index-fno_002dsched_002dinterblock-746"></a>Don't schedule instructions across basic blocks. This is normally 641enabled by default when scheduling before register allocation, i.e. 642with <samp><span class="option">-fschedule-insns</span></samp> or at <samp><span class="option">-O2</span></samp> or higher. 643 644 <br><dt><code>-fno-sched-spec</code><dd><a name="index-fno_002dsched_002dspec-747"></a>Don't allow speculative motion of non-load instructions. This is normally 645enabled by default when scheduling before register allocation, i.e. 646with <samp><span class="option">-fschedule-insns</span></samp> or at <samp><span class="option">-O2</span></samp> or higher. 647 648 <br><dt><code>-fsched-pressure</code><dd><a name="index-fsched_002dpressure-748"></a>Enable register pressure sensitive insn scheduling before the register 649allocation. This only makes sense when scheduling before register 650allocation is enabled, i.e. with <samp><span class="option">-fschedule-insns</span></samp> or at 651<samp><span class="option">-O2</span></samp> or higher. Usage of this option can improve the 652generated code and decrease its size by preventing register pressure 653increase above the number of available hard registers and as a 654consequence register spills in the register allocation. 655 656 <br><dt><code>-fsched-spec-load</code><dd><a name="index-fsched_002dspec_002dload-749"></a>Allow speculative motion of some load instructions. This only makes 657sense when scheduling before register allocation, i.e. with 658<samp><span class="option">-fschedule-insns</span></samp> or at <samp><span class="option">-O2</span></samp> or higher. 659 660 <br><dt><code>-fsched-spec-load-dangerous</code><dd><a name="index-fsched_002dspec_002dload_002ddangerous-750"></a>Allow speculative motion of more load instructions. This only makes 661sense when scheduling before register allocation, i.e. with 662<samp><span class="option">-fschedule-insns</span></samp> or at <samp><span class="option">-O2</span></samp> or higher. 663 664 <br><dt><code>-fsched-stalled-insns</code><dt><code>-fsched-stalled-insns=</code><var>n</var><dd><a name="index-fsched_002dstalled_002dinsns-751"></a>Define how many insns (if any) can be moved prematurely from the queue 665of stalled insns into the ready list, during the second scheduling pass. 666<samp><span class="option">-fno-sched-stalled-insns</span></samp> means that no insns will be moved 667prematurely, <samp><span class="option">-fsched-stalled-insns=0</span></samp> means there is no limit 668on how many queued insns can be moved prematurely. 669<samp><span class="option">-fsched-stalled-insns</span></samp> without a value is equivalent to 670<samp><span class="option">-fsched-stalled-insns=1</span></samp>. 671 672 <br><dt><code>-fsched-stalled-insns-dep</code><dt><code>-fsched-stalled-insns-dep=</code><var>n</var><dd><a name="index-fsched_002dstalled_002dinsns_002ddep-752"></a>Define how many insn groups (cycles) will be examined for a dependency 673on a stalled insn that is candidate for premature removal from the queue 674of stalled insns. This has an effect only during the second scheduling pass, 675and only if <samp><span class="option">-fsched-stalled-insns</span></samp> is used. 676<samp><span class="option">-fno-sched-stalled-insns-dep</span></samp> is equivalent to 677<samp><span class="option">-fsched-stalled-insns-dep=0</span></samp>. 678<samp><span class="option">-fsched-stalled-insns-dep</span></samp> without a value is equivalent to 679<samp><span class="option">-fsched-stalled-insns-dep=1</span></samp>. 680 681 <br><dt><code>-fsched2-use-superblocks</code><dd><a name="index-fsched2_002duse_002dsuperblocks-753"></a>When scheduling after register allocation, do use superblock scheduling 682algorithm. Superblock scheduling allows motion across basic block boundaries 683resulting on faster schedules. This option is experimental, as not all machine 684descriptions used by GCC model the CPU closely enough to avoid unreliable 685results from the algorithm. 686 687 <p>This only makes sense when scheduling after register allocation, i.e. with 688<samp><span class="option">-fschedule-insns2</span></samp> or at <samp><span class="option">-O2</span></samp> or higher. 689 690 <br><dt><code>-fsched-group-heuristic</code><dd><a name="index-fsched_002dgroup_002dheuristic-754"></a>Enable the group heuristic in the scheduler. This heuristic favors 691the instruction that belongs to a schedule group. This is enabled 692by default when scheduling is enabled, i.e. with <samp><span class="option">-fschedule-insns</span></samp> 693or <samp><span class="option">-fschedule-insns2</span></samp> or at <samp><span class="option">-O2</span></samp> or higher. 694 695 <br><dt><code>-fsched-critical-path-heuristic</code><dd><a name="index-fsched_002dcritical_002dpath_002dheuristic-755"></a>Enable the critical-path heuristic in the scheduler. This heuristic favors 696instructions on the critical path. This is enabled by default when 697scheduling is enabled, i.e. with <samp><span class="option">-fschedule-insns</span></samp> 698or <samp><span class="option">-fschedule-insns2</span></samp> or at <samp><span class="option">-O2</span></samp> or higher. 699 700 <br><dt><code>-fsched-spec-insn-heuristic</code><dd><a name="index-fsched_002dspec_002dinsn_002dheuristic-756"></a>Enable the speculative instruction heuristic in the scheduler. This 701heuristic favors speculative instructions with greater dependency weakness. 702This is enabled by default when scheduling is enabled, i.e. 703with <samp><span class="option">-fschedule-insns</span></samp> or <samp><span class="option">-fschedule-insns2</span></samp> 704or at <samp><span class="option">-O2</span></samp> or higher. 705 706 <br><dt><code>-fsched-rank-heuristic</code><dd><a name="index-fsched_002drank_002dheuristic-757"></a>Enable the rank heuristic in the scheduler. This heuristic favors 707the instruction belonging to a basic block with greater size or frequency. 708This is enabled by default when scheduling is enabled, i.e. 709with <samp><span class="option">-fschedule-insns</span></samp> or <samp><span class="option">-fschedule-insns2</span></samp> or 710at <samp><span class="option">-O2</span></samp> or higher. 711 712 <br><dt><code>-fsched-last-insn-heuristic</code><dd><a name="index-fsched_002dlast_002dinsn_002dheuristic-758"></a>Enable the last-instruction heuristic in the scheduler. This heuristic 713favors the instruction that is less dependent on the last instruction 714scheduled. This is enabled by default when scheduling is enabled, 715i.e. with <samp><span class="option">-fschedule-insns</span></samp> or <samp><span class="option">-fschedule-insns2</span></samp> or 716at <samp><span class="option">-O2</span></samp> or higher. 717 718 <br><dt><code>-fsched-dep-count-heuristic</code><dd><a name="index-fsched_002ddep_002dcount_002dheuristic-759"></a>Enable the dependent-count heuristic in the scheduler. This heuristic 719favors the instruction that has more instructions depending on it. 720This is enabled by default when scheduling is enabled, i.e. 721with <samp><span class="option">-fschedule-insns</span></samp> or <samp><span class="option">-fschedule-insns2</span></samp> or 722at <samp><span class="option">-O2</span></samp> or higher. 723 724 <br><dt><code>-freschedule-modulo-scheduled-loops</code><dd><a name="index-freschedule_002dmodulo_002dscheduled_002dloops-760"></a>The modulo scheduling comes before the traditional scheduling, if a loop 725was modulo scheduled we may want to prevent the later scheduling passes 726from changing its schedule, we use this option to control that. 727 728 <br><dt><code>-fselective-scheduling</code><dd><a name="index-fselective_002dscheduling-761"></a>Schedule instructions using selective scheduling algorithm. Selective 729scheduling runs instead of the first scheduler pass. 730 731 <br><dt><code>-fselective-scheduling2</code><dd><a name="index-fselective_002dscheduling2-762"></a>Schedule instructions using selective scheduling algorithm. Selective 732scheduling runs instead of the second scheduler pass. 733 734 <br><dt><code>-fsel-sched-pipelining</code><dd><a name="index-fsel_002dsched_002dpipelining-763"></a>Enable software pipelining of innermost loops during selective scheduling. 735This option has no effect until one of <samp><span class="option">-fselective-scheduling</span></samp> or 736<samp><span class="option">-fselective-scheduling2</span></samp> is turned on. 737 738 <br><dt><code>-fsel-sched-pipelining-outer-loops</code><dd><a name="index-fsel_002dsched_002dpipelining_002douter_002dloops-764"></a>When pipelining loops during selective scheduling, also pipeline outer loops. 739This option has no effect until <samp><span class="option">-fsel-sched-pipelining</span></samp> is turned on. 740 741 <br><dt><code>-fshrink-wrap</code><dd><a name="index-fshrink_002dwrap-765"></a>Emit function prologues only before parts of the function that need it, 742rather than at the top of the function. 743 744 <br><dt><code>-fcaller-saves</code><dd><a name="index-fcaller_002dsaves-766"></a>Enable values to be allocated in registers that will be clobbered by 745function calls, by emitting extra instructions to save and restore the 746registers around such calls. Such allocation is done only when it 747seems to result in better code than would otherwise be produced. 748 749 <p>This option is always enabled by default on certain machines, usually 750those which have no call-preserved registers to use instead. 751 752 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 753 754 <br><dt><code>-fcombine-stack-adjustments</code><dd><a name="index-fcombine_002dstack_002dadjustments-767"></a>Tracks stack adjustments (pushes and pops) and stack memory references 755and then tries to find ways to combine them. 756 757 <p>Enabled by default at <samp><span class="option">-O1</span></samp> and higher. 758 759 <br><dt><code>-fconserve-stack</code><dd><a name="index-fconserve_002dstack-768"></a>Attempt to minimize stack usage. The compiler will attempt to use less 760stack space, even if that makes the program slower. This option 761implies setting the <samp><span class="option">large-stack-frame</span></samp> parameter to 100 762and the <samp><span class="option">large-stack-frame-growth</span></samp> parameter to 400. 763 764 <br><dt><code>-ftree-reassoc</code><dd><a name="index-ftree_002dreassoc-769"></a>Perform reassociation on trees. This flag is enabled by default 765at <samp><span class="option">-O</span></samp> and higher. 766 767 <br><dt><code>-ftree-pre</code><dd><a name="index-ftree_002dpre-770"></a>Perform partial redundancy elimination (PRE) on trees. This flag is 768enabled by default at <samp><span class="option">-O2</span></samp> and <samp><span class="option">-O3</span></samp>. 769 770 <br><dt><code>-ftree-forwprop</code><dd><a name="index-ftree_002dforwprop-771"></a>Perform forward propagation on trees. This flag is enabled by default 771at <samp><span class="option">-O</span></samp> and higher. 772 773 <br><dt><code>-ftree-fre</code><dd><a name="index-ftree_002dfre-772"></a>Perform full redundancy elimination (FRE) on trees. The difference 774between FRE and PRE is that FRE only considers expressions 775that are computed on all paths leading to the redundant computation. 776This analysis is faster than PRE, though it exposes fewer redundancies. 777This flag is enabled by default at <samp><span class="option">-O</span></samp> and higher. 778 779 <br><dt><code>-ftree-phiprop</code><dd><a name="index-ftree_002dphiprop-773"></a>Perform hoisting of loads from conditional pointers on trees. This 780pass is enabled by default at <samp><span class="option">-O</span></samp> and higher. 781 782 <br><dt><code>-ftree-copy-prop</code><dd><a name="index-ftree_002dcopy_002dprop-774"></a>Perform copy propagation on trees. This pass eliminates unnecessary 783copy operations. This flag is enabled by default at <samp><span class="option">-O</span></samp> and 784higher. 785 786 <br><dt><code>-fipa-pure-const</code><dd><a name="index-fipa_002dpure_002dconst-775"></a>Discover which functions are pure or constant. 787Enabled by default at <samp><span class="option">-O</span></samp> and higher. 788 789 <br><dt><code>-fipa-reference</code><dd><a name="index-fipa_002dreference-776"></a>Discover which static variables do not escape cannot escape the 790compilation unit. 791Enabled by default at <samp><span class="option">-O</span></samp> and higher. 792 793 <br><dt><code>-fipa-struct-reorg</code><dd><a name="index-fipa_002dstruct_002dreorg-777"></a>Perform structure reorganization optimization, that change C-like structures 794layout in order to better utilize spatial locality. This transformation is 795affective for programs containing arrays of structures. Available in two 796compilation modes: profile-based (enabled with <samp><span class="option">-fprofile-generate</span></samp>) 797or static (which uses built-in heuristics). It works only in whole program 798mode, so it requires <samp><span class="option">-fwhole-program</span></samp> to be 799enabled. Structures considered ‘<samp><span class="samp">cold</span></samp>’ by this transformation are not 800affected (see <samp><span class="option">--param struct-reorg-cold-struct-ratio=</span><var>value</var></samp>). 801 802 <p>With this flag, the program debug info reflects a new structure layout. 803 804 <br><dt><code>-fipa-pta</code><dd><a name="index-fipa_002dpta-778"></a>Perform interprocedural pointer analysis and interprocedural modification 805and reference analysis. This option can cause excessive memory and 806compile-time usage on large compilation units. It is not enabled by 807default at any optimization level. 808 809 <br><dt><code>-fipa-profile</code><dd><a name="index-fipa_002dprofile-779"></a>Perform interprocedural profile propagation. The functions called only from 810cold functions are marked as cold. Also functions executed once (such as 811<code>cold</code>, <code>noreturn</code>, static constructors or destructors) are identified. Cold 812functions and loop less parts of functions executed once are then optimized for 813size. 814Enabled by default at <samp><span class="option">-O</span></samp> and higher. 815 816 <br><dt><code>-fipa-cp</code><dd><a name="index-fipa_002dcp-780"></a>Perform interprocedural constant propagation. 817This optimization analyzes the program to determine when values passed 818to functions are constants and then optimizes accordingly. 819This optimization can substantially increase performance 820if the application has constants passed to functions. 821This flag is enabled by default at <samp><span class="option">-O2</span></samp>, <samp><span class="option">-Os</span></samp> and <samp><span class="option">-O3</span></samp>. 822 823 <br><dt><code>-fipa-cp-clone</code><dd><a name="index-fipa_002dcp_002dclone-781"></a>Perform function cloning to make interprocedural constant propagation stronger. 824When enabled, interprocedural constant propagation will perform function cloning 825when externally visible function can be called with constant arguments. 826Because this optimization can create multiple copies of functions, 827it may significantly increase code size 828(see <samp><span class="option">--param ipcp-unit-growth=</span><var>value</var></samp>). 829This flag is enabled by default at <samp><span class="option">-O3</span></samp>. 830 831 <br><dt><code>-fipa-matrix-reorg</code><dd><a name="index-fipa_002dmatrix_002dreorg-782"></a>Perform matrix flattening and transposing. 832Matrix flattening tries to replace an m-dimensional matrix 833with its equivalent n-dimensional matrix, where n < m. 834This reduces the level of indirection needed for accessing the elements 835of the matrix. The second optimization is matrix transposing that 836attempts to change the order of the matrix's dimensions in order to 837improve cache locality. 838Both optimizations need the <samp><span class="option">-fwhole-program</span></samp> flag. 839Transposing is enabled only if profiling information is available. 840 841 <br><dt><code>-ftree-sink</code><dd><a name="index-ftree_002dsink-783"></a>Perform forward store motion on trees. This flag is 842enabled by default at <samp><span class="option">-O</span></samp> and higher. 843 844 <br><dt><code>-ftree-bit-ccp</code><dd><a name="index-ftree_002dbit_002dccp-784"></a>Perform sparse conditional bit constant propagation on trees and propagate 845pointer alignment information. 846This pass only operates on local scalar variables and is enabled by default 847at <samp><span class="option">-O</span></samp> and higher. It requires that <samp><span class="option">-ftree-ccp</span></samp> is enabled. 848 849 <br><dt><code>-ftree-ccp</code><dd><a name="index-ftree_002dccp-785"></a>Perform sparse conditional constant propagation (CCP) on trees. This 850pass only operates on local scalar variables and is enabled by default 851at <samp><span class="option">-O</span></samp> and higher. 852 853 <br><dt><code>-ftree-switch-conversion</code><dd>Perform conversion of simple initializations in a switch to 854initializations from a scalar array. This flag is enabled by default 855at <samp><span class="option">-O2</span></samp> and higher. 856 857 <br><dt><code>-ftree-if-to-switch-conversion</code><dd>Perform conversion of chains of ifs into switches. This flag is enabled by 858default at <samp><span class="option">-O2</span></samp> and higher. 859 860 <br><dt><code>-ftree-dce</code><dd><a name="index-ftree_002ddce-786"></a>Perform dead code elimination (DCE) on trees. This flag is enabled by 861default at <samp><span class="option">-O</span></samp> and higher. 862 863 <br><dt><code>-ftree-builtin-call-dce</code><dd><a name="index-ftree_002dbuiltin_002dcall_002ddce-787"></a>Perform conditional dead code elimination (DCE) for calls to builtin functions 864that may set <code>errno</code> but are otherwise side-effect free. This flag is 865enabled by default at <samp><span class="option">-O2</span></samp> and higher if <samp><span class="option">-Os</span></samp> is not also 866specified. 867 868 <br><dt><code>-ftree-dominator-opts</code><dd><a name="index-ftree_002ddominator_002dopts-788"></a>Perform a variety of simple scalar cleanups (constant/copy 869propagation, redundancy elimination, range propagation and expression 870simplification) based on a dominator tree traversal. This also 871performs jump threading (to reduce jumps to jumps). This flag is 872enabled by default at <samp><span class="option">-O</span></samp> and higher. 873 874 <br><dt><code>-ftree-dse</code><dd><a name="index-ftree_002ddse-789"></a>Perform dead store elimination (DSE) on trees. A dead store is a store into 875a memory location which will later be overwritten by another store without 876any intervening loads. In this case the earlier store can be deleted. This 877flag is enabled by default at <samp><span class="option">-O</span></samp> and higher. 878 879 <br><dt><code>-ftree-ch</code><dd><a name="index-ftree_002dch-790"></a>Perform loop header copying on trees. This is beneficial since it increases 880effectiveness of code motion optimizations. It also saves one jump. This flag 881is enabled by default at <samp><span class="option">-O</span></samp> and higher. It is not enabled 882for <samp><span class="option">-Os</span></samp>, since it usually increases code size. 883 884 <br><dt><code>-ftree-loop-optimize</code><dd><a name="index-ftree_002dloop_002doptimize-791"></a>Perform loop optimizations on trees. This flag is enabled by default 885at <samp><span class="option">-O</span></samp> and higher. 886 887 <br><dt><code>-ftree-loop-linear</code><dd><a name="index-ftree_002dloop_002dlinear-792"></a>Perform loop interchange transformations on tree. Same as 888<samp><span class="option">-floop-interchange</span></samp>. To use this code transformation, GCC has 889to be configured with <samp><span class="option">--with-ppl</span></samp> and <samp><span class="option">--with-cloog</span></samp> to 890enable the Graphite loop transformation infrastructure. 891 892 <br><dt><code>-floop-interchange</code><dd><a name="index-floop_002dinterchange-793"></a>Perform loop interchange transformations on loops. Interchanging two 893nested loops switches the inner and outer loops. For example, given a 894loop like: 895 <pre class="smallexample"> DO J = 1, M 896 DO I = 1, N 897 A(J, I) = A(J, I) * C 898 ENDDO 899 ENDDO 900</pre> 901 <p>loop interchange will transform the loop as if the user had written: 902 <pre class="smallexample"> DO I = 1, N 903 DO J = 1, M 904 A(J, I) = A(J, I) * C 905 ENDDO 906 ENDDO 907</pre> 908 <p>which can be beneficial when <code>N</code> is larger than the caches, 909because in Fortran, the elements of an array are stored in memory 910contiguously by column, and the original loop iterates over rows, 911potentially creating at each access a cache miss. This optimization 912applies to all the languages supported by GCC and is not limited to 913Fortran. To use this code transformation, GCC has to be configured 914with <samp><span class="option">--with-ppl</span></samp> and <samp><span class="option">--with-cloog</span></samp> to enable the 915Graphite loop transformation infrastructure. 916 917 <br><dt><code>-floop-strip-mine</code><dd><a name="index-floop_002dstrip_002dmine-794"></a>Perform loop strip mining transformations on loops. Strip mining 918splits a loop into two nested loops. The outer loop has strides 919equal to the strip size and the inner loop has strides of the 920original loop within a strip. The strip length can be changed 921using the <samp><span class="option">loop-block-tile-size</span></samp> parameter. For example, 922given a loop like: 923 <pre class="smallexample"> DO I = 1, N 924 A(I) = A(I) + C 925 ENDDO 926</pre> 927 <p>loop strip mining will transform the loop as if the user had written: 928 <pre class="smallexample"> DO II = 1, N, 51 929 DO I = II, min (II + 50, N) 930 A(I) = A(I) + C 931 ENDDO 932 ENDDO 933</pre> 934 <p>This optimization applies to all the languages supported by GCC and is 935not limited to Fortran. To use this code transformation, GCC has to 936be configured with <samp><span class="option">--with-ppl</span></samp> and <samp><span class="option">--with-cloog</span></samp> to 937enable the Graphite loop transformation infrastructure. 938 939 <br><dt><code>-floop-block</code><dd><a name="index-floop_002dblock-795"></a>Perform loop blocking transformations on loops. Blocking strip mines 940each loop in the loop nest such that the memory accesses of the 941element loops fit inside caches. The strip length can be changed 942using the <samp><span class="option">loop-block-tile-size</span></samp> parameter. For example, given 943a loop like: 944 <pre class="smallexample"> DO I = 1, N 945 DO J = 1, M 946 A(J, I) = B(I) + C(J) 947 ENDDO 948 ENDDO 949</pre> 950 <p>loop blocking will transform the loop as if the user had written: 951 <pre class="smallexample"> DO II = 1, N, 51 952 DO JJ = 1, M, 51 953 DO I = II, min (II + 50, N) 954 DO J = JJ, min (JJ + 50, M) 955 A(J, I) = B(I) + C(J) 956 ENDDO 957 ENDDO 958 ENDDO 959 ENDDO 960</pre> 961 <p>which can be beneficial when <code>M</code> is larger than the caches, 962because the innermost loop will iterate over a smaller amount of data 963that can be kept in the caches. This optimization applies to all the 964languages supported by GCC and is not limited to Fortran. To use this 965code transformation, GCC has to be configured with <samp><span class="option">--with-ppl</span></samp> 966and <samp><span class="option">--with-cloog</span></samp> to enable the Graphite loop transformation 967infrastructure. 968 969 <br><dt><code>-fgraphite-identity</code><dd><a name="index-fgraphite_002didentity-796"></a>Enable the identity transformation for graphite. For every SCoP we generate 970the polyhedral representation and transform it back to gimple. Using 971<samp><span class="option">-fgraphite-identity</span></samp> we can check the costs or benefits of the 972GIMPLE -> GRAPHITE -> GIMPLE transformation. Some minimal optimizations 973are also performed by the code generator CLooG, like index splitting and 974dead code elimination in loops. 975 976 <br><dt><code>-floop-flatten</code><dd><a name="index-floop_002dflatten-797"></a>Removes the loop nesting structure: transforms the loop nest into a 977single loop. This transformation can be useful to vectorize all the 978levels of the loop nest. 979 980 <br><dt><code>-floop-parallelize-all</code><dd><a name="index-floop_002dparallelize_002dall-798"></a>Use the Graphite data dependence analysis to identify loops that can 981be parallelized. Parallelize all the loops that can be analyzed to 982not contain loop carried dependences without checking that it is 983profitable to parallelize the loops. 984 985 <br><dt><code>-fcheck-data-deps</code><dd><a name="index-fcheck_002ddata_002ddeps-799"></a>Compare the results of several data dependence analyzers. This option 986is used for debugging the data dependence analyzers. 987 988 <br><dt><code>-ftree-loop-if-convert</code><dd>Attempt to transform conditional jumps in the innermost loops to 989branch-less equivalents. The intent is to remove control-flow from 990the innermost loops in order to improve the ability of the 991vectorization pass to handle these loops. This is enabled by default 992if vectorization is enabled. 993 994 <br><dt><code>-ftree-loop-if-convert-stores</code><dd>Attempt to also if-convert conditional jumps containing memory writes. 995This transformation can be unsafe for multi-threaded programs as it 996transforms conditional memory writes into unconditional memory writes. 997For example, 998 <pre class="smallexample"> for (i = 0; i < N; i++) 999 if (cond) 1000 A[i] = expr; 1001</pre> 1002 <p>would be transformed to 1003 <pre class="smallexample"> for (i = 0; i < N; i++) 1004 A[i] = cond ? expr : A[i]; 1005</pre> 1006 <p>potentially producing data races. 1007 1008 <br><dt><code>-ftree-loop-distribution</code><dd>Perform loop distribution. This flag can improve cache performance on 1009big loop bodies and allow further loop optimizations, like 1010parallelization or vectorization, to take place. For example, the loop 1011 <pre class="smallexample"> DO I = 1, N 1012 A(I) = B(I) + C 1013 D(I) = E(I) * F 1014 ENDDO 1015</pre> 1016 <p>is transformed to 1017 <pre class="smallexample"> DO I = 1, N 1018 A(I) = B(I) + C 1019 ENDDO 1020 DO I = 1, N 1021 D(I) = E(I) * F 1022 ENDDO 1023</pre> 1024 <br><dt><code>-ftree-loop-distribute-patterns</code><dd>Perform loop distribution of patterns that can be code generated with 1025calls to a library. This flag is enabled by default at <samp><span class="option">-O3</span></samp>. 1026 1027 <p>This pass distributes the initialization loops and generates a call to 1028memset zero. For example, the loop 1029 <pre class="smallexample"> DO I = 1, N 1030 A(I) = 0 1031 B(I) = A(I) + I 1032 ENDDO 1033</pre> 1034 <p>is transformed to 1035 <pre class="smallexample"> DO I = 1, N 1036 A(I) = 0 1037 ENDDO 1038 DO I = 1, N 1039 B(I) = A(I) + I 1040 ENDDO 1041</pre> 1042 <p>and the initialization loop is transformed into a call to memset zero. 1043 1044 <br><dt><code>-ftree-loop-im</code><dd><a name="index-ftree_002dloop_002dim-800"></a>Perform loop invariant motion on trees. This pass moves only invariants that 1045would be hard to handle at RTL level (function calls, operations that expand to 1046nontrivial sequences of insns). With <samp><span class="option">-funswitch-loops</span></samp> it also moves 1047operands of conditions that are invariant out of the loop, so that we can use 1048just trivial invariantness analysis in loop unswitching. The pass also includes 1049store motion. 1050 1051 <br><dt><code>-ftree-loop-ivcanon</code><dd><a name="index-ftree_002dloop_002divcanon-801"></a>Create a canonical counter for number of iterations in the loop for that 1052determining number of iterations requires complicated analysis. Later 1053optimizations then may determine the number easily. Useful especially 1054in connection with unrolling. 1055 1056 <br><dt><code>-fivopts</code><dd><a name="index-fivopts-802"></a>Perform induction variable optimizations (strength reduction, induction 1057variable merging and induction variable elimination) on trees. 1058 1059 <br><dt><code>-ftree-parallelize-loops=n</code><dd><a name="index-ftree_002dparallelize_002dloops-803"></a>Parallelize loops, i.e., split their iteration space to run in n threads. 1060This is only possible for loops whose iterations are independent 1061and can be arbitrarily reordered. The optimization is only 1062profitable on multiprocessor machines, for loops that are CPU-intensive, 1063rather than constrained e.g. by memory bandwidth. This option 1064implies <samp><span class="option">-pthread</span></samp>, and thus is only supported on targets 1065that have support for <samp><span class="option">-pthread</span></samp>. 1066 1067 <br><dt><code>-ftree-pta</code><dd><a name="index-ftree_002dpta-804"></a>Perform function-local points-to analysis on trees. This flag is 1068enabled by default at <samp><span class="option">-O</span></samp> and higher. 1069 1070 <br><dt><code>-ftree-sra</code><dd><a name="index-ftree_002dsra-805"></a>Perform scalar replacement of aggregates. This pass replaces structure 1071references with scalars to prevent committing structures to memory too 1072early. This flag is enabled by default at <samp><span class="option">-O</span></samp> and higher. 1073 1074 <br><dt><code>-ftree-copyrename</code><dd><a name="index-ftree_002dcopyrename-806"></a>Perform copy renaming on trees. This pass attempts to rename compiler 1075temporaries to other variables at copy locations, usually resulting in 1076variable names which more closely resemble the original variables. This flag 1077is enabled by default at <samp><span class="option">-O</span></samp> and higher. 1078 1079 <br><dt><code>-ftree-ter</code><dd><a name="index-ftree_002dter-807"></a>Perform temporary expression replacement during the SSA->normal phase. Single 1080use/single def temporaries are replaced at their use location with their 1081defining expression. This results in non-GIMPLE code, but gives the expanders 1082much more complex trees to work on resulting in better RTL generation. This is 1083enabled by default at <samp><span class="option">-O</span></samp> and higher. 1084 1085 <br><dt><code>-ftree-vectorize</code><dd><a name="index-ftree_002dvectorize-808"></a>Perform loop vectorization on trees. This flag is enabled by default at 1086<samp><span class="option">-O3</span></samp>. 1087 1088 <br><dt><code>-ftree-slp-vectorize</code><dd><a name="index-ftree_002dslp_002dvectorize-809"></a>Perform basic block vectorization on trees. This flag is enabled by default at 1089<samp><span class="option">-O3</span></samp> and when <samp><span class="option">-ftree-vectorize</span></samp> is enabled. 1090 1091 <br><dt><code>-ftree-vect-loop-version</code><dd><a name="index-ftree_002dvect_002dloop_002dversion-810"></a>Perform loop versioning when doing loop vectorization on trees. When a loop 1092appears to be vectorizable except that data alignment or data dependence cannot 1093be determined at compile time then vectorized and non-vectorized versions of 1094the loop are generated along with runtime checks for alignment or dependence 1095to control which version is executed. This option is enabled by default 1096except at level <samp><span class="option">-Os</span></samp> where it is disabled. 1097 1098 <br><dt><code>-fvect-cost-model</code><dd><a name="index-fvect_002dcost_002dmodel-811"></a>Enable cost model for vectorization. 1099 1100 <br><dt><code>-ftree-vrp</code><dd><a name="index-ftree_002dvrp-812"></a>Perform Value Range Propagation on trees. This is similar to the 1101constant propagation pass, but instead of values, ranges of values are 1102propagated. This allows the optimizers to remove unnecessary range 1103checks like array bound checks and null pointer checks. This is 1104enabled by default at <samp><span class="option">-O2</span></samp> and higher. Null pointer check 1105elimination is only done if <samp><span class="option">-fdelete-null-pointer-checks</span></samp> is 1106enabled. 1107 1108 <br><dt><code>-ftracer</code><dd><a name="index-ftracer-813"></a>Perform tail duplication to enlarge superblock size. This transformation 1109simplifies the control flow of the function allowing other optimizations to do 1110better job. 1111 1112 <br><dt><code>-funroll-loops</code><dd><a name="index-funroll_002dloops-814"></a>Unroll loops whose number of iterations can be determined at compile 1113time or upon entry to the loop. <samp><span class="option">-funroll-loops</span></samp> implies 1114<samp><span class="option">-frerun-cse-after-loop</span></samp>. This option makes code larger, 1115and may or may not make it run faster. 1116 1117 <br><dt><code>-funroll-all-loops</code><dd><a name="index-funroll_002dall_002dloops-815"></a>Unroll all loops, even if their number of iterations is uncertain when 1118the loop is entered. This usually makes programs run more slowly. 1119<samp><span class="option">-funroll-all-loops</span></samp> implies the same options as 1120<samp><span class="option">-funroll-loops</span></samp>, 1121 1122 <br><dt><code>-fsplit-ivs-in-unroller</code><dd><a name="index-fsplit_002divs_002din_002dunroller-816"></a>Enables expressing of values of induction variables in later iterations 1123of the unrolled loop using the value in the first iteration. This breaks 1124long dependency chains, thus improving efficiency of the scheduling passes. 1125 1126 <p>Combination of <samp><span class="option">-fweb</span></samp> and CSE is often sufficient to obtain the 1127same effect. However in cases the loop body is more complicated than 1128a single basic block, this is not reliable. It also does not work at all 1129on some of the architectures due to restrictions in the CSE pass. 1130 1131 <p>This optimization is enabled by default. 1132 1133 <br><dt><code>-fvariable-expansion-in-unroller</code><dd><a name="index-fvariable_002dexpansion_002din_002dunroller-817"></a>With this option, the compiler will create multiple copies of some 1134local variables when unrolling a loop which can result in superior code. 1135 1136 <br><dt><code>-fpartial-inlining</code><dd><a name="index-fpartial_002dinlining-818"></a>Inline parts of functions. This option has any effect only 1137when inlining itself is turned on by the <samp><span class="option">-finline-functions</span></samp> 1138or <samp><span class="option">-finline-small-functions</span></samp> options. 1139 1140 <p>Enabled at level <samp><span class="option">-O2</span></samp>. 1141 1142 <br><dt><code>-fpredictive-commoning</code><dd><a name="index-fpredictive_002dcommoning-819"></a>Perform predictive commoning optimization, i.e., reusing computations 1143(especially memory loads and stores) performed in previous 1144iterations of loops. 1145 1146 <p>This option is enabled at level <samp><span class="option">-O3</span></samp>. 1147 1148 <br><dt><code>-fprefetch-loop-arrays</code><dd><a name="index-fprefetch_002dloop_002darrays-820"></a>If supported by the target machine, generate instructions to prefetch 1149memory to improve the performance of loops that access large arrays. 1150 1151 <p>This option may generate better or worse code; results are highly 1152dependent on the structure of loops within the source code. 1153 1154 <p>Disabled at level <samp><span class="option">-Os</span></samp>. 1155 1156 <br><dt><code>-fno-peephole</code><dt><code>-fno-peephole2</code><dd><a name="index-fno_002dpeephole-821"></a><a name="index-fno_002dpeephole2-822"></a>Disable any machine-specific peephole optimizations. The difference 1157between <samp><span class="option">-fno-peephole</span></samp> and <samp><span class="option">-fno-peephole2</span></samp> is in how they 1158are implemented in the compiler; some targets use one, some use the 1159other, a few use both. 1160 1161 <p><samp><span class="option">-fpeephole</span></samp> is enabled by default. 1162<samp><span class="option">-fpeephole2</span></samp> enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 1163 1164 <br><dt><code>-fno-guess-branch-probability</code><dd><a name="index-fno_002dguess_002dbranch_002dprobability-823"></a>Do not guess branch probabilities using heuristics. 1165 1166 <p>GCC will use heuristics to guess branch probabilities if they are 1167not provided by profiling feedback (<samp><span class="option">-fprofile-arcs</span></samp>). These 1168heuristics are based on the control flow graph. If some branch probabilities 1169are specified by ‘<samp><span class="samp">__builtin_expect</span></samp>’, then the heuristics will be 1170used to guess branch probabilities for the rest of the control flow graph, 1171taking the ‘<samp><span class="samp">__builtin_expect</span></samp>’ info into account. The interactions 1172between the heuristics and ‘<samp><span class="samp">__builtin_expect</span></samp>’ can be complex, and in 1173some cases, it may be useful to disable the heuristics so that the effects 1174of ‘<samp><span class="samp">__builtin_expect</span></samp>’ are easier to understand. 1175 1176 <p>The default is <samp><span class="option">-fguess-branch-probability</span></samp> at levels 1177<samp><span class="option">-O</span></samp>, <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 1178 1179 <br><dt><code>-freorder-blocks</code><dd><a name="index-freorder_002dblocks-824"></a>Reorder basic blocks in the compiled function in order to reduce number of 1180taken branches and improve code locality. 1181 1182 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>. 1183 1184 <br><dt><code>-freorder-blocks-and-partition</code><dd><a name="index-freorder_002dblocks_002dand_002dpartition-825"></a>In addition to reordering basic blocks in the compiled function, in order 1185to reduce number of taken branches, partitions hot and cold basic blocks 1186into separate sections of the assembly and .o files, to improve 1187paging and cache locality performance. 1188 1189 <p>This optimization is automatically turned off in the presence of 1190exception handling, for linkonce sections, for functions with a user-defined 1191section attribute and on any architecture that does not support named 1192sections. 1193 1194 <br><dt><code>-freorder-functions</code><dd><a name="index-freorder_002dfunctions-826"></a>Reorder functions in the object file in order to 1195improve code locality. This is implemented by using special 1196subsections <code>.text.hot</code> for most frequently executed functions and 1197<code>.text.unlikely</code> for unlikely executed functions. Reordering is done by 1198the linker so object file format must support named sections and linker must 1199place them in a reasonable way. 1200 1201 <p>Also profile feedback must be available in to make this option effective. See 1202<samp><span class="option">-fprofile-arcs</span></samp> for details. 1203 1204 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 1205 1206 <br><dt><code>-fstrict-aliasing</code><dd><a name="index-fstrict_002daliasing-827"></a>Allow the compiler to assume the strictest aliasing rules applicable to 1207the language being compiled. For C (and C++), this activates 1208optimizations based on the type of expressions. In particular, an 1209object of one type is assumed never to reside at the same address as an 1210object of a different type, unless the types are almost the same. For 1211example, an <code>unsigned int</code> can alias an <code>int</code>, but not a 1212<code>void*</code> or a <code>double</code>. A character type may alias any other 1213type. 1214 1215 <p><a name="Type_002dpunning"></a>Pay special attention to code like this: 1216 <pre class="smallexample"> union a_union { 1217 int i; 1218 double d; 1219 }; 1220 1221 int f() { 1222 union a_union t; 1223 t.d = 3.0; 1224 return t.i; 1225 } 1226</pre> 1227 <p>The practice of reading from a different union member than the one most 1228recently written to (called “type-punning”) is common. Even with 1229<samp><span class="option">-fstrict-aliasing</span></samp>, type-punning is allowed, provided the memory 1230is accessed through the union type. So, the code above will work as 1231expected. See <a href="Structures-unions-enumerations-and-bit_002dfields-implementation.html#Structures-unions-enumerations-and-bit_002dfields-implementation">Structures unions enumerations and bit-fields implementation</a>. However, this code might not: 1232 <pre class="smallexample"> int f() { 1233 union a_union t; 1234 int* ip; 1235 t.d = 3.0; 1236 ip = &t.i; 1237 return *ip; 1238 } 1239</pre> 1240 <p>Similarly, access by taking the address, casting the resulting pointer 1241and dereferencing the result has undefined behavior, even if the cast 1242uses a union type, e.g.: 1243 <pre class="smallexample"> int f() { 1244 double d = 3.0; 1245 return ((union a_union *) &d)->i; 1246 } 1247</pre> 1248 <p>The <samp><span class="option">-fstrict-aliasing</span></samp> option is enabled at levels 1249<samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 1250 1251 <br><dt><code>-fstrict-overflow</code><dd><a name="index-fstrict_002doverflow-828"></a>Allow the compiler to assume strict signed overflow rules, depending 1252on the language being compiled. For C (and C++) this means that 1253overflow when doing arithmetic with signed numbers is undefined, which 1254means that the compiler may assume that it will not happen. This 1255permits various optimizations. For example, the compiler will assume 1256that an expression like <code>i + 10 > i</code> will always be true for 1257signed <code>i</code>. This assumption is only valid if signed overflow is 1258undefined, as the expression is false if <code>i + 10</code> overflows when 1259using twos complement arithmetic. When this option is in effect any 1260attempt to determine whether an operation on signed numbers will 1261overflow must be written carefully to not actually involve overflow. 1262 1263 <p>This option also allows the compiler to assume strict pointer 1264semantics: given a pointer to an object, if adding an offset to that 1265pointer does not produce a pointer to the same object, the addition is 1266undefined. This permits the compiler to conclude that <code>p + u > 1267p</code> is always true for a pointer <code>p</code> and unsigned integer 1268<code>u</code>. This assumption is only valid because pointer wraparound is 1269undefined, as the expression is false if <code>p + u</code> overflows using 1270twos complement arithmetic. 1271 1272 <p>See also the <samp><span class="option">-fwrapv</span></samp> option. Using <samp><span class="option">-fwrapv</span></samp> means 1273that integer signed overflow is fully defined: it wraps. When 1274<samp><span class="option">-fwrapv</span></samp> is used, there is no difference between 1275<samp><span class="option">-fstrict-overflow</span></samp> and <samp><span class="option">-fno-strict-overflow</span></samp> for 1276integers. With <samp><span class="option">-fwrapv</span></samp> certain types of overflow are 1277permitted. For example, if the compiler gets an overflow when doing 1278arithmetic on constants, the overflowed value can still be used with 1279<samp><span class="option">-fwrapv</span></samp>, but not otherwise. 1280 1281 <p>The <samp><span class="option">-fstrict-overflow</span></samp> option is enabled at levels 1282<samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 1283 1284 <br><dt><code>-falign-arrays</code><dd><a name="index-falign_002darrays-829"></a>Set the minimum alignment for array variables to be the largest power 1285of two less than or equal to their total storage size, or the biggest 1286alignment used on the machine, whichever is smaller. This option may be 1287helpful when compiling legacy code that uses type punning on arrays that 1288does not strictly conform to the C standard. 1289 1290 <br><dt><code>-falign-functions</code><dt><code>-falign-functions=</code><var>n</var><dd><a name="index-falign_002dfunctions-830"></a>Align the start of functions to the next power-of-two greater than 1291<var>n</var>, skipping up to <var>n</var> bytes. For instance, 1292<samp><span class="option">-falign-functions=32</span></samp> aligns functions to the next 32-byte 1293boundary, but <samp><span class="option">-falign-functions=24</span></samp> would align to the next 129432-byte boundary only if this can be done by skipping 23 bytes or less. 1295 1296 <p><samp><span class="option">-fno-align-functions</span></samp> and <samp><span class="option">-falign-functions=1</span></samp> are 1297equivalent and mean that functions will not be aligned. 1298 1299 <p>Some assemblers only support this flag when <var>n</var> is a power of two; 1300in that case, it is rounded up. 1301 1302 <p>If <var>n</var> is not specified or is zero, use a machine-dependent default. 1303 1304 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>. 1305 1306 <br><dt><code>-falign-labels</code><dt><code>-falign-labels=</code><var>n</var><dd><a name="index-falign_002dlabels-831"></a>Align all branch targets to a power-of-two boundary, skipping up to 1307<var>n</var> bytes like <samp><span class="option">-falign-functions</span></samp>. This option can easily 1308make code slower, because it must insert dummy operations for when the 1309branch target is reached in the usual flow of the code. 1310 1311 <p><samp><span class="option">-fno-align-labels</span></samp> and <samp><span class="option">-falign-labels=1</span></samp> are 1312equivalent and mean that labels will not be aligned. 1313 1314 <p>If <samp><span class="option">-falign-loops</span></samp> or <samp><span class="option">-falign-jumps</span></samp> are applicable and 1315are greater than this value, then their values are used instead. 1316 1317 <p>If <var>n</var> is not specified or is zero, use a machine-dependent default 1318which is very likely to be ‘<samp><span class="samp">1</span></samp>’, meaning no alignment. 1319 1320 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>. 1321 1322 <br><dt><code>-falign-loops</code><dt><code>-falign-loops=</code><var>n</var><dd><a name="index-falign_002dloops-832"></a>Align loops to a power-of-two boundary, skipping up to <var>n</var> bytes 1323like <samp><span class="option">-falign-functions</span></samp>. The hope is that the loop will be 1324executed many times, which will make up for any execution of the dummy 1325operations. 1326 1327 <p><samp><span class="option">-fno-align-loops</span></samp> and <samp><span class="option">-falign-loops=1</span></samp> are 1328equivalent and mean that loops will not be aligned. 1329 1330 <p>If <var>n</var> is not specified or is zero, use a machine-dependent default. 1331 1332 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>. 1333 1334 <br><dt><code>-falign-jumps</code><dt><code>-falign-jumps=</code><var>n</var><dd><a name="index-falign_002djumps-833"></a>Align branch targets to a power-of-two boundary, for branch targets 1335where the targets can only be reached by jumping, skipping up to <var>n</var> 1336bytes like <samp><span class="option">-falign-functions</span></samp>. In this case, no dummy operations 1337need be executed. 1338 1339 <p><samp><span class="option">-fno-align-jumps</span></samp> and <samp><span class="option">-falign-jumps=1</span></samp> are 1340equivalent and mean that loops will not be aligned. 1341 1342 <p>If <var>n</var> is not specified or is zero, use a machine-dependent default. 1343 1344 <p>Enabled at levels <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>. 1345 1346 <br><dt><code>-funit-at-a-time</code><dd><a name="index-funit_002dat_002da_002dtime-834"></a>This option is left for compatibility reasons. <samp><span class="option">-funit-at-a-time</span></samp> 1347has no effect, while <samp><span class="option">-fno-unit-at-a-time</span></samp> implies 1348<samp><span class="option">-fno-toplevel-reorder</span></samp> and <samp><span class="option">-fno-section-anchors</span></samp>. 1349 1350 <p>Enabled by default. 1351 1352 <br><dt><code>-fno-toplevel-reorder</code><dd><a name="index-fno_002dtoplevel_002dreorder-835"></a>Do not reorder top-level functions, variables, and <code>asm</code> 1353statements. Output them in the same order that they appear in the 1354input file. When this option is used, unreferenced static variables 1355will not be removed. This option is intended to support existing code 1356which relies on a particular ordering. For new code, it is better to 1357use attributes. 1358 1359 <p>Enabled at level <samp><span class="option">-O0</span></samp>. When disabled explicitly, it also imply 1360<samp><span class="option">-fno-section-anchors</span></samp> that is otherwise enabled at <samp><span class="option">-O0</span></samp> on some 1361targets. 1362 1363 <br><dt><code>-fweb</code><dd><a name="index-fweb-836"></a>Constructs webs as commonly used for register allocation purposes and assign 1364each web individual pseudo register. This allows the register allocation pass 1365to operate on pseudos directly, but also strengthens several other optimization 1366passes, such as CSE, loop optimizer and trivial dead code remover. It can, 1367however, make debugging impossible, since variables will no longer stay in a 1368“home register”. 1369 1370 <p>Enabled by default with <samp><span class="option">-funroll-loops</span></samp>. 1371 1372 <br><dt><code>-fwhole-program</code><dd><a name="index-fwhole_002dprogram-837"></a>Assume that the current compilation unit represents the whole program being 1373compiled. All public functions and variables with the exception of <code>main</code> 1374and those merged by attribute <code>externally_visible</code> become static functions 1375and in effect are optimized more aggressively by interprocedural optimizers. If <samp><span class="command">gold</span></samp> is used as the linker plugin, <code>externally_visible</code> attributes are automatically added to functions (not variable yet due to a current <samp><span class="command">gold</span></samp> issue) that are accessed outside of LTO objects according to resolution file produced by <samp><span class="command">gold</span></samp>. For other linkers that cannot generate resolution file, explicit <code>externally_visible</code> attributes are still necessary. 1376While this option is equivalent to proper use of the <code>static</code> keyword for 1377programs consisting of a single file, in combination with option 1378<samp><span class="option">-flto</span></samp> this flag can be used to 1379compile many smaller scale programs since the functions and variables become 1380local for the whole combined compilation unit, not for the single source file 1381itself. 1382 1383 <p>This option implies <samp><span class="option">-fwhole-file</span></samp> for Fortran programs. 1384 1385 <br><dt><code>-flto[=</code><var>n</var><code>]</code><dd><a name="index-flto-838"></a>This option runs the standard link-time optimizer. When invoked 1386with source code, it generates GIMPLE (one of GCC's internal 1387representations) and writes it to special ELF sections in the object 1388file. When the object files are linked together, all the function 1389bodies are read from these ELF sections and instantiated as if they 1390had been part of the same translation unit. 1391 1392 <p>To use the link-timer optimizer, <samp><span class="option">-flto</span></samp> needs to be specified at 1393compile time and during the final link. For example, 1394 1395 <pre class="smallexample"> gcc -c -O2 -flto foo.c 1396 gcc -c -O2 -flto bar.c 1397 gcc -o myprog -flto -O2 foo.o bar.o 1398</pre> 1399 <p>The first two invocations to GCC will save a bytecode representation 1400of GIMPLE into special ELF sections inside <samp><span class="file">foo.o</span></samp> and 1401<samp><span class="file">bar.o</span></samp>. The final invocation will read the GIMPLE bytecode from 1402<samp><span class="file">foo.o</span></samp> and <samp><span class="file">bar.o</span></samp>, merge the two files into a single 1403internal image, and compile the result as usual. Since both 1404<samp><span class="file">foo.o</span></samp> and <samp><span class="file">bar.o</span></samp> are merged into a single image, this 1405causes all the inter-procedural analyses and optimizations in GCC to 1406work across the two files as if they were a single one. This means, 1407for example, that the inliner will be able to inline functions in 1408<samp><span class="file">bar.o</span></samp> into functions in <samp><span class="file">foo.o</span></samp> and vice-versa. 1409 1410 <p>Another (simpler) way to enable link-time optimization is, 1411 1412 <pre class="smallexample"> gcc -o myprog -flto -O2 foo.c bar.c 1413</pre> 1414 <p>The above will generate bytecode for <samp><span class="file">foo.c</span></samp> and <samp><span class="file">bar.c</span></samp>, 1415merge them together into a single GIMPLE representation and optimize 1416them as usual to produce <samp><span class="file">myprog</span></samp>. 1417 1418 <p>The only important thing to keep in mind is that to enable link-time 1419optimizations the <samp><span class="option">-flto</span></samp> flag needs to be passed to both the 1420compile and the link commands. 1421 1422 <p>To make whole program optimization effective, it is necessary to make 1423certain whole program assumptions. The compiler needs to know 1424what functions and variables can be accessed by libraries and runtime 1425outside of the link time optimized unit. When supported by the linker, 1426the linker plugin (see <samp><span class="option">-fuse-linker-plugin</span></samp>) passes to the 1427compiler information about used and externally visible symbols. When 1428the linker plugin is not available, <samp><span class="option">-fwhole-program</span></samp> should be 1429used to allow the compiler to make these assumptions, which will lead 1430to more aggressive optimization decisions. 1431 1432 <p>Note that when a file is compiled with <samp><span class="option">-flto</span></samp>, the generated 1433object file will be larger than a regular object file because it will 1434contain GIMPLE bytecodes and the usual final code. This means that 1435object files with LTO information can be linked as a normal object 1436file. So, in the previous example, if the final link is done with 1437 1438 <pre class="smallexample"> gcc -o myprog foo.o bar.o 1439</pre> 1440 <p>The only difference will be that no inter-procedural optimizations 1441will be applied to produce <samp><span class="file">myprog</span></samp>. The two object files 1442<samp><span class="file">foo.o</span></samp> and <samp><span class="file">bar.o</span></samp> will be simply sent to the regular 1443linker. 1444 1445 <p>Additionally, the optimization flags used to compile individual files 1446are not necessarily related to those used at link-time. For instance, 1447 1448 <pre class="smallexample"> gcc -c -O0 -flto foo.c 1449 gcc -c -O0 -flto bar.c 1450 gcc -o myprog -flto -O3 foo.o bar.o 1451</pre> 1452 <p>This will produce individual object files with unoptimized assembler 1453code, but the resulting binary <samp><span class="file">myprog</span></samp> will be optimized at 1454<samp><span class="option">-O3</span></samp>. Now, if the final binary is generated without 1455<samp><span class="option">-flto</span></samp>, then <samp><span class="file">myprog</span></samp> will not be optimized. 1456 1457 <p>When producing the final binary with <samp><span class="option">-flto</span></samp>, GCC will only 1458apply link-time optimizations to those files that contain bytecode. 1459Therefore, you can mix and match object files and libraries with 1460GIMPLE bytecodes and final object code. GCC will automatically select 1461which files to optimize in LTO mode and which files to link without 1462further processing. 1463 1464 <p>There are some code generation flags that GCC will preserve when 1465generating bytecodes, as they need to be used during the final link 1466stage. Currently, the following options are saved into the GIMPLE 1467bytecode files: <samp><span class="option">-fPIC</span></samp>, <samp><span class="option">-fcommon</span></samp> and all the 1468<samp><span class="option">-m</span></samp> target flags. 1469 1470 <p>At link time, these options are read-in and reapplied. Note that the 1471current implementation makes no attempt at recognizing conflicting 1472values for these options. If two or more files have a conflicting 1473value (e.g., one file is compiled with <samp><span class="option">-fPIC</span></samp> and another 1474isn't), the compiler will simply use the last value read from the 1475bytecode files. It is recommended, then, that all the files 1476participating in the same link be compiled with the same options. 1477 1478 <p>Another feature of LTO is that it is possible to apply interprocedural 1479optimizations on files written in different languages. This requires 1480some support in the language front end. Currently, the C, C++ and 1481Fortran front ends are capable of emitting GIMPLE bytecodes, so 1482something like this should work 1483 1484 <pre class="smallexample"> gcc -c -flto foo.c 1485 g++ -c -flto bar.cc 1486 gfortran -c -flto baz.f90 1487 g++ -o myprog -flto -O3 foo.o bar.o baz.o -lgfortran 1488</pre> 1489 <p>Notice that the final link is done with <samp><span class="command">g++</span></samp> to get the C++ 1490runtime libraries and <samp><span class="option">-lgfortran</span></samp> is added to get the Fortran 1491runtime libraries. In general, when mixing languages in LTO mode, you 1492should use the same link command used when mixing languages in a 1493regular (non-LTO) compilation. This means that if your build process 1494was mixing languages before, all you need to add is <samp><span class="option">-flto</span></samp> to 1495all the compile and link commands. 1496 1497 <p>If LTO encounters objects with C linkage declared with incompatible 1498types in separate translation units to be linked together (undefined 1499behavior according to ISO C99 6.2.7), a non-fatal diagnostic may be 1500issued. The behavior is still undefined at runtime. 1501 1502 <p>If object files containing GIMPLE bytecode are stored in a library archive, say 1503<samp><span class="file">libfoo.a</span></samp>, it is possible to extract and use them in an LTO link if you 1504are using a linker with linker plugin support. To enable this feature, use 1505the flag <samp><span class="option">-fuse-linker-plugin</span></samp> at link-time: 1506 1507 <pre class="smallexample"> gcc -o myprog -O2 -flto -fuse-linker-plugin a.o b.o -lfoo 1508</pre> 1509 <p>With the linker plugin enabled, the linker will extract the needed 1510GIMPLE files from <samp><span class="file">libfoo.a</span></samp> and pass them on to the running GCC 1511to make them part of the aggregated GIMPLE image to be optimized. 1512 1513 <p>If you are not using a linker with linker plugin support and/or do not 1514enable linker plugin then the objects inside <samp><span class="file">libfoo.a</span></samp> 1515will be extracted and linked as usual, but they will not participate 1516in the LTO optimization process. 1517 1518 <p>Link time optimizations do not require the presence of the whole program to 1519operate. If the program does not require any symbols to be exported, it is 1520possible to combine <samp><span class="option">-flto</span></samp> and with <samp><span class="option">-fwhole-program</span></samp> to allow 1521the interprocedural optimizers to use more aggressive assumptions which may 1522lead to improved optimization opportunities. 1523Use of <samp><span class="option">-fwhole-program</span></samp> is not needed when linker plugin is 1524active (see <samp><span class="option">-fuse-linker-plugin</span></samp>). 1525 1526 <p>Regarding portability: the current implementation of LTO makes no 1527attempt at generating bytecode that can be ported between different 1528types of hosts. The bytecode files are versioned and there is a 1529strict version check, so bytecode files generated in one version of 1530GCC will not work with an older/newer version of GCC. 1531 1532 <p>Link time optimization does not play well with generating debugging 1533information. Combining <samp><span class="option">-flto</span></samp> with 1534<samp><span class="option">-g</span></samp> is currently experimental and expected to produce wrong 1535results. 1536 1537 <p>If you specify the optional <var>n</var>, the optimization and code 1538generation done at link time is executed in parallel using <var>n</var> 1539parallel jobs by utilizing an installed <samp><span class="command">make</span></samp> program. The 1540environment variable <samp><span class="env">MAKE</span></samp> may be used to override the program 1541used. The default value for <var>n</var> is 1. 1542 1543 <p>You can also specify <samp><span class="option">-flto=jobserver</span></samp> to use GNU make's 1544job server mode to determine the number of parallel jobs. This 1545is useful when the Makefile calling GCC is already executing in parallel. 1546The parent Makefile will need a ‘<samp><span class="samp">+</span></samp>’ prepended to the command recipe 1547for this to work. This will likely only work if <samp><span class="env">MAKE</span></samp> is 1548GNU make. 1549 1550 <p>This option is disabled by default. 1551 1552 <br><dt><code>-flto-partition=</code><var>alg</var><dd><a name="index-flto_002dpartition-839"></a>Specify the partitioning algorithm used by the link time optimizer. 1553The value is either <code>1to1</code> to specify a partitioning mirroring 1554the original source files or <code>balanced</code> to specify partitioning 1555into equally sized chunks (whenever possible). Specifying <code>none</code> 1556as an algorithm disables partitioning and streaming completely. The 1557default value is <code>balanced</code>. 1558 1559 <br><dt><code>-flto-compression-level=</code><var>n</var><dd>This option specifies the level of compression used for intermediate 1560language written to LTO object files, and is only meaningful in 1561conjunction with LTO mode (<samp><span class="option">-flto</span></samp>). Valid 1562values are 0 (no compression) to 9 (maximum compression). Values 1563outside this range are clamped to either 0 or 9. If the option is not 1564given, a default balanced compression setting is used. 1565 1566 <br><dt><code>-flto-report</code><dd>Prints a report with internal details on the workings of the link-time 1567optimizer. The contents of this report vary from version to version, 1568it is meant to be useful to GCC developers when processing object 1569files in LTO mode (via <samp><span class="option">-flto</span></samp>). 1570 1571 <p>Disabled by default. 1572 1573 <br><dt><code>-fuse-linker-plugin</code><dd>Enables the use of linker plugin during link time optimization. This option 1574relies on the linker plugin support in linker that is available in gold 1575or in GNU ld 2.21 or newer. 1576 1577 <p>This option enables the extraction of object files with GIMPLE bytecode out of 1578library archives. This improves the quality of optimization by exposing more 1579code the the link time optimizer. This information specify what symbols 1580can be accessed externally (by non-LTO object or during dynamic linking). 1581Resulting code quality improvements on binaries (and shared libraries that do 1582use hidden visibility) is similar to <code>-fwhole-program</code>. See 1583<samp><span class="option">-flto</span></samp> for a description on the effect of this flag and how to use it. 1584 1585 <p>Enabled by default when LTO support in GCC is enabled and GCC was compiled 1586with a linker supporting plugins (GNU ld 2.21 or newer or gold). 1587 1588 <br><dt><code>-fcompare-elim</code><dd><a name="index-fcompare_002delim-840"></a>After register allocation and post-register allocation instruction splitting, 1589identify arithmetic instructions that compute processor flags similar to a 1590comparison operation based on that arithmetic. If possible, eliminate the 1591explicit comparison operation. 1592 1593 <p>This pass only applies to certain targets that cannot explicitly represent 1594the comparison operation before register allocation is complete. 1595 1596 <p>Enabled at levels <samp><span class="option">-O</span></samp>, <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 1597 1598 <br><dt><code>-fcprop-registers</code><dd><a name="index-fcprop_002dregisters-841"></a>After register allocation and post-register allocation instruction splitting, 1599we perform a copy-propagation pass to try to reduce scheduling dependencies 1600and occasionally eliminate the copy. 1601 1602 <p>Enabled at levels <samp><span class="option">-O</span></samp>, <samp><span class="option">-O2</span></samp>, <samp><span class="option">-O3</span></samp>, <samp><span class="option">-Os</span></samp>. 1603 1604 <br><dt><code>-fprofile-correction</code><dd><a name="index-fprofile_002dcorrection-842"></a>Profiles collected using an instrumented binary for multi-threaded programs may 1605be inconsistent due to missed counter updates. When this option is specified, 1606GCC will use heuristics to correct or smooth out such inconsistencies. By 1607default, GCC will emit an error message when an inconsistent profile is detected. 1608 1609 <br><dt><code>-fprofile-dir=</code><var>path</var><dd><a name="index-fprofile_002ddir-843"></a> 1610Set the directory to search for the profile data files in to <var>path</var>. 1611This option affects only the profile data generated by 1612<samp><span class="option">-fprofile-generate</span></samp>, <samp><span class="option">-ftest-coverage</span></samp>, <samp><span class="option">-fprofile-arcs</span></samp> 1613and used by <samp><span class="option">-fprofile-use</span></samp> and <samp><span class="option">-fbranch-probabilities</span></samp> 1614and its related options. 1615By default, GCC will use the current directory as <var>path</var>, thus the 1616profile data file will appear in the same directory as the object file. 1617 1618 <br><dt><code>-fprofile-generate</code><dt><code>-fprofile-generate=</code><var>path</var><dd><a name="index-fprofile_002dgenerate-844"></a> 1619Enable options usually used for instrumenting application to produce 1620profile useful for later recompilation with profile feedback based 1621optimization. You must use <samp><span class="option">-fprofile-generate</span></samp> both when 1622compiling and when linking your program. 1623 1624 <p>The following options are enabled: <code>-fprofile-arcs</code>, <code>-fprofile-values</code>, <code>-fvpt</code>. 1625 1626 <p>If <var>path</var> is specified, GCC will look at the <var>path</var> to find 1627the profile feedback data files. See <samp><span class="option">-fprofile-dir</span></samp>. 1628 1629 <br><dt><code>-fprofile-use</code><dt><code>-fprofile-use=</code><var>path</var><dd><a name="index-fprofile_002duse-845"></a>Enable profile feedback directed optimizations, and optimizations 1630generally profitable only with profile feedback available. 1631 1632 <p>The following options are enabled: <code>-fbranch-probabilities</code>, <code>-fvpt</code>, 1633<code>-funroll-loops</code>, <code>-fpeel-loops</code>, <code>-ftracer</code> 1634 1635 <p>By default, GCC emits an error message if the feedback profiles do not 1636match the source code. This error can be turned into a warning by using 1637<samp><span class="option">-Wcoverage-mismatch</span></samp>. Note this may result in poorly optimized 1638code. 1639 1640 <p>If <var>path</var> is specified, GCC will look at the <var>path</var> to find 1641the profile feedback data files. See <samp><span class="option">-fprofile-dir</span></samp>. 1642</dl> 1643 1644 <p>The following options control compiler behavior regarding floating 1645point arithmetic. These options trade off between speed and 1646correctness. All must be specifically enabled. 1647 1648 <dl> 1649<dt><code>-ffloat-store</code><dd><a name="index-ffloat_002dstore-846"></a>Do not store floating point variables in registers, and inhibit other 1650options that might change whether a floating point value is taken from a 1651register or memory. 1652 1653 <p><a name="index-floating-point-precision-847"></a>This option prevents undesirable excess precision on machines such as 1654the 68000 where the floating registers (of the 68881) keep more 1655precision than a <code>double</code> is supposed to have. Similarly for the 1656x86 architecture. For most programs, the excess precision does only 1657good, but a few programs rely on the precise definition of IEEE floating 1658point. Use <samp><span class="option">-ffloat-store</span></samp> for such programs, after modifying 1659them to store all pertinent intermediate computations into variables. 1660 1661 <br><dt><code>-fexcess-precision=</code><var>style</var><dd><a name="index-fexcess_002dprecision-848"></a>This option allows further control over excess precision on machines 1662where floating-point registers have more precision than the IEEE 1663<code>float</code> and <code>double</code> types and the processor does not 1664support operations rounding to those types. By default, 1665<samp><span class="option">-fexcess-precision=fast</span></samp> is in effect; this means that 1666operations are carried out in the precision of the registers and that 1667it is unpredictable when rounding to the types specified in the source 1668code takes place. When compiling C, if 1669<samp><span class="option">-fexcess-precision=standard</span></samp> is specified then excess 1670precision will follow the rules specified in ISO C99; in particular, 1671both casts and assignments cause values to be rounded to their 1672semantic types (whereas <samp><span class="option">-ffloat-store</span></samp> only affects 1673assignments). This option is enabled by default for C if a strict 1674conformance option such as <samp><span class="option">-std=c99</span></samp> is used. 1675 1676 <p><a name="index-mfpmath-849"></a><samp><span class="option">-fexcess-precision=standard</span></samp> is not implemented for languages 1677other than C, and has no effect if 1678<samp><span class="option">-funsafe-math-optimizations</span></samp> or <samp><span class="option">-ffast-math</span></samp> is 1679specified. On the x86, it also has no effect if <samp><span class="option">-mfpmath=sse</span></samp> 1680or <samp><span class="option">-mfpmath=sse+387</span></samp> is specified; in the former case, IEEE 1681semantics apply without excess precision, and in the latter, rounding 1682is unpredictable. 1683 1684 <br><dt><code>-ffast-math</code><dd><a name="index-ffast_002dmath-850"></a>Sets <samp><span class="option">-fno-math-errno</span></samp>, <samp><span class="option">-funsafe-math-optimizations</span></samp>, 1685<samp><span class="option">-ffinite-math-only</span></samp>, <samp><span class="option">-fno-rounding-math</span></samp>, 1686<samp><span class="option">-fno-signaling-nans</span></samp> and <samp><span class="option">-fcx-limited-range</span></samp>. 1687 1688 <p>This option causes the preprocessor macro <code>__FAST_MATH__</code> to be defined. 1689 1690 <p>This option is not turned on by any <samp><span class="option">-O</span></samp> option besides 1691<samp><span class="option">-Ofast</span></samp> since it can result in incorrect output for programs 1692which depend on an exact implementation of IEEE or ISO rules/specifications 1693for math functions. It may, however, yield faster code for programs 1694that do not require the guarantees of these specifications. 1695 1696 <br><dt><code>-fno-math-errno</code><dd><a name="index-fno_002dmath_002derrno-851"></a>Do not set ERRNO after calling math functions that are executed 1697with a single instruction, e.g., sqrt. A program that relies on 1698IEEE exceptions for math error handling may want to use this flag 1699for speed while maintaining IEEE arithmetic compatibility. 1700 1701 <p>This option is not turned on by any <samp><span class="option">-O</span></samp> option since 1702it can result in incorrect output for programs which depend on 1703an exact implementation of IEEE or ISO rules/specifications for 1704math functions. It may, however, yield faster code for programs 1705that do not require the guarantees of these specifications. 1706 1707 <p>The default is <samp><span class="option">-fmath-errno</span></samp>. 1708 1709 <p>On Darwin systems, the math library never sets <code>errno</code>. There is 1710therefore no reason for the compiler to consider the possibility that 1711it might, and <samp><span class="option">-fno-math-errno</span></samp> is the default. 1712 1713 <br><dt><code>-funsafe-math-optimizations</code><dd><a name="index-funsafe_002dmath_002doptimizations-852"></a> 1714Allow optimizations for floating-point arithmetic that (a) assume 1715that arguments and results are valid and (b) may violate IEEE or 1716ANSI standards. When used at link-time, it may include libraries 1717or startup files that change the default FPU control word or other 1718similar optimizations. 1719 1720 <p>This option is not turned on by any <samp><span class="option">-O</span></samp> option since 1721it can result in incorrect output for programs which depend on 1722an exact implementation of IEEE or ISO rules/specifications for 1723math functions. It may, however, yield faster code for programs 1724that do not require the guarantees of these specifications. 1725Enables <samp><span class="option">-fno-signed-zeros</span></samp>, <samp><span class="option">-fno-trapping-math</span></samp>, 1726<samp><span class="option">-fassociative-math</span></samp> and <samp><span class="option">-freciprocal-math</span></samp>. 1727 1728 <p>The default is <samp><span class="option">-fno-unsafe-math-optimizations</span></samp>. 1729 1730 <br><dt><code>-fassociative-math</code><dd><a name="index-fassociative_002dmath-853"></a> 1731Allow re-association of operands in series of floating-point operations. 1732This violates the ISO C and C++ language standard by possibly changing 1733computation result. NOTE: re-ordering may change the sign of zero as 1734well as ignore NaNs and inhibit or create underflow or overflow (and 1735thus cannot be used on a code which relies on rounding behavior like 1736<code>(x + 2**52) - 2**52)</code>. May also reorder floating-point comparisons 1737and thus may not be used when ordered comparisons are required. 1738This option requires that both <samp><span class="option">-fno-signed-zeros</span></samp> and 1739<samp><span class="option">-fno-trapping-math</span></samp> be in effect. Moreover, it doesn't make 1740much sense with <samp><span class="option">-frounding-math</span></samp>. For Fortran the option 1741is automatically enabled when both <samp><span class="option">-fno-signed-zeros</span></samp> and 1742<samp><span class="option">-fno-trapping-math</span></samp> are in effect. 1743 1744 <p>The default is <samp><span class="option">-fno-associative-math</span></samp>. 1745 1746 <br><dt><code>-freciprocal-math</code><dd><a name="index-freciprocal_002dmath-854"></a> 1747Allow the reciprocal of a value to be used instead of dividing by 1748the value if this enables optimizations. For example <code>x / y</code> 1749can be replaced with <code>x * (1/y)</code> which is useful if <code>(1/y)</code> 1750is subject to common subexpression elimination. Note that this loses 1751precision and increases the number of flops operating on the value. 1752 1753 <p>The default is <samp><span class="option">-fno-reciprocal-math</span></samp>. 1754 1755 <br><dt><code>-ffinite-math-only</code><dd><a name="index-ffinite_002dmath_002donly-855"></a>Allow optimizations for floating-point arithmetic that assume 1756that arguments and results are not NaNs or +-Infs. 1757 1758 <p>This option is not turned on by any <samp><span class="option">-O</span></samp> option since 1759it can result in incorrect output for programs which depend on 1760an exact implementation of IEEE or ISO rules/specifications for 1761math functions. It may, however, yield faster code for programs 1762that do not require the guarantees of these specifications. 1763 1764 <p>The default is <samp><span class="option">-fno-finite-math-only</span></samp>. 1765 1766 <br><dt><code>-fno-signed-zeros</code><dd><a name="index-fno_002dsigned_002dzeros-856"></a>Allow optimizations for floating point arithmetic that ignore the 1767signedness of zero. IEEE arithmetic specifies the behavior of 1768distinct +0.0 and −0.0 values, which then prohibits simplification 1769of expressions such as x+0.0 or 0.0*x (even with <samp><span class="option">-ffinite-math-only</span></samp>). 1770This option implies that the sign of a zero result isn't significant. 1771 1772 <p>The default is <samp><span class="option">-fsigned-zeros</span></samp>. 1773 1774 <br><dt><code>-fno-trapping-math</code><dd><a name="index-fno_002dtrapping_002dmath-857"></a>Compile code assuming that floating-point operations cannot generate 1775user-visible traps. These traps include division by zero, overflow, 1776underflow, inexact result and invalid operation. This option requires 1777that <samp><span class="option">-fno-signaling-nans</span></samp> be in effect. Setting this option may 1778allow faster code if one relies on “non-stop” IEEE arithmetic, for example. 1779 1780 <p>This option should never be turned on by any <samp><span class="option">-O</span></samp> option since 1781it can result in incorrect output for programs which depend on 1782an exact implementation of IEEE or ISO rules/specifications for 1783math functions. 1784 1785 <p>The default is <samp><span class="option">-ftrapping-math</span></samp>. 1786 1787 <br><dt><code>-frounding-math</code><dd><a name="index-frounding_002dmath-858"></a>Disable transformations and optimizations that assume default floating 1788point rounding behavior. This is round-to-zero for all floating point 1789to integer conversions, and round-to-nearest for all other arithmetic 1790truncations. This option should be specified for programs that change 1791the FP rounding mode dynamically, or that may be executed with a 1792non-default rounding mode. This option disables constant folding of 1793floating point expressions at compile-time (which may be affected by 1794rounding mode) and arithmetic transformations that are unsafe in the 1795presence of sign-dependent rounding modes. 1796 1797 <p>The default is <samp><span class="option">-fno-rounding-math</span></samp>. 1798 1799 <p>This option is experimental and does not currently guarantee to 1800disable all GCC optimizations that are affected by rounding mode. 1801Future versions of GCC may provide finer control of this setting 1802using C99's <code>FENV_ACCESS</code> pragma. This command line option 1803will be used to specify the default state for <code>FENV_ACCESS</code>. 1804 1805 <br><dt><code>-fsignaling-nans</code><dd><a name="index-fsignaling_002dnans-859"></a>Compile code assuming that IEEE signaling NaNs may generate user-visible 1806traps during floating-point operations. Setting this option disables 1807optimizations that may change the number of exceptions visible with 1808signaling NaNs. This option implies <samp><span class="option">-ftrapping-math</span></samp>. 1809 1810 <p>This option causes the preprocessor macro <code>__SUPPORT_SNAN__</code> to 1811be defined. 1812 1813 <p>The default is <samp><span class="option">-fno-signaling-nans</span></samp>. 1814 1815 <p>This option is experimental and does not currently guarantee to 1816disable all GCC optimizations that affect signaling NaN behavior. 1817 1818 <br><dt><code>-fsingle-precision-constant</code><dd><a name="index-fsingle_002dprecision_002dconstant-860"></a>Treat floating point constant as single precision constant instead of 1819implicitly converting it to double precision constant. 1820 1821 <br><dt><code>-fcx-limited-range</code><dd><a name="index-fcx_002dlimited_002drange-861"></a>When enabled, this option states that a range reduction step is not 1822needed when performing complex division. Also, there is no checking 1823whether the result of a complex multiplication or division is <code>NaN 1824+ I*NaN</code>, with an attempt to rescue the situation in that case. The 1825default is <samp><span class="option">-fno-cx-limited-range</span></samp>, but is enabled by 1826<samp><span class="option">-ffast-math</span></samp>. 1827 1828 <p>This option controls the default setting of the ISO C99 1829<code>CX_LIMITED_RANGE</code> pragma. Nevertheless, the option applies to 1830all languages. 1831 1832 <br><dt><code>-fcx-fortran-rules</code><dd><a name="index-fcx_002dfortran_002drules-862"></a>Complex multiplication and division follow Fortran rules. Range 1833reduction is done as part of complex division, but there is no checking 1834whether the result of a complex multiplication or division is <code>NaN 1835+ I*NaN</code>, with an attempt to rescue the situation in that case. 1836 1837 <p>The default is <samp><span class="option">-fno-cx-fortran-rules</span></samp>. 1838 1839 </dl> 1840 1841 <p>The following options control optimizations that may improve 1842performance, but are not enabled by any <samp><span class="option">-O</span></samp> options. This 1843section includes experimental options that may produce broken code. 1844 1845 <dl> 1846<dt><code>-fbranch-probabilities</code><dd><a name="index-fbranch_002dprobabilities-863"></a>After running a program compiled with <samp><span class="option">-fprofile-arcs</span></samp> 1847(see <a href="Debugging-Options.html#Debugging-Options">Options for Debugging Your Program or <samp><span class="command">gcc</span></samp></a>), you can compile it a second time using 1848<samp><span class="option">-fbranch-probabilities</span></samp>, to improve optimizations based on 1849the number of times each branch was taken. When the program 1850compiled with <samp><span class="option">-fprofile-arcs</span></samp> exits it saves arc execution 1851counts to a file called <samp><var>sourcename</var><span class="file">.gcda</span></samp> for each source 1852file. The information in this data file is very dependent on the 1853structure of the generated code, so you must use the same source code 1854and the same optimization options for both compilations. 1855 1856 <p>With <samp><span class="option">-fbranch-probabilities</span></samp>, GCC puts a 1857‘<samp><span class="samp">REG_BR_PROB</span></samp>’ note on each ‘<samp><span class="samp">JUMP_INSN</span></samp>’ and ‘<samp><span class="samp">CALL_INSN</span></samp>’. 1858These can be used to improve optimization. Currently, they are only 1859used in one place: in <samp><span class="file">reorg.c</span></samp>, instead of guessing which path a 1860branch is most likely to take, the ‘<samp><span class="samp">REG_BR_PROB</span></samp>’ values are used to 1861exactly determine which path is taken more often. 1862 1863 <br><dt><code>-fprofile-values</code><dd><a name="index-fprofile_002dvalues-864"></a>If combined with <samp><span class="option">-fprofile-arcs</span></samp>, it adds code so that some 1864data about values of expressions in the program is gathered. 1865 1866 <p>With <samp><span class="option">-fbranch-probabilities</span></samp>, it reads back the data gathered 1867from profiling values of expressions for usage in optimizations. 1868 1869 <p>Enabled with <samp><span class="option">-fprofile-generate</span></samp> and <samp><span class="option">-fprofile-use</span></samp>. 1870 1871 <br><dt><code>-fvpt</code><dd><a name="index-fvpt-865"></a>If combined with <samp><span class="option">-fprofile-arcs</span></samp>, it instructs the compiler to add 1872a code to gather information about values of expressions. 1873 1874 <p>With <samp><span class="option">-fbranch-probabilities</span></samp>, it reads back the data gathered 1875and actually performs the optimizations based on them. 1876Currently the optimizations include specialization of division operation 1877using the knowledge about the value of the denominator. 1878 1879 <br><dt><code>-frename-registers</code><dd><a name="index-frename_002dregisters-866"></a>Attempt to avoid false dependencies in scheduled code by making use 1880of registers left over after register allocation. This optimization 1881will most benefit processors with lots of registers. Depending on the 1882debug information format adopted by the target, however, it can 1883make debugging impossible, since variables will no longer stay in 1884a “home register”. 1885 1886 <p>Enabled by default with <samp><span class="option">-funroll-loops</span></samp> and <samp><span class="option">-fpeel-loops</span></samp>. 1887 1888 <br><dt><code>-ftracer</code><dd><a name="index-ftracer-867"></a>Perform tail duplication to enlarge superblock size. This transformation 1889simplifies the control flow of the function allowing other optimizations to do 1890better job. 1891 1892 <p>Enabled with <samp><span class="option">-fprofile-use</span></samp>. 1893 1894 <br><dt><code>-funroll-loops</code><dd><a name="index-funroll_002dloops-868"></a>Unroll loops whose number of iterations can be determined at compile time or 1895upon entry to the loop. <samp><span class="option">-funroll-loops</span></samp> implies 1896<samp><span class="option">-frerun-cse-after-loop</span></samp>, <samp><span class="option">-fweb</span></samp> and <samp><span class="option">-frename-registers</span></samp>. 1897It also turns on complete loop peeling (i.e. complete removal of loops with 1898small constant number of iterations). This option makes code larger, and may 1899or may not make it run faster. 1900 1901 <p>Enabled with <samp><span class="option">-fprofile-use</span></samp>. 1902 1903 <br><dt><code>-funroll-all-loops</code><dd><a name="index-funroll_002dall_002dloops-869"></a>Unroll all loops, even if their number of iterations is uncertain when 1904the loop is entered. This usually makes programs run more slowly. 1905<samp><span class="option">-funroll-all-loops</span></samp> implies the same options as 1906<samp><span class="option">-funroll-loops</span></samp>. 1907 1908 <br><dt><code>-fpeel-loops</code><dd><a name="index-fpeel_002dloops-870"></a>Peels the loops for that there is enough information that they do not 1909roll much (from profile feedback). It also turns on complete loop peeling 1910(i.e. complete removal of loops with small constant number of iterations). 1911 1912 <p>Enabled with <samp><span class="option">-fprofile-use</span></samp>. 1913 1914 <br><dt><code>-fmove-loop-invariants</code><dd><a name="index-fmove_002dloop_002dinvariants-871"></a>Enables the loop invariant motion pass in the RTL loop optimizer. Enabled 1915at level <samp><span class="option">-O1</span></samp> 1916 1917 <br><dt><code>-funswitch-loops</code><dd><a name="index-funswitch_002dloops-872"></a>Move branches with loop invariant conditions out of the loop, with duplicates 1918of the loop on both branches (modified according to result of the condition). 1919 1920 <br><dt><code>-ffunction-sections</code><dt><code>-fdata-sections</code><dd><a name="index-ffunction_002dsections-873"></a><a name="index-fdata_002dsections-874"></a>Place each function or data item into its own section in the output 1921file if the target supports arbitrary sections. The name of the 1922function or the name of the data item determines the section's name 1923in the output file. 1924 1925 <p>Use these options on systems where the linker can perform optimizations 1926to improve locality of reference in the instruction space. Most systems 1927using the ELF object format and SPARC processors running Solaris 2 have 1928linkers with such optimizations. AIX may have these optimizations in 1929the future. 1930 1931 <p>Only use these options when there are significant benefits from doing 1932so. When you specify these options, the assembler and linker will 1933create larger object and executable files and will also be slower. 1934You will not be able to use <code>gprof</code> on all systems if you 1935specify this option and you may have problems with debugging if 1936you specify both this option and <samp><span class="option">-g</span></samp>. 1937 1938 <br><dt><code>-fbranch-target-load-optimize</code><dd><a name="index-fbranch_002dtarget_002dload_002doptimize-875"></a>Perform branch target register load optimization before prologue / epilogue 1939threading. 1940The use of target registers can typically be exposed only during reload, 1941thus hoisting loads out of loops and doing inter-block scheduling needs 1942a separate optimization pass. 1943 1944 <br><dt><code>-fbranch-target-load-optimize2</code><dd><a name="index-fbranch_002dtarget_002dload_002doptimize2-876"></a>Perform branch target register load optimization after prologue / epilogue 1945threading. 1946 1947 <br><dt><code>-fbtr-bb-exclusive</code><dd><a name="index-fbtr_002dbb_002dexclusive-877"></a>When performing branch target register load optimization, don't reuse 1948branch target registers in within any basic block. 1949 1950 <br><dt><code>-fstack-protector</code><dd><a name="index-fstack_002dprotector-878"></a>Emit extra code to check for buffer overflows, such as stack smashing 1951attacks. This is done by adding a guard variable to functions with 1952vulnerable objects. This includes functions that call alloca, and 1953functions with buffers larger than 8 bytes. The guards are initialized 1954when a function is entered and then checked when the function exits. 1955If a guard check fails, an error message is printed and the program exits. 1956 1957 <br><dt><code>-fstack-protector-all</code><dd><a name="index-fstack_002dprotector_002dall-879"></a>Like <samp><span class="option">-fstack-protector</span></samp> except that all functions are protected. 1958 1959 <br><dt><code>-fsection-anchors</code><dd><a name="index-fsection_002danchors-880"></a>Try to reduce the number of symbolic address calculations by using 1960shared “anchor” symbols to address nearby objects. This transformation 1961can help to reduce the number of GOT entries and GOT accesses on some 1962targets. 1963 1964 <p>For example, the implementation of the following function <code>foo</code>: 1965 1966 <pre class="smallexample"> static int a, b, c; 1967 int foo (void) { return a + b + c; } 1968</pre> 1969 <p>would usually calculate the addresses of all three variables, but if you 1970compile it with <samp><span class="option">-fsection-anchors</span></samp>, it will access the variables 1971from a common anchor point instead. The effect is similar to the 1972following pseudocode (which isn't valid C): 1973 1974 <pre class="smallexample"> int foo (void) 1975 { 1976 register int *xr = &x; 1977 return xr[&a - &x] + xr[&b - &x] + xr[&c - &x]; 1978 } 1979</pre> 1980 <p>Not all targets support this option. 1981 1982 <br><dt><code>-fremove-local-statics</code><dd><a name="index-fremove_002dlocal_002dstatics-881"></a>Converts function-local static variables to automatic variables when it 1983is safe to do so. This transformation can reduce the number of 1984instructions executed due to automatic variables being cheaper to 1985read/write than static variables. 1986 1987 <br><dt><code>-fpromote-loop-indices</code><dd><a name="index-fpromote_002dloop_002dindices-882"></a>Converts loop indices that have a type shorter than the word size to 1988word-sized quantities. This transformation can reduce the overhead 1989associated with sign/zero-extension and truncation of such variables. 1990Using <samp><span class="option">-funsafe-loop-optimizations</span></samp> with this option may result 1991in more effective optimization. 1992 1993 <br><dt><code>--param </code><var>name</var><code>=</code><var>value</var><dd><a name="index-param-883"></a>In some places, GCC uses various constants to control the amount of 1994optimization that is done. For example, GCC will not inline functions 1995that contain more that a certain number of instructions. You can 1996control some of these constants on the command-line using the 1997<samp><span class="option">--param</span></samp> option. 1998 1999 <p>The names of specific parameters, and the meaning of the values, are 2000tied to the internals of the compiler, and are subject to change 2001without notice in future releases. 2002 2003 <p>In each case, the <var>value</var> is an integer. The allowable choices for 2004<var>name</var> are given in the following table: 2005 2006 <dl> 2007<dt><code>struct-reorg-cold-struct-ratio</code><dd>The threshold ratio (as a percentage) between a structure frequency 2008and the frequency of the hottest structure in the program. This parameter 2009is used by struct-reorg optimization enabled by <samp><span class="option">-fipa-struct-reorg</span></samp>. 2010We say that if the ratio of a structure frequency, calculated by profiling, 2011to the hottest structure frequency in the program is less than this 2012parameter, then structure reorganization is not applied to this structure. 2013The default is 10. 2014 2015 <br><dt><code>predictable-branch-outcome</code><dd>When branch is predicted to be taken with probability lower than this threshold 2016(in percent), then it is considered well predictable. The default is 10. 2017 2018 <br><dt><code>max-crossjump-edges</code><dd>The maximum number of incoming edges to consider for crossjumping. 2019The algorithm used by <samp><span class="option">-fcrossjumping</span></samp> is O(N^2) in 2020the number of edges incoming to each block. Increasing values mean 2021more aggressive optimization, making the compile time increase with 2022probably small improvement in executable size. 2023 2024 <br><dt><code>min-crossjump-insns</code><dd>The minimum number of instructions which must be matched at the end 2025of two blocks before crossjumping will be performed on them. This 2026value is ignored in the case where all instructions in the block being 2027crossjumped from are matched. The default value is 5. 2028 2029 <br><dt><code>max-grow-copy-bb-insns</code><dd>The maximum code size expansion factor when copying basic blocks 2030instead of jumping. The expansion is relative to a jump instruction. 2031The default value is 8. 2032 2033 <br><dt><code>max-goto-duplication-insns</code><dd>The maximum number of instructions to duplicate to a block that jumps 2034to a computed goto. To avoid O(N^2) behavior in a number of 2035passes, GCC factors computed gotos early in the compilation process, 2036and unfactors them as late as possible. Only computed jumps at the 2037end of a basic blocks with no more than max-goto-duplication-insns are 2038unfactored. The default value is 8. 2039 2040 <br><dt><code>max-delay-slot-insn-search</code><dd>The maximum number of instructions to consider when looking for an 2041instruction to fill a delay slot. If more than this arbitrary number of 2042instructions is searched, the time savings from filling the delay slot 2043will be minimal so stop searching. Increasing values mean more 2044aggressive optimization, making the compile time increase with probably 2045small improvement in executable run time. 2046 2047 <br><dt><code>max-delay-slot-live-search</code><dd>When trying to fill delay slots, the maximum number of instructions to 2048consider when searching for a block with valid live register 2049information. Increasing this arbitrarily chosen value means more 2050aggressive optimization, increasing the compile time. This parameter 2051should be removed when the delay slot code is rewritten to maintain the 2052control-flow graph. 2053 2054 <br><dt><code>max-gcse-memory</code><dd>The approximate maximum amount of memory that will be allocated in 2055order to perform the global common subexpression elimination 2056optimization. If more memory than specified is required, the 2057optimization will not be done. 2058 2059 <br><dt><code>max-gcse-insertion-ratio</code><dd>If the ratio of expression insertions to deletions is larger than this value 2060for any expression, then RTL PRE will insert or remove the expression and thus 2061leave partially redundant computations in the instruction stream. The default value is 20. 2062 2063 <br><dt><code>max-pending-list-length</code><dd>The maximum number of pending dependencies scheduling will allow 2064before flushing the current state and starting over. Large functions 2065with few branches or calls can create excessively large lists which 2066needlessly consume memory and resources. 2067 2068 <br><dt><code>max-inline-insns-single</code><dd>Several parameters control the tree inliner used in gcc. 2069This number sets the maximum number of instructions (counted in GCC's 2070internal representation) in a single function that the tree inliner 2071will consider for inlining. This only affects functions declared 2072inline and methods implemented in a class declaration (C++). 2073The default value is 400. 2074 2075 <br><dt><code>max-inline-insns-auto</code><dd>When you use <samp><span class="option">-finline-functions</span></samp> (included in <samp><span class="option">-O3</span></samp>), 2076a lot of functions that would otherwise not be considered for inlining 2077by the compiler will be investigated. To those functions, a different 2078(more restrictive) limit compared to functions declared inline can 2079be applied. 2080The default value is 40. 2081 2082 <br><dt><code>large-function-insns</code><dd>The limit specifying really large functions. For functions larger than this 2083limit after inlining, inlining is constrained by 2084<samp><span class="option">--param large-function-growth</span></samp>. This parameter is useful primarily 2085to avoid extreme compilation time caused by non-linear algorithms used by the 2086backend. 2087The default value is 2700. 2088 2089 <br><dt><code>large-function-growth</code><dd>Specifies maximal growth of large function caused by inlining in percents. 2090The default value is 100 which limits large function growth to 2.0 times 2091the original size. 2092 2093 <br><dt><code>large-unit-insns</code><dd>The limit specifying large translation unit. Growth caused by inlining of 2094units larger than this limit is limited by <samp><span class="option">--param inline-unit-growth</span></samp>. 2095For small units this might be too tight (consider unit consisting of function A 2096that is inline and B that just calls A three time. If B is small relative to 2097A, the growth of unit is 300\% and yet such inlining is very sane. For very 2098large units consisting of small inlineable functions however the overall unit 2099growth limit is needed to avoid exponential explosion of code size. Thus for 2100smaller units, the size is increased to <samp><span class="option">--param large-unit-insns</span></samp> 2101before applying <samp><span class="option">--param inline-unit-growth</span></samp>. The default is 10000 2102 2103 <br><dt><code>inline-unit-growth</code><dd>Specifies maximal overall growth of the compilation unit caused by inlining. 2104The default value is 30 which limits unit growth to 1.3 times the original 2105size. 2106 2107 <br><dt><code>ipcp-unit-growth</code><dd>Specifies maximal overall growth of the compilation unit caused by 2108interprocedural constant propagation. The default value is 10 which limits 2109unit growth to 1.1 times the original size. 2110 2111 <br><dt><code>large-stack-frame</code><dd>The limit specifying large stack frames. While inlining the algorithm is trying 2112to not grow past this limit too much. Default value is 256 bytes. 2113 2114 <br><dt><code>large-stack-frame-growth</code><dd>Specifies maximal growth of large stack frames caused by inlining in percents. 2115The default value is 1000 which limits large stack frame growth to 11 times 2116the original size. 2117 2118 <br><dt><code>max-inline-insns-recursive</code><dt><code>max-inline-insns-recursive-auto</code><dd>Specifies maximum number of instructions out-of-line copy of self recursive inline 2119function can grow into by performing recursive inlining. 2120 2121 <p>For functions declared inline <samp><span class="option">--param max-inline-insns-recursive</span></samp> is 2122taken into account. For function not declared inline, recursive inlining 2123happens only when <samp><span class="option">-finline-functions</span></samp> (included in <samp><span class="option">-O3</span></samp>) is 2124enabled and <samp><span class="option">--param max-inline-insns-recursive-auto</span></samp> is used. The 2125default value is 450. 2126 2127 <br><dt><code>max-inline-recursive-depth</code><dt><code>max-inline-recursive-depth-auto</code><dd>Specifies maximum recursion depth used by the recursive inlining. 2128 2129 <p>For functions declared inline <samp><span class="option">--param max-inline-recursive-depth</span></samp> is 2130taken into account. For function not declared inline, recursive inlining 2131happens only when <samp><span class="option">-finline-functions</span></samp> (included in <samp><span class="option">-O3</span></samp>) is 2132enabled and <samp><span class="option">--param max-inline-recursive-depth-auto</span></samp> is used. The 2133default value is 8. 2134 2135 <br><dt><code>min-inline-recursive-probability</code><dd>Recursive inlining is profitable only for function having deep recursion 2136in average and can hurt for function having little recursion depth by 2137increasing the prologue size or complexity of function body to other 2138optimizers. 2139 2140 <p>When profile feedback is available (see <samp><span class="option">-fprofile-generate</span></samp>) the actual 2141recursion depth can be guessed from probability that function will recurse via 2142given call expression. This parameter limits inlining only to call expression 2143whose probability exceeds given threshold (in percents). The default value is 214410. 2145 2146 <br><dt><code>early-inlining-insns</code><dd>Specify growth that early inliner can make. In effect it increases amount of 2147inlining for code having large abstraction penalty. The default value is 10. 2148 2149 <br><dt><code>max-early-inliner-iterations</code><dt><code>max-early-inliner-iterations</code><dd>Limit of iterations of early inliner. This basically bounds number of nested 2150indirect calls early inliner can resolve. Deeper chains are still handled by 2151late inlining. 2152 2153 <br><dt><code>comdat-sharing-probability</code><dt><code>comdat-sharing-probability</code><dd>Probability (in percent) that C++ inline function with comdat visibility 2154will be shared across multiple compilation units. The default value is 20. 2155 2156 <br><dt><code>min-vect-loop-bound</code><dd>The minimum number of iterations under which a loop will not get vectorized 2157when <samp><span class="option">-ftree-vectorize</span></samp> is used. The number of iterations after 2158vectorization needs to be greater than the value specified by this option 2159to allow vectorization. The default value is 0. 2160 2161 <br><dt><code>gcse-cost-distance-ratio</code><dd>Scaling factor in calculation of maximum distance an expression 2162can be moved by GCSE optimizations. This is currently supported only in the 2163code hoisting pass. The bigger the ratio, the more aggressive code hoisting 2164will be with simple expressions, i.e., the expressions which have cost 2165less than <samp><span class="option">gcse-unrestricted-cost</span></samp>. Specifying 0 will disable 2166hoisting of simple expressions. The default value is 10. 2167 2168 <br><dt><code>gcse-unrestricted-cost</code><dd>Cost, roughly measured as the cost of a single typical machine 2169instruction, at which GCSE optimizations will not constrain 2170the distance an expression can travel. This is currently 2171supported only in the code hoisting pass. The lesser the cost, 2172the more aggressive code hoisting will be. Specifying 0 will 2173allow all expressions to travel unrestricted distances. 2174The default value is 3. 2175 2176 <br><dt><code>max-hoist-depth</code><dd>The depth of search in the dominator tree for expressions to hoist. 2177This is used to avoid quadratic behavior in hoisting algorithm. 2178The value of 0 will avoid limiting the search, but may slow down compilation 2179of huge functions. The default value is 30. 2180 2181 <br><dt><code>max-unrolled-insns</code><dd>The maximum number of instructions that a loop should have if that loop 2182is unrolled, and if the loop is unrolled, it determines how many times 2183the loop code is unrolled. 2184 2185 <br><dt><code>max-average-unrolled-insns</code><dd>The maximum number of instructions biased by probabilities of their execution 2186that a loop should have if that loop is unrolled, and if the loop is unrolled, 2187it determines how many times the loop code is unrolled. 2188 2189 <br><dt><code>max-unroll-times</code><dd>The maximum number of unrollings of a single loop. 2190 2191 <br><dt><code>max-peeled-insns</code><dd>The maximum number of instructions that a loop should have if that loop 2192is peeled, and if the loop is peeled, it determines how many times 2193the loop code is peeled. 2194 2195 <br><dt><code>max-peel-times</code><dd>The maximum number of peelings of a single loop. 2196 2197 <br><dt><code>max-completely-peeled-insns</code><dd>The maximum number of insns of a completely peeled loop. 2198 2199 <br><dt><code>max-completely-peel-times</code><dd>The maximum number of iterations of a loop to be suitable for complete peeling. 2200 2201 <br><dt><code>max-completely-peel-loop-nest-depth</code><dd>The maximum depth of a loop nest suitable for complete peeling. 2202 2203 <br><dt><code>max-unswitch-insns</code><dd>The maximum number of insns of an unswitched loop. 2204 2205 <br><dt><code>max-unswitch-level</code><dd>The maximum number of branches unswitched in a single loop. 2206 2207 <br><dt><code>lim-expensive</code><dd>The minimum cost of an expensive expression in the loop invariant motion. 2208 2209 <br><dt><code>iv-consider-all-candidates-bound</code><dd>Bound on number of candidates for induction variables below that 2210all candidates are considered for each use in induction variable 2211optimizations. Only the most relevant candidates are considered 2212if there are more candidates, to avoid quadratic time complexity. 2213 2214 <br><dt><code>iv-max-considered-uses</code><dd>The induction variable optimizations give up on loops that contain more 2215induction variable uses. 2216 2217 <br><dt><code>iv-always-prune-cand-set-bound</code><dd>If number of candidates in the set is smaller than this value, 2218we always try to remove unnecessary ivs from the set during its 2219optimization when a new iv is added to the set. 2220 2221 <br><dt><code>scev-max-expr-size</code><dd>Bound on size of expressions used in the scalar evolutions analyzer. 2222Large expressions slow the analyzer. 2223 2224 <br><dt><code>scev-max-expr-complexity</code><dd>Bound on the complexity of the expressions in the scalar evolutions analyzer. 2225Complex expressions slow the analyzer. 2226 2227 <br><dt><code>omega-max-vars</code><dd>The maximum number of variables in an Omega constraint system. 2228The default value is 128. 2229 2230 <br><dt><code>omega-max-geqs</code><dd>The maximum number of inequalities in an Omega constraint system. 2231The default value is 256. 2232 2233 <br><dt><code>omega-max-eqs</code><dd>The maximum number of equalities in an Omega constraint system. 2234The default value is 128. 2235 2236 <br><dt><code>omega-max-wild-cards</code><dd>The maximum number of wildcard variables that the Omega solver will 2237be able to insert. The default value is 18. 2238 2239 <br><dt><code>omega-hash-table-size</code><dd>The size of the hash table in the Omega solver. The default value is 2240550. 2241 2242 <br><dt><code>omega-max-keys</code><dd>The maximal number of keys used by the Omega solver. The default 2243value is 500. 2244 2245 <br><dt><code>omega-eliminate-redundant-constraints</code><dd>When set to 1, use expensive methods to eliminate all redundant 2246constraints. The default value is 0. 2247 2248 <br><dt><code>vect-max-version-for-alignment-checks</code><dd>The maximum number of runtime checks that can be performed when 2249doing loop versioning for alignment in the vectorizer. See option 2250ftree-vect-loop-version for more information. 2251 2252 <br><dt><code>vect-max-version-for-alias-checks</code><dd>The maximum number of runtime checks that can be performed when 2253doing loop versioning for alias in the vectorizer. See option 2254ftree-vect-loop-version for more information. 2255 2256 <br><dt><code>max-iterations-to-track</code><dd> 2257The maximum number of iterations of a loop the brute force algorithm 2258for analysis of # of iterations of the loop tries to evaluate. 2259 2260 <br><dt><code>hot-bb-count-fraction</code><dd>Select fraction of the maximal count of repetitions of basic block in program 2261given basic block needs to have to be considered hot. 2262 2263 <br><dt><code>hot-bb-frequency-fraction</code><dd>Select fraction of the entry block frequency of executions of basic block in 2264function given basic block needs to have to be considered hot 2265 2266 <br><dt><code>max-predicted-iterations</code><dd>The maximum number of loop iterations we predict statically. This is useful 2267in cases where function contain single loop with known bound and other loop 2268with unknown. We predict the known number of iterations correctly, while 2269the unknown number of iterations average to roughly 10. This means that the 2270loop without bounds would appear artificially cold relative to the other one. 2271 2272 <br><dt><code>align-threshold</code><dd> 2273Select fraction of the maximal frequency of executions of basic block in 2274function given basic block will get aligned. 2275 2276 <br><dt><code>align-loop-iterations</code><dd> 2277A loop expected to iterate at lest the selected number of iterations will get 2278aligned. 2279 2280 <br><dt><code>tracer-dynamic-coverage</code><dt><code>tracer-dynamic-coverage-feedback</code><dd> 2281This value is used to limit superblock formation once the given percentage of 2282executed instructions is covered. This limits unnecessary code size 2283expansion. 2284 2285 <p>The <samp><span class="option">tracer-dynamic-coverage-feedback</span></samp> is used only when profile 2286feedback is available. The real profiles (as opposed to statically estimated 2287ones) are much less balanced allowing the threshold to be larger value. 2288 2289 <br><dt><code>tracer-max-code-growth</code><dd>Stop tail duplication once code growth has reached given percentage. This is 2290rather hokey argument, as most of the duplicates will be eliminated later in 2291cross jumping, so it may be set to much higher values than is the desired code 2292growth. 2293 2294 <br><dt><code>tracer-min-branch-ratio</code><dd> 2295Stop reverse growth when the reverse probability of best edge is less than this 2296threshold (in percent). 2297 2298 <br><dt><code>tracer-min-branch-ratio</code><dt><code>tracer-min-branch-ratio-feedback</code><dd> 2299Stop forward growth if the best edge do have probability lower than this 2300threshold. 2301 2302 <p>Similarly to <samp><span class="option">tracer-dynamic-coverage</span></samp> two values are present, one for 2303compilation for profile feedback and one for compilation without. The value 2304for compilation with profile feedback needs to be more conservative (higher) in 2305order to make tracer effective. 2306 2307 <br><dt><code>max-cse-path-length</code><dd> 2308Maximum number of basic blocks on path that cse considers. The default is 10. 2309 2310 <br><dt><code>max-cse-insns</code><dd>The maximum instructions CSE process before flushing. The default is 1000. 2311 2312 <br><dt><code>ggc-min-expand</code><dd> 2313GCC uses a garbage collector to manage its own memory allocation. This 2314parameter specifies the minimum percentage by which the garbage 2315collector's heap should be allowed to expand between collections. 2316Tuning this may improve compilation speed; it has no effect on code 2317generation. 2318 2319 <p>The default is 30% + 70% * (RAM/1GB) with an upper bound of 100% when 2320RAM >= 1GB. If <code>getrlimit</code> is available, the notion of "RAM" is 2321the smallest of actual RAM and <code>RLIMIT_DATA</code> or <code>RLIMIT_AS</code>. If 2322GCC is not able to calculate RAM on a particular platform, the lower 2323bound of 30% is used. Setting this parameter and 2324<samp><span class="option">ggc-min-heapsize</span></samp> to zero causes a full collection to occur at 2325every opportunity. This is extremely slow, but can be useful for 2326debugging. 2327 2328 <br><dt><code>ggc-min-heapsize</code><dd> 2329Minimum size of the garbage collector's heap before it begins bothering 2330to collect garbage. The first collection occurs after the heap expands 2331by <samp><span class="option">ggc-min-expand</span></samp>% beyond <samp><span class="option">ggc-min-heapsize</span></samp>. Again, 2332tuning this may improve compilation speed, and has no effect on code 2333generation. 2334 2335 <p>The default is the smaller of RAM/8, RLIMIT_RSS, or a limit which 2336tries to ensure that RLIMIT_DATA or RLIMIT_AS are not exceeded, but 2337with a lower bound of 4096 (four megabytes) and an upper bound of 2338131072 (128 megabytes). If GCC is not able to calculate RAM on a 2339particular platform, the lower bound is used. Setting this parameter 2340very large effectively disables garbage collection. Setting this 2341parameter and <samp><span class="option">ggc-min-expand</span></samp> to zero causes a full collection 2342to occur at every opportunity. 2343 2344 <br><dt><code>max-reload-search-insns</code><dd>The maximum number of instruction reload should look backward for equivalent 2345register. Increasing values mean more aggressive optimization, making the 2346compile time increase with probably slightly better performance. The default 2347value is 100. 2348 2349 <br><dt><code>max-cselib-memory-locations</code><dd>The maximum number of memory locations cselib should take into account. 2350Increasing values mean more aggressive optimization, making the compile time 2351increase with probably slightly better performance. The default value is 500. 2352 2353 <br><dt><code>reorder-blocks-duplicate</code><dt><code>reorder-blocks-duplicate-feedback</code><dd> 2354Used by basic block reordering pass to decide whether to use unconditional 2355branch or duplicate the code on its destination. Code is duplicated when its 2356estimated size is smaller than this value multiplied by the estimated size of 2357unconditional jump in the hot spots of the program. 2358 2359 <p>The <samp><span class="option">reorder-block-duplicate-feedback</span></samp> is used only when profile 2360feedback is available and may be set to higher values than 2361<samp><span class="option">reorder-block-duplicate</span></samp> since information about the hot spots is more 2362accurate. 2363 2364 <br><dt><code>max-sched-ready-insns</code><dd>The maximum number of instructions ready to be issued the scheduler should 2365consider at any given time during the first scheduling pass. Increasing 2366values mean more thorough searches, making the compilation time increase 2367with probably little benefit. The default value is 100. 2368 2369 <br><dt><code>max-sched-region-blocks</code><dd>The maximum number of blocks in a region to be considered for 2370interblock scheduling. The default value is 10. 2371 2372 <br><dt><code>max-pipeline-region-blocks</code><dd>The maximum number of blocks in a region to be considered for 2373pipelining in the selective scheduler. The default value is 15. 2374 2375 <br><dt><code>max-sched-region-insns</code><dd>The maximum number of insns in a region to be considered for 2376interblock scheduling. The default value is 100. 2377 2378 <br><dt><code>max-pipeline-region-insns</code><dd>The maximum number of insns in a region to be considered for 2379pipelining in the selective scheduler. The default value is 200. 2380 2381 <br><dt><code>min-spec-prob</code><dd>The minimum probability (in percents) of reaching a source block 2382for interblock speculative scheduling. The default value is 40. 2383 2384 <br><dt><code>max-sched-extend-regions-iters</code><dd>The maximum number of iterations through CFG to extend regions. 23850 - disable region extension, 2386N - do at most N iterations. 2387The default value is 0. 2388 2389 <br><dt><code>max-sched-insn-conflict-delay</code><dd>The maximum conflict delay for an insn to be considered for speculative motion. 2390The default value is 3. 2391 2392 <br><dt><code>sched-spec-prob-cutoff</code><dd>The minimal probability of speculation success (in percents), so that 2393speculative insn will be scheduled. 2394The default value is 40. 2395 2396 <br><dt><code>sched-mem-true-dep-cost</code><dd>Minimal distance (in CPU cycles) between store and load targeting same 2397memory locations. The default value is 1. 2398 2399 <br><dt><code>selsched-max-lookahead</code><dd>The maximum size of the lookahead window of selective scheduling. It is a 2400depth of search for available instructions. 2401The default value is 50. 2402 2403 <br><dt><code>selsched-max-sched-times</code><dd>The maximum number of times that an instruction will be scheduled during 2404selective scheduling. This is the limit on the number of iterations 2405through which the instruction may be pipelined. The default value is 2. 2406 2407 <br><dt><code>selsched-max-insns-to-rename</code><dd>The maximum number of best instructions in the ready list that are considered 2408for renaming in the selective scheduler. The default value is 2. 2409 2410 <br><dt><code>max-last-value-rtl</code><dd>The maximum size measured as number of RTLs that can be recorded in an expression 2411in combiner for a pseudo register as last known value of that register. The default 2412is 10000. 2413 2414 <br><dt><code>integer-share-limit</code><dd>Small integer constants can use a shared data structure, reducing the 2415compiler's memory usage and increasing its speed. This sets the maximum 2416value of a shared integer constant. The default value is 256. 2417 2418 <br><dt><code>min-virtual-mappings</code><dd>Specifies the minimum number of virtual mappings in the incremental 2419SSA updater that should be registered to trigger the virtual mappings 2420heuristic defined by virtual-mappings-ratio. The default value is 2421100. 2422 2423 <br><dt><code>virtual-mappings-ratio</code><dd>If the number of virtual mappings is virtual-mappings-ratio bigger 2424than the number of virtual symbols to be updated, then the incremental 2425SSA updater switches to a full update for those symbols. The default 2426ratio is 3. 2427 2428 <br><dt><code>ssp-buffer-size</code><dd>The minimum size of buffers (i.e. arrays) that will receive stack smashing 2429protection when <samp><span class="option">-fstack-protection</span></samp> is used. 2430 2431 <br><dt><code>max-jump-thread-duplication-stmts</code><dd>Maximum number of statements allowed in a block that needs to be 2432duplicated when threading jumps. 2433 2434 <br><dt><code>max-fields-for-field-sensitive</code><dd>Maximum number of fields in a structure we will treat in 2435a field sensitive manner during pointer analysis. The default is zero 2436for -O0, and -O1 and 100 for -Os, -O2, and -O3. 2437 2438 <br><dt><code>prefetch-latency</code><dd>Estimate on average number of instructions that are executed before 2439prefetch finishes. The distance we prefetch ahead is proportional 2440to this constant. Increasing this number may also lead to less 2441streams being prefetched (see <samp><span class="option">simultaneous-prefetches</span></samp>). 2442 2443 <br><dt><code>simultaneous-prefetches</code><dd>Maximum number of prefetches that can run at the same time. 2444 2445 <br><dt><code>l1-cache-line-size</code><dd>The size of cache line in L1 cache, in bytes. 2446 2447 <br><dt><code>l1-cache-size</code><dd>The size of L1 cache, in kilobytes. 2448 2449 <br><dt><code>l2-cache-size</code><dd>The size of L2 cache, in kilobytes. 2450 2451 <br><dt><code>min-insn-to-prefetch-ratio</code><dd>The minimum ratio between the number of instructions and the 2452number of prefetches to enable prefetching in a loop. 2453 2454 <br><dt><code>prefetch-min-insn-to-mem-ratio</code><dd>The minimum ratio between the number of instructions and the 2455number of memory references to enable prefetching in a loop. 2456 2457 <br><dt><code>use-canonical-types</code><dd>Whether the compiler should use the “canonical” type system. By 2458default, this should always be 1, which uses a more efficient internal 2459mechanism for comparing types in C++ and Objective-C++. However, if 2460bugs in the canonical type system are causing compilation failures, 2461set this value to 0 to disable canonical types. 2462 2463 <br><dt><code>switch-conversion-max-branch-ratio</code><dd>Switch initialization conversion will refuse to create arrays that are 2464bigger than <samp><span class="option">switch-conversion-max-branch-ratio</span></samp> times the number of 2465branches in the switch. 2466 2467 <br><dt><code>max-partial-antic-length</code><dd>Maximum length of the partial antic set computed during the tree 2468partial redundancy elimination optimization (<samp><span class="option">-ftree-pre</span></samp>) when 2469optimizing at <samp><span class="option">-O3</span></samp> and above. For some sorts of source code 2470the enhanced partial redundancy elimination optimization can run away, 2471consuming all of the memory available on the host machine. This 2472parameter sets a limit on the length of the sets that are computed, 2473which prevents the runaway behavior. Setting a value of 0 for 2474this parameter will allow an unlimited set length. 2475 2476 <br><dt><code>sccvn-max-scc-size</code><dd>Maximum size of a strongly connected component (SCC) during SCCVN 2477processing. If this limit is hit, SCCVN processing for the whole 2478function will not be done and optimizations depending on it will 2479be disabled. The default maximum SCC size is 10000. 2480 2481 <br><dt><code>ira-max-loops-num</code><dd>IRA uses a regional register allocation by default. If a function 2482contains loops more than number given by the parameter, only at most 2483given number of the most frequently executed loops will form regions 2484for the regional register allocation. The default value of the 2485parameter is 100. 2486 2487 <br><dt><code>ira-max-conflict-table-size</code><dd>Although IRA uses a sophisticated algorithm of compression conflict 2488table, the table can be still big for huge functions. If the conflict 2489table for a function could be more than size in MB given by the 2490parameter, the conflict table is not built and faster, simpler, and 2491lower quality register allocation algorithm will be used. The 2492algorithm do not use pseudo-register conflicts. The default value of 2493the parameter is 2000. 2494 2495 <br><dt><code>ira-loop-reserved-regs</code><dd>IRA can be used to evaluate more accurate register pressure in loops 2496for decision to move loop invariants (see <samp><span class="option">-O3</span></samp>). The number 2497of available registers reserved for some other purposes is described 2498by this parameter. The default value of the parameter is 2 which is 2499minimal number of registers needed for execution of typical 2500instruction. This value is the best found from numerous experiments. 2501 2502 <br><dt><code>loop-invariant-max-bbs-in-loop</code><dd>Loop invariant motion can be very expensive, both in compile time and 2503in amount of needed compile time memory, with very large loops. Loops 2504with more basic blocks than this parameter won't have loop invariant 2505motion optimization performed on them. The default value of the 2506parameter is 1000 for -O1 and 10000 for -O2 and above. 2507 2508 <br><dt><code>max-vartrack-size</code><dd>Sets a maximum number of hash table slots to use during variable 2509tracking dataflow analysis of any function. If this limit is exceeded 2510with variable tracking at assignments enabled, analysis for that 2511function is retried without it, after removing all debug insns from 2512the function. If the limit is exceeded even without debug insns, var 2513tracking analysis is completely disabled for the function. Setting 2514the parameter to zero makes it unlimited. 2515 2516 <br><dt><code>min-nondebug-insn-uid</code><dd>Use uids starting at this parameter for nondebug insns. The range below 2517the parameter is reserved exclusively for debug insns created by 2518<samp><span class="option">-fvar-tracking-assignments</span></samp>, but debug insns may get 2519(non-overlapping) uids above it if the reserved range is exhausted. 2520 2521 <br><dt><code>ipa-sra-ptr-growth-factor</code><dd>IPA-SRA will replace a pointer to an aggregate with one or more new 2522parameters only when their cumulative size is less or equal to 2523<samp><span class="option">ipa-sra-ptr-growth-factor</span></samp> times the size of the original 2524pointer parameter. 2525 2526 <br><dt><code>graphite-max-nb-scop-params</code><dd>To avoid exponential effects in the Graphite loop transforms, the 2527number of parameters in a Static Control Part (SCoP) is bounded. The 2528default value is 10 parameters. A variable whose value is unknown at 2529compile time and defined outside a SCoP is a parameter of the SCoP. 2530 2531 <br><dt><code>graphite-max-bbs-per-function</code><dd>To avoid exponential effects in the detection of SCoPs, the size of 2532the functions analyzed by Graphite is bounded. The default value is 2533100 basic blocks. 2534 2535 <br><dt><code>loop-block-tile-size</code><dd>Loop blocking or strip mining transforms, enabled with 2536<samp><span class="option">-floop-block</span></samp> or <samp><span class="option">-floop-strip-mine</span></samp>, strip mine each 2537loop in the loop nest by a given number of iterations. The strip 2538length can be changed using the <samp><span class="option">loop-block-tile-size</span></samp> 2539parameter. The default value is 51 iterations. 2540 2541 <br><dt><code>devirt-type-list-size</code><dd>IPA-CP attempts to track all possible types passed to a function's 2542parameter in order to perform devirtualization. 2543<samp><span class="option">devirt-type-list-size</span></samp> is the maximum number of types it 2544stores per a single formal parameter of a function. 2545 2546 <br><dt><code>lto-partitions</code><dd>Specify desired number of partitions produced during WHOPR compilation. 2547The number of partitions should exceed the number of CPUs used for compilation. 2548The default value is 32. 2549 2550 <br><dt><code>lto-minpartition</code><dd>Size of minimal partition for WHOPR (in estimated instructions). 2551This prevents expenses of splitting very small programs into too many 2552partitions. 2553 2554 <br><dt><code>cxx-max-namespaces-for-diagnostic-help</code><dd>The maximum number of namespaces to consult for suggestions when C++ 2555name lookup fails for an identifier. The default is 1000. 2556 2557 <br><dt><code>if-to-switch-threshold</code><dd>If-chain to switch conversion, enabled by 2558<samp><span class="option">-ftree-if-to-switch-conversion</span></samp> convert chains of ifs of sufficient 2559length into switches. The parameter <samp><span class="option">if-to-switch-threshold</span></samp> can be 2560used to set the minimal required length. The default value is 3. 2561 2562 </dl> 2563 </dl> 2564 2565 </body></html> 2566 2567