1<html lang="en"> 2<head> 3<title>i386 and x86-64 Options - Using the GNU Compiler Collection (GCC)</title> 4<meta http-equiv="Content-Type" content="text/html"> 5<meta name="description" content="Using the GNU Compiler Collection (GCC)"> 6<meta name="generator" content="makeinfo 4.13"> 7<link title="Top" rel="start" href="index.html#Top"> 8<link rel="up" href="Submodel-Options.html#Submodel-Options" title="Submodel Options"> 9<link rel="prev" href="HPPA-Options.html#HPPA-Options" title="HPPA Options"> 10<link rel="next" href="i386-and-x86_002d64-Windows-Options.html#i386-and-x86_002d64-Windows-Options" title="i386 and x86-64 Windows Options"> 11<link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage"> 12<!-- 13Copyright (C) 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 141998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 152010 Free Software Foundation, Inc. 16 17Permission is granted to copy, distribute and/or modify this document 18under the terms of the GNU Free Documentation License, Version 1.3 or 19any later version published by the Free Software Foundation; with the 20Invariant Sections being ``Funding Free Software'', the Front-Cover 21Texts being (a) (see below), and with the Back-Cover Texts being (b) 22(see below). A copy of the license is included in the section entitled 23``GNU Free Documentation License''. 24 25(a) The FSF's Front-Cover Text is: 26 27 A GNU Manual 28 29(b) The FSF's Back-Cover Text is: 30 31 You have freedom to copy and modify this GNU Manual, like GNU 32 software. Copies published by the Free Software Foundation raise 33 funds for GNU development.--> 34<meta http-equiv="Content-Style-Type" content="text/css"> 35<style type="text/css"><!-- 36 pre.display { font-family:inherit } 37 pre.format { font-family:inherit } 38 pre.smalldisplay { font-family:inherit; font-size:smaller } 39 pre.smallformat { font-family:inherit; font-size:smaller } 40 pre.smallexample { font-size:smaller } 41 pre.smalllisp { font-size:smaller } 42 span.sc { font-variant:small-caps } 43 span.roman { font-family:serif; font-weight:normal; } 44 span.sansserif { font-family:sans-serif; font-weight:normal; } 45--></style> 46<link rel="stylesheet" type="text/css" href="../cs.css"> 47</head> 48<body> 49<div class="node"> 50<a name="i386-and-x86-64-Options"></a> 51<a name="i386-and-x86_002d64-Options"></a> 52<p> 53Next: <a rel="next" accesskey="n" href="i386-and-x86_002d64-Windows-Options.html#i386-and-x86_002d64-Windows-Options">i386 and x86-64 Windows Options</a>, 54Previous: <a rel="previous" accesskey="p" href="HPPA-Options.html#HPPA-Options">HPPA Options</a>, 55Up: <a rel="up" accesskey="u" href="Submodel-Options.html#Submodel-Options">Submodel Options</a> 56<hr> 57</div> 58 59<h4 class="subsection">3.17.15 Intel 386 and AMD x86-64 Options</h4> 60 61<p><a name="index-i386-Options-1332"></a><a name="index-x86_002d64-Options-1333"></a><a name="index-Intel-386-Options-1334"></a><a name="index-AMD-x86_002d64-Options-1335"></a> 62These ‘<samp><span class="samp">-m</span></samp>’ options are defined for the i386 and x86-64 family of 63computers: 64 65 <dl> 66<dt><code>-mtune=</code><var>cpu-type</var><dd><a name="index-mtune-1336"></a>Tune to <var>cpu-type</var> everything applicable about the generated code, except 67for the ABI and the set of available instructions. The choices for 68<var>cpu-type</var> are: 69 <dl> 70<dt><em>generic</em><dd>Produce code optimized for the most common IA32/AMD64/EM64T processors. 71If you know the CPU on which your code will run, then you should use 72the corresponding <samp><span class="option">-mtune</span></samp> option instead of 73<samp><span class="option">-mtune=generic</span></samp>. But, if you do not know exactly what CPU users 74of your application will have, then you should use this option. 75 76 <p>As new processors are deployed in the marketplace, the behavior of this 77option will change. Therefore, if you upgrade to a newer version of 78GCC, the code generated option will change to reflect the processors 79that were most common when that version of GCC was released. 80 81 <p>There is no <samp><span class="option">-march=generic</span></samp> option because <samp><span class="option">-march</span></samp> 82indicates the instruction set the compiler can use, and there is no 83generic instruction set applicable to all processors. In contrast, 84<samp><span class="option">-mtune</span></samp> indicates the processor (or, in this case, collection of 85processors) for which the code is optimized. 86<br><dt><em>native</em><dd>This selects the CPU to tune for at compilation time by determining 87the processor type of the compiling machine. Using <samp><span class="option">-mtune=native</span></samp> 88will produce code optimized for the local machine under the constraints 89of the selected instruction set. Using <samp><span class="option">-march=native</span></samp> will 90enable all instruction subsets supported by the local machine (hence 91the result might not run on different machines). 92<br><dt><em>i386</em><dd>Original Intel's i386 CPU. 93<br><dt><em>i486</em><dd>Intel's i486 CPU. (No scheduling is implemented for this chip.) 94<br><dt><em>i586, pentium</em><dd>Intel Pentium CPU with no MMX support. 95<br><dt><em>pentium-mmx</em><dd>Intel PentiumMMX CPU based on Pentium core with MMX instruction set support. 96<br><dt><em>pentiumpro</em><dd>Intel PentiumPro CPU. 97<br><dt><em>i686</em><dd>Same as <code>generic</code>, but when used as <code>march</code> option, PentiumPro 98instruction set will be used, so the code will run on all i686 family chips. 99<br><dt><em>pentium2</em><dd>Intel Pentium2 CPU based on PentiumPro core with MMX instruction set support. 100<br><dt><em>pentium3, pentium3m</em><dd>Intel Pentium3 CPU based on PentiumPro core with MMX and SSE instruction set 101support. 102<br><dt><em>pentium-m</em><dd>Low power version of Intel Pentium3 CPU with MMX, SSE and SSE2 instruction set 103support. Used by Centrino notebooks. 104<br><dt><em>pentium4, pentium4m</em><dd>Intel Pentium4 CPU with MMX, SSE and SSE2 instruction set support. 105<br><dt><em>prescott</em><dd>Improved version of Intel Pentium4 CPU with MMX, SSE, SSE2 and SSE3 instruction 106set support. 107<br><dt><em>nocona</em><dd>Improved version of Intel Pentium4 CPU with 64-bit extensions, MMX, SSE, 108SSE2 and SSE3 instruction set support. 109<br><dt><em>core2</em><dd>Intel Core2 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3 and SSSE3 110instruction set support. 111<br><dt><em>corei7</em><dd>Intel Core i7 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 112and SSE4.2 instruction set support. 113<br><dt><em>corei7-avx</em><dd>Intel Core i7 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3, 114SSE4.1, SSE4.2, AVX, AES and PCLMUL instruction set support. 115<br><dt><em>atom</em><dd>Intel Atom CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3 and SSSE3 116instruction set support. 117<br><dt><em>k6</em><dd>AMD K6 CPU with MMX instruction set support. 118<br><dt><em>k6-2, k6-3</em><dd>Improved versions of AMD K6 CPU with MMX and 3DNow! instruction set support. 119<br><dt><em>athlon, athlon-tbird</em><dd>AMD Athlon CPU with MMX, 3dNOW!, enhanced 3DNow! and SSE prefetch instructions 120support. 121<br><dt><em>athlon-4, athlon-xp, athlon-mp</em><dd>Improved AMD Athlon CPU with MMX, 3DNow!, enhanced 3DNow! and full SSE 122instruction set support. 123<br><dt><em>k8, opteron, athlon64, athlon-fx</em><dd>AMD K8 core based CPUs with x86-64 instruction set support. (This supersets 124MMX, SSE, SSE2, 3DNow!, enhanced 3DNow! and 64-bit instruction set extensions.) 125<br><dt><em>k8-sse3, opteron-sse3, athlon64-sse3</em><dd>Improved versions of k8, opteron and athlon64 with SSE3 instruction set support. 126<br><dt><em>amdfam10, barcelona</em><dd>AMD Family 10h core based CPUs with x86-64 instruction set support. (This 127supersets MMX, SSE, SSE2, SSE3, SSE4A, 3DNow!, enhanced 3DNow!, ABM and 64-bit 128instruction set extensions.) 129<br><dt><em>winchip-c6</em><dd>IDT Winchip C6 CPU, dealt in same way as i486 with additional MMX instruction 130set support. 131<br><dt><em>winchip2</em><dd>IDT Winchip2 CPU, dealt in same way as i486 with additional MMX and 3DNow! 132instruction set support. 133<br><dt><em>c3</em><dd>Via C3 CPU with MMX and 3DNow! instruction set support. (No scheduling is 134implemented for this chip.) 135<br><dt><em>c3-2</em><dd>Via C3-2 CPU with MMX and SSE instruction set support. (No scheduling is 136implemented for this chip.) 137<br><dt><em>geode</em><dd>Embedded AMD CPU with MMX and 3DNow! instruction set support. 138</dl> 139 140 <p>While picking a specific <var>cpu-type</var> will schedule things appropriately 141for that particular chip, the compiler will not generate any code that 142does not run on the i386 without the <samp><span class="option">-march=</span><var>cpu-type</var></samp> option 143being used. 144 145 <br><dt><code>-march=</code><var>cpu-type</var><dd><a name="index-march-1337"></a>Generate instructions for the machine type <var>cpu-type</var>. The choices 146for <var>cpu-type</var> are the same as for <samp><span class="option">-mtune</span></samp>. Moreover, 147specifying <samp><span class="option">-march=</span><var>cpu-type</var></samp> implies <samp><span class="option">-mtune=</span><var>cpu-type</var></samp>. 148 149 <br><dt><code>-mcpu=</code><var>cpu-type</var><dd><a name="index-mcpu-1338"></a>A deprecated synonym for <samp><span class="option">-mtune</span></samp>. 150 151 <br><dt><code>-mfpmath=</code><var>unit</var><dd><a name="index-mfpmath-1339"></a>Generate floating point arithmetics for selected unit <var>unit</var>. The choices 152for <var>unit</var> are: 153 154 <dl> 155<dt>‘<samp><span class="samp">387</span></samp>’<dd>Use the standard 387 floating point coprocessor present majority of chips and 156emulated otherwise. Code compiled with this option will run almost everywhere. 157The temporary results are computed in 80bit precision instead of precision 158specified by the type resulting in slightly different results compared to most 159of other chips. See <samp><span class="option">-ffloat-store</span></samp> for more detailed description. 160 161 <p>This is the default choice for i386 compiler. 162 163 <br><dt>‘<samp><span class="samp">sse</span></samp>’<dd>Use scalar floating point instructions present in the SSE instruction set. 164This instruction set is supported by Pentium3 and newer chips, in the AMD line 165by Athlon-4, Athlon-xp and Athlon-mp chips. The earlier version of SSE 166instruction set supports only single precision arithmetics, thus the double and 167extended precision arithmetics is still done using 387. Later version, present 168only in Pentium4 and the future AMD x86-64 chips supports double precision 169arithmetics too. 170 171 <p>For the i386 compiler, you need to use <samp><span class="option">-march=</span><var>cpu-type</var></samp>, <samp><span class="option">-msse</span></samp> 172or <samp><span class="option">-msse2</span></samp> switches to enable SSE extensions and make this option 173effective. For the x86-64 compiler, these extensions are enabled by default. 174 175 <p>The resulting code should be considerably faster in the majority of cases and avoid 176the numerical instability problems of 387 code, but may break some existing 177code that expects temporaries to be 80bit. 178 179 <p>This is the default choice for the x86-64 compiler. 180 181 <br><dt>‘<samp><span class="samp">sse,387</span></samp>’<dt>‘<samp><span class="samp">sse+387</span></samp>’<dt>‘<samp><span class="samp">both</span></samp>’<dd>Attempt to utilize both instruction sets at once. This effectively double the 182amount of available registers and on chips with separate execution units for 183387 and SSE the execution resources too. Use this option with care, as it is 184still experimental, because the GCC register allocator does not model separate 185functional units well resulting in instable performance. 186</dl> 187 188 <br><dt><code>-masm=</code><var>dialect</var><dd><a name="index-masm_003d_0040var_007bdialect_007d-1340"></a>Output asm instructions using selected <var>dialect</var>. Supported 189choices are ‘<samp><span class="samp">intel</span></samp>’ or ‘<samp><span class="samp">att</span></samp>’ (the default one). Darwin does 190not support ‘<samp><span class="samp">intel</span></samp>’. 191 192 <br><dt><code>-mieee-fp</code><dt><code>-mno-ieee-fp</code><dd><a name="index-mieee_002dfp-1341"></a><a name="index-mno_002dieee_002dfp-1342"></a>Control whether or not the compiler uses IEEE floating point 193comparisons. These handle correctly the case where the result of a 194comparison is unordered. 195 196 <br><dt><code>-msoft-float</code><dd><a name="index-msoft_002dfloat-1343"></a>Generate output containing library calls for floating point. 197<strong>Warning:</strong> the requisite libraries are not part of GCC. 198Normally the facilities of the machine's usual C compiler are used, but 199this can't be done directly in cross-compilation. You must make your 200own arrangements to provide suitable library functions for 201cross-compilation. 202 203 <p>On machines where a function returns floating point results in the 80387 204register stack, some floating point opcodes may be emitted even if 205<samp><span class="option">-msoft-float</span></samp> is used. 206 207 <br><dt><code>-mno-fp-ret-in-387</code><dd><a name="index-mno_002dfp_002dret_002din_002d387-1344"></a>Do not use the FPU registers for return values of functions. 208 209 <p>The usual calling convention has functions return values of types 210<code>float</code> and <code>double</code> in an FPU register, even if there 211is no FPU. The idea is that the operating system should emulate 212an FPU. 213 214 <p>The option <samp><span class="option">-mno-fp-ret-in-387</span></samp> causes such values to be returned 215in ordinary CPU registers instead. 216 217 <br><dt><code>-mno-fancy-math-387</code><dd><a name="index-mno_002dfancy_002dmath_002d387-1345"></a>Some 387 emulators do not support the <code>sin</code>, <code>cos</code> and 218<code>sqrt</code> instructions for the 387. Specify this option to avoid 219generating those instructions. This option is the default on FreeBSD, 220OpenBSD and NetBSD. This option is overridden when <samp><span class="option">-march</span></samp> 221indicates that the target CPU will always have an FPU and so the 222instruction will not need emulation. As of revision 2.6.1, these 223instructions are not generated unless you also use the 224<samp><span class="option">-funsafe-math-optimizations</span></samp> switch. 225 226 <br><dt><code>-malign-double</code><dt><code>-mno-align-double</code><dd><a name="index-malign_002ddouble-1346"></a><a name="index-mno_002dalign_002ddouble-1347"></a>Control whether GCC aligns <code>double</code>, <code>long double</code>, and 227<code>long long</code> variables on a two word boundary or a one word 228boundary. Aligning <code>double</code> variables on a two word boundary will 229produce code that runs somewhat faster on a ‘<samp><span class="samp">Pentium</span></samp>’ at the 230expense of more memory. 231 232 <p>On x86-64, <samp><span class="option">-malign-double</span></samp> is enabled by default. 233 234 <p><strong>Warning:</strong> if you use the <samp><span class="option">-malign-double</span></samp> switch, 235structures containing the above types will be aligned differently than 236the published application binary interface specifications for the 386 237and will not be binary compatible with structures in code compiled 238without that switch. 239 240 <br><dt><code>-m96bit-long-double</code><dt><code>-m128bit-long-double</code><dd><a name="index-m96bit_002dlong_002ddouble-1348"></a><a name="index-m128bit_002dlong_002ddouble-1349"></a>These switches control the size of <code>long double</code> type. The i386 241application binary interface specifies the size to be 96 bits, 242so <samp><span class="option">-m96bit-long-double</span></samp> is the default in 32 bit mode. 243 244 <p>Modern architectures (Pentium and newer) would prefer <code>long double</code> 245to be aligned to an 8 or 16 byte boundary. In arrays or structures 246conforming to the ABI, this would not be possible. So specifying a 247<samp><span class="option">-m128bit-long-double</span></samp> will align <code>long double</code> 248to a 16 byte boundary by padding the <code>long double</code> with an additional 24932 bit zero. 250 251 <p>In the x86-64 compiler, <samp><span class="option">-m128bit-long-double</span></samp> is the default choice as 252its ABI specifies that <code>long double</code> is to be aligned on 16 byte boundary. 253 254 <p>Notice that neither of these options enable any extra precision over the x87 255standard of 80 bits for a <code>long double</code>. 256 257 <p><strong>Warning:</strong> if you override the default value for your target ABI, the 258structures and arrays containing <code>long double</code> variables will change 259their size as well as function calling convention for function taking 260<code>long double</code> will be modified. Hence they will not be binary 261compatible with arrays or structures in code compiled without that switch. 262 263 <br><dt><code>-mlarge-data-threshold=</code><var>number</var><dd><a name="index-mlarge_002ddata_002dthreshold_003d_0040var_007bnumber_007d-1350"></a>When <samp><span class="option">-mcmodel=medium</span></samp> is specified, the data greater than 264<var>threshold</var> are placed in large data section. This value must be the 265same across all object linked into the binary and defaults to 65535. 266 267 <br><dt><code>-mrtd</code><dd><a name="index-mrtd-1351"></a>Use a different function-calling convention, in which functions that 268take a fixed number of arguments return with the <code>ret</code> <var>num</var> 269instruction, which pops their arguments while returning. This saves one 270instruction in the caller since there is no need to pop the arguments 271there. 272 273 <p>You can specify that an individual function is called with this calling 274sequence with the function attribute ‘<samp><span class="samp">stdcall</span></samp>’. You can also 275override the <samp><span class="option">-mrtd</span></samp> option by using the function attribute 276‘<samp><span class="samp">cdecl</span></samp>’. See <a href="Function-Attributes.html#Function-Attributes">Function Attributes</a>. 277 278 <p><strong>Warning:</strong> this calling convention is incompatible with the one 279normally used on Unix, so you cannot use it if you need to call 280libraries compiled with the Unix compiler. 281 282 <p>Also, you must provide function prototypes for all functions that 283take variable numbers of arguments (including <code>printf</code>); 284otherwise incorrect code will be generated for calls to those 285functions. 286 287 <p>In addition, seriously incorrect code will result if you call a 288function with too many arguments. (Normally, extra arguments are 289harmlessly ignored.) 290 291 <br><dt><code>-mregparm=</code><var>num</var><dd><a name="index-mregparm-1352"></a>Control how many registers are used to pass integer arguments. By 292default, no registers are used to pass arguments, and at most 3 293registers can be used. You can control this behavior for a specific 294function by using the function attribute ‘<samp><span class="samp">regparm</span></samp>’. 295See <a href="Function-Attributes.html#Function-Attributes">Function Attributes</a>. 296 297 <p><strong>Warning:</strong> if you use this switch, and 298<var>num</var> is nonzero, then you must build all modules with the same 299value, including any libraries. This includes the system libraries and 300startup modules. 301 302 <br><dt><code>-msseregparm</code><dd><a name="index-msseregparm-1353"></a>Use SSE register passing conventions for float and double arguments 303and return values. You can control this behavior for a specific 304function by using the function attribute ‘<samp><span class="samp">sseregparm</span></samp>’. 305See <a href="Function-Attributes.html#Function-Attributes">Function Attributes</a>. 306 307 <p><strong>Warning:</strong> if you use this switch then you must build all 308modules with the same value, including any libraries. This includes 309the system libraries and startup modules. 310 311 <br><dt><code>-mvect8-ret-in-mem</code><dd><a name="index-mvect8_002dret_002din_002dmem-1354"></a>Return 8-byte vectors in memory instead of MMX registers. This is the 312default on Solaris 8 and 9 and VxWorks to match the ABI of the Sun 313Studio compilers until version 12. Later compiler versions (starting 314with Studio 12 Update 1) follow the ABI used by other x86 targets, which 315is the default on Solaris 10 and later. <em>Only</em> use this option if 316you need to remain compatible with existing code produced by those 317previous compiler versions or older versions of GCC. 318 319 <br><dt><code>-mpc32</code><dt><code>-mpc64</code><dt><code>-mpc80</code><dd><a name="index-mpc32-1355"></a><a name="index-mpc64-1356"></a><a name="index-mpc80-1357"></a> 320Set 80387 floating-point precision to 32, 64 or 80 bits. When <samp><span class="option">-mpc32</span></samp> 321is specified, the significands of results of floating-point operations are 322rounded to 24 bits (single precision); <samp><span class="option">-mpc64</span></samp> rounds the 323significands of results of floating-point operations to 53 bits (double 324precision) and <samp><span class="option">-mpc80</span></samp> rounds the significands of results of 325floating-point operations to 64 bits (extended double precision), which is 326the default. When this option is used, floating-point operations in higher 327precisions are not available to the programmer without setting the FPU 328control word explicitly. 329 330 <p>Setting the rounding of floating-point operations to less than the default 33180 bits can speed some programs by 2% or more. Note that some mathematical 332libraries assume that extended precision (80 bit) floating-point operations 333are enabled by default; routines in such libraries could suffer significant 334loss of accuracy, typically through so-called "catastrophic cancellation", 335when this option is used to set the precision to less than extended precision. 336 337 <br><dt><code>-mstackrealign</code><dd><a name="index-mstackrealign-1358"></a>Realign the stack at entry. On the Intel x86, the <samp><span class="option">-mstackrealign</span></samp> 338option will generate an alternate prologue and epilogue that realigns the 339runtime stack if necessary. This supports mixing legacy codes that keep 340a 4-byte aligned stack with modern codes that keep a 16-byte stack for 341SSE compatibility. See also the attribute <code>force_align_arg_pointer</code>, 342applicable to individual functions. 343 344 <br><dt><code>-mpreferred-stack-boundary=</code><var>num</var><dd><a name="index-mpreferred_002dstack_002dboundary-1359"></a>Attempt to keep the stack boundary aligned to a 2 raised to <var>num</var> 345byte boundary. If <samp><span class="option">-mpreferred-stack-boundary</span></samp> is not specified, 346the default is 4 (16 bytes or 128 bits). 347 348 <br><dt><code>-mincoming-stack-boundary=</code><var>num</var><dd><a name="index-mincoming_002dstack_002dboundary-1360"></a>Assume the incoming stack is aligned to a 2 raised to <var>num</var> byte 349boundary. If <samp><span class="option">-mincoming-stack-boundary</span></samp> is not specified, 350the one specified by <samp><span class="option">-mpreferred-stack-boundary</span></samp> will be used. 351 352 <p>On Pentium and PentiumPro, <code>double</code> and <code>long double</code> values 353should be aligned to an 8 byte boundary (see <samp><span class="option">-malign-double</span></samp>) or 354suffer significant run time performance penalties. On Pentium III, the 355Streaming SIMD Extension (SSE) data type <code>__m128</code> may not work 356properly if it is not 16 byte aligned. 357 358 <p>To ensure proper alignment of this values on the stack, the stack boundary 359must be as aligned as that required by any value stored on the stack. 360Further, every function must be generated such that it keeps the stack 361aligned. Thus calling a function compiled with a higher preferred 362stack boundary from a function compiled with a lower preferred stack 363boundary will most likely misalign the stack. It is recommended that 364libraries that use callbacks always use the default setting. 365 366 <p>This extra alignment does consume extra stack space, and generally 367increases code size. Code that is sensitive to stack space usage, such 368as embedded systems and operating system kernels, may want to reduce the 369preferred alignment to <samp><span class="option">-mpreferred-stack-boundary=2</span></samp>. 370 371 <br><dt><code>-mmmx</code><dt><code>-mno-mmx</code><dt><code>-msse</code><dt><code>-mno-sse</code><dt><code>-msse2</code><dt><code>-mno-sse2</code><dt><code>-msse3</code><dt><code>-mno-sse3</code><dt><code>-mssse3</code><dt><code>-mno-ssse3</code><dt><code>-msse4.1</code><dt><code>-mno-sse4.1</code><dt><code>-msse4.2</code><dt><code>-mno-sse4.2</code><dt><code>-msse4</code><dt><code>-mno-sse4</code><dt><code>-mavx</code><dt><code>-mno-avx</code><dt><code>-maes</code><dt><code>-mno-aes</code><dt><code>-mpclmul</code><dt><code>-mno-pclmul</code><dt><code>-mfsgsbase</code><dt><code>-mno-fsgsbase</code><dt><code>-mrdrnd</code><dt><code>-mno-rdrnd</code><dt><code>-mf16c</code><dt><code>-mno-f16c</code><dt><code>-msse4a</code><dt><code>-mno-sse4a</code><dt><code>-mfma4</code><dt><code>-mno-fma4</code><dt><code>-mxop</code><dt><code>-mno-xop</code><dt><code>-mlwp</code><dt><code>-mno-lwp</code><dt><code>-m3dnow</code><dt><code>-mno-3dnow</code><dt><code>-mpopcnt</code><dt><code>-mno-popcnt</code><dt><code>-mabm</code><dt><code>-mno-abm</code><dt><code>-mbmi</code><dt><code>-mno-bmi</code><dt><code>-mtbm</code><dt><code>-mno-tbm</code><dd><a name="index-mmmx-1361"></a><a name="index-mno_002dmmx-1362"></a><a name="index-msse-1363"></a><a name="index-mno_002dsse-1364"></a><a name="index-m3dnow-1365"></a><a name="index-mno_002d3dnow-1366"></a>These switches enable or disable the use of instructions in the MMX, 372SSE, SSE2, SSE3, SSSE3, SSE4.1, AVX, AES, PCLMUL, FSGSBASE, RDRND, 373F16C, SSE4A, FMA4, XOP, LWP, ABM, BMI, or 3DNow! extended instruction sets. 374These extensions are also available as built-in functions: see 375<a href="X86-Built_002din-Functions.html#X86-Built_002din-Functions">X86 Built-in Functions</a>, for details of the functions enabled and 376disabled by these switches. 377 378 <p>To have SSE/SSE2 instructions generated automatically from floating-point 379code (as opposed to 387 instructions), see <samp><span class="option">-mfpmath=sse</span></samp>. 380 381 <p>GCC depresses SSEx instructions when <samp><span class="option">-mavx</span></samp> is used. Instead, it 382generates new AVX instructions or AVX equivalence for all SSEx instructions 383when needed. 384 385 <p>These options will enable GCC to use these extended instructions in 386generated code, even without <samp><span class="option">-mfpmath=sse</span></samp>. Applications which 387perform runtime CPU detection must compile separate files for each 388supported architecture, using the appropriate flags. In particular, 389the file containing the CPU detection code should be compiled without 390these options. 391 392 <br><dt><code>-mfused-madd</code><dt><code>-mno-fused-madd</code><dd><a name="index-mfused_002dmadd-1367"></a><a name="index-mno_002dfused_002dmadd-1368"></a>Do (don't) generate code that uses the fused multiply/add or multiply/subtract 393instructions. The default is to use these instructions. 394 395 <br><dt><code>-mcld</code><dd><a name="index-mcld-1369"></a>This option instructs GCC to emit a <code>cld</code> instruction in the prologue 396of functions that use string instructions. String instructions depend on 397the DF flag to select between autoincrement or autodecrement mode. While the 398ABI specifies the DF flag to be cleared on function entry, some operating 399systems violate this specification by not clearing the DF flag in their 400exception dispatchers. The exception handler can be invoked with the DF flag 401set which leads to wrong direction mode, when string instructions are used. 402This option can be enabled by default on 32-bit x86 targets by configuring 403GCC with the <samp><span class="option">--enable-cld</span></samp> configure option. Generation of <code>cld</code> 404instructions can be suppressed with the <samp><span class="option">-mno-cld</span></samp> compiler option 405in this case. 406 407 <br><dt><code>-mvzeroupper</code><dd><a name="index-mvzeroupper-1370"></a>This option instructs GCC to emit a <code>vzeroupper</code> instruction 408before a transfer of control flow out of the function to minimize 409AVX to SSE transition penalty as well as remove unnecessary zeroupper 410intrinsics. 411 412 <br><dt><code>-mcx16</code><dd><a name="index-mcx16-1371"></a>This option will enable GCC to use CMPXCHG16B instruction in generated code. 413CMPXCHG16B allows for atomic operations on 128-bit double quadword (or oword) 414data types. This is useful for high resolution counters that could be updated 415by multiple processors (or cores). This instruction is generated as part of 416atomic built-in functions: see <a href="Atomic-Builtins.html#Atomic-Builtins">Atomic Builtins</a> for details. 417 418 <br><dt><code>-msahf</code><dd><a name="index-msahf-1372"></a>This option will enable GCC to use SAHF instruction in generated 64-bit code. 419Early Intel CPUs with Intel 64 lacked LAHF and SAHF instructions supported 420by AMD64 until introduction of Pentium 4 G1 step in December 2005. LAHF and 421SAHF are load and store instructions, respectively, for certain status flags. 422In 64-bit mode, SAHF instruction is used to optimize <code>fmod</code>, <code>drem</code> 423or <code>remainder</code> built-in functions: see <a href="Other-Builtins.html#Other-Builtins">Other Builtins</a> for details. 424 425 <br><dt><code>-mmovbe</code><dd><a name="index-mmovbe-1373"></a>This option will enable GCC to use movbe instruction to implement 426<code>__builtin_bswap32</code> and <code>__builtin_bswap64</code>. 427 428 <br><dt><code>-mcrc32</code><dd><a name="index-mcrc32-1374"></a>This option will enable built-in functions, <code>__builtin_ia32_crc32qi</code>, 429<code>__builtin_ia32_crc32hi</code>. <code>__builtin_ia32_crc32si</code> and 430<code>__builtin_ia32_crc32di</code> to generate the crc32 machine instruction. 431 432 <br><dt><code>-mrecip</code><dd><a name="index-mrecip-1375"></a>This option will enable GCC to use RCPSS and RSQRTSS instructions (and their 433vectorized variants RCPPS and RSQRTPS) with an additional Newton-Raphson step 434to increase precision instead of DIVSS and SQRTSS (and their vectorized 435variants) for single precision floating point arguments. These instructions 436are generated only when <samp><span class="option">-funsafe-math-optimizations</span></samp> is enabled 437together with <samp><span class="option">-finite-math-only</span></samp> and <samp><span class="option">-fno-trapping-math</span></samp>. 438Note that while the throughput of the sequence is higher than the throughput 439of the non-reciprocal instruction, the precision of the sequence can be 440decreased by up to 2 ulp (i.e. the inverse of 1.0 equals 0.99999994). 441 442 <p>Note that GCC implements 1.0f/sqrtf(x) in terms of RSQRTSS (or RSQRTPS) 443already with <samp><span class="option">-ffast-math</span></samp> (or the above option combination), and 444doesn't need <samp><span class="option">-mrecip</span></samp>. 445 446 <br><dt><code>-mveclibabi=</code><var>type</var><dd><a name="index-mveclibabi-1376"></a>Specifies the ABI type to use for vectorizing intrinsics using an 447external library. Supported types are <code>svml</code> for the Intel short 448vector math library and <code>acml</code> for the AMD math core library style 449of interfacing. GCC will currently emit calls to <code>vmldExp2</code>, 450<code>vmldLn2</code>, <code>vmldLog102</code>, <code>vmldLog102</code>, <code>vmldPow2</code>, 451<code>vmldTanh2</code>, <code>vmldTan2</code>, <code>vmldAtan2</code>, <code>vmldAtanh2</code>, 452<code>vmldCbrt2</code>, <code>vmldSinh2</code>, <code>vmldSin2</code>, <code>vmldAsinh2</code>, 453<code>vmldAsin2</code>, <code>vmldCosh2</code>, <code>vmldCos2</code>, <code>vmldAcosh2</code>, 454<code>vmldAcos2</code>, <code>vmlsExp4</code>, <code>vmlsLn4</code>, <code>vmlsLog104</code>, 455<code>vmlsLog104</code>, <code>vmlsPow4</code>, <code>vmlsTanh4</code>, <code>vmlsTan4</code>, 456<code>vmlsAtan4</code>, <code>vmlsAtanh4</code>, <code>vmlsCbrt4</code>, <code>vmlsSinh4</code>, 457<code>vmlsSin4</code>, <code>vmlsAsinh4</code>, <code>vmlsAsin4</code>, <code>vmlsCosh4</code>, 458<code>vmlsCos4</code>, <code>vmlsAcosh4</code> and <code>vmlsAcos4</code> for corresponding 459function type when <samp><span class="option">-mveclibabi=svml</span></samp> is used and <code>__vrd2_sin</code>, 460<code>__vrd2_cos</code>, <code>__vrd2_exp</code>, <code>__vrd2_log</code>, <code>__vrd2_log2</code>, 461<code>__vrd2_log10</code>, <code>__vrs4_sinf</code>, <code>__vrs4_cosf</code>, 462<code>__vrs4_expf</code>, <code>__vrs4_logf</code>, <code>__vrs4_log2f</code>, 463<code>__vrs4_log10f</code> and <code>__vrs4_powf</code> for corresponding function type 464when <samp><span class="option">-mveclibabi=acml</span></samp> is used. Both <samp><span class="option">-ftree-vectorize</span></samp> and 465<samp><span class="option">-funsafe-math-optimizations</span></samp> have to be enabled. A SVML or ACML ABI 466compatible library will have to be specified at link time. 467 468 <br><dt><code>-mabi=</code><var>name</var><dd><a name="index-mabi-1377"></a>Generate code for the specified calling convention. Permissible values 469are: ‘<samp><span class="samp">sysv</span></samp>’ for the ABI used on GNU/Linux and other systems and 470‘<samp><span class="samp">ms</span></samp>’ for the Microsoft ABI. The default is to use the Microsoft 471ABI when targeting Windows. On all other systems, the default is the 472SYSV ABI. You can control this behavior for a specific function by 473using the function attribute ‘<samp><span class="samp">ms_abi</span></samp>’/‘<samp><span class="samp">sysv_abi</span></samp>’. 474See <a href="Function-Attributes.html#Function-Attributes">Function Attributes</a>. 475 476 <br><dt><code>-mpush-args</code><dt><code>-mno-push-args</code><dd><a name="index-mpush_002dargs-1378"></a><a name="index-mno_002dpush_002dargs-1379"></a>Use PUSH operations to store outgoing parameters. This method is shorter 477and usually equally fast as method using SUB/MOV operations and is enabled 478by default. In some cases disabling it may improve performance because of 479improved scheduling and reduced dependencies. 480 481 <br><dt><code>-maccumulate-outgoing-args</code><dd><a name="index-maccumulate_002doutgoing_002dargs-1380"></a>If enabled, the maximum amount of space required for outgoing arguments will be 482computed in the function prologue. This is faster on most modern CPUs 483because of reduced dependencies, improved scheduling and reduced stack usage 484when preferred stack boundary is not equal to 2. The drawback is a notable 485increase in code size. This switch implies <samp><span class="option">-mno-push-args</span></samp>. 486 487 <br><dt><code>-mthreads</code><dd><a name="index-mthreads-1381"></a>Support thread-safe exception handling on ‘<samp><span class="samp">Mingw32</span></samp>’. Code that relies 488on thread-safe exception handling must compile and link all code with the 489<samp><span class="option">-mthreads</span></samp> option. When compiling, <samp><span class="option">-mthreads</span></samp> defines 490<samp><span class="option">-D_MT</span></samp>; when linking, it links in a special thread helper library 491<samp><span class="option">-lmingwthrd</span></samp> which cleans up per thread exception handling data. 492 493 <br><dt><code>-mno-align-stringops</code><dd><a name="index-mno_002dalign_002dstringops-1382"></a>Do not align destination of inlined string operations. This switch reduces 494code size and improves performance in case the destination is already aligned, 495but GCC doesn't know about it. 496 497 <br><dt><code>-minline-all-stringops</code><dd><a name="index-minline_002dall_002dstringops-1383"></a>By default GCC inlines string operations only when destination is known to be 498aligned at least to 4 byte boundary. This enables more inlining, increase code 499size, but may improve performance of code that depends on fast memcpy, strlen 500and memset for short lengths. 501 502 <br><dt><code>-minline-stringops-dynamically</code><dd><a name="index-minline_002dstringops_002ddynamically-1384"></a>For string operation of unknown size, inline runtime checks so for small 503blocks inline code is used, while for large blocks library call is used. 504 505 <br><dt><code>-mstringop-strategy=</code><var>alg</var><dd><a name="index-mstringop_002dstrategy_003d_0040var_007balg_007d-1385"></a>Overwrite internal decision heuristic about particular algorithm to inline 506string operation with. The allowed values are <code>rep_byte</code>, 507<code>rep_4byte</code>, <code>rep_8byte</code> for expanding using i386 <code>rep</code> prefix 508of specified size, <code>byte_loop</code>, <code>loop</code>, <code>unrolled_loop</code> for 509expanding inline loop, <code>libcall</code> for always expanding library call. 510 511 <br><dt><code>-momit-leaf-frame-pointer</code><dd><a name="index-momit_002dleaf_002dframe_002dpointer-1386"></a>Don't keep the frame pointer in a register for leaf functions. This 512avoids the instructions to save, set up and restore frame pointers and 513makes an extra register available in leaf functions. The option 514<samp><span class="option">-fomit-frame-pointer</span></samp> removes the frame pointer for all functions 515which might make debugging harder. 516 517 <br><dt><code>-mtls-direct-seg-refs</code><dt><code>-mno-tls-direct-seg-refs</code><dd><a name="index-mtls_002ddirect_002dseg_002drefs-1387"></a>Controls whether TLS variables may be accessed with offsets from the 518TLS segment register (<code>%gs</code> for 32-bit, <code>%fs</code> for 64-bit), 519or whether the thread base pointer must be added. Whether or not this 520is legal depends on the operating system, and whether it maps the 521segment to cover the entire TLS area. 522 523 <p>For systems that use GNU libc, the default is on. 524 525 <br><dt><code>-msse2avx</code><dt><code>-mno-sse2avx</code><dd><a name="index-msse2avx-1388"></a>Specify that the assembler should encode SSE instructions with VEX 526prefix. The option <samp><span class="option">-mavx</span></samp> turns this on by default. 527 528 <br><dt><code>-mfentry</code><dt><code>-mno-fentry</code><dd><a name="index-mfentry-1389"></a>If profiling is active <samp><span class="option">-pg</span></samp> put the profiling 529counter call before prologue. 530Note: On x86 architectures the attribute <code>ms_hook_prologue</code> 531isn't possible at the moment for <samp><span class="option">-mfentry</span></samp> and <samp><span class="option">-pg</span></samp>. 532 533 <br><dt><code>-m8bit-idiv</code><dt><code>-mno-8bit-idiv</code><dd><a name="index-g_t8bit_002didiv-1390"></a>On some processors, like Intel Atom, 8bit unsigned integer divide is 534much faster than 32bit/64bit integer divide. This option will generate a 535runt-time check. If both dividend and divisor are within range of 0 536to 255, 8bit unsigned integer divide will be used instead of 53732bit/64bit integer divide. 538 539 </dl> 540 541 <p>These ‘<samp><span class="samp">-m</span></samp>’ switches are supported in addition to the above 542on AMD x86-64 processors in 64-bit environments. 543 544 <dl> 545<dt><code>-m32</code><dt><code>-m64</code><dd><a name="index-m32-1391"></a><a name="index-m64-1392"></a>Generate code for a 32-bit or 64-bit environment. 546The 32-bit environment sets int, long and pointer to 32 bits and 547generates code that runs on any i386 system. 548The 64-bit environment sets int to 32 bits and long and pointer 549to 64 bits and generates code for AMD's x86-64 architecture. For 550darwin only the -m64 option turns off the <samp><span class="option">-fno-pic</span></samp> and 551<samp><span class="option">-mdynamic-no-pic</span></samp> options. 552 553 <br><dt><code>-mno-red-zone</code><dd><a name="index-mno_002dred_002dzone-1393"></a>Do not use a so called red zone for x86-64 code. The red zone is mandated 554by the x86-64 ABI, it is a 128-byte area beyond the location of the 555stack pointer that will not be modified by signal or interrupt handlers 556and therefore can be used for temporary data without adjusting the stack 557pointer. The flag <samp><span class="option">-mno-red-zone</span></samp> disables this red zone. 558 559 <br><dt><code>-mcmodel=small</code><dd><a name="index-mcmodel_003dsmall-1394"></a>Generate code for the small code model: the program and its symbols must 560be linked in the lower 2 GB of the address space. Pointers are 64 bits. 561Programs can be statically or dynamically linked. This is the default 562code model. 563 564 <br><dt><code>-mcmodel=kernel</code><dd><a name="index-mcmodel_003dkernel-1395"></a>Generate code for the kernel code model. The kernel runs in the 565negative 2 GB of the address space. 566This model has to be used for Linux kernel code. 567 568 <br><dt><code>-mcmodel=medium</code><dd><a name="index-mcmodel_003dmedium-1396"></a>Generate code for the medium model: The program is linked in the lower 2 569GB of the address space. Small symbols are also placed there. Symbols 570with sizes larger than <samp><span class="option">-mlarge-data-threshold</span></samp> are put into 571large data or bss sections and can be located above 2GB. Programs can 572be statically or dynamically linked. 573 574 <br><dt><code>-mcmodel=large</code><dd><a name="index-mcmodel_003dlarge-1397"></a>Generate code for the large model: This model makes no assumptions 575about addresses and sizes of sections. 576</dl> 577 578 </body></html> 579 580