1@c Copyright (C) 1991-2017 Free Software Foundation, Inc. 2@c This is part of the GAS manual. 3@c For copying conditions, see the file as.texinfo. 4@c man end 5 6@ifset GENERIC 7@page 8@node i386-Dependent 9@chapter 80386 Dependent Features 10@end ifset 11@ifclear GENERIC 12@node Machine Dependencies 13@chapter 80386 Dependent Features 14@end ifclear 15 16@cindex i386 support 17@cindex i80386 support 18@cindex x86-64 support 19 20The i386 version @code{@value{AS}} supports both the original Intel 386 21architecture in both 16 and 32-bit mode as well as AMD x86-64 architecture 22extending the Intel architecture to 64-bits. 23 24@menu 25* i386-Options:: Options 26* i386-Directives:: X86 specific directives 27* i386-Syntax:: Syntactical considerations 28* i386-Mnemonics:: Instruction Naming 29* i386-Regs:: Register Naming 30* i386-Prefixes:: Instruction Prefixes 31* i386-Memory:: Memory References 32* i386-Jumps:: Handling of Jump Instructions 33* i386-Float:: Floating Point 34* i386-SIMD:: Intel's MMX and AMD's 3DNow! SIMD Operations 35* i386-LWP:: AMD's Lightweight Profiling Instructions 36* i386-BMI:: Bit Manipulation Instruction 37* i386-TBM:: AMD's Trailing Bit Manipulation Instructions 38* i386-16bit:: Writing 16-bit Code 39* i386-Arch:: Specifying an x86 CPU architecture 40* i386-Bugs:: AT&T Syntax bugs 41* i386-Notes:: Notes 42@end menu 43 44@node i386-Options 45@section Options 46 47@cindex options for i386 48@cindex options for x86-64 49@cindex i386 options 50@cindex x86-64 options 51 52The i386 version of @code{@value{AS}} has a few machine 53dependent options: 54 55@c man begin OPTIONS 56@table @gcctabopt 57@cindex @samp{--32} option, i386 58@cindex @samp{--32} option, x86-64 59@cindex @samp{--x32} option, i386 60@cindex @samp{--x32} option, x86-64 61@cindex @samp{--64} option, i386 62@cindex @samp{--64} option, x86-64 63@item --32 | --x32 | --64 64Select the word size, either 32 bits or 64 bits. @samp{--32} 65implies Intel i386 architecture, while @samp{--x32} and @samp{--64} 66imply AMD x86-64 architecture with 32-bit or 64-bit word-size 67respectively. 68 69These options are only available with the ELF object file format, and 70require that the necessary BFD support has been included (on a 32-bit 71platform you have to add --enable-64-bit-bfd to configure enable 64-bit 72usage and use x86-64 as target platform). 73 74@item -n 75By default, x86 GAS replaces multiple nop instructions used for 76alignment within code sections with multi-byte nop instructions such 77as leal 0(%esi,1),%esi. This switch disables the optimization. 78 79@cindex @samp{--divide} option, i386 80@item --divide 81On SVR4-derived platforms, the character @samp{/} is treated as a comment 82character, which means that it cannot be used in expressions. The 83@samp{--divide} option turns @samp{/} into a normal character. This does 84not disable @samp{/} at the beginning of a line starting a comment, or 85affect using @samp{#} for starting a comment. 86 87@cindex @samp{-march=} option, i386 88@cindex @samp{-march=} option, x86-64 89@item -march=@var{CPU}[+@var{EXTENSION}@dots{}] 90This option specifies the target processor. The assembler will 91issue an error message if an attempt is made to assemble an instruction 92which will not execute on the target processor. The following 93processor names are recognized: 94@code{i8086}, 95@code{i186}, 96@code{i286}, 97@code{i386}, 98@code{i486}, 99@code{i586}, 100@code{i686}, 101@code{pentium}, 102@code{pentiumpro}, 103@code{pentiumii}, 104@code{pentiumiii}, 105@code{pentium4}, 106@code{prescott}, 107@code{nocona}, 108@code{core}, 109@code{core2}, 110@code{corei7}, 111@code{l1om}, 112@code{k1om}, 113@code{iamcu}, 114@code{k6}, 115@code{k6_2}, 116@code{athlon}, 117@code{opteron}, 118@code{k8}, 119@code{amdfam10}, 120@code{bdver1}, 121@code{bdver2}, 122@code{bdver3}, 123@code{bdver4}, 124@code{znver1}, 125@code{btver1}, 126@code{btver2}, 127@code{generic32} and 128@code{generic64}. 129 130In addition to the basic instruction set, the assembler can be told to 131accept various extension mnemonics. For example, 132@code{-march=i686+sse4+vmx} extends @var{i686} with @var{sse4} and 133@var{vmx}. The following extensions are currently supported: 134@code{8087}, 135@code{287}, 136@code{387}, 137@code{687}, 138@code{no87}, 139@code{no287}, 140@code{no387}, 141@code{no687}, 142@code{mmx}, 143@code{nommx}, 144@code{sse}, 145@code{sse2}, 146@code{sse3}, 147@code{ssse3}, 148@code{sse4.1}, 149@code{sse4.2}, 150@code{sse4}, 151@code{nosse}, 152@code{nosse2}, 153@code{nosse3}, 154@code{nossse3}, 155@code{nosse4.1}, 156@code{nosse4.2}, 157@code{nosse4}, 158@code{avx}, 159@code{avx2}, 160@code{noavx}, 161@code{noavx2}, 162@code{adx}, 163@code{rdseed}, 164@code{prfchw}, 165@code{smap}, 166@code{mpx}, 167@code{sha}, 168@code{rdpid}, 169@code{ptwrite}, 170@code{prefetchwt1}, 171@code{clflushopt}, 172@code{se1}, 173@code{clwb}, 174@code{avx512f}, 175@code{avx512cd}, 176@code{avx512er}, 177@code{avx512pf}, 178@code{avx512vl}, 179@code{avx512bw}, 180@code{avx512dq}, 181@code{avx512ifma}, 182@code{avx512vbmi}, 183@code{avx512_4fmaps}, 184@code{avx512_4vnniw}, 185@code{avx512_vpopcntdq}, 186@code{noavx512f}, 187@code{noavx512cd}, 188@code{noavx512er}, 189@code{noavx512pf}, 190@code{noavx512vl}, 191@code{noavx512bw}, 192@code{noavx512dq}, 193@code{noavx512ifma}, 194@code{noavx512vbmi}, 195@code{noavx512_4fmaps}, 196@code{noavx512_4vnniw}, 197@code{noavx512_vpopcntdq}, 198@code{vmx}, 199@code{vmfunc}, 200@code{smx}, 201@code{xsave}, 202@code{xsaveopt}, 203@code{xsavec}, 204@code{xsaves}, 205@code{aes}, 206@code{pclmul}, 207@code{fsgsbase}, 208@code{rdrnd}, 209@code{f16c}, 210@code{bmi2}, 211@code{fma}, 212@code{movbe}, 213@code{ept}, 214@code{lzcnt}, 215@code{hle}, 216@code{rtm}, 217@code{invpcid}, 218@code{clflush}, 219@code{mwaitx}, 220@code{clzero}, 221@code{lwp}, 222@code{fma4}, 223@code{xop}, 224@code{cx16}, 225@code{syscall}, 226@code{rdtscp}, 227@code{3dnow}, 228@code{3dnowa}, 229@code{sse4a}, 230@code{sse5}, 231@code{svme}, 232@code{abm} and 233@code{padlock}. 234Note that rather than extending a basic instruction set, the extension 235mnemonics starting with @code{no} revoke the respective functionality. 236 237When the @code{.arch} directive is used with @option{-march}, the 238@code{.arch} directive will take precedent. 239 240@cindex @samp{-mtune=} option, i386 241@cindex @samp{-mtune=} option, x86-64 242@item -mtune=@var{CPU} 243This option specifies a processor to optimize for. When used in 244conjunction with the @option{-march} option, only instructions 245of the processor specified by the @option{-march} option will be 246generated. 247 248Valid @var{CPU} values are identical to the processor list of 249@option{-march=@var{CPU}}. 250 251@cindex @samp{-msse2avx} option, i386 252@cindex @samp{-msse2avx} option, x86-64 253@item -msse2avx 254This option specifies that the assembler should encode SSE instructions 255with VEX prefix. 256 257@cindex @samp{-msse-check=} option, i386 258@cindex @samp{-msse-check=} option, x86-64 259@item -msse-check=@var{none} 260@itemx -msse-check=@var{warning} 261@itemx -msse-check=@var{error} 262These options control if the assembler should check SSE instructions. 263@option{-msse-check=@var{none}} will make the assembler not to check SSE 264instructions, which is the default. @option{-msse-check=@var{warning}} 265will make the assembler issue a warning for any SSE instruction. 266@option{-msse-check=@var{error}} will make the assembler issue an error 267for any SSE instruction. 268 269@cindex @samp{-mavxscalar=} option, i386 270@cindex @samp{-mavxscalar=} option, x86-64 271@item -mavxscalar=@var{128} 272@itemx -mavxscalar=@var{256} 273These options control how the assembler should encode scalar AVX 274instructions. @option{-mavxscalar=@var{128}} will encode scalar 275AVX instructions with 128bit vector length, which is the default. 276@option{-mavxscalar=@var{256}} will encode scalar AVX instructions 277with 256bit vector length. 278 279@cindex @samp{-mevexlig=} option, i386 280@cindex @samp{-mevexlig=} option, x86-64 281@item -mevexlig=@var{128} 282@itemx -mevexlig=@var{256} 283@itemx -mevexlig=@var{512} 284These options control how the assembler should encode length-ignored 285(LIG) EVEX instructions. @option{-mevexlig=@var{128}} will encode LIG 286EVEX instructions with 128bit vector length, which is the default. 287@option{-mevexlig=@var{256}} and @option{-mevexlig=@var{512}} will 288encode LIG EVEX instructions with 256bit and 512bit vector length, 289respectively. 290 291@cindex @samp{-mevexwig=} option, i386 292@cindex @samp{-mevexwig=} option, x86-64 293@item -mevexwig=@var{0} 294@itemx -mevexwig=@var{1} 295These options control how the assembler should encode w-ignored (WIG) 296EVEX instructions. @option{-mevexwig=@var{0}} will encode WIG 297EVEX instructions with evex.w = 0, which is the default. 298@option{-mevexwig=@var{1}} will encode WIG EVEX instructions with 299evex.w = 1. 300 301@cindex @samp{-mmnemonic=} option, i386 302@cindex @samp{-mmnemonic=} option, x86-64 303@item -mmnemonic=@var{att} 304@itemx -mmnemonic=@var{intel} 305This option specifies instruction mnemonic for matching instructions. 306The @code{.att_mnemonic} and @code{.intel_mnemonic} directives will 307take precedent. 308 309@cindex @samp{-msyntax=} option, i386 310@cindex @samp{-msyntax=} option, x86-64 311@item -msyntax=@var{att} 312@itemx -msyntax=@var{intel} 313This option specifies instruction syntax when processing instructions. 314The @code{.att_syntax} and @code{.intel_syntax} directives will 315take precedent. 316 317@cindex @samp{-mnaked-reg} option, i386 318@cindex @samp{-mnaked-reg} option, x86-64 319@item -mnaked-reg 320This opetion specifies that registers don't require a @samp{%} prefix. 321The @code{.att_syntax} and @code{.intel_syntax} directives will take precedent. 322 323@cindex @samp{-madd-bnd-prefix} option, i386 324@cindex @samp{-madd-bnd-prefix} option, x86-64 325@item -madd-bnd-prefix 326This option forces the assembler to add BND prefix to all branches, even 327if such prefix was not explicitly specified in the source code. 328 329@cindex @samp{-mshared} option, i386 330@cindex @samp{-mshared} option, x86-64 331@item -mno-shared 332On ELF target, the assembler normally optimizes out non-PLT relocations 333against defined non-weak global branch targets with default visibility. 334The @samp{-mshared} option tells the assembler to generate code which 335may go into a shared library where all non-weak global branch targets 336with default visibility can be preempted. The resulting code is 337slightly bigger. This option only affects the handling of branch 338instructions. 339 340@cindex @samp{-mbig-obj} option, x86-64 341@item -mbig-obj 342On x86-64 PE/COFF target this option forces the use of big object file 343format, which allows more than 32768 sections. 344 345@cindex @samp{-momit-lock-prefix=} option, i386 346@cindex @samp{-momit-lock-prefix=} option, x86-64 347@item -momit-lock-prefix=@var{no} 348@itemx -momit-lock-prefix=@var{yes} 349These options control how the assembler should encode lock prefix. 350This option is intended as a workaround for processors, that fail on 351lock prefix. This option can only be safely used with single-core, 352single-thread computers 353@option{-momit-lock-prefix=@var{yes}} will omit all lock prefixes. 354@option{-momit-lock-prefix=@var{no}} will encode lock prefix as usual, 355which is the default. 356 357@cindex @samp{-mfence-as-lock-add=} option, i386 358@cindex @samp{-mfence-as-lock-add=} option, x86-64 359@item -mfence-as-lock-add=@var{no} 360@itemx -mfence-as-lock-add=@var{yes} 361These options control how the assembler should encode lfence, mfence and 362sfence. 363@option{-mfence-as-lock-add=@var{yes}} will encode lfence, mfence and 364sfence as @samp{lock addl $0x0, (%rsp)} in 64-bit mode and 365@samp{lock addl $0x0, (%esp)} in 32-bit mode. 366@option{-mfence-as-lock-add=@var{no}} will encode lfence, mfence and 367sfence as usual, which is the default. 368 369@cindex @samp{-mrelax-relocations=} option, i386 370@cindex @samp{-mrelax-relocations=} option, x86-64 371@item -mrelax-relocations=@var{no} 372@itemx -mrelax-relocations=@var{yes} 373These options control whether the assembler should generate relax 374relocations, R_386_GOT32X, in 32-bit mode, or R_X86_64_GOTPCRELX and 375R_X86_64_REX_GOTPCRELX, in 64-bit mode. 376@option{-mrelax-relocations=@var{yes}} will generate relax relocations. 377@option{-mrelax-relocations=@var{no}} will not generate relax 378relocations. The default can be controlled by a configure option 379@option{--enable-x86-relax-relocations}. 380 381@cindex @samp{-mevexrcig=} option, i386 382@cindex @samp{-mevexrcig=} option, x86-64 383@item -mevexrcig=@var{rne} 384@itemx -mevexrcig=@var{rd} 385@itemx -mevexrcig=@var{ru} 386@itemx -mevexrcig=@var{rz} 387These options control how the assembler should encode SAE-only 388EVEX instructions. @option{-mevexrcig=@var{rne}} will encode RC bits 389of EVEX instruction with 00, which is the default. 390@option{-mevexrcig=@var{rd}}, @option{-mevexrcig=@var{ru}} 391and @option{-mevexrcig=@var{rz}} will encode SAE-only EVEX instructions 392with 01, 10 and 11 RC bits, respectively. 393 394@cindex @samp{-mamd64} option, x86-64 395@cindex @samp{-mintel64} option, x86-64 396@item -mamd64 397@itemx -mintel64 398This option specifies that the assembler should accept only AMD64 or 399Intel64 ISA in 64-bit mode. The default is to accept both. 400 401@end table 402@c man end 403 404@node i386-Directives 405@section x86 specific Directives 406 407@cindex machine directives, x86 408@cindex x86 machine directives 409@table @code 410 411@cindex @code{lcomm} directive, COFF 412@item .lcomm @var{symbol} , @var{length}[, @var{alignment}] 413Reserve @var{length} (an absolute expression) bytes for a local common 414denoted by @var{symbol}. The section and value of @var{symbol} are 415those of the new local common. The addresses are allocated in the bss 416section, so that at run-time the bytes start off zeroed. Since 417@var{symbol} is not declared global, it is normally not visible to 418@code{@value{LD}}. The optional third parameter, @var{alignment}, 419specifies the desired alignment of the symbol in the bss section. 420 421This directive is only available for COFF based x86 targets. 422 423@c FIXME: Document other x86 specific directives ? Eg: .code16gcc, 424@c .largecomm 425 426@end table 427 428@node i386-Syntax 429@section i386 Syntactical Considerations 430@menu 431* i386-Variations:: AT&T Syntax versus Intel Syntax 432* i386-Chars:: Special Characters 433@end menu 434 435@node i386-Variations 436@subsection AT&T Syntax versus Intel Syntax 437 438@cindex i386 intel_syntax pseudo op 439@cindex intel_syntax pseudo op, i386 440@cindex i386 att_syntax pseudo op 441@cindex att_syntax pseudo op, i386 442@cindex i386 syntax compatibility 443@cindex syntax compatibility, i386 444@cindex x86-64 intel_syntax pseudo op 445@cindex intel_syntax pseudo op, x86-64 446@cindex x86-64 att_syntax pseudo op 447@cindex att_syntax pseudo op, x86-64 448@cindex x86-64 syntax compatibility 449@cindex syntax compatibility, x86-64 450 451@code{@value{AS}} now supports assembly using Intel assembler syntax. 452@code{.intel_syntax} selects Intel mode, and @code{.att_syntax} switches 453back to the usual AT&T mode for compatibility with the output of 454@code{@value{GCC}}. Either of these directives may have an optional 455argument, @code{prefix}, or @code{noprefix} specifying whether registers 456require a @samp{%} prefix. AT&T System V/386 assembler syntax is quite 457different from Intel syntax. We mention these differences because 458almost all 80386 documents use Intel syntax. Notable differences 459between the two syntaxes are: 460 461@cindex immediate operands, i386 462@cindex i386 immediate operands 463@cindex register operands, i386 464@cindex i386 register operands 465@cindex jump/call operands, i386 466@cindex i386 jump/call operands 467@cindex operand delimiters, i386 468 469@cindex immediate operands, x86-64 470@cindex x86-64 immediate operands 471@cindex register operands, x86-64 472@cindex x86-64 register operands 473@cindex jump/call operands, x86-64 474@cindex x86-64 jump/call operands 475@cindex operand delimiters, x86-64 476@itemize @bullet 477@item 478AT&T immediate operands are preceded by @samp{$}; Intel immediate 479operands are undelimited (Intel @samp{push 4} is AT&T @samp{pushl $4}). 480AT&T register operands are preceded by @samp{%}; Intel register operands 481are undelimited. AT&T absolute (as opposed to PC relative) jump/call 482operands are prefixed by @samp{*}; they are undelimited in Intel syntax. 483 484@cindex i386 source, destination operands 485@cindex source, destination operands; i386 486@cindex x86-64 source, destination operands 487@cindex source, destination operands; x86-64 488@item 489AT&T and Intel syntax use the opposite order for source and destination 490operands. Intel @samp{add eax, 4} is @samp{addl $4, %eax}. The 491@samp{source, dest} convention is maintained for compatibility with 492previous Unix assemblers. Note that @samp{bound}, @samp{invlpga}, and 493instructions with 2 immediate operands, such as the @samp{enter} 494instruction, do @emph{not} have reversed order. @ref{i386-Bugs}. 495 496@cindex mnemonic suffixes, i386 497@cindex sizes operands, i386 498@cindex i386 size suffixes 499@cindex mnemonic suffixes, x86-64 500@cindex sizes operands, x86-64 501@cindex x86-64 size suffixes 502@item 503In AT&T syntax the size of memory operands is determined from the last 504character of the instruction mnemonic. Mnemonic suffixes of @samp{b}, 505@samp{w}, @samp{l} and @samp{q} specify byte (8-bit), word (16-bit), long 506(32-bit) and quadruple word (64-bit) memory references. Intel syntax accomplishes 507this by prefixing memory operands (@emph{not} the instruction mnemonics) with 508@samp{byte ptr}, @samp{word ptr}, @samp{dword ptr} and @samp{qword ptr}. Thus, 509Intel @samp{mov al, byte ptr @var{foo}} is @samp{movb @var{foo}, %al} in AT&T 510syntax. 511 512In 64-bit code, @samp{movabs} can be used to encode the @samp{mov} 513instruction with the 64-bit displacement or immediate operand. 514 515@cindex return instructions, i386 516@cindex i386 jump, call, return 517@cindex return instructions, x86-64 518@cindex x86-64 jump, call, return 519@item 520Immediate form long jumps and calls are 521@samp{lcall/ljmp $@var{section}, $@var{offset}} in AT&T syntax; the 522Intel syntax is 523@samp{call/jmp far @var{section}:@var{offset}}. Also, the far return 524instruction 525is @samp{lret $@var{stack-adjust}} in AT&T syntax; Intel syntax is 526@samp{ret far @var{stack-adjust}}. 527 528@cindex sections, i386 529@cindex i386 sections 530@cindex sections, x86-64 531@cindex x86-64 sections 532@item 533The AT&T assembler does not provide support for multiple section 534programs. Unix style systems expect all programs to be single sections. 535@end itemize 536 537@node i386-Chars 538@subsection Special Characters 539 540@cindex line comment character, i386 541@cindex i386 line comment character 542The presence of a @samp{#} appearing anywhere on a line indicates the 543start of a comment that extends to the end of that line. 544 545If a @samp{#} appears as the first character of a line then the whole 546line is treated as a comment, but in this case the line can also be a 547logical line number directive (@pxref{Comments}) or a preprocessor 548control command (@pxref{Preprocessing}). 549 550If the @option{--divide} command line option has not been specified 551then the @samp{/} character appearing anywhere on a line also 552introduces a line comment. 553 554@cindex line separator, i386 555@cindex statement separator, i386 556@cindex i386 line separator 557The @samp{;} character can be used to separate statements on the same 558line. 559 560@node i386-Mnemonics 561@section i386-Mnemonics 562@subsection Instruction Naming 563 564@cindex i386 instruction naming 565@cindex instruction naming, i386 566@cindex x86-64 instruction naming 567@cindex instruction naming, x86-64 568 569Instruction mnemonics are suffixed with one character modifiers which 570specify the size of operands. The letters @samp{b}, @samp{w}, @samp{l} 571and @samp{q} specify byte, word, long and quadruple word operands. If 572no suffix is specified by an instruction then @code{@value{AS}} tries to 573fill in the missing suffix based on the destination register operand 574(the last one by convention). Thus, @samp{mov %ax, %bx} is equivalent 575to @samp{movw %ax, %bx}; also, @samp{mov $1, %bx} is equivalent to 576@samp{movw $1, bx}. Note that this is incompatible with the AT&T Unix 577assembler which assumes that a missing mnemonic suffix implies long 578operand size. (This incompatibility does not affect compiler output 579since compilers always explicitly specify the mnemonic suffix.) 580 581Almost all instructions have the same names in AT&T and Intel format. 582There are a few exceptions. The sign extend and zero extend 583instructions need two sizes to specify them. They need a size to 584sign/zero extend @emph{from} and a size to zero extend @emph{to}. This 585is accomplished by using two instruction mnemonic suffixes in AT&T 586syntax. Base names for sign extend and zero extend are 587@samp{movs@dots{}} and @samp{movz@dots{}} in AT&T syntax (@samp{movsx} 588and @samp{movzx} in Intel syntax). The instruction mnemonic suffixes 589are tacked on to this base name, the @emph{from} suffix before the 590@emph{to} suffix. Thus, @samp{movsbl %al, %edx} is AT&T syntax for 591``move sign extend @emph{from} %al @emph{to} %edx.'' Possible suffixes, 592thus, are @samp{bl} (from byte to long), @samp{bw} (from byte to word), 593@samp{wl} (from word to long), @samp{bq} (from byte to quadruple word), 594@samp{wq} (from word to quadruple word), and @samp{lq} (from long to 595quadruple word). 596 597@cindex encoding options, i386 598@cindex encoding options, x86-64 599 600Different encoding options can be specified via optional mnemonic 601suffix. @samp{.s} suffix swaps 2 register operands in encoding when 602moving from one register to another. @samp{.d8} or @samp{.d32} suffix 603prefers 8bit or 32bit displacement in encoding. 604 605@cindex conversion instructions, i386 606@cindex i386 conversion instructions 607@cindex conversion instructions, x86-64 608@cindex x86-64 conversion instructions 609The Intel-syntax conversion instructions 610 611@itemize @bullet 612@item 613@samp{cbw} --- sign-extend byte in @samp{%al} to word in @samp{%ax}, 614 615@item 616@samp{cwde} --- sign-extend word in @samp{%ax} to long in @samp{%eax}, 617 618@item 619@samp{cwd} --- sign-extend word in @samp{%ax} to long in @samp{%dx:%ax}, 620 621@item 622@samp{cdq} --- sign-extend dword in @samp{%eax} to quad in @samp{%edx:%eax}, 623 624@item 625@samp{cdqe} --- sign-extend dword in @samp{%eax} to quad in @samp{%rax} 626(x86-64 only), 627 628@item 629@samp{cqo} --- sign-extend quad in @samp{%rax} to octuple in 630@samp{%rdx:%rax} (x86-64 only), 631@end itemize 632 633@noindent 634are called @samp{cbtw}, @samp{cwtl}, @samp{cwtd}, @samp{cltd}, @samp{cltq}, and 635@samp{cqto} in AT&T naming. @code{@value{AS}} accepts either naming for these 636instructions. 637 638@cindex jump instructions, i386 639@cindex call instructions, i386 640@cindex jump instructions, x86-64 641@cindex call instructions, x86-64 642Far call/jump instructions are @samp{lcall} and @samp{ljmp} in 643AT&T syntax, but are @samp{call far} and @samp{jump far} in Intel 644convention. 645 646@subsection AT&T Mnemonic versus Intel Mnemonic 647 648@cindex i386 mnemonic compatibility 649@cindex mnemonic compatibility, i386 650 651@code{@value{AS}} supports assembly using Intel mnemonic. 652@code{.intel_mnemonic} selects Intel mnemonic with Intel syntax, and 653@code{.att_mnemonic} switches back to the usual AT&T mnemonic with AT&T 654syntax for compatibility with the output of @code{@value{GCC}}. 655Several x87 instructions, @samp{fadd}, @samp{fdiv}, @samp{fdivp}, 656@samp{fdivr}, @samp{fdivrp}, @samp{fmul}, @samp{fsub}, @samp{fsubp}, 657@samp{fsubr} and @samp{fsubrp}, are implemented in AT&T System V/386 658assembler with different mnemonics from those in Intel IA32 specification. 659@code{@value{GCC}} generates those instructions with AT&T mnemonic. 660 661@node i386-Regs 662@section Register Naming 663 664@cindex i386 registers 665@cindex registers, i386 666@cindex x86-64 registers 667@cindex registers, x86-64 668Register operands are always prefixed with @samp{%}. The 80386 registers 669consist of 670 671@itemize @bullet 672@item 673the 8 32-bit registers @samp{%eax} (the accumulator), @samp{%ebx}, 674@samp{%ecx}, @samp{%edx}, @samp{%edi}, @samp{%esi}, @samp{%ebp} (the 675frame pointer), and @samp{%esp} (the stack pointer). 676 677@item 678the 8 16-bit low-ends of these: @samp{%ax}, @samp{%bx}, @samp{%cx}, 679@samp{%dx}, @samp{%di}, @samp{%si}, @samp{%bp}, and @samp{%sp}. 680 681@item 682the 8 8-bit registers: @samp{%ah}, @samp{%al}, @samp{%bh}, 683@samp{%bl}, @samp{%ch}, @samp{%cl}, @samp{%dh}, and @samp{%dl} (These 684are the high-bytes and low-bytes of @samp{%ax}, @samp{%bx}, 685@samp{%cx}, and @samp{%dx}) 686 687@item 688the 6 section registers @samp{%cs} (code section), @samp{%ds} 689(data section), @samp{%ss} (stack section), @samp{%es}, @samp{%fs}, 690and @samp{%gs}. 691 692@item 693the 5 processor control registers @samp{%cr0}, @samp{%cr2}, 694@samp{%cr3}, @samp{%cr4}, and @samp{%cr8}. 695 696@item 697the 6 debug registers @samp{%db0}, @samp{%db1}, @samp{%db2}, 698@samp{%db3}, @samp{%db6}, and @samp{%db7}. 699 700@item 701the 2 test registers @samp{%tr6} and @samp{%tr7}. 702 703@item 704the 8 floating point register stack @samp{%st} or equivalently 705@samp{%st(0)}, @samp{%st(1)}, @samp{%st(2)}, @samp{%st(3)}, 706@samp{%st(4)}, @samp{%st(5)}, @samp{%st(6)}, and @samp{%st(7)}. 707These registers are overloaded by 8 MMX registers @samp{%mm0}, 708@samp{%mm1}, @samp{%mm2}, @samp{%mm3}, @samp{%mm4}, @samp{%mm5}, 709@samp{%mm6} and @samp{%mm7}. 710 711@item 712the 8 128-bit SSE registers registers @samp{%xmm0}, @samp{%xmm1}, @samp{%xmm2}, 713@samp{%xmm3}, @samp{%xmm4}, @samp{%xmm5}, @samp{%xmm6} and @samp{%xmm7}. 714@end itemize 715 716The AMD x86-64 architecture extends the register set by: 717 718@itemize @bullet 719@item 720enhancing the 8 32-bit registers to 64-bit: @samp{%rax} (the 721accumulator), @samp{%rbx}, @samp{%rcx}, @samp{%rdx}, @samp{%rdi}, 722@samp{%rsi}, @samp{%rbp} (the frame pointer), @samp{%rsp} (the stack 723pointer) 724 725@item 726the 8 extended registers @samp{%r8}--@samp{%r15}. 727 728@item 729the 8 32-bit low ends of the extended registers: @samp{%r8d}--@samp{%r15d}. 730 731@item 732the 8 16-bit low ends of the extended registers: @samp{%r8w}--@samp{%r15w}. 733 734@item 735the 8 8-bit low ends of the extended registers: @samp{%r8b}--@samp{%r15b}. 736 737@item 738the 4 8-bit registers: @samp{%sil}, @samp{%dil}, @samp{%bpl}, @samp{%spl}. 739 740@item 741the 8 debug registers: @samp{%db8}--@samp{%db15}. 742 743@item 744the 8 128-bit SSE registers: @samp{%xmm8}--@samp{%xmm15}. 745@end itemize 746 747With the AVX extensions more registers were made available: 748 749@itemize @bullet 750 751@item 752the 16 256-bit SSE @samp{%ymm0}--@samp{%ymm15} (only the first 8 753available in 32-bit mode). The bottom 128 bits are overlaid with the 754@samp{xmm0}--@samp{xmm15} registers. 755 756@end itemize 757 758The AVX2 extensions made in 64-bit mode more registers available: 759 760@itemize @bullet 761 762@item 763the 16 128-bit registers @samp{%xmm16}--@samp{%xmm31} and the 16 256-bit 764registers @samp{%ymm16}--@samp{%ymm31}. 765 766@end itemize 767 768The AVX512 extensions added the following registers: 769 770@itemize @bullet 771 772@item 773the 32 512-bit registers @samp{%zmm0}--@samp{%zmm31} (only the first 8 774available in 32-bit mode). The bottom 128 bits are overlaid with the 775@samp{%xmm0}--@samp{%xmm31} registers and the first 256 bits are 776overlaid with the @samp{%ymm0}--@samp{%ymm31} registers. 777 778@item 779the 8 mask registers @samp{%k0}--@samp{%k7}. 780 781@end itemize 782 783@node i386-Prefixes 784@section Instruction Prefixes 785 786@cindex i386 instruction prefixes 787@cindex instruction prefixes, i386 788@cindex prefixes, i386 789Instruction prefixes are used to modify the following instruction. They 790are used to repeat string instructions, to provide section overrides, to 791perform bus lock operations, and to change operand and address sizes. 792(Most instructions that normally operate on 32-bit operands will use 79316-bit operands if the instruction has an ``operand size'' prefix.) 794Instruction prefixes are best written on the same line as the instruction 795they act upon. For example, the @samp{scas} (scan string) instruction is 796repeated with: 797 798@smallexample 799 repne scas %es:(%edi),%al 800@end smallexample 801 802You may also place prefixes on the lines immediately preceding the 803instruction, but this circumvents checks that @code{@value{AS}} does 804with prefixes, and will not work with all prefixes. 805 806Here is a list of instruction prefixes: 807 808@cindex section override prefixes, i386 809@itemize @bullet 810@item 811Section override prefixes @samp{cs}, @samp{ds}, @samp{ss}, @samp{es}, 812@samp{fs}, @samp{gs}. These are automatically added by specifying 813using the @var{section}:@var{memory-operand} form for memory references. 814 815@cindex size prefixes, i386 816@item 817Operand/Address size prefixes @samp{data16} and @samp{addr16} 818change 32-bit operands/addresses into 16-bit operands/addresses, 819while @samp{data32} and @samp{addr32} change 16-bit ones (in a 820@code{.code16} section) into 32-bit operands/addresses. These prefixes 821@emph{must} appear on the same line of code as the instruction they 822modify. For example, in a 16-bit @code{.code16} section, you might 823write: 824 825@smallexample 826 addr32 jmpl *(%ebx) 827@end smallexample 828 829@cindex bus lock prefixes, i386 830@cindex inhibiting interrupts, i386 831@item 832The bus lock prefix @samp{lock} inhibits interrupts during execution of 833the instruction it precedes. (This is only valid with certain 834instructions; see a 80386 manual for details). 835 836@cindex coprocessor wait, i386 837@item 838The wait for coprocessor prefix @samp{wait} waits for the coprocessor to 839complete the current instruction. This should never be needed for the 84080386/80387 combination. 841 842@cindex repeat prefixes, i386 843@item 844The @samp{rep}, @samp{repe}, and @samp{repne} prefixes are added 845to string instructions to make them repeat @samp{%ecx} times (@samp{%cx} 846times if the current address size is 16-bits). 847@cindex REX prefixes, i386 848@item 849The @samp{rex} family of prefixes is used by x86-64 to encode 850extensions to i386 instruction set. The @samp{rex} prefix has four 851bits --- an operand size overwrite (@code{64}) used to change operand size 852from 32-bit to 64-bit and X, Y and Z extensions bits used to extend the 853register set. 854 855You may write the @samp{rex} prefixes directly. The @samp{rex64xyz} 856instruction emits @samp{rex} prefix with all the bits set. By omitting 857the @code{64}, @code{x}, @code{y} or @code{z} you may write other 858prefixes as well. Normally, there is no need to write the prefixes 859explicitly, since gas will automatically generate them based on the 860instruction operands. 861@end itemize 862 863@node i386-Memory 864@section Memory References 865 866@cindex i386 memory references 867@cindex memory references, i386 868@cindex x86-64 memory references 869@cindex memory references, x86-64 870An Intel syntax indirect memory reference of the form 871 872@smallexample 873@var{section}:[@var{base} + @var{index}*@var{scale} + @var{disp}] 874@end smallexample 875 876@noindent 877is translated into the AT&T syntax 878 879@smallexample 880@var{section}:@var{disp}(@var{base}, @var{index}, @var{scale}) 881@end smallexample 882 883@noindent 884where @var{base} and @var{index} are the optional 32-bit base and 885index registers, @var{disp} is the optional displacement, and 886@var{scale}, taking the values 1, 2, 4, and 8, multiplies @var{index} 887to calculate the address of the operand. If no @var{scale} is 888specified, @var{scale} is taken to be 1. @var{section} specifies the 889optional section register for the memory operand, and may override the 890default section register (see a 80386 manual for section register 891defaults). Note that section overrides in AT&T syntax @emph{must} 892be preceded by a @samp{%}. If you specify a section override which 893coincides with the default section register, @code{@value{AS}} does @emph{not} 894output any section register override prefixes to assemble the given 895instruction. Thus, section overrides can be specified to emphasize which 896section register is used for a given memory operand. 897 898Here are some examples of Intel and AT&T style memory references: 899 900@table @asis 901@item AT&T: @samp{-4(%ebp)}, Intel: @samp{[ebp - 4]} 902@var{base} is @samp{%ebp}; @var{disp} is @samp{-4}. @var{section} is 903missing, and the default section is used (@samp{%ss} for addressing with 904@samp{%ebp} as the base register). @var{index}, @var{scale} are both missing. 905 906@item AT&T: @samp{foo(,%eax,4)}, Intel: @samp{[foo + eax*4]} 907@var{index} is @samp{%eax} (scaled by a @var{scale} 4); @var{disp} is 908@samp{foo}. All other fields are missing. The section register here 909defaults to @samp{%ds}. 910 911@item AT&T: @samp{foo(,1)}; Intel @samp{[foo]} 912This uses the value pointed to by @samp{foo} as a memory operand. 913Note that @var{base} and @var{index} are both missing, but there is only 914@emph{one} @samp{,}. This is a syntactic exception. 915 916@item AT&T: @samp{%gs:foo}; Intel @samp{gs:foo} 917This selects the contents of the variable @samp{foo} with section 918register @var{section} being @samp{%gs}. 919@end table 920 921Absolute (as opposed to PC relative) call and jump operands must be 922prefixed with @samp{*}. If no @samp{*} is specified, @code{@value{AS}} 923always chooses PC relative addressing for jump/call labels. 924 925Any instruction that has a memory operand, but no register operand, 926@emph{must} specify its size (byte, word, long, or quadruple) with an 927instruction mnemonic suffix (@samp{b}, @samp{w}, @samp{l} or @samp{q}, 928respectively). 929 930The x86-64 architecture adds an RIP (instruction pointer relative) 931addressing. This addressing mode is specified by using @samp{rip} as a 932base register. Only constant offsets are valid. For example: 933 934@table @asis 935@item AT&T: @samp{1234(%rip)}, Intel: @samp{[rip + 1234]} 936Points to the address 1234 bytes past the end of the current 937instruction. 938 939@item AT&T: @samp{symbol(%rip)}, Intel: @samp{[rip + symbol]} 940Points to the @code{symbol} in RIP relative way, this is shorter than 941the default absolute addressing. 942@end table 943 944Other addressing modes remain unchanged in x86-64 architecture, except 945registers used are 64-bit instead of 32-bit. 946 947@node i386-Jumps 948@section Handling of Jump Instructions 949 950@cindex jump optimization, i386 951@cindex i386 jump optimization 952@cindex jump optimization, x86-64 953@cindex x86-64 jump optimization 954Jump instructions are always optimized to use the smallest possible 955displacements. This is accomplished by using byte (8-bit) displacement 956jumps whenever the target is sufficiently close. If a byte displacement 957is insufficient a long displacement is used. We do not support 958word (16-bit) displacement jumps in 32-bit mode (i.e. prefixing the jump 959instruction with the @samp{data16} instruction prefix), since the 80386 960insists upon masking @samp{%eip} to 16 bits after the word displacement 961is added. (See also @pxref{i386-Arch}) 962 963Note that the @samp{jcxz}, @samp{jecxz}, @samp{loop}, @samp{loopz}, 964@samp{loope}, @samp{loopnz} and @samp{loopne} instructions only come in byte 965displacements, so that if you use these instructions (@code{@value{GCC}} does 966not use them) you may get an error message (and incorrect code). The AT&T 96780386 assembler tries to get around this problem by expanding @samp{jcxz foo} 968to 969 970@smallexample 971 jcxz cx_zero 972 jmp cx_nonzero 973cx_zero: jmp foo 974cx_nonzero: 975@end smallexample 976 977@node i386-Float 978@section Floating Point 979 980@cindex i386 floating point 981@cindex floating point, i386 982@cindex x86-64 floating point 983@cindex floating point, x86-64 984All 80387 floating point types except packed BCD are supported. 985(BCD support may be added without much difficulty). These data 986types are 16-, 32-, and 64- bit integers, and single (32-bit), 987double (64-bit), and extended (80-bit) precision floating point. 988Each supported type has an instruction mnemonic suffix and a constructor 989associated with it. Instruction mnemonic suffixes specify the operand's 990data type. Constructors build these data types into memory. 991 992@cindex @code{float} directive, i386 993@cindex @code{single} directive, i386 994@cindex @code{double} directive, i386 995@cindex @code{tfloat} directive, i386 996@cindex @code{float} directive, x86-64 997@cindex @code{single} directive, x86-64 998@cindex @code{double} directive, x86-64 999@cindex @code{tfloat} directive, x86-64 1000@itemize @bullet 1001@item 1002Floating point constructors are @samp{.float} or @samp{.single}, 1003@samp{.double}, and @samp{.tfloat} for 32-, 64-, and 80-bit formats. 1004These correspond to instruction mnemonic suffixes @samp{s}, @samp{l}, 1005and @samp{t}. @samp{t} stands for 80-bit (ten byte) real. The 80387 1006only supports this format via the @samp{fldt} (load 80-bit real to stack 1007top) and @samp{fstpt} (store 80-bit real and pop stack) instructions. 1008 1009@cindex @code{word} directive, i386 1010@cindex @code{long} directive, i386 1011@cindex @code{int} directive, i386 1012@cindex @code{quad} directive, i386 1013@cindex @code{word} directive, x86-64 1014@cindex @code{long} directive, x86-64 1015@cindex @code{int} directive, x86-64 1016@cindex @code{quad} directive, x86-64 1017@item 1018Integer constructors are @samp{.word}, @samp{.long} or @samp{.int}, and 1019@samp{.quad} for the 16-, 32-, and 64-bit integer formats. The 1020corresponding instruction mnemonic suffixes are @samp{s} (single), 1021@samp{l} (long), and @samp{q} (quad). As with the 80-bit real format, 1022the 64-bit @samp{q} format is only present in the @samp{fildq} (load 1023quad integer to stack top) and @samp{fistpq} (store quad integer and pop 1024stack) instructions. 1025@end itemize 1026 1027Register to register operations should not use instruction mnemonic suffixes. 1028@samp{fstl %st, %st(1)} will give a warning, and be assembled as if you 1029wrote @samp{fst %st, %st(1)}, since all register to register operations 1030use 80-bit floating point operands. (Contrast this with @samp{fstl %st, mem}, 1031which converts @samp{%st} from 80-bit to 64-bit floating point format, 1032then stores the result in the 4 byte location @samp{mem}) 1033 1034@node i386-SIMD 1035@section Intel's MMX and AMD's 3DNow! SIMD Operations 1036 1037@cindex MMX, i386 1038@cindex 3DNow!, i386 1039@cindex SIMD, i386 1040@cindex MMX, x86-64 1041@cindex 3DNow!, x86-64 1042@cindex SIMD, x86-64 1043 1044@code{@value{AS}} supports Intel's MMX instruction set (SIMD 1045instructions for integer data), available on Intel's Pentium MMX 1046processors and Pentium II processors, AMD's K6 and K6-2 processors, 1047Cyrix' M2 processor, and probably others. It also supports AMD's 3DNow!@: 1048instruction set (SIMD instructions for 32-bit floating point data) 1049available on AMD's K6-2 processor and possibly others in the future. 1050 1051Currently, @code{@value{AS}} does not support Intel's floating point 1052SIMD, Katmai (KNI). 1053 1054The eight 64-bit MMX operands, also used by 3DNow!, are called @samp{%mm0}, 1055@samp{%mm1}, ... @samp{%mm7}. They contain eight 8-bit integers, four 105616-bit integers, two 32-bit integers, one 64-bit integer, or two 32-bit 1057floating point values. The MMX registers cannot be used at the same time 1058as the floating point stack. 1059 1060See Intel and AMD documentation, keeping in mind that the operand order in 1061instructions is reversed from the Intel syntax. 1062 1063@node i386-LWP 1064@section AMD's Lightweight Profiling Instructions 1065 1066@cindex LWP, i386 1067@cindex LWP, x86-64 1068 1069@code{@value{AS}} supports AMD's Lightweight Profiling (LWP) 1070instruction set, available on AMD's Family 15h (Orochi) processors. 1071 1072LWP enables applications to collect and manage performance data, and 1073react to performance events. The collection of performance data 1074requires no context switches. LWP runs in the context of a thread and 1075so several counters can be used independently across multiple threads. 1076LWP can be used in both 64-bit and legacy 32-bit modes. 1077 1078For detailed information on the LWP instruction set, see the 1079@cite{AMD Lightweight Profiling Specification} available at 1080@uref{http://developer.amd.com/cpu/LWP,Lightweight Profiling Specification}. 1081 1082@node i386-BMI 1083@section Bit Manipulation Instructions 1084 1085@cindex BMI, i386 1086@cindex BMI, x86-64 1087 1088@code{@value{AS}} supports the Bit Manipulation (BMI) instruction set. 1089 1090BMI instructions provide several instructions implementing individual 1091bit manipulation operations such as isolation, masking, setting, or 1092resetting. 1093 1094@c Need to add a specification citation here when available. 1095 1096@node i386-TBM 1097@section AMD's Trailing Bit Manipulation Instructions 1098 1099@cindex TBM, i386 1100@cindex TBM, x86-64 1101 1102@code{@value{AS}} supports AMD's Trailing Bit Manipulation (TBM) 1103instruction set, available on AMD's BDVER2 processors (Trinity and 1104Viperfish). 1105 1106TBM instructions provide instructions implementing individual bit 1107manipulation operations such as isolating, masking, setting, resetting, 1108complementing, and operations on trailing zeros and ones. 1109 1110@c Need to add a specification citation here when available. 1111 1112@node i386-16bit 1113@section Writing 16-bit Code 1114 1115@cindex i386 16-bit code 1116@cindex 16-bit code, i386 1117@cindex real-mode code, i386 1118@cindex @code{code16gcc} directive, i386 1119@cindex @code{code16} directive, i386 1120@cindex @code{code32} directive, i386 1121@cindex @code{code64} directive, i386 1122@cindex @code{code64} directive, x86-64 1123While @code{@value{AS}} normally writes only ``pure'' 32-bit i386 code 1124or 64-bit x86-64 code depending on the default configuration, 1125it also supports writing code to run in real mode or in 16-bit protected 1126mode code segments. To do this, put a @samp{.code16} or 1127@samp{.code16gcc} directive before the assembly language instructions to 1128be run in 16-bit mode. You can switch @code{@value{AS}} to writing 112932-bit code with the @samp{.code32} directive or 64-bit code with the 1130@samp{.code64} directive. 1131 1132@samp{.code16gcc} provides experimental support for generating 16-bit 1133code from gcc, and differs from @samp{.code16} in that @samp{call}, 1134@samp{ret}, @samp{enter}, @samp{leave}, @samp{push}, @samp{pop}, 1135@samp{pusha}, @samp{popa}, @samp{pushf}, and @samp{popf} instructions 1136default to 32-bit size. This is so that the stack pointer is 1137manipulated in the same way over function calls, allowing access to 1138function parameters at the same stack offsets as in 32-bit mode. 1139@samp{.code16gcc} also automatically adds address size prefixes where 1140necessary to use the 32-bit addressing modes that gcc generates. 1141 1142The code which @code{@value{AS}} generates in 16-bit mode will not 1143necessarily run on a 16-bit pre-80386 processor. To write code that 1144runs on such a processor, you must refrain from using @emph{any} 32-bit 1145constructs which require @code{@value{AS}} to output address or operand 1146size prefixes. 1147 1148Note that writing 16-bit code instructions by explicitly specifying a 1149prefix or an instruction mnemonic suffix within a 32-bit code section 1150generates different machine instructions than those generated for a 115116-bit code segment. In a 32-bit code section, the following code 1152generates the machine opcode bytes @samp{66 6a 04}, which pushes the 1153value @samp{4} onto the stack, decrementing @samp{%esp} by 2. 1154 1155@smallexample 1156 pushw $4 1157@end smallexample 1158 1159The same code in a 16-bit code section would generate the machine 1160opcode bytes @samp{6a 04} (i.e., without the operand size prefix), which 1161is correct since the processor default operand size is assumed to be 16 1162bits in a 16-bit code section. 1163 1164@node i386-Arch 1165@section Specifying CPU Architecture 1166 1167@cindex arch directive, i386 1168@cindex i386 arch directive 1169@cindex arch directive, x86-64 1170@cindex x86-64 arch directive 1171 1172@code{@value{AS}} may be told to assemble for a particular CPU 1173(sub-)architecture with the @code{.arch @var{cpu_type}} directive. This 1174directive enables a warning when gas detects an instruction that is not 1175supported on the CPU specified. The choices for @var{cpu_type} are: 1176 1177@multitable @columnfractions .20 .20 .20 .20 1178@item @samp{i8086} @tab @samp{i186} @tab @samp{i286} @tab @samp{i386} 1179@item @samp{i486} @tab @samp{i586} @tab @samp{i686} @tab @samp{pentium} 1180@item @samp{pentiumpro} @tab @samp{pentiumii} @tab @samp{pentiumiii} @tab @samp{pentium4} 1181@item @samp{prescott} @tab @samp{nocona} @tab @samp{core} @tab @samp{core2} 1182@item @samp{corei7} @tab @samp{l1om} @tab @samp{k1om} @samp{iamcu} 1183@item @samp{k6} @tab @samp{k6_2} @tab @samp{athlon} @tab @samp{k8} 1184@item @samp{amdfam10} @tab @samp{bdver1} @tab @samp{bdver2} @tab @samp{bdver3} 1185@item @samp{bdver4} @tab @samp{znver1} @tab @samp{btver1} @tab @samp{btver2} 1186@item @samp{generic32} @tab @samp{generic64} 1187@item @samp{.mmx} @tab @samp{.sse} @tab @samp{.sse2} @tab @samp{.sse3} 1188@item @samp{.ssse3} @tab @samp{.sse4.1} @tab @samp{.sse4.2} @tab @samp{.sse4} 1189@item @samp{.avx} @tab @samp{.vmx} @tab @samp{.smx} @tab @samp{.ept} 1190@item @samp{.clflush} @tab @samp{.movbe} @tab @samp{.xsave} @tab @samp{.xsaveopt} 1191@item @samp{.aes} @tab @samp{.pclmul} @tab @samp{.fma} @tab @samp{.fsgsbase} 1192@item @samp{.rdrnd} @tab @samp{.f16c} @tab @samp{.avx2} @tab @samp{.bmi2} 1193@item @samp{.lzcnt} @tab @samp{.invpcid} @tab @samp{.vmfunc} @tab @samp{.hle} 1194@item @samp{.rtm} @tab @samp{.adx} @tab @samp{.rdseed} @tab @samp{.prfchw} 1195@item @samp{.smap} @tab @samp{.mpx} @tab @samp{.sha} @tab @samp{.prefetchwt1} 1196@item @samp{.clflushopt} @tab @samp{.xsavec} @tab @samp{.xsaves} @tab @samp{.se1} 1197@item @samp{.avx512f} @tab @samp{.avx512cd} @tab @samp{.avx512er} @tab @samp{.avx512pf} 1198@item @samp{.avx512vl} @tab @samp{.avx512bw} @tab @samp{.avx512dq} @tab @samp{.avx512ifma} 1199@item @samp{.avx512vbmi} @tab @samp{.avx512_4fmaps} @tab @samp{.avx512_4vnniw} 1200@item @samp{.avx512_vpopcntdq} @tab @samp{.clwb} @tab @samp{.rdpid} @tab @samp{.ptwrite} 1201@item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5} 1202@item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme} @tab @samp{.abm} 1203@item @samp{.lwp} @tab @samp{.fma4} @tab @samp{.xop} @tab @samp{.cx16} 1204@item @samp{.padlock} @tab @samp{.clzero} @tab @samp{.mwaitx} 1205@end multitable 1206 1207Apart from the warning, there are only two other effects on 1208@code{@value{AS}} operation; Firstly, if you specify a CPU other than 1209@samp{i486}, then shift by one instructions such as @samp{sarl $1, %eax} 1210will automatically use a two byte opcode sequence. The larger three 1211byte opcode sequence is used on the 486 (and when no architecture is 1212specified) because it executes faster on the 486. Note that you can 1213explicitly request the two byte opcode by writing @samp{sarl %eax}. 1214Secondly, if you specify @samp{i8086}, @samp{i186}, or @samp{i286}, 1215@emph{and} @samp{.code16} or @samp{.code16gcc} then byte offset 1216conditional jumps will be promoted when necessary to a two instruction 1217sequence consisting of a conditional jump of the opposite sense around 1218an unconditional jump to the target. 1219 1220Following the CPU architecture (but not a sub-architecture, which are those 1221starting with a dot), you may specify @samp{jumps} or @samp{nojumps} to 1222control automatic promotion of conditional jumps. @samp{jumps} is the 1223default, and enables jump promotion; All external jumps will be of the long 1224variety, and file-local jumps will be promoted as necessary. 1225(@pxref{i386-Jumps}) @samp{nojumps} leaves external conditional jumps as 1226byte offset jumps, and warns about file-local conditional jumps that 1227@code{@value{AS}} promotes. 1228Unconditional jumps are treated as for @samp{jumps}. 1229 1230For example 1231 1232@smallexample 1233 .arch i8086,nojumps 1234@end smallexample 1235 1236@node i386-Bugs 1237@section AT&T Syntax bugs 1238 1239The UnixWare assembler, and probably other AT&T derived ix86 Unix 1240assemblers, generate floating point instructions with reversed source 1241and destination registers in certain cases. Unfortunately, gcc and 1242possibly many other programs use this reversed syntax, so we're stuck 1243with it. 1244 1245For example 1246 1247@smallexample 1248 fsub %st,%st(3) 1249@end smallexample 1250@noindent 1251results in @samp{%st(3)} being updated to @samp{%st - %st(3)} rather 1252than the expected @samp{%st(3) - %st}. This happens with all the 1253non-commutative arithmetic floating point operations with two register 1254operands where the source register is @samp{%st} and the destination 1255register is @samp{%st(i)}. 1256 1257@node i386-Notes 1258@section Notes 1259 1260@cindex i386 @code{mul}, @code{imul} instructions 1261@cindex @code{mul} instruction, i386 1262@cindex @code{imul} instruction, i386 1263@cindex @code{mul} instruction, x86-64 1264@cindex @code{imul} instruction, x86-64 1265There is some trickery concerning the @samp{mul} and @samp{imul} 1266instructions that deserves mention. The 16-, 32-, 64- and 128-bit expanding 1267multiplies (base opcode @samp{0xf6}; extension 4 for @samp{mul} and 5 1268for @samp{imul}) can be output only in the one operand form. Thus, 1269@samp{imul %ebx, %eax} does @emph{not} select the expanding multiply; 1270the expanding multiply would clobber the @samp{%edx} register, and this 1271would confuse @code{@value{GCC}} output. Use @samp{imul %ebx} to get the 127264-bit product in @samp{%edx:%eax}. 1273 1274We have added a two operand form of @samp{imul} when the first operand 1275is an immediate mode expression and the second operand is a register. 1276This is just a shorthand, so that, multiplying @samp{%eax} by 69, for 1277example, can be done with @samp{imul $69, %eax} rather than @samp{imul 1278$69, %eax, %eax}. 1279 1280