X86Disassembler.h revision 234353
1218792Snp//===-- X86Disassembler.h - Disassembler for x86 and x86_64 -----*- C++ -*-===// 2218792Snp// 3218792Snp// The LLVM Compiler Infrastructure 4218792Snp// 5218792Snp// This file is distributed under the University of Illinois Open Source 6218792Snp// License. See LICENSE.TXT for details. 7218792Snp// 8218792Snp//===----------------------------------------------------------------------===// 9218792Snp// 10218792Snp// The X86 disassembler is a table-driven disassembler for the 16-, 32-, and 11218792Snp// 64-bit X86 instruction sets. The main decode sequence for an assembly 12218792Snp// instruction in this disassembler is: 13218792Snp// 14218792Snp// 1. Read the prefix bytes and determine the attributes of the instruction. 15218792Snp// These attributes, recorded in enum attributeBits 16218792Snp// (X86DisassemblerDecoderCommon.h), form a bitmask. The table CONTEXTS_SYM 17218792Snp// provides a mapping from bitmasks to contexts, which are represented by 18218792Snp// enum InstructionContext (ibid.). 19218792Snp// 20218792Snp// 2. Read the opcode, and determine what kind of opcode it is. The 21218792Snp// disassembler distinguishes four kinds of opcodes, which are enumerated in 22218792Snp// OpcodeType (X86DisassemblerDecoderCommon.h): one-byte (0xnn), two-byte 23218792Snp// (0x0f 0xnn), three-byte-38 (0x0f 0x38 0xnn), or three-byte-3a 24218792Snp// (0x0f 0x3a 0xnn). Mandatory prefixes are treated as part of the context. 25218792Snp// 26218792Snp// 3. Depending on the opcode type, look in one of four ClassDecision structures 27218792Snp// (X86DisassemblerDecoderCommon.h). Use the opcode class to determine which 28218792Snp// OpcodeDecision (ibid.) to look the opcode in. Look up the opcode, to get 29218792Snp// a ModRMDecision (ibid.). 30218792Snp// 31218792Snp// 4. Some instructions, such as escape opcodes or extended opcodes, or even 32218792Snp// instructions that have ModRM*Reg / ModRM*Mem forms in LLVM, need the 33218792Snp// ModR/M byte to complete decode. The ModRMDecision's type is an entry from 34218792Snp// ModRMDecisionType (X86DisassemblerDecoderCommon.h) that indicates if the 35218792Snp// ModR/M byte is required and how to interpret it. 36218792Snp// 37237263Snp// 5. After resolving the ModRMDecision, the disassembler has a unique ID 38219286Snp// of type InstrUID (X86DisassemblerDecoderCommon.h). Looking this ID up in 39219286Snp// INSTRUCTIONS_SYM yields the name of the instruction and the encodings and 40219286Snp// meanings of its operands. 41218792Snp// 42228561Snp// 6. For each operand, its encoding is an entry from OperandEncoding 43218792Snp// (X86DisassemblerDecoderCommon.h) and its type is an entry from 44218792Snp// OperandType (ibid.). The encoding indicates how to read it from the 45218792Snp// instruction; the type indicates how to interpret the value once it has 46218792Snp// been read. For example, a register operand could be stored in the R/M 47218792Snp// field of the ModR/M byte, the REG field of the ModR/M byte, or added to 48218792Snp// the main opcode. This is orthogonal from its meaning (an GPR or an XMM 49218792Snp// register, for instance). Given this information, the operands can be 50218792Snp// extracted and interpreted. 51218792Snp// 52218792Snp// 7. As the last step, the disassembler translates the instruction information 53218792Snp// and operands into a format understandable by the client - in this case, an 54218792Snp// MCInst for use by the MC infrastructure. 55218792Snp// 56218792Snp// The disassembler is broken broadly into two parts: the table emitter that 57218792Snp// emits the instruction decode tables discussed above during compilation, and 58218792Snp// the disassembler itself. The table emitter is documented in more detail in 59218792Snp// utils/TableGen/X86DisassemblerEmitter.h. 60218792Snp// 61218792Snp// X86Disassembler.h contains the public interface for the disassembler, 62219392Snp// adhering to the MCDisassembler interface. 63219392Snp// X86Disassembler.cpp contains the code responsible for step 7, and for 64219392Snp// invoking the decoder to execute steps 1-6. 65218792Snp// X86DisassemblerDecoderCommon.h contains the definitions needed by both the 66218792Snp// table emitter and the disassembler. 67218792Snp// X86DisassemblerDecoder.h contains the public interface of the decoder, 68218792Snp// factored out into C for possible use by other projects. 69218792Snp// X86DisassemblerDecoder.c contains the source code of the decoder, which is 70218792Snp// responsible for steps 1-6. 71218792Snp// 72218792Snp//===----------------------------------------------------------------------===// 73237512Snp 74237512Snp#ifndef X86DISASSEMBLER_H 75218792Snp#define X86DISASSEMBLER_H 76218792Snp 77218792Snp#define INSTRUCTION_SPECIFIER_FIELDS \ 78218792Snp const char* name; 79218792Snp 80218792Snp#define INSTRUCTION_IDS \ 81218792Snp unsigned instructionIDs; 82218792Snp 83218792Snp#include "X86DisassemblerDecoderCommon.h" 84218792Snp 85218792Snp#undef INSTRUCTION_SPECIFIER_FIELDS 86218792Snp#undef INSTRUCTION_IDS 87218792Snp 88218792Snp#include "llvm/MC/MCDisassembler.h" 89218792Snp 90218792Snpnamespace llvm { 91228561Snp 92228561Snpclass MCInst; 93228561Snpclass MCInstrInfo; 94228561Snpclass MCSubtargetInfo; 95218792Snpclass MemoryObject; 96228561Snpclass raw_ostream; 97228561Snp 98228561Snpstruct EDInstInfo; 99228561Snp 100218792Snpnamespace X86Disassembler { 101218792Snp 102218792Snp/// X86GenericDisassembler - Generic disassembler for all X86 platforms. 103218792Snp/// All each platform class should have to do is subclass the constructor, and 104218792Snp/// provide a different disassemblerMode value. 105222085Snpclass X86GenericDisassembler : public MCDisassembler { 106218792Snp const MCInstrInfo *MII; 107228561Snppublic: 108228561Snp /// Constructor - Initializes the disassembler. 109228561Snp /// 110228561Snp /// @param mode - The X86 architecture mode to decode for. 111228561Snp X86GenericDisassembler(const MCSubtargetInfo &STI, DisassemblerMode mode, 112228561Snp const MCInstrInfo *MII); 113218792Snpprivate: 114237263Snp ~X86GenericDisassembler(); 115228561Snppublic: 116228561Snp 117228561Snp /// getInstruction - See MCDisassembler. 118228561Snp DecodeStatus getInstruction(MCInst &instr, 119228561Snp uint64_t &size, 120228561Snp const MemoryObject ®ion, 121237263Snp uint64_t address, 122228561Snp raw_ostream &vStream, 123228561Snp raw_ostream &cStream) const; 124228561Snp 125228561Snp /// getEDInfo - See MCDisassembler. 126228561Snp const EDInstInfo *getEDInfo() const; 127228561Snpprivate: 128228561Snp DisassemblerMode fMode; 129228561Snp}; 130228561Snp 131218792Snp} // namespace X86Disassembler 132218792Snp 133218792Snp} // namespace llvm 134218792Snp 135218792Snp#endif 136228561Snp