1@c \input texinfo 2@c %**start of header 3@c @setfilename agentexpr.info 4@c @settitle GDB Agent Expressions 5@c @setchapternewpage off 6@c %**end of header 7 8@c This file is part of the GDB manual. 9@c 10@c Copyright (C) 2003, 2004, 2005, 2006 11@c Free Software Foundation, Inc. 12@c 13@c See the file gdb.texinfo for copying conditions. 14 15@c Revision: $Id: agentexpr.texi,v 1.2 1998/12/09 21:23:46 jimb Exp $ 16 17@node Agent Expressions 18@appendix The GDB Agent Expression Mechanism 19 20In some applications, it is not feasible for the debugger to interrupt 21the program's execution long enough for the developer to learn anything 22helpful about its behavior. If the program's correctness depends on its 23real-time behavior, delays introduced by a debugger might cause the 24program to fail, even when the code itself is correct. It is useful to 25be able to observe the program's behavior without interrupting it. 26 27Using GDB's @code{trace} and @code{collect} commands, the user can 28specify locations in the program, and arbitrary expressions to evaluate 29when those locations are reached. Later, using the @code{tfind} 30command, she can examine the values those expressions had when the 31program hit the trace points. The expressions may also denote objects 32in memory --- structures or arrays, for example --- whose values GDB 33should record; while visiting a particular tracepoint, the user may 34inspect those objects as if they were in memory at that moment. 35However, because GDB records these values without interacting with the 36user, it can do so quickly and unobtrusively, hopefully not disturbing 37the program's behavior. 38 39When GDB is debugging a remote target, the GDB @dfn{agent} code running 40on the target computes the values of the expressions itself. To avoid 41having a full symbolic expression evaluator on the agent, GDB translates 42expressions in the source language into a simpler bytecode language, and 43then sends the bytecode to the agent; the agent then executes the 44bytecode, and records the values for GDB to retrieve later. 45 46The bytecode language is simple; there are forty-odd opcodes, the bulk 47of which are the usual vocabulary of C operands (addition, subtraction, 48shifts, and so on) and various sizes of literals and memory reference 49operations. The bytecode interpreter operates strictly on machine-level 50values --- various sizes of integers and floating point numbers --- and 51requires no information about types or symbols; thus, the interpreter's 52internal data structures are simple, and each bytecode requires only a 53few native machine instructions to implement it. The interpreter is 54small, and strict limits on the memory and time required to evaluate an 55expression are easy to determine, making it suitable for use by the 56debugging agent in real-time applications. 57 58@menu 59* General Bytecode Design:: Overview of the interpreter. 60* Bytecode Descriptions:: What each one does. 61* Using Agent Expressions:: How agent expressions fit into the big picture. 62* Varying Target Capabilities:: How to discover what the target can do. 63* Tracing on Symmetrix:: Special info for implementation on EMC's 64 boxes. 65* Rationale:: Why we did it this way. 66@end menu 67 68 69@c @node Rationale 70@c @section Rationale 71 72 73@node General Bytecode Design 74@section General Bytecode Design 75 76The agent represents bytecode expressions as an array of bytes. Each 77instruction is one byte long (thus the term @dfn{bytecode}). Some 78instructions are followed by operand bytes; for example, the @code{goto} 79instruction is followed by a destination for the jump. 80 81The bytecode interpreter is a stack-based machine; most instructions pop 82their operands off the stack, perform some operation, and push the 83result back on the stack for the next instruction to consume. Each 84element of the stack may contain either a integer or a floating point 85value; these values are as many bits wide as the largest integer that 86can be directly manipulated in the source language. Stack elements 87carry no record of their type; bytecode could push a value as an 88integer, then pop it as a floating point value. However, GDB will not 89generate code which does this. In C, one might define the type of a 90stack element as follows: 91@example 92union agent_val @{ 93 LONGEST l; 94 DOUBLEST d; 95@}; 96@end example 97@noindent 98where @code{LONGEST} and @code{DOUBLEST} are @code{typedef} names for 99the largest integer and floating point types on the machine. 100 101By the time the bytecode interpreter reaches the end of the expression, 102the value of the expression should be the only value left on the stack. 103For tracing applications, @code{trace} bytecodes in the expression will 104have recorded the necessary data, and the value on the stack may be 105discarded. For other applications, like conditional breakpoints, the 106value may be useful. 107 108Separate from the stack, the interpreter has two registers: 109@table @code 110@item pc 111The address of the next bytecode to execute. 112 113@item start 114The address of the start of the bytecode expression, necessary for 115interpreting the @code{goto} and @code{if_goto} instructions. 116 117@end table 118@noindent 119Neither of these registers is directly visible to the bytecode language 120itself, but they are useful for defining the meanings of the bytecode 121operations. 122 123There are no instructions to perform side effects on the running 124program, or call the program's functions; we assume that these 125expressions are only used for unobtrusive debugging, not for patching 126the running code. 127 128Most bytecode instructions do not distinguish between the various sizes 129of values, and operate on full-width values; the upper bits of the 130values are simply ignored, since they do not usually make a difference 131to the value computed. The exceptions to this rule are: 132@table @asis 133 134@item memory reference instructions (@code{ref}@var{n}) 135There are distinct instructions to fetch different word sizes from 136memory. Once on the stack, however, the values are treated as full-size 137integers. They may need to be sign-extended; the @code{ext} instruction 138exists for this purpose. 139 140@item the sign-extension instruction (@code{ext} @var{n}) 141These clearly need to know which portion of their operand is to be 142extended to occupy the full length of the word. 143 144@end table 145 146If the interpreter is unable to evaluate an expression completely for 147some reason (a memory location is inaccessible, or a divisor is zero, 148for example), we say that interpretation ``terminates with an error''. 149This means that the problem is reported back to the interpreter's caller 150in some helpful way. In general, code using agent expressions should 151assume that they may attempt to divide by zero, fetch arbitrary memory 152locations, and misbehave in other ways. 153 154Even complicated C expressions compile to a few bytecode instructions; 155for example, the expression @code{x + y * z} would typically produce 156code like the following, assuming that @code{x} and @code{y} live in 157registers, and @code{z} is a global variable holding a 32-bit 158@code{int}: 159@example 160reg 1 161reg 2 162const32 @i{address of z} 163ref32 164ext 32 165mul 166add 167end 168@end example 169 170In detail, these mean: 171@table @code 172 173@item reg 1 174Push the value of register 1 (presumably holding @code{x}) onto the 175stack. 176 177@item reg 2 178Push the value of register 2 (holding @code{y}). 179 180@item const32 @i{address of z} 181Push the address of @code{z} onto the stack. 182 183@item ref32 184Fetch a 32-bit word from the address at the top of the stack; replace 185the address on the stack with the value. Thus, we replace the address 186of @code{z} with @code{z}'s value. 187 188@item ext 32 189Sign-extend the value on the top of the stack from 32 bits to full 190length. This is necessary because @code{z} is a signed integer. 191 192@item mul 193Pop the top two numbers on the stack, multiply them, and push their 194product. Now the top of the stack contains the value of the expression 195@code{y * z}. 196 197@item add 198Pop the top two numbers, add them, and push the sum. Now the top of the 199stack contains the value of @code{x + y * z}. 200 201@item end 202Stop executing; the value left on the stack top is the value to be 203recorded. 204 205@end table 206 207 208@node Bytecode Descriptions 209@section Bytecode Descriptions 210 211Each bytecode description has the following form: 212 213@table @asis 214 215@item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b} 216 217Pop the top two stack items, @var{a} and @var{b}, as integers; push 218their sum, as an integer. 219 220@end table 221 222In this example, @code{add} is the name of the bytecode, and 223@code{(0x02)} is the one-byte value used to encode the bytecode, in 224hexadecimal. The phrase ``@var{a} @var{b} @result{} @var{a+b}'' shows 225the stack before and after the bytecode executes. Beforehand, the stack 226must contain at least two values, @var{a} and @var{b}; since the top of 227the stack is to the right, @var{b} is on the top of the stack, and 228@var{a} is underneath it. After execution, the bytecode will have 229popped @var{a} and @var{b} from the stack, and replaced them with a 230single value, @var{a+b}. There may be other values on the stack below 231those shown, but the bytecode affects only those shown. 232 233Here is another example: 234 235@table @asis 236 237@item @code{const8} (0x22) @var{n}: @result{} @var{n} 238Push the 8-bit integer constant @var{n} on the stack, without sign 239extension. 240 241@end table 242 243In this example, the bytecode @code{const8} takes an operand @var{n} 244directly from the bytecode stream; the operand follows the @code{const8} 245bytecode itself. We write any such operands immediately after the name 246of the bytecode, before the colon, and describe the exact encoding of 247the operand in the bytecode stream in the body of the bytecode 248description. 249 250For the @code{const8} bytecode, there are no stack items given before 251the @result{}; this simply means that the bytecode consumes no values 252from the stack. If a bytecode consumes no values, or produces no 253values, the list on either side of the @result{} may be empty. 254 255If a value is written as @var{a}, @var{b}, or @var{n}, then the bytecode 256treats it as an integer. If a value is written is @var{addr}, then the 257bytecode treats it as an address. 258 259We do not fully describe the floating point operations here; although 260this design can be extended in a clean way to handle floating point 261values, they are not of immediate interest to the customer, so we avoid 262describing them, to save time. 263 264 265@table @asis 266 267@item @code{float} (0x01): @result{} 268 269Prefix for floating-point bytecodes. Not implemented yet. 270 271@item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b} 272Pop two integers from the stack, and push their sum, as an integer. 273 274@item @code{sub} (0x03): @var{a} @var{b} @result{} @var{a-b} 275Pop two integers from the stack, subtract the top value from the 276next-to-top value, and push the difference. 277 278@item @code{mul} (0x04): @var{a} @var{b} @result{} @var{a*b} 279Pop two integers from the stack, multiply them, and push the product on 280the stack. Note that, when one multiplies two @var{n}-bit numbers 281yielding another @var{n}-bit number, it is irrelevant whether the 282numbers are signed or not; the results are the same. 283 284@item @code{div_signed} (0x05): @var{a} @var{b} @result{} @var{a/b} 285Pop two signed integers from the stack; divide the next-to-top value by 286the top value, and push the quotient. If the divisor is zero, terminate 287with an error. 288 289@item @code{div_unsigned} (0x06): @var{a} @var{b} @result{} @var{a/b} 290Pop two unsigned integers from the stack; divide the next-to-top value 291by the top value, and push the quotient. If the divisor is zero, 292terminate with an error. 293 294@item @code{rem_signed} (0x07): @var{a} @var{b} @result{} @var{a modulo b} 295Pop two signed integers from the stack; divide the next-to-top value by 296the top value, and push the remainder. If the divisor is zero, 297terminate with an error. 298 299@item @code{rem_unsigned} (0x08): @var{a} @var{b} @result{} @var{a modulo b} 300Pop two unsigned integers from the stack; divide the next-to-top value 301by the top value, and push the remainder. If the divisor is zero, 302terminate with an error. 303 304@item @code{lsh} (0x09): @var{a} @var{b} @result{} @var{a<<b} 305Pop two integers from the stack; let @var{a} be the next-to-top value, 306and @var{b} be the top value. Shift @var{a} left by @var{b} bits, and 307push the result. 308 309@item @code{rsh_signed} (0x0a): @var{a} @var{b} @result{} @code{(signed)}@var{a>>b} 310Pop two integers from the stack; let @var{a} be the next-to-top value, 311and @var{b} be the top value. Shift @var{a} right by @var{b} bits, 312inserting copies of the top bit at the high end, and push the result. 313 314@item @code{rsh_unsigned} (0x0b): @var{a} @var{b} @result{} @var{a>>b} 315Pop two integers from the stack; let @var{a} be the next-to-top value, 316and @var{b} be the top value. Shift @var{a} right by @var{b} bits, 317inserting zero bits at the high end, and push the result. 318 319@item @code{log_not} (0x0e): @var{a} @result{} @var{!a} 320Pop an integer from the stack; if it is zero, push the value one; 321otherwise, push the value zero. 322 323@item @code{bit_and} (0x0f): @var{a} @var{b} @result{} @var{a&b} 324Pop two integers from the stack, and push their bitwise @code{and}. 325 326@item @code{bit_or} (0x10): @var{a} @var{b} @result{} @var{a|b} 327Pop two integers from the stack, and push their bitwise @code{or}. 328 329@item @code{bit_xor} (0x11): @var{a} @var{b} @result{} @var{a^b} 330Pop two integers from the stack, and push their bitwise 331exclusive-@code{or}. 332 333@item @code{bit_not} (0x12): @var{a} @result{} @var{~a} 334Pop an integer from the stack, and push its bitwise complement. 335 336@item @code{equal} (0x13): @var{a} @var{b} @result{} @var{a=b} 337Pop two integers from the stack; if they are equal, push the value one; 338otherwise, push the value zero. 339 340@item @code{less_signed} (0x14): @var{a} @var{b} @result{} @var{a<b} 341Pop two signed integers from the stack; if the next-to-top value is less 342than the top value, push the value one; otherwise, push the value zero. 343 344@item @code{less_unsigned} (0x15): @var{a} @var{b} @result{} @var{a<b} 345Pop two unsigned integers from the stack; if the next-to-top value is less 346than the top value, push the value one; otherwise, push the value zero. 347 348@item @code{ext} (0x16) @var{n}: @var{a} @result{} @var{a}, sign-extended from @var{n} bits 349Pop an unsigned value from the stack; treating it as an @var{n}-bit 350twos-complement value, extend it to full length. This means that all 351bits to the left of bit @var{n-1} (where the least significant bit is bit 3520) are set to the value of bit @var{n-1}. Note that @var{n} may be 353larger than or equal to the width of the stack elements of the bytecode 354engine; in this case, the bytecode should have no effect. 355 356The number of source bits to preserve, @var{n}, is encoded as a single 357byte unsigned integer following the @code{ext} bytecode. 358 359@item @code{zero_ext} (0x2a) @var{n}: @var{a} @result{} @var{a}, zero-extended from @var{n} bits 360Pop an unsigned value from the stack; zero all but the bottom @var{n} 361bits. This means that all bits to the left of bit @var{n-1} (where the 362least significant bit is bit 0) are set to the value of bit @var{n-1}. 363 364The number of source bits to preserve, @var{n}, is encoded as a single 365byte unsigned integer following the @code{zero_ext} bytecode. 366 367@item @code{ref8} (0x17): @var{addr} @result{} @var{a} 368@itemx @code{ref16} (0x18): @var{addr} @result{} @var{a} 369@itemx @code{ref32} (0x19): @var{addr} @result{} @var{a} 370@itemx @code{ref64} (0x1a): @var{addr} @result{} @var{a} 371Pop an address @var{addr} from the stack. For bytecode 372@code{ref}@var{n}, fetch an @var{n}-bit value from @var{addr}, using the 373natural target endianness. Push the fetched value as an unsigned 374integer. 375 376Note that @var{addr} may not be aligned in any particular way; the 377@code{ref@var{n}} bytecodes should operate correctly for any address. 378 379If attempting to access memory at @var{addr} would cause a processor 380exception of some sort, terminate with an error. 381 382@item @code{ref_float} (0x1b): @var{addr} @result{} @var{d} 383@itemx @code{ref_double} (0x1c): @var{addr} @result{} @var{d} 384@itemx @code{ref_long_double} (0x1d): @var{addr} @result{} @var{d} 385@itemx @code{l_to_d} (0x1e): @var{a} @result{} @var{d} 386@itemx @code{d_to_l} (0x1f): @var{d} @result{} @var{a} 387Not implemented yet. 388 389@item @code{dup} (0x28): @var{a} => @var{a} @var{a} 390Push another copy of the stack's top element. 391 392@item @code{swap} (0x2b): @var{a} @var{b} => @var{b} @var{a} 393Exchange the top two items on the stack. 394 395@item @code{pop} (0x29): @var{a} => 396Discard the top value on the stack. 397 398@item @code{if_goto} (0x20) @var{offset}: @var{a} @result{} 399Pop an integer off the stack; if it is non-zero, branch to the given 400offset in the bytecode string. Otherwise, continue to the next 401instruction in the bytecode stream. In other words, if @var{a} is 402non-zero, set the @code{pc} register to @code{start} + @var{offset}. 403Thus, an offset of zero denotes the beginning of the expression. 404 405The @var{offset} is stored as a sixteen-bit unsigned value, stored 406immediately following the @code{if_goto} bytecode. It is always stored 407most significant byte first, regardless of the target's normal 408endianness. The offset is not guaranteed to fall at any particular 409alignment within the bytecode stream; thus, on machines where fetching a 41016-bit on an unaligned address raises an exception, you should fetch the 411offset one byte at a time. 412 413@item @code{goto} (0x21) @var{offset}: @result{} 414Branch unconditionally to @var{offset}; in other words, set the 415@code{pc} register to @code{start} + @var{offset}. 416 417The offset is stored in the same way as for the @code{if_goto} bytecode. 418 419@item @code{const8} (0x22) @var{n}: @result{} @var{n} 420@itemx @code{const16} (0x23) @var{n}: @result{} @var{n} 421@itemx @code{const32} (0x24) @var{n}: @result{} @var{n} 422@itemx @code{const64} (0x25) @var{n}: @result{} @var{n} 423Push the integer constant @var{n} on the stack, without sign extension. 424To produce a small negative value, push a small twos-complement value, 425and then sign-extend it using the @code{ext} bytecode. 426 427The constant @var{n} is stored in the appropriate number of bytes 428following the @code{const}@var{b} bytecode. The constant @var{n} is 429always stored most significant byte first, regardless of the target's 430normal endianness. The constant is not guaranteed to fall at any 431particular alignment within the bytecode stream; thus, on machines where 432fetching a 16-bit on an unaligned address raises an exception, you 433should fetch @var{n} one byte at a time. 434 435@item @code{reg} (0x26) @var{n}: @result{} @var{a} 436Push the value of register number @var{n}, without sign extension. The 437registers are numbered following GDB's conventions. 438 439The register number @var{n} is encoded as a 16-bit unsigned integer 440immediately following the @code{reg} bytecode. It is always stored most 441significant byte first, regardless of the target's normal endianness. 442The register number is not guaranteed to fall at any particular 443alignment within the bytecode stream; thus, on machines where fetching a 44416-bit on an unaligned address raises an exception, you should fetch the 445register number one byte at a time. 446 447@item @code{trace} (0x0c): @var{addr} @var{size} @result{} 448Record the contents of the @var{size} bytes at @var{addr} in a trace 449buffer, for later retrieval by GDB. 450 451@item @code{trace_quick} (0x0d) @var{size}: @var{addr} @result{} @var{addr} 452Record the contents of the @var{size} bytes at @var{addr} in a trace 453buffer, for later retrieval by GDB. @var{size} is a single byte 454unsigned integer following the @code{trace} opcode. 455 456This bytecode is equivalent to the sequence @code{dup const8 @var{size} 457trace}, but we provide it anyway to save space in bytecode strings. 458 459@item @code{trace16} (0x30) @var{size}: @var{addr} @result{} @var{addr} 460Identical to trace_quick, except that @var{size} is a 16-bit big-endian 461unsigned integer, not a single byte. This should probably have been 462named @code{trace_quick16}, for consistency. 463 464@item @code{end} (0x27): @result{} 465Stop executing bytecode; the result should be the top element of the 466stack. If the purpose of the expression was to compute an lvalue or a 467range of memory, then the next-to-top of the stack is the lvalue's 468address, and the top of the stack is the lvalue's size, in bytes. 469 470@end table 471 472 473@node Using Agent Expressions 474@section Using Agent Expressions 475 476Here is a sketch of a full non-stop debugging cycle, showing how agent 477expressions fit into the process. 478 479@itemize @bullet 480 481@item 482The user selects trace points in the program's code at which GDB should 483collect data. 484 485@item 486The user specifies expressions to evaluate at each trace point. These 487expressions may denote objects in memory, in which case those objects' 488contents are recorded as the program runs, or computed values, in which 489case the values themselves are recorded. 490 491@item 492GDB transmits the tracepoints and their associated expressions to the 493GDB agent, running on the debugging target. 494 495@item 496The agent arranges to be notified when a trace point is hit. Note that, 497on some systems, the target operating system is completely responsible 498for collecting the data; see @ref{Tracing on Symmetrix}. 499 500@item 501When execution on the target reaches a trace point, the agent evaluates 502the expressions associated with that trace point, and records the 503resulting values and memory ranges. 504 505@item 506Later, when the user selects a given trace event and inspects the 507objects and expression values recorded, GDB talks to the agent to 508retrieve recorded data as necessary to meet the user's requests. If the 509user asks to see an object whose contents have not been recorded, GDB 510reports an error. 511 512@end itemize 513 514 515@node Varying Target Capabilities 516@section Varying Target Capabilities 517 518Some targets don't support floating-point, and some would rather not 519have to deal with @code{long long} operations. Also, different targets 520will have different stack sizes, and different bytecode buffer lengths. 521 522Thus, GDB needs a way to ask the target about itself. We haven't worked 523out the details yet, but in general, GDB should be able to send the 524target a packet asking it to describe itself. The reply should be a 525packet whose length is explicit, so we can add new information to the 526packet in future revisions of the agent, without confusing old versions 527of GDB, and it should contain a version number. It should contain at 528least the following information: 529 530@itemize @bullet 531 532@item 533whether floating point is supported 534 535@item 536whether @code{long long} is supported 537 538@item 539maximum acceptable size of bytecode stack 540 541@item 542maximum acceptable length of bytecode expressions 543 544@item 545which registers are actually available for collection 546 547@item 548whether the target supports disabled tracepoints 549 550@end itemize 551 552 553 554@node Tracing on Symmetrix 555@section Tracing on Symmetrix 556 557This section documents the API used by the GDB agent to collect data on 558Symmetrix systems. 559 560Cygnus originally implemented these tracing features to help EMC 561Corporation debug their Symmetrix high-availability disk drives. The 562Symmetrix application code already includes substantial tracing 563facilities; the GDB agent for the Symmetrix system uses those facilities 564for its own data collection, via the API described here. 565 566@deftypefn Function DTC_RESPONSE adbg_find_memory_in_frame (FRAME_DEF *@var{frame}, char *@var{address}, char **@var{buffer}, unsigned int *@var{size}) 567Search the trace frame @var{frame} for memory saved from @var{address}. 568If the memory is available, provide the address of the buffer holding 569it; otherwise, provide the address of the next saved area. 570 571@itemize @bullet 572 573@item 574If the memory at @var{address} was saved in @var{frame}, set 575@code{*@var{buffer}} to point to the buffer in which that memory was 576saved, set @code{*@var{size}} to the number of bytes from @var{address} 577that are saved at @code{*@var{buffer}}, and return 578@code{OK_TARGET_RESPONSE}. (Clearly, in this case, the function will 579always set @code{*@var{size}} to a value greater than zero.) 580 581@item 582If @var{frame} does not record any memory at @var{address}, set 583@code{*@var{size}} to the distance from @var{address} to the start of 584the saved region with the lowest address higher than @var{address}. If 585there is no memory saved from any higher address, set @code{*@var{size}} 586to zero. Return @code{NOT_FOUND_TARGET_RESPONSE}. 587@end itemize 588 589These two possibilities allow the caller to either retrieve the data, or 590walk the address space to the next saved area. 591@end deftypefn 592 593This function allows the GDB agent to map the regions of memory saved in 594a particular frame, and retrieve their contents efficiently. 595 596This function also provides a clean interface between the GDB agent and 597the Symmetrix tracing structures, making it easier to adapt the GDB 598agent to future versions of the Symmetrix system, and vice versa. This 599function searches all data saved in @var{frame}, whether the data is 600there at the request of a bytecode expression, or because it falls in 601one of the format's memory ranges, or because it was saved from the top 602of the stack. EMC can arbitrarily change and enhance the tracing 603mechanism, but as long as this function works properly, all collected 604memory is visible to GDB. 605 606The function itself is straightforward to implement. A single pass over 607the trace frame's stack area, memory ranges, and expression blocks can 608yield the address of the buffer (if the requested address was saved), 609and also note the address of the next higher range of memory, to be 610returned when the search fails. 611 612As an example, suppose the trace frame @code{f} has saved sixteen bytes 613from address @code{0x8000} in a buffer at @code{0x1000}, and thirty-two 614bytes from address @code{0xc000} in a buffer at @code{0x1010}. Here are 615some sample calls, and the effect each would have: 616 617@table @code 618 619@item adbg_find_memory_in_frame (f, (char*) 0x8000, &buffer, &size) 620This would set @code{buffer} to @code{0x1000}, set @code{size} to 621sixteen, and return @code{OK_TARGET_RESPONSE}, since @code{f} saves 622sixteen bytes from @code{0x8000} at @code{0x1000}. 623 624@item adbg_find_memory_in_frame (f, (char *) 0x8004, &buffer, &size) 625This would set @code{buffer} to @code{0x1004}, set @code{size} to 626twelve, and return @code{OK_TARGET_RESPONSE}, since @file{f} saves the 627twelve bytes from @code{0x8004} starting four bytes into the buffer at 628@code{0x1000}. This shows that request addresses may fall in the middle 629of saved areas; the function should return the address and size of the 630remainder of the buffer. 631 632@item adbg_find_memory_in_frame (f, (char *) 0x8100, &buffer, &size) 633This would set @code{size} to @code{0x3f00} and return 634@code{NOT_FOUND_TARGET_RESPONSE}, since there is no memory saved in 635@code{f} from the address @code{0x8100}, and the next memory available 636is at @code{0x8100 + 0x3f00}, or @code{0xc000}. This shows that request 637addresses may fall outside of all saved memory ranges; the function 638should indicate the next saved area, if any. 639 640@item adbg_find_memory_in_frame (f, (char *) 0x7000, &buffer, &size) 641This would set @code{size} to @code{0x1000} and return 642@code{NOT_FOUND_TARGET_RESPONSE}, since the next saved memory is at 643@code{0x7000 + 0x1000}, or @code{0x8000}. 644 645@item adbg_find_memory_in_frame (f, (char *) 0xf000, &buffer, &size) 646This would set @code{size} to zero, and return 647@code{NOT_FOUND_TARGET_RESPONSE}. This shows how the function tells the 648caller that no further memory ranges have been saved. 649 650@end table 651 652As another example, here is a function which will print out the 653addresses of all memory saved in the trace frame @code{frame} on the 654Symmetrix INLINES console: 655@example 656void 657print_frame_addresses (FRAME_DEF *frame) 658@{ 659 char *addr; 660 char *buffer; 661 unsigned long size; 662 663 addr = 0; 664 for (;;) 665 @{ 666 /* Either find out how much memory we have here, or discover 667 where the next saved region is. */ 668 if (adbg_find_memory_in_frame (frame, addr, &buffer, &size) 669 == OK_TARGET_RESPONSE) 670 printp ("saved %x to %x\n", addr, addr + size); 671 if (size == 0) 672 break; 673 addr += size; 674 @} 675@} 676@end example 677 678Note that there is not necessarily any connection between the order in 679which the data is saved in the trace frame, and the order in which 680@code{adbg_find_memory_in_frame} will return those memory ranges. The 681code above will always print the saved memory regions in order of 682increasing address, while the underlying frame structure might store the 683data in a random order. 684 685[[This section should cover the rest of the Symmetrix functions the stub 686relies upon, too.]] 687 688@node Rationale 689@section Rationale 690 691Some of the design decisions apparent above are arguable. 692 693@table @b 694 695@item What about stack overflow/underflow? 696GDB should be able to query the target to discover its stack size. 697Given that information, GDB can determine at translation time whether a 698given expression will overflow the stack. But this spec isn't about 699what kinds of error-checking GDB ought to do. 700 701@item Why are you doing everything in LONGEST? 702 703Speed isn't important, but agent code size is; using LONGEST brings in a 704bunch of support code to do things like division, etc. So this is a 705serious concern. 706 707First, note that you don't need different bytecodes for different 708operand sizes. You can generate code without @emph{knowing} how big the 709stack elements actually are on the target. If the target only supports 71032-bit ints, and you don't send any 64-bit bytecodes, everything just 711works. The observation here is that the MIPS and the Alpha have only 712fixed-size registers, and you can still get C's semantics even though 713most instructions only operate on full-sized words. You just need to 714make sure everything is properly sign-extended at the right times. So 715there is no need for 32- and 64-bit variants of the bytecodes. Just 716implement everything using the largest size you support. 717 718GDB should certainly check to see what sizes the target supports, so the 719user can get an error earlier, rather than later. But this information 720is not necessary for correctness. 721 722 723@item Why don't you have @code{>} or @code{<=} operators? 724I want to keep the interpreter small, and we don't need them. We can 725combine the @code{less_} opcodes with @code{log_not}, and swap the order 726of the operands, yielding all four asymmetrical comparison operators. 727For example, @code{(x <= y)} is @code{! (x > y)}, which is @code{! (y < 728x)}. 729 730@item Why do you have @code{log_not}? 731@itemx Why do you have @code{ext}? 732@itemx Why do you have @code{zero_ext}? 733These are all easily synthesized from other instructions, but I expect 734them to be used frequently, and they're simple, so I include them to 735keep bytecode strings short. 736 737@code{log_not} is equivalent to @code{const8 0 equal}; it's used in half 738the relational operators. 739 740@code{ext @var{n}} is equivalent to @code{const8 @var{s-n} lsh const8 741@var{s-n} rsh_signed}, where @var{s} is the size of the stack elements; 742it follows @code{ref@var{m}} and @var{reg} bytecodes when the value 743should be signed. See the next bulleted item. 744 745@code{zero_ext @var{n}} is equivalent to @code{const@var{m} @var{mask} 746log_and}; it's used whenever we push the value of a register, because we 747can't assume the upper bits of the register aren't garbage. 748 749@item Why not have sign-extending variants of the @code{ref} operators? 750Because that would double the number of @code{ref} operators, and we 751need the @code{ext} bytecode anyway for accessing bitfields. 752 753@item Why not have constant-address variants of the @code{ref} operators? 754Because that would double the number of @code{ref} operators again, and 755@code{const32 @var{address} ref32} is only one byte longer. 756 757@item Why do the @code{ref@var{n}} operators have to support unaligned fetches? 758GDB will generate bytecode that fetches multi-byte values at unaligned 759addresses whenever the executable's debugging information tells it to. 760Furthermore, GDB does not know the value the pointer will have when GDB 761generates the bytecode, so it cannot determine whether a particular 762fetch will be aligned or not. 763 764In particular, structure bitfields may be several bytes long, but follow 765no alignment rules; members of packed structures are not necessarily 766aligned either. 767 768In general, there are many cases where unaligned references occur in 769correct C code, either at the programmer's explicit request, or at the 770compiler's discretion. Thus, it is simpler to make the GDB agent 771bytecodes work correctly in all circumstances than to make GDB guess in 772each case whether the compiler did the usual thing. 773 774@item Why are there no side-effecting operators? 775Because our current client doesn't want them? That's a cheap answer. I 776think the real answer is that I'm afraid of implementing function 777calls. We should re-visit this issue after the present contract is 778delivered. 779 780@item Why aren't the @code{goto} ops PC-relative? 781The interpreter has the base address around anyway for PC bounds 782checking, and it seemed simpler. 783 784@item Why is there only one offset size for the @code{goto} ops? 785Offsets are currently sixteen bits. I'm not happy with this situation 786either: 787 788Suppose we have multiple branch ops with different offset sizes. As I 789generate code left-to-right, all my jumps are forward jumps (there are 790no loops in expressions), so I never know the target when I emit the 791jump opcode. Thus, I have to either always assume the largest offset 792size, or do jump relaxation on the code after I generate it, which seems 793like a big waste of time. 794 795I can imagine a reasonable expression being longer than 256 bytes. I 796can't imagine one being longer than 64k. Thus, we need 16-bit offsets. 797This kind of reasoning is so bogus, but relaxation is pathetic. 798 799The other approach would be to generate code right-to-left. Then I'd 800always know my offset size. That might be fun. 801 802@item Where is the function call bytecode? 803 804When we add side-effects, we should add this. 805 806@item Why does the @code{reg} bytecode take a 16-bit register number? 807 808Intel's IA-64 architecture has 128 general-purpose registers, 809and 128 floating-point registers, and I'm sure it has some random 810control registers. 811 812@item Why do we need @code{trace} and @code{trace_quick}? 813Because GDB needs to record all the memory contents and registers an 814expression touches. If the user wants to evaluate an expression 815@code{x->y->z}, the agent must record the values of @code{x} and 816@code{x->y} as well as the value of @code{x->y->z}. 817 818@item Don't the @code{trace} bytecodes make the interpreter less general? 819They do mean that the interpreter contains special-purpose code, but 820that doesn't mean the interpreter can only be used for that purpose. If 821an expression doesn't use the @code{trace} bytecodes, they don't get in 822its way. 823 824@item Why doesn't @code{trace_quick} consume its arguments the way everything else does? 825In general, you do want your operators to consume their arguments; it's 826consistent, and generally reduces the amount of stack rearrangement 827necessary. However, @code{trace_quick} is a kludge to save space; it 828only exists so we needn't write @code{dup const8 @var{SIZE} trace} 829before every memory reference. Therefore, it's okay for it not to 830consume its arguments; it's meant for a specific context in which we 831know exactly what it should do with the stack. If we're going to have a 832kludge, it should be an effective kludge. 833 834@item Why does @code{trace16} exist? 835That opcode was added by the customer that contracted Cygnus for the 836data tracing work. I personally think it is unnecessary; objects that 837large will be quite rare, so it is okay to use @code{dup const16 838@var{size} trace} in those cases. 839 840Whatever we decide to do with @code{trace16}, we should at least leave 841opcode 0x30 reserved, to remain compatible with the customer who added 842it. 843 844@end table 845