1\." $Revision: 1.8 $
2\." $Date: 2005/04/01 18:04:37 $
3\."
4\."
5\." the following line may be removed if the ff ligature works on your machine
6.lg 0
7\." set up heading formats
8.ds HF 3 3 3 3 3 2 2
9.ds HP +2 +2 +1 +0 +0
10\." ==============================================
11\." Put current date in the following at each rev
12.ds vE rev 1.18, 31 March 2005
13\." ==============================================
14\." ==============================================
15.nr Hs 5
16.nr Hb 5
17.ds | |
18.ds ~ ~
19.ds ' '
20.if t .ds Cw \&\f(CW
21.if n .ds Cw \fB
22.de Cf          \" Place every other arg in Cw font, beginning with first
23.if \\n(.$=1 \&\*(Cw\\$1\fP
24.if \\n(.$=2 \&\*(Cw\\$1\fP\\$2
25.if \\n(.$=3 \&\*(Cw\\$1\fP\\$2\*(Cw\\$3\fP
26.if \\n(.$=4 \&\*(Cw\\$1\fP\\$2\*(Cw\\$3\fP\\$4
27.if \\n(.$=5 \&\*(Cw\\$1\fP\\$2\*(Cw\\$3\fP\\$4\*(Cw\\$5\fP
28.if \\n(.$=6 \&\*(Cw\\$1\fP\\$2\*(Cw\\$3\fP\\$4\*(Cw\\$5\fP\\$6
29.if \\n(.$=7 \&\*(Cw\\$1\fP\\$2\*(Cw\\$3\fP\\$4\*(Cw\\$5\fP\\$6\*(Cw\\$7\fP
30.if \\n(.$=8 \&\*(Cw\\$1\fP\\$2\*(Cw\\$3\fP\\$4\*(Cw\\$5\fP\\$6\*(Cw\\$7\fP\\$8
31.if \\n(.$=9 \&\*(Cw\\$1\fP\\$2\*(Cw\\$3\fP\\$4\*(Cw\\$5\fP\\$6\*(Cw\\$7\fP\\$8\
32*(Cw
33..
34.SA 1
35.TL
36MIPS Extensions to DWARF Version 2.0
37.AF ""
38.AU "Silicon Graphics Computer Systems
39.PF "'\*(vE '- \\\\nP -''"
40.PM ""
41.AS 1
42This document describes the MIPS/Silicon Graphics extensions to the "DWARF
43Information Format" (version 2.0.0 dated July 27, 1993).
44DWARF3 draft 8 (or draft 9) is out as of 2005, and
45is mentioned below where applicable.
46MIPS/IRIX compilers emit DWARF2 (with extensions).
47.P
48Rather than alter the base documents to describe the extensions
49we provide this separate document.
50.P
51The extensions documented here are subject to change.
52.P
53It also describes known bugs resulting in incorrect dwarf usage.
54.P
55\*(vE
56.AE
57.MT 4
58.H 1 "INTRODUCTION"
59This document describes MIPS extensions to the DWARF
60debugging information format.
61The extensions documented here are subject to change at
62any time.
63.H 1 "64 BIT DWARF"
64The DWARF2 spec has no provision for 64 bit offsets.
65SGI-IRIX/MIPS Elf64 objects contain DWARF 2 with all offsets
66(and addresses) as 64bit values.  
67This non-standard extension was adopted in 1992.
68Nothing in the dwarf itself identifies the dwarf as 64bit.
69This extension 64bit-offset dwarf cannot be mixed with 32bit-offset dwarf
70in a single object or executable, and SGI-IRIX/MIPS compilers
71and tools do not mix the sizes.
72.P
73In 2001 DWARF3 adopted a very different 64bit-offset
74format which can be mixed usefully with 32bit-offset DWARF2 or DWARF3.
75It is not very likely SGI-IRIX/MIPS compilers will switch to the 
76now-standard
77DWARF3 64bit-offset scheme, but such a switch is theoretically
78possible and would be a good idea.
79.P
80SGI-IRIX/MIPS Elf32 objects
81contain DWARF2 with all offsets (and addresses) 32 bits.
82.H 1 "How much symbol information is emitted"
83The following standard DWARF V2 sections may be emitted:
84.AL
85.LI 
86 .debug_abbrev
87contains
88abbreviations supporting the .debug_info section.
89.LI
90 .debug_info
91contains
92Debug Information Entries (DIEs).
93.LI 
94 .debug_frame
95contains
96stack frame descriptions.
97.LI 
98 .debug_line
99contains
100line number information.
101.LI 
102 .debug_aranges
103contains
104address range descriptions.
105.LI 
106 .debug_pubnames
107contains
108names of global functions and data.
109.P
110The following
111are MIPS extensions.
112Theses were created to allow debuggers to
113know names without having to look at
114the .debug_info section.
115.LI 
116 .debug_weaknames
117is a MIPS extension
118containing .debug_pubnames-like entries describing weak
119symbols.
120.LI 
121 .debug_funcnames
122is a MIPS extension
123containing .debug_pubnames-like entries describing file-static
124functions (C static functions).
125The gcc extension of nested subprograms (like Pascal)
126adds non-global non-static functions.  These should be treated like
127static functions and gcc should add such to this section
128so that IRIX libexc(3C) will work correctly.
129Similarly, Ada functions which are non-global should be here too
130so that libexc(3C) can work.
131Putting it another way, every function (other than inline code)
132belongs either in .debug_pubnames or in .debug_funcnames
133or else libexc(3C) cannot find the function name.
134.LI 
135 .debug_varnames
136is a MIPS extension
137containing .debug_pubnames-like entries describing file-static
138data symbols (C static variables).
139.LI 
140 .debug_typenames
141is a MIPS extension
142containing .debug_pubnames-like entries describing file-level
143types.
144.P
145The following are not currently emitted.
146.LI 
147 .debug_macinfo
148Macro information is not currently emitted.
149.LI 
150 .debug_loc
151Location lists are not currently emitted.
152.LI
153 .debug_str
154The string section is not currently emitted.
155.LE
156.H 2 "Overview of information emitted"
157We emit debug information in 3 flavors.
158We mention C here.
159The situation is essentially identical for f77, f90, and C++.
160.AL
161.LI 
162"default C"
163We emit line information and DIEs for each subprogram.
164But no local symbols and no type information.
165Frame information is output.
166The DW_AT_producer string has the optimization level: for example
167"-O2".
168We put so much in the DW_AT_producer that the string
169is a significant user of space in .debug_info --
170this is perhaps a poor use of space.
171When optimizing the IRIX CC/cc option -DEBUG:optimize_space
172eliminates such wasted space.
173Debuggers only currently use the lack of -g
174of DW_AT_producer
175as a hint as to how a 'step' command should be interpreted, and
176the rest of the string is not used for anything (unless
177a human looks at it for some reason), so if space-on-disk
178is an issue, it is quite appropriate to use -DEBUG:optimize_space
179and save disk space.
180Every function definition (not inline instances though) is mentioned
181in either .debug_pubnames or .debug_funcnames.
182This is crucial to allow libexc(3C) stack-traceback to work and
183show function names (for all languages).
184.LI 
185"C with full symbols"
186All possible info is emitted.
187DW_AT_producer string has all options that might be of interest,
188which includes -D's, -U's, and the -g option.
189These options look like they came from the command line.
190We put so much in the DW_AT_producer that the string
191is a significant user of space in .debug_info.
192this is perhaps a poor use of space.
193Debuggers only currently use the -g
194of DW_AT_producer
195as a hint as to how a 'step' command should be interpreted, and
196the rest of the string is not used for anything (unless
197a human looks at it for some reason).
198Every function definition (not inline instances though) is mentioned
199in either .debug_pubnames or .debug_funcnames.
200This is crucial to allow libexc(3C) stack-traceback to work and
201show function names (for all languages).
202.LI 
203"Assembler (-g, non -g are the same)"
204Frame information is output.
205No type information is emitted, but DIEs are prepared
206for globals.
207.LE
208.H 2 "Detecting 'full symbols' (-g)"
209The debugger depends on the existence of
210the DW_AT_producer string to determine if the
211compilation unit has full symbols or not.
212It looks for -g or -g[123] and accepts these as
213full symbols but an absent -g or a present -g0 
214is taken to mean that only basic symbols are defined and there
215are no local symbols and no type information.
216.P
217In various contexts the debugger will think the program  is
218stripped or 'was not compiled with -g' unless the -g
219is in the DW_AT_producer string.
220.H 2 "DWARF and strip(1)"
221The DWARF section ".debug_frame" is marked SHF_MIPS_NOSTRIP
222and is not stripped by the strip(1) program.
223This is because the section is needed for doing
224stack back traces (essential for C++
225and Ada exception handling).
226.P
227All .debug_* sections are marked with elf type
228SHT_MIPS_DWARF.
229Applications needing to access the various DWARF sections
230must use the section name to discriminate between them. 
231
232.H 2 "Evaluating location expressions"
233When the debugger evaluates location expressions, it does so
234in 2 stages. In stage one it simply looks for the trivial
235location expressions and treats those as special cases.
236.P
237If the location expression is not trivial, it enters stage two.
238In this case it uses a stack to evaluate the expression.
239.P
240If the application is a 32-bit application, it does the operations
241on 32-bit values (address size values).  Even though registers
242can be 64 bits in a 32-bit program all evaluations are done in 
24332-bit quantities, so an attempt to calculate a 32-bit quantity
244by taking the difference of 2 64-bit register values will not
245work.  The notion is that the stack machine is, by the dwarf
246definition, working in address size units.
247.P
248These values are then expanded to 64-bit values (addresses or
249offsets).   This extension does not involve sign-extension.
250.P
251If the application is a 64-bit application, then the stack
252values are all 64 bits and all operations are done on 64 bits.
253.H 3 "The fbreg location op"
254Compilers shipped with IRIX 6.0 and 6.1
255do not emit the fbreg location expression
256and never emit the DW_AT_frame_base attribute that it
257depends on. 
258However, this changes
259with release 6.2 and these are now emitted routinely.
260
261.H 1 "Frame Information"
262.H 2 "Initial Instructions"
263The DWARF V2 spec 
264provides for "initial instructions" in each CIE (page 61,
265section 6.4.1).
266However, it does not say whether there are default
267values for each column (register).
268.P
269Rather than force every CIE to have a long list
270of bytes to initialize all 32 integer registers,
271we define that the default values of all registers
272(as returned by libdwarf in the frame interface)
273are 'same value'.
274This is a good choice for many non-register-windows
275implementations.
276.H 2 "Augmentation string in debug_frame"
277The augmentation string we use in shipped compilers (up thru
278irix6.2) is the empty string.
279IRIX6.2 and later has an augmentation string 
280the empty string ("")
281or "z" or "mti v1" 
282where the "v1" is a version number (version 1).
283.P
284We do not believe that "mti v1" was emitted  as the
285augmentation string in any shipped compiler.
286.P
287.H 3 "CIE processing based on augmentation string:"
288If the augmentation string begins with 'z', then it is followed
289immediately by a unsigned_leb_128 number giving the code alignment factor.
290Next is a signed_leb_128 number giving the data alignment factor.
291Next is a unsigned byte giving the number of the return address register.
292Next is an unsigned_leb_128 number giving the length of the 'augmentation'
293fields  (the length of augmentation bytes, not
294including the unsigned_leb_128 length itself).
295As of release 6.2, the length of the CIE augmentation fields is 0.
296What this means is that it is possible to add new
297augmentations, z1, z2, etc and yet an old consumer to
298understand the entire CIE as it can bypass the
299augmentation it does not understand because the
300length of the augmentation fields is present.
301Presuming of course that all augmentation fields are
302simply additional information, 
303not some 'changing of the meaning of 
304an existing field'.
305Currently there is no CIE data in the augmentation for things
306beginning with 'z'.
307.P
308If the augmentation string is "mti v1" or "" then it is followed
309immediately by a unsigned_leb_128 number giving the code alignment factor.
310Next is a signed_leb_128 number giving the data alignment factor.
311Next is a unsigned byte giving the number of the return address register.
312.P
313If the augmentation string is something else, then the
314code alignment factor is assumed to be 4 and the data alignment
315factor is assumed to be -1 and the return
316address register is assumed to be 31. Arbitrarily.
317The library (libdwarf) assumes it does not understand the rest of the CIE.
318.P
319.H 3 "FDE processing based on augmentation"
320If the CIE augmentation string 
321for an fde begins with 'z'
322then the next FDE field after the address_range field
323is an 
324unsigned_leb_128 number giving the length of the 'augmentation'
325fields, and those fields follow immediately.
326
327.H 4 "FDE augmentation fields"
328.P
329If the CIE augmentation string is "mti v1" or ""
330then the FDE is exactly as described in the Dwarf Document section 6.4.1.
331.P
332Else, if the CIE augmentation string begins with "z"
333then the next field after the FDE augmentation length field
334is a Dwarf_Sword size offset into
335exception tables.
336If the CIE augmentation string does not begin with "z"
337(and is neither "mti v1" nor "")
338the FDE augmentation fields are skipped (not understood).
339Note that libdwarf actually (as of MIPSpro7.3 and earlier)
340only tests that the initial character of the augmentation
341string is 'z', and ignores the rest of the string, if any.
342So in reality the test is for a _prefix_ of 'z'.
343.P
344If the CIE augmentation string neither starts with 'z' nor is ""
345nor is "mti v1" then libdwarf (incorrectly) assumes that the
346table defining instructions start next.  
347Processing (in libdwarf) will be incorrect.
348.H 2 "Stack Pointer recovery from debug_frame"
349There is no identifiable means in
350DWARF2 to say that the stack register is
351recovered by any particular operation.
352A 'register rule' works if the caller's
353stack pointer was copied to another
354register.
355An 'offset(N)' rule works if the caller's
356stack pointer was stored on the stack.
357However if the stack pointer is
358some register value plus/minus some offset,
359there is no means to say this in an FDE.
360For MIPS/IRIX, the recovered stack pointer
361of the next frame up the stack (towards main())
362is simply the CFA value of the current
363frame, and the CFA value is
364precisely a register (value of a register)
365or a register plus offset (value of a register
366plus offset).  This is a software convention.
367.H 1 "egcs dwarf extensions (egcs-1.1.2 extensions)"
368This and following egcs sections describe
369the extensions currently shown in egcs dwarf2.
370Note that egcs has chosen to adopt tag and
371attribute naming as if their choices were
372standard dwarf, not as if they were extensions.
373However, they are properly numbered as extensions.
374
375.H 2 "DW_TAG_format_label             0x4101" 
376For FORTRAN 77, Fortran 90.
377Details of use not defined in egcs source, so
378unclear if used.
379.H 2 "DW_TAG_function_template        0x4102"
380For C++.
381Details of use not defined in egcs source, so
382unclear if used.
383.H 2 "DW_TAG_class_template           0x4103" 
384For C++.
385Details of use not defined in egcs source, so
386unclear if used.
387.H 2 "DW_AT_sf_names                          0x2101"
388Apparently only output in DWARF1, not DWARF2.
389.H 2 "DW_AT_src_info                          0x2102"
390Apparently only output in DWARF1, not DWARF2.
391.H 2 "DW_AT_mac_info                          0x2103"
392Apparently only output in DWARF1, not DWARF2.
393.H 2 "DW_AT_src_coords                        0x2104"
394Apparently only output in DWARF1, not DWARF2.
395.H 2 "DW_AT_body_begin                        0x2105"
396Apparently only output in DWARF1, not DWARF2.
397.H 2 "DW_AT_body_end                          0x2106"
398Apparently only output in DWARF1, not DWARF2.
399
400.H 1 "egcs .eh_frame (non-sgi) (egcs-1.1.2 extensions)"
401egcs-1.1.2 (and earlier egcs)
402emits by default a section named .eh_frame 
403for ia32 (and possibly other platforms) which
404is nearly identical to .debug_frame in format and content.
405This section is used for helping handle C++ exceptions.
406.P
407Because after linking there are sometimes zero-ed out bytes
408at the end of the eh_frame section, the reader code in
409dwarf_frame.c considers a zero cie/fde length as an indication
410that it is the end of the section.
411.P
412.H 2 "CIE_id 0"
413The section is an ALLOCATED section in an executable, and
414is therefore mapped into memory at run time.
415The CIE_pointer (aka CIE_id, section 6.4.1
416of the DWARF2 document) is the field that 
417distinguishes a CIE from an FDE.
418The designers of the egcs .eh_frame section
419decided to make the CIE_id
420be 0 as the CIE_pointer definition is
421.in +2
422the number of bytes from the CIE-pointer in the FDE back to the
423applicable CIE.
424.in -2
425In a dwarf .debug_frame section, the CIE_pointer is the
426offset in .debug_frame of the CIE for this fde, and
427since an offset can be zero of some CIE, the CIE_id
428cannot be 0, but must be all 1 bits .
429Note that the dwarf2.0 spec does specify the value of CIE_id
430as 0xffffffff
431(see section 7.23 of v2.0.0),
432though earlier versions of this extensions document
433incorrectly said it was not specified in the dwarf
434document.
435.H 2 "augmentation eh"
436The augmentation string in each CIE is "eh"
437which, with its following NUL character, aligns
438the following word to a 32bit boundary.
439Following the augmentation string is a 32bit
440word with the address of the __EXCEPTION_TABLE__,
441part of the exception handling data for egcs.
442.H 2 "DW_CFA_GNU_window_save   0x2d"
443This is effectively a flag for architectures with
444register windows, and tells the unwinder code that
445it must look to a previous frame for the
446correct register window set.
447As of this writing, egcs  gcc/frame.c
448indicates this is for SPARC register windows.
449.H 2 "DW_CFA_GNU_args_size     0x2e"
450DW_CFA_GNU_args_size has a single uleb128 argument
451which is the size, in bytes, of the function's stack
452at that point in the function.
453.H 2 "__EXCEPTION_TABLE__"
454A series of 3 32bit word entries by default:
4550 word: low pc address
4561 word: high pc address
4572 word: pointer to exception handler code
458The end of the table is 
459signaled by 2 words  of -1 (not 3 words!).
460.H 1 "Interpretations of the DWARF V2 spec"
461.H 2 "template TAG spellings"
462The DWARF V2 spec spells two attributes in two ways.
463DW_TAG_template_type_param 
464(listed in Figure 1, page 7)
465is spelled DW_TAG_template_type_parameter
466in the body of the document (section 3.3.7, page 28).
467We have adopted the spelling 
468DW_TAG_template_type_param.
469.P
470DW_TAG_template_value_param
471(listed in Figure 1, page 7)
472is spelled DW_TAG_template_value_parameter
473in the body of the document (section 3.3.7, page 28).
474We have adopted the spelling 
475DW_TAG_template_value_parameter.
476.P
477We recognize that the choices adopted are neither consistently
478the longer nor the shorter name.
479This inconsistency was an accident.
480.H 2 DW_FORM_ref_addr confusing
481Section 7.5.4, Attribute Encodings, describes
482DW_FORM_ref_addr.
483The description says the reference is the size of an address
484on the target architecture.
485This is surely a mistake, because on a 16bit-pointer-architecture
486it would mean that the reference could not exceed
48716 bits, which makes only
488a limited amount of  sense as the reference is from one
489part of the dwarf to another, and could (in theory)
490be *on the disk* and not limited to what fits in memory.
491Since MIPS is 32 bit pointers (at the smallest) 
492the restriction is not a problem for MIPS/SGI.
493The 32bit pointer ABIs are limited to 32 bit section sizes
494anyway (as a result of implementation details).
495And the 64bit pointer ABIs currently have the same limit
496as a result of how the compilers and tools are built
497(this has not proven to be a limit in practice, so far).
498.P
499This has been clarified in the DWARF3 spec and the IRIX use
500of DW_FORM_ref_addr being an offset is correct.
501.H 2 ".debug_macinfo in a debugger"
502It seems quite difficult, in general, to
503tie specific text(code) addresses to points in the
504stream of macro information for a particular compilation unit.
505So it's been difficult to see how to design a consumer
506interface to libdwarf for macro information.
507.P
508The best (simple to implement, easy for a debugger user to
509understand) candidate seems to be that
510the debugger asks for macros of a given name in a compilation
511unit, and the debugger responds with *all* the macros of that name.
512.H 3 "only a single choice exists"
513If there is exactly one, that is usable in expressions, if the
514debugger is able to evaluate such.
515.H 3 "multiple macros with same name".
516If there are multiple macros with the same name
517in a compilation unit, 
518the debugger (and the debugger user and the application
519programmer) have
520a problem: confusion is quite possible.
521If the macros are simple the
522debugger  user can simply substitute by hand in an expression.
523If the macros are complicated hand substitution will be
524impractical, and the debugger will have to identify the
525choices and let the debugger user choose an interpretation.
526.H 2 "Section 6.1.2 Lookup by address problem"
527Each entry is a beginning-address followed by a length.
528And the distinguished entry 0,0 is used to denote
529the end of a range of entries.
530.P
531This means that one must be careful not to emit a zero length,
532as in a .o (object file) the beginning address of
533a normal entry might be 0 (it is a section offset after all),
534and the resulting 0,0 would be taken as end-of-range, not
535as a valid entry.
536A dwarf dumper would have trouble with such data
537in an object file.
538.P
539In an a.out or shared object (dynamic shared object, DSO)
540no text will be at address zero so in such this problem does
541not arise.
542.H 2 "Section 5.10 Subrange Type Entries problem"
543It is specified that  DW_AT_upper_bound (and lower bound)
544must be signed entries if there is no object type
545info to specify the bound type (Sec 5.10, end of section). 
546One cannot tell (with some
547dwarf constant types) what the signedness is from the
548form itself (like DW_FORM_data1), so it is necessary
549to determine the object and type according to the rules 
550in 5.10 and then if all that fails, the type is signed.
551It's a bit complicated and earlier versions of  mips_extensions
552incorrectly said signedness was not defined.
553.H 2 "Section 5.5.6 Class Template Instantiations problem"
554Lots of room for implementor to canonicalize
555template declarations.  Ie various folks won't agree.
556This is not serious since a given compiler
557will be consistent with itself and  debuggers
558will have to cope!
559.H 2 "Section 2.4.3.4  # 11. operator spelling"  
560DW_OP_add should be DW_OP_plus (page 14)
561(this mistake just one place on the page).
562.H 2 "No clear specification of C++ static funcs"
563There is no clear way to tell if a C++ member function
564is a static member or a non-static member function.
565(dwarf2read.c in gdb 4.18, for example, has this observation)
566.H 2 "Misspelling of DW_AT_const_value"
567Twice in appendix 1, DW_AT_const_value is misspelled
568as DW_AT_constant_value.
569.H 2 "Mistake in Atribute Encodings"
570Section 7.5.4, "Attribute Encodings"
571has a brief discussion of "constant"
572which says there are 6 forms of constants.
573It is incorrect in that it fails to mention (or count)
574the block forms, which are clearly allowed by
575section 4.1 "Data Object Entries" (see entry number 9 in
576the numbered list, on constants).
577.H 2 "DW_OP_bregx"
578The description of DW_OP_bregx in 2.4.3.2 (Register Based
579Addressing) is slightly misleading, in that it
580lists the offset first.
581As section 7.7.1 (Location Expression)
582makes clear, in the encoding the register number
583comes first.
584.H 1 "MIPS attributes"
585.H 2 "DW_AT_MIPS_fde"
586This extension to Dwarf appears only on subprogram TAGs and has as
587its value the offset, in the .debug_frame section, of the fde which
588describes the frame of this function.  It is an optimization of
589sorts to have this present.
590
591.H 2 "DW_CFA_MIPS_advance_loc8 0x1d"
592This obvious extension to dwarf line tables enables encoding of 8 byte
593advance_loc values (for cases when such must be relocatable, 
594and thus must be full length).  Applicable only to 64-bit objects.
595
596.H 2 "DW_TAG_MIPS_loop        0x4081"
597For future use. Not currently emitted.
598Places to be emitted and attributes that this might own
599not finalized.
600
601.H 2 "DW_AT_MIPS_loop_begin   0x2002"
602For future use. Not currently emitted.
603Attribute form and content not finalized.
604
605.H 2 "DW_AT_MIPS_tail_loop_begin  0x2003"
606For future use. Not currently emitted.
607Attribute form and content not finalized.
608
609.H 2 "DW_AT_MIPS_epilog_begin     0x2004"
610For future use. Not currently emitted.
611Attribute form and content not finalized.
612
613.H 2 "DW_AT_MIPS_loop_unroll_factor  0x2005"
614For future use. Not currently emitted.
615Attribute form and content not finalized.
616
617.H 2 "DW_AT_MIPS_software_pipeline_depth   0x2006"
618For future use. Not currently emitted.
619Attribute form and content not finalized.
620.H 2 "DW_AT_MIPS_linkage_name                 0x2007"
621The rules for mangling C++ names are not part of the
622C++ standard and are different for different versions
623of C++.  With this extension, the compiler emits
624both the DW_AT_name for things with mangled names 
625(recall that DW_AT_name is NOT the mangled form)
626and also emits DW_AT_MIPS_linkage_name whose value
627is the mangled name.
628.P
629This makes looking for the mangled name in other linker
630information straightforward.
631It also is passed (by the debugger) to the
632libmangle routines to generate names to present to the
633debugger user.
634.H 2 "DW_AT_MIPS_stride            0x2008"
635F90 allows assumed shape arguments and pointers to describe
636non-contiguous memory. A (runtime) descriptor contains address,
637bounds and stride information - rank and element size is known
638during compilation. The extent in each dimension is given by the
639bounds in a DW_TAG_subrange_type, but the stride cannot be
640represented in conventional dwarf. DW_AT_MIPS_stride was added as
641an attribute of a DW_TAG_subrange_type to describe the
642location of the stride.
643Used in the MIPSpro 7.2 (7.2.1 etc) compilers.
644.P
645If the stride is constant (ie: can be inferred from the type in the
646usual manner) DW_AT_MIPS_stride is absent. 
647.P
648If DW_AT_MIPS_stride is present, the attribute contains a reference
649to a DIE which describes the location holding the stride, and the
650DW_AT_stride_size field of DW_TAG_array_type is ignored if
651present.  The value of the stride is the number of 
6524 byte words between
653elements along that axis.
654.P
655This applies to 
656.nf
657a) Intrinsic types whose size is greater 
658   or equal to 4bytes ie: real*4,integer*8 
659   complex etc, but not character types.
660
661b) Derived types (ie: structs) of any size, 
662   unless all components are of type character.
663.fi
664
665.H 2 "DW_AT_MIPS_abstract_name              0x2009"
666This attribute only appears in a DA_TAG_inlined_subroutine DIE.
667The value of this attribute is a string.
668When IPA inlines a routine and the abstract origin is
669in another compilation unit, there is a problem with putting
670in a reference, since the ordering and timing of the 
671creation of references is unpredicatable with reference to
672the DIE and compilation unit the reference refers to. 
673.P
674Since there may be NO ordering of the compilation units that
675allows a correct reference to be done without some kind of patching,
676and since even getting the information from one place to another
677is a problem, the compiler simply passes the problem on to the debugger.
678.P
679The debugger must match the DW_AT_MIPS_abstract_name 
680in the concrete
681inlined instance DIE 
682with the  DW_AT_MIPS_abstract_name
683in the abstract inlined subroutine DIE.
684.P
685A dwarf-consumer-centric view of this  and other inline
686issues could be expressed as follows:
687.nf
688If DW_TAG_subprogram
689  If has DW_AT_inline is abstract instance root
690  If has DW_AT_abstract_origin, is out-of-line instance
691    of function (need abstract origin for some data)
692    (abstract root in same CU (conceptually anywhere
693    a ref can reach, but reaching outside of CU is
694    a problem for ipa: see DW_AT_MIPS_abstract_name))
695  If has DW_AT_MIPS_abstract_name is abstract instance
696    root( must have DW_AT_inline) and this name is used to
697    match with the abstract root
698
699If DW_TAG_inline_subroutine
700  Is concrete inlined subprogram instance.
701  If has DW_AT_abstract_origin, it is a CU-local inline.
702  If it has DW_AT_MIPS_abstract_name it is an
703    inline whose abstract root is in another file (CU).
704.fi
705
706.H 2 "DW_AT_MIPS_clone_origin               0x200a"
707This attribute appears only in a cloned subroutine.
708The procedure is cloned from the same compilation unit.
709The value of this attribute is a reference to 
710the original routine in this compilation unit.
711.P
712The 'original' routine means the routine which has all the
713original code.  The cloned routines will always have
714been 'specialized' by IPA.   
715A routine with DW_AT_MIPS_clone_origin
716will also have the DW_CC_nocall value of the DW_AT_calling_convention
717attribute.
718
719.H 2 "DW_AT_MIPS_has_inlines               0x200b"
720This attribute  may appear in a DW_TAG_subprogram DIE.
721If present and it has the value True, then the subprogram
722has inlined functions somewhere in the body.
723.P
724By default, at startup, the debugger may not look for 
725inlined functions in scopes inside the outer function.
726.P
727This is a hint to the debugger to look for the inlined functions
728so the debugger can set breakpoints on these in case the user
729requests 'stop in foo' and foo is inlined.
730.H 2 "DW_AT_MIPS_stride_byte                  0x200c"
731Created for f90 pointer and assumed shape
732arrays.
733Used in the MIPSpro 7.2 (7.2.1 etc) compilers.
734A variant of DW_AT_MIPS_stride. 
735This stride is interpreted as a byte count. 
736Used for integer*1 and character arrays
737and arrays of derived type
738whose components are all character.
739.H 2 "DW_AT_MIPS_stride_elem                  0x200d"
740Created for f90 pointer and assumed shape
741arrays.
742Used in the MIPSpro 7.2 (7.2.1 etc) compilers.
743A variant of DW_AT_MIPS_stride. 
744This stride is interpreted as a byte-pair (2 byte) count. 
745Used for integer*2 arrays.
746.H 2 "DW_AT_MIPS_ptr_dopetype                 0x200e"
747See following.
748.H 2 "DW_AT_MIPS_allocatable_dopetype         0x200f"
749See following.
750.H 2 "DW_AT_MIPS_assumed_shape_dopetype       0x2010"
751DW_AT_MIPS_assumed_shape_dopetype, DW_AT_MIPS_allocatable_dopetype,
752and DW_AT_MIPS_ptr_dopetype have an attribute value
753which is a reference to a Fortran 90 Dope Vector.
754These attributes are introduced in MIPSpro7.3.
755They only apply to f90 arrays (where they are
756needed to describe arrays never properly described
757before in debug information).
758C, C++, f77, and most f90 arrays continue to be described
759in standard dwarf.
760.P
761The distinction between these three attributes is the f90 syntax
762distinction: keywords 'pointer' and 'allocatable' with the absence
763of these keywords on an assumed shape array being the third case.
764.P
765A "Dope Vector" is a struct (C struct) which describes
766a dynamically-allocatable array.
767In objects with full debugging the C struct will be
768in the dwarf information (of the f90 object, represented like C).
769A debugger will use the link to find the main struct DopeVector
770and will use that information to decode the dope vector.
771At the outer allocatable/assumed-shape/pointer
772the DW_AT_location points at the dope vector (so debugger
773calculations use that as a base).
774.H 2 "Overview of debugger use of dope vectors"
775Fundamentally, we build two distinct
776representations of the arrays and pointers.
777One, in dwarf, represents the statically-representable
778information (the types and
779variable/type-names, without type size information).
780The other, using dope vectors in memory, represents
781the run-time data of sizes.
782A debugger must process the two representations
783in parallel (and merge them) to deal with  user expressions in
784a debugger.
785.H 2 "Example f90 code for use in explanation"
786[Note
787We want dwarf output with *exactly* 
788this little (arbitrary) example.
789Not yet available.
790end Note]
791Consider the following code.
792.nf
793       type array_ptr
794	  real   :: myvar
795          real, dimension (:), pointer :: ap
796       end type array_ptr
797
798       type (array_ptr), allocatable, dimension (:) :: arrays
799
800       allocate (arrays(20))
801       do i = 1,20
802          allocate (arrays(i)%ap(i))
803       end do
804.fi
805arrays is an allocatable array (1 dimension) whose size is
806not known at compile time (it has
807a Dope Vector).  At run time, the
808allocate statement creats 20 array_ptr dope vectors
809and marks the base arrays dopevector as allocated.
810The myvar variable is just there to add complexity to
811the example :-)
812.nf
813In the loop, arrays(1)%ap(1) 
814    is allocated as a single element array of reals.
815In the loop, arrays(2)%ap(2) 
816    is allocated as an array of two reals.
817...
818In the loop, arrays(20)%ap(20) 
819    is allocated as an array of twenty reals.
820.fi
821.H 2 "the problem with standard dwarf and this example"
822.sp
823In dwarf, there is no way to find the array bounds of arrays(3)%ap,
824for example, (which are 1:3 in f90 syntax)
825since any location expression in an ap array lower bound
826attribute cannot involve the 3 (the 3 is known at debug time and
827does not appear in the running binary, so no way for the
828location expression to get to it).
829And of course the 3 must actually index across the array of
830dope vectors in 'arrays' in our implementation, but that is less of
831a problem than the problem with the '3'.
832.sp
833Plus dwarf has no way to find the 'allocated' flag in the
834dope vector (so the debugger can know when the allocate is done
835for a particular arrays(j)%ap).
836.sp
837Consequently, the calculation of array bounds and indices
838for these dynamically created f90 arrays
839is now pushed of into the debugger, which must know the
840field names and usages of the dope vector C structure and
841use the field offsets etc to find data arrays.
842C, C++, f77, and most f90 arrays continue to be described
843in standard dwarf.
844At the outer allocatable/assumed-shape/pointer
845the DW_AT_location points at the dope vector (so debugger
846calculations use that as a base).
847.P
848It would have been nice to design a dwarf extension
849to handle the above problems, but
850the methods considered to date were not
851any more consistent with standard dwarf than
852this dope vector centric approach: essentially just
853as much work in the debugger appeared necessary either way.
854A better (more dwarf-ish) 
855design would be welcome information.
856
857.H 2 "A simplified sketch of the dwarf information"
858[Note:
859Needs to be written.
860end Note]
861
862.H 2 "A simplified sketch of the dope vector information"
863[Note:
864This one is simplified.
865Details left out that should be here. Amplify.
866end Note]
867This is an overly simplified version of a dope vector,
868presented as an initial hint.
869Full details presented later.
870.nf
871struct simplified{
872  void *base; // pointer to the data this describes
873  long  el_len;
874  int   assoc:1
875  int   ptr_alloc:1
876  int   num_dims:3;
877  struct dims_s {
878    long lb;
879    long ext;
880    long str_m;
881  } dims[7];
882};
883.fi
884Only 'num_dims' elements of dims[] are actually used.
885
886.H 2 "The dwarf information"
887
888Here is dwarf information from the compiler for
889the example above, as printed by dwarfdump(1)
890.nf
891[Note:
892The following may not be the test.
893Having field names with '.' in the name is 
894not such a good idea, as it conflicts with the
895use of '.' in dbx extended naming.
896Something else, like _$, would be much easier
897to work with in dbx (customers won't care about this,
898for the most part, 
899but folks working on dbx will, and in those
900rare circumstances when a customer cares,
901the '.' will be a real problem in dbx.).
902Note that to print something about .base., in dbx one
903would have to do
904	whatis `.base.`
905where that is the grave accent, or back-quote I am using.
906With extended naming one do
907	whatis `.dope.`.`.base.`
908which is hard to type and hard to read.
909end Note]
910
911<2><  388>      DW_TAG_array_type
912                DW_AT_name                  .base.
913                DW_AT_type                  <815>
914                DW_AT_declaration           yes(1)
915<3><  401>      DW_TAG_subrange_type
916                DW_AT_lower_bound           0
917                DW_AT_upper_bound           0
918<2><  405>      DW_TAG_pointer_type
919                DW_AT_type                  <388>
920                DW_AT_byte_size             4
921                DW_AT_address_class         0
922<2><  412>      DW_TAG_structure_type
923                DW_AT_name                  .flds.
924                DW_AT_byte_size             28
925<3><  421>      DW_TAG_member
926                DW_AT_name                  el_len
927                DW_AT_type                  <815>
928                DW_AT_data_member_location  DW_OP_consts 0
929<3><  436>      DW_TAG_member
930                DW_AT_name                  assoc
931                DW_AT_type                  <841>
932                DW_AT_byte_size             0
933                DW_AT_bit_offset            0
934                DW_AT_bit_size              1
935                DW_AT_data_member_location  DW_OP_consts 4
936<3><  453>      DW_TAG_member
937                DW_AT_name                  ptr_alloc
938                DW_AT_type                  <841>
939                DW_AT_byte_size             0
940                DW_AT_bit_offset            1
941                DW_AT_bit_size              1
942                DW_AT_data_member_location  DW_OP_consts 4
943<3><  474>      DW_TAG_member
944                DW_AT_name                  p_or_a
945                DW_AT_type                  <841>
946                DW_AT_byte_size             0
947                DW_AT_bit_offset            2
948                DW_AT_bit_size              2
949                DW_AT_data_member_location  DW_OP_consts 4
950<3><  492>      DW_TAG_member
951                DW_AT_name                  a_contig
952                DW_AT_type                  <841>
953                DW_AT_byte_size             0
954                DW_AT_bit_offset            4
955                DW_AT_bit_size              1
956                DW_AT_data_member_location  DW_OP_consts 4
957<3><  532>      DW_TAG_member
958                DW_AT_name                  num_dims
959                DW_AT_type                  <841>
960                DW_AT_byte_size             0
961                DW_AT_bit_offset            29
962                DW_AT_bit_size              3
963                DW_AT_data_member_location  DW_OP_consts 8
964<3><  572>      DW_TAG_member
965                DW_AT_name                  type_code
966                DW_AT_type                  <841>
967                DW_AT_byte_size             0
968                DW_AT_bit_offset            0
969                DW_AT_bit_size              32
970                DW_AT_data_member_location  DW_OP_consts 16
971<3><  593>      DW_TAG_member
972                DW_AT_name                  orig_base
973                DW_AT_type                  <841>
974                DW_AT_data_member_location  DW_OP_consts 20
975<3><  611>      DW_TAG_member
976                DW_AT_name                  orig_size
977                DW_AT_type                  <815>
978                DW_AT_data_member_location  DW_OP_consts 24
979<2><  630>      DW_TAG_structure_type
980                DW_AT_name                  .dope_bnd.
981                DW_AT_byte_size             12
982<3><  643>      DW_TAG_member
983                DW_AT_name                  lb
984                DW_AT_type                  <815>
985                DW_AT_data_member_location  DW_OP_consts 0
986<3><  654>      DW_TAG_member
987                DW_AT_name                  ext
988                DW_AT_type                  <815>
989                DW_AT_data_member_location  DW_OP_consts 4
990<3><  666>      DW_TAG_member
991                DW_AT_name                  str_m
992                DW_AT_type                  <815>
993                DW_AT_data_member_location  DW_OP_consts 8
994<2><  681>      DW_TAG_array_type
995                DW_AT_name                  .dims.
996                DW_AT_type                  <630>
997                DW_AT_declaration           yes(1)
998<3><  694>      DW_TAG_subrange_type
999                DW_AT_lower_bound           0
1000                DW_AT_upper_bound           0
1001<2><  698>      DW_TAG_structure_type
1002                DW_AT_name                  .dope.
1003                DW_AT_byte_size             44
1004<3><  707>      DW_TAG_member
1005                DW_AT_name                  base
1006                DW_AT_type                  <405>
1007                DW_AT_data_member_location  DW_OP_consts 0
1008<3><  720>      DW_TAG_member
1009                DW_AT_name                  .flds
1010                DW_AT_type                  <412>
1011                DW_AT_data_member_location  DW_OP_consts 4
1012<3><  734>      DW_TAG_member
1013                DW_AT_name                  .dims.
1014                DW_AT_type                  <681>
1015                DW_AT_data_member_location  DW_OP_consts 32
1016<2><  750>      DW_TAG_variable
1017                DW_AT_type                  <815>
1018                DW_AT_location              DW_OP_fbreg -32
1019                DW_AT_artificial            yes(1)
1020<2><  759>      DW_TAG_variable
1021                DW_AT_type                  <815>
1022                DW_AT_location              DW_OP_fbreg -28
1023                DW_AT_artificial            yes(1)
1024<2><  768>      DW_TAG_variable
1025                DW_AT_type                  <815>
1026                DW_AT_location              DW_OP_fbreg -24
1027                DW_AT_artificial            yes(1)
1028<2><  777>      DW_TAG_array_type
1029                DW_AT_type                  <815>
1030                DW_AT_declaration           yes(1)
1031<3><  783>      DW_TAG_subrange_type
1032                DW_AT_lower_bound           <750>
1033                DW_AT_count                 <759>
1034                DW_AT_MIPS_stride           <768>
1035<2><  797>      DW_TAG_variable
1036                DW_AT_decl_file             1
1037                DW_AT_decl_line             1
1038                DW_AT_name                  ARRAY
1039                DW_AT_type                  <698>
1040                DW_AT_location              DW_OP_fbreg -64 DW_OP_deref
1041<1><  815>      DW_TAG_base_type
1042                DW_AT_name                  INTEGER_4
1043                DW_AT_encoding              DW_ATE_signed
1044                DW_AT_byte_size             4
1045<1><  828>      DW_TAG_base_type
1046                DW_AT_name                  INTEGER_8
1047                DW_AT_encoding              DW_ATE_signed
1048                DW_AT_byte_size             8
1049<1><  841>      DW_TAG_base_type
1050                DW_AT_name                  INTEGER*4
1051                DW_AT_encoding              DW_ATE_unsigned
1052                DW_AT_byte_size             4
1053<1><  854>      DW_TAG_base_type
1054                DW_AT_name                  INTEGER*8
1055                DW_AT_encoding              DW_ATE_unsigned
1056                DW_AT_byte_size             8
1057
1058.fi
1059.H 2 "The dope vector structure details"
1060A dope vector is the following C struct, "dopevec.h".
1061Not all the fields are of use to a debugger.
1062It may be that not all fields will show up
1063in the f90 dwarf (since not all are of interest to debuggers).
1064.nf
1065[Note:
1066Need details on the use of each field.
1067And need to know which are really 32 bits and which
1068are 32 or 64.
1069end Note]
1070The following
1071struct 
1072is a representation of all the dope vector fields.
1073It suppresses irrelevant detail and may not
1074exactly match the layout in memory (a debugger must
1075examine the dwarf to find the fields, not
1076compile this structure into the debugger!).
1077.nf
1078struct .dope. {
1079 void *base;   // pointer to data
1080 struct .flds. {
1081  long el_len; // length of element in bytes?
1082  unsigned int assoc:1;     //means?
1083  unsigned int ptr_alloc:1;     //means?
1084  unsigned int p_or_a:2;    //means?
1085  unsigned int a_contig:1;  // means?
1086  unsigned int num_dims: 3; // 0 thru 7
1087  unsigned int type_code:32; //values?
1088  unsigned int orig_base; //void *? means?
1089  long         orig_size; // means?
1090 } .flds;
1091 
1092 struct .dope_bnd. {
1093   long lb   ; // lower bound 
1094   long ext  ;  // means?
1095   long str_m; // means?
1096 } .dims[7];
1097}
1098.fi
1099
1100.H 2 "DW_AT_MIPS_assumed_size       0x2011"
1101This flag was invented to deal with f90 arrays.
1102For example:
1103
1104.nf
1105      pointer (rptr, axx(1))
1106      pointer (iptr, ita(*))
1107      rptr = malloc (100*8)
1108      iptr = malloc (100*4)
1109.fi
1110
1111This flag attribute has the value 'yes' (true, on) if and only if
1112the size is unbounded, as iptr is.
1113Both may show an explicit upper bound of 1 in the dwarf,
1114but this flag notifies the debugger that there is explicitly
1115no user-provided size.
1116
1117So if a user asks for a printout of  the rptr allocated
1118array, the default will be of a single entry (as
1119there is a user slice bound in the source).
1120In contrast, there is no explicit upper bound on the iptr
1121(ita) array so the default slice will use the current bound
1122(a value calculated from the malloc size, see the dope vector).
1123
1124Given explicit requests, more of rptr(axx) can me shown
1125than the default.
1126
1127.H 1 "Line information and Source Position"
1128DWARF does not define the meaning of the term 'source statement'.
1129Nor does it define any way to find the first user-written
1130executable code in a function.
1131.P
1132It does define that a source statement  has a file name,
1133a line number, and a column position (see Sec 6.2, Line Number
1134Information of the Dwarf Version 2 document).
1135We will call those 3 source coordinates a 'source position'
1136in this document.  We'll try not to accidentally call the
1137source position a 'line number' since that is ambiguous
1138as to what it means.
1139
1140.H 2 "Definition of Statement"
1141.P
1142A function prolog is a statement.
1143.P
1144A C, C++, Pascal, or Fortran statement is a statement.
1145.P
1146Each initialized local variable in C,C++ is a statement
1147in that its initialization generates a source position.
1148This means that
1149	x =3, y=4;
1150is two statements.
1151.P
1152For C, C++:
1153The 3 parts a,b,c in for(a;b;c) {d;} are individual statements.
1154The condition portion of a while() and do {} while() is
1155a statement.  (of course d; can be any number of statements)
1156.P
1157For Fortran, the controlling expression of a DO loop is a statement.
1158Is a 'continue' statement in Fortran a DWARF statement?
1159.P
1160Each function return, whether user coded or generated by the
1161compiler, is a statement.  This is so one can step over (in
1162a debugger) the final user-coded statement 
1163(exclusive of the return statement if any) in a function
1164wile not leaving the function scope.
1165.P
1166
1167.H 2 "Finding The First User Code in a Function"
1168
1169.nf
1170Consider:
1171int func(int a)
1172{                    /* source position 1 */
1173	float b = a; /* source position 2 */
1174	int x;       
1175	x = b + 2;   /* source position 3 */
1176}                    /* source position 4 */
1177.fi
1178.P
1179The DIE for a function gives the address range of the function,
1180including function prolog(s) and epilog(s)
1181.P
1182Since there is no scope block for the outer user scope of a
1183function (and thus no beginning address range for the outer
1184user scope:  the DWARF committee explicitly rejected the idea
1185of having a user scope block)
1186it is necessary to use the source position information to find
1187the first user-executable statement.
1188.P
1189This means that the user code for a function must be presumed
1190to begin at the code location of the second source position in
1191the function address range.
1192.P
1193If a function has exactly one source position, the function
1194presumably consists solely of a return.
1195.P
1196If a function has exactly two source positions, the function
1197may consist of a function prolog and a return or a single user
1198statement and a return (there may be no prolog code needed in a
1199leaf function).  In this case, there is no way to be sure which
1200is the first source position of user code, so the rule is to
1201presume that the first address is user code.
1202.P
1203If a function consists of 3 or more source positions, one
1204should assume that the first source position is function prolog and
1205the second is the first user executable code.
1206
1207.H 2 "Using debug_frame Information to find first user statement"
1208In addition to the line information, the debug_frame information
1209can be
1210useful in determining the first user source line.
1211.P
1212Given that a function has more than 1 source position,
1213Find the code location of the second source position, then
1214examine the debug_frame information to determine if the Canonical
1215Frame Address (cfa) is updated before the second source position
1216code location.
1217If the cfa is updated, then one can be pretty sure that the
1218code for the first source position is function prolog code.
1219.P
1220Similarly, if the cfa is restored in the code for
1221a source position, the source position is likely to
1222represent a function exit block.
1223
1224.H 2 "Debugger Use Of Source Position"
1225Command line debuggers, such as dbx and gdb, will ordinarily
1226want to consider multiple statements on one line to be a single
1227statement: doing otherwise is distressing to users since it
1228causes a 'step' command to appear to have no effect.
1229.P
1230An exception for command line debuggers is in determining the
1231first user statement: as detailed above, there one wants to
1232consider the full  source position and will want to consider
1233the function return a separate statement.  It is difficult to
1234make the function return a separate statement 'step' reliably
1235however if a function is coded all on one line or if the last
1236line of user code before the return  is on the same line as the
1237return.
1238.P
1239A graphical debugger has none of these problems if it simply
1240highlights the portion of the line being executed.  In that
1241case, stepping will appear natural even stepping within a
1242line.
1243.H 1 "Known Bugs"
1244Up through at least MIPSpro7.2.1
1245the compiler has been emitting form DW_FORM_DATA1,2, or 4
1246for DW_AT_const_value in DW_TAG_enumerator.
1247And dwarfdump and debuggers have read this with dwarf_formudata()
1248or form_sdata() and gotten some values incorrect.
1249For example, a value of 128 was printed by debuggers as a negative value.
1250Since dwarfdump and the compilers were not written to use the
1251value the same way, their output differed.
1252For negative enumerator values the compiler has been emitting 32bit values
1253in a DW_FORM_DATA4.
1254The compiler should probably be emitting a DW_FORM_sdata for
1255enumerator values.
1256And consumers of enumerator values should then call form_sdata().
1257However, right now, debuggers should call form_udata() and only if
1258it fails, call form_sdata().
1259Anything else will break backward compatibility with
1260the objects produced earlier.
1261.SK
1262.S
1263.TC
1264.CS
1265