passes.texi revision 117395
1@c Copyright (C) 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
2@c 1999, 2000, 2001 Free Software Foundation, Inc.
3@c This is part of the GCC manual.
4@c For copying conditions, see the file gcc.texi.
5
6@node Passes
7@chapter Passes and Files of the Compiler
8@cindex passes and files of the compiler
9@cindex files and passes of the compiler
10@cindex compiler passes and files
11
12@cindex top level of compiler
13The overall control structure of the compiler is in @file{toplev.c}.  This
14file is responsible for initialization, decoding arguments, opening and
15closing files, and sequencing the passes.
16
17@cindex parsing pass
18The parsing pass is invoked only once, to parse the entire input.  A
19high level tree representation is then generated from the input,
20one function at a time.  This tree code is then transformed into RTL
21intermediate code, and processed.  The files involved in transforming
22the trees into RTL are @file{expr.c}, @file{expmed.c}, and
23@file{stmt.c}.
24@c Note, the above files aren't strictly the only files involved. It's
25@c all over the place (function.c, final.c,etc).  However, those are
26@c the files that are supposed to be directly involved, and have
27@c their purpose listed as such, so i've only listed them.
28The order of trees that are processed, is not
29necessarily the same order they are generated from
30the input, due to deferred inlining, and other considerations.
31
32@findex rest_of_compilation
33@findex rest_of_decl_compilation
34Each time the parsing pass reads a complete function definition or
35top-level declaration, it calls either the function
36@code{rest_of_compilation}, or the function
37@code{rest_of_decl_compilation} in @file{toplev.c}, which are
38responsible for all further processing necessary, ending with output of
39the assembler language.  All other compiler passes run, in sequence,
40within @code{rest_of_compilation}.  When that function returns from
41compiling a function definition, the storage used for that function
42definition's compilation is entirely freed, unless it is an inline
43function, or was deferred for some reason (this can occur in
44templates, for example).
45(@pxref{Inline,,An Inline Function is As Fast As a Macro,gcc,Using the
46GNU Compiler Collection (GCC)}).
47
48Here is a list of all the passes of the compiler and their source files.
49Also included is a description of where debugging dumps can be requested
50with @option{-d} options.
51
52@itemize @bullet
53@item
54Parsing.  This pass reads the entire text of a function definition,
55constructing a high level tree representation.  (Because of the semantic
56analysis that takes place during this pass, it does more than is
57formally considered to be parsing.)
58
59The tree representation does not entirely follow C syntax, because it is
60intended to support other languages as well.
61
62Language-specific data type analysis is also done in this pass, and every
63tree node that represents an expression has a data type attached.
64Variables are represented as declaration nodes.
65
66The language-independent source files for parsing are
67@file{tree.c}, @file{fold-const.c}, and @file{stor-layout.c}.
68There are also header files @file{tree.h} and @file{tree.def}
69which define the format of the tree representation.
70
71C preprocessing, for language front ends, that want or require it, is
72performed by cpplib, which is covered in separate documentation.  In
73particular, the internals are covered in @xref{Top, ,Cpplib internals,
74cppinternals, Cpplib Internals}.
75
76@c Avoiding overfull is tricky here.
77The source files to parse C are
78@file{c-convert.c},
79@file{c-decl.c},
80@file{c-errors.c},
81@file{c-lang.c},
82@file{c-objc-common.c},
83@file{c-parse.in},
84@file{c-aux-info.c},
85and
86@file{c-typeck.c},
87along with a header file
88@file{c-tree.h}
89and some files shared with Objective-C and C++.
90
91The source files for parsing C++ are in @file{cp/}.
92They are @file{parse.y},
93@file{class.c},
94@file{cvt.c}, @file{decl.c}, @file{decl2.c},
95@file{except.c},
96@file{expr.c}, @file{init.c}, @file{lex.c},
97@file{method.c}, @file{ptree.c},
98@file{search.c}, @file{spew.c},
99@file{semantics.c}, @file{tree.c},
100@file{typeck2.c}, and
101@file{typeck.c}, along with header files @file{cp-tree.def},
102@file{cp-tree.h}, and @file{decl.h}.
103
104The special source files for parsing Objective-C are in @file{objc/}.
105They are @file{objc-act.c}, @file{objc-tree.def}, and @file{objc-act.h}.
106Certain C-specific files are used for this as well.
107
108The files
109@file{c-common.c},
110@file{c-common.def},
111@file{c-format.c},
112@file{c-opts.c},
113@file{c-pragma.c},
114@file{c-semantics.c},
115and
116@file{c-lex.c},
117along with header files
118@file{c-common.h},
119@file{c-dump.h},
120and
121@file{c-pragma.h},
122are also used for all of the above languages.
123
124
125@cindex Tree optimization
126@item
127Tree optimization.   This is the optimization of the tree
128representation, before converting into RTL code.
129
130@cindex inline on trees, automatic
131Currently, the main optimization performed here is tree-based
132inlining.
133This is implemented in @file{tree-inline.c} and used by both C and C++.
134Note that tree based inlining turns off rtx based inlining (since it's more
135powerful, it would be a waste of time to do rtx based inlining in
136addition).
137
138@cindex constant folding
139@cindex arithmetic simplifications
140@cindex simplifications, arithmetic
141Constant folding and some arithmetic simplifications are also done
142during this pass, on the tree representation.
143The routines that perform these tasks are located in @file{fold-const.c}.
144
145@cindex RTL generation
146@item
147RTL generation.  This is the conversion of syntax tree into RTL code.
148
149@cindex target-parameter-dependent code
150This is where the bulk of target-parameter-dependent code is found,
151since often it is necessary for strategies to apply only when certain
152standard kinds of instructions are available.  The purpose of named
153instruction patterns is to provide this information to the RTL
154generation pass.
155
156@cindex tail recursion optimization
157Optimization is done in this pass for @code{if}-conditions that are
158comparisons, boolean operations or conditional expressions.  Tail
159recursion is detected at this time also.  Decisions are made about how
160best to arrange loops and how to output @code{switch} statements.
161
162@c Avoiding overfull is tricky here.
163The source files for RTL generation include
164@file{stmt.c},
165@file{calls.c},
166@file{expr.c},
167@file{explow.c},
168@file{expmed.c},
169@file{function.c},
170@file{optabs.c}
171and @file{emit-rtl.c}.
172Also, the file
173@file{insn-emit.c}, generated from the machine description by the
174program @code{genemit}, is used in this pass.  The header file
175@file{expr.h} is used for communication within this pass.
176
177@findex genflags
178@findex gencodes
179The header files @file{insn-flags.h} and @file{insn-codes.h},
180generated from the machine description by the programs @code{genflags}
181and @code{gencodes}, tell this pass which standard names are available
182for use and which patterns correspond to them.
183
184Aside from debugging information output, none of the following passes
185refers to the tree structure representation of the function (only
186part of which is saved).
187
188@cindex inline on rtx, automatic
189The decision of whether the function can and should be expanded inline
190in its subsequent callers is made at the end of rtl generation.  The
191function must meet certain criteria, currently related to the size of
192the function and the types and number of parameters it has.  Note that
193this function may contain loops, recursive calls to itself
194(tail-recursive functions can be inlined!), gotos, in short, all
195constructs supported by GCC@.  The file @file{integrate.c} contains
196the code to save a function's rtl for later inlining and to inline that
197rtl when the function is called.  The header file @file{integrate.h}
198is also used for this purpose.
199
200@opindex dr
201The option @option{-dr} causes a debugging dump of the RTL code after
202this pass.  This dump file's name is made by appending @samp{.rtl} to
203the input file name.
204
205@c Should the exception handling pass be talked about here?
206
207@cindex sibling call optimization
208@item
209Sibling call optimization.   This pass performs tail recursion
210elimination, and tail and sibling call optimizations.  The purpose of
211these optimizations is to reduce the overhead of function calls,
212whenever possible.
213
214The source file of this pass is @file{sibcall.c}
215
216@opindex di
217The option @option{-di} causes a debugging dump of the RTL code after
218this pass is run.  This dump file's name is made by appending
219@samp{.sibling} to the input file name.
220
221@cindex jump optimization
222@cindex unreachable code
223@cindex dead code
224@item
225Jump optimization.  This pass simplifies jumps to the following
226instruction, jumps across jumps, and jumps to jumps.  It deletes
227unreferenced labels and unreachable code, except that unreachable code
228that contains a loop is not recognized as unreachable in this pass.
229(Such loops are deleted later in the basic block analysis.)  It also
230converts some code originally written with jumps into sequences of
231instructions that directly set values from the results of comparisons,
232if the machine has such instructions.
233
234Jump optimization is performed two or three times.  The first time is
235immediately following RTL generation.  The second time is after CSE,
236but only if CSE says repeated jump optimization is needed.  The
237last time is right before the final pass.  That time, cross-jumping
238and deletion of no-op move instructions are done together with the
239optimizations described above.
240
241The source file of this pass is @file{jump.c}.
242
243@opindex dj
244The option @option{-dj} causes a debugging dump of the RTL code after
245this pass is run for the first time.  This dump file's name is made by
246appending @samp{.jump} to the input file name.
247
248
249@cindex register use analysis
250@item
251Register scan.  This pass finds the first and last use of each
252register, as a guide for common subexpression elimination.  Its source
253is in @file{regclass.c}.
254
255@cindex jump threading
256@item
257@opindex fthread-jumps
258Jump threading.  This pass detects a condition jump that branches to an
259identical or inverse test.  Such jumps can be @samp{threaded} through
260the second conditional test.  The source code for this pass is in
261@file{jump.c}.  This optimization is only performed if
262@option{-fthread-jumps} is enabled.
263
264@cindex SSA optimizations
265@cindex Single Static Assignment optimizations
266@opindex fssa
267@item
268Static Single Assignment (SSA) based optimization passes.  The
269SSA conversion passes (to/from) are turned on by the @option{-fssa}
270option (it is also done automatically if you enable an SSA optimization pass).
271These passes utilize a form called Static Single Assignment.  In SSA form,
272each variable (pseudo register) is only set once, giving you def-use
273and use-def chains for free, and enabling a lot more optimization
274passes to be run in linear time.
275Conversion to and from SSA form is handled by functions in
276@file{ssa.c}.
277
278@opindex de
279The option @option{-de} causes a debugging dump of the RTL code after
280this pass.  This dump file's name is made by appending @samp{.ssa} to
281the input file name.
282@itemize @bullet
283@cindex SSA Conditional Constant Propagation
284@cindex Conditional Constant Propagation, SSA based
285@cindex conditional constant propagation
286@opindex fssa-ccp
287@item
288SSA Conditional Constant Propagation.  Turned on by the @option{-fssa-ccp}
289option.  This pass performs conditional constant propagation to simplify
290instructions including conditional branches.  This pass is more aggressive
291than the constant propagation done by the CSE and GCSE passes, but operates
292in linear time.
293
294@opindex dW
295The option @option{-dW} causes a debugging dump of the RTL code after
296this pass.  This dump file's name is made by appending @samp{.ssaccp} to
297the input file name.
298
299@cindex SSA DCE
300@cindex DCE, SSA based
301@cindex dead code elimination
302@opindex fssa-dce
303@item
304SSA Aggressive Dead Code Elimination.  Turned on by the @option{-fssa-dce}
305option.  This pass performs elimination of code considered unnecessary because
306it has no externally visible effects on the program.  It operates in
307linear time.
308
309@opindex dX
310The option @option{-dX} causes a debugging dump of the RTL code after
311this pass.  This dump file's name is made by appending @samp{.ssadce} to
312the input file name.
313@end itemize
314
315@cindex common subexpression elimination
316@cindex constant propagation
317@item
318Common subexpression elimination.  This pass also does constant
319propagation.  Its source files are @file{cse.c}, and @file{cselib.c}.
320If constant  propagation causes conditional jumps to become
321unconditional or to become no-ops, jump optimization is run again when
322CSE is finished.
323
324@opindex ds
325The option @option{-ds} causes a debugging dump of the RTL code after
326this pass.  This dump file's name is made by appending @samp{.cse} to
327the input file name.
328
329@cindex global common subexpression elimination
330@cindex constant propagation
331@cindex copy propagation
332@item
333Global common subexpression elimination.  This pass performs two
334different types of GCSE  depending on whether you are optimizing for
335size or not (LCM based GCSE tends to increase code size for a gain in
336speed, while Morel-Renvoise based GCSE does not).
337When optimizing for size, GCSE is done using Morel-Renvoise Partial
338Redundancy Elimination, with the exception that it does not try to move
339invariants out of loops---that is left to  the loop optimization pass.
340If MR PRE GCSE is done, code hoisting (aka unification) is also done, as
341well as load motion.
342If you are optimizing for speed, LCM (lazy code motion) based GCSE is
343done.  LCM is based on the work of Knoop, Ruthing, and Steffen.  LCM
344based GCSE also does loop invariant code motion.  We also perform load
345and store motion when optimizing for speed.
346Regardless of which type of GCSE is used, the GCSE pass also performs
347global constant and  copy propagation.
348
349The source file for this pass is @file{gcse.c}, and the LCM routines
350are in @file{lcm.c}.
351
352@opindex dG
353The option @option{-dG} causes a debugging dump of the RTL code after
354this pass.  This dump file's name is made by appending @samp{.gcse} to
355the input file name.
356
357@cindex loop optimization
358@cindex code motion
359@cindex strength-reduction
360@item
361Loop optimization.  This pass moves constant expressions out of loops,
362and optionally does strength-reduction and loop unrolling as well.
363Its source files are @file{loop.c} and @file{unroll.c}, plus the header
364@file{loop.h} used for communication between them.  Loop unrolling uses
365some functions in @file{integrate.c} and the header @file{integrate.h}.
366Loop dependency analysis routines are contained in @file{dependence.c}.
367
368@opindex dL
369The option @option{-dL} causes a debugging dump of the RTL code after
370this pass.  This dump file's name is made by appending @samp{.loop} to
371the input file name.
372
373@item
374@opindex frerun-cse-after-loop
375If @option{-frerun-cse-after-loop} was enabled, a second common
376subexpression elimination pass is performed after the loop optimization
377pass.  Jump threading is also done again at this time if it was specified.
378
379@opindex dt
380The option @option{-dt} causes a debugging dump of the RTL code after
381this pass.  This dump file's name is made by appending @samp{.cse2} to
382the input file name.
383
384@cindex data flow analysis
385@cindex analysis, data flow
386@cindex basic blocks
387@item
388Data flow analysis (@file{flow.c}).  This pass divides the program
389into basic blocks (and in the process deletes unreachable loops); then
390it computes which pseudo-registers are live at each point in the
391program, and makes the first instruction that uses a value point at
392the instruction that computed the value.
393
394@cindex autoincrement/decrement analysis
395This pass also deletes computations whose results are never used, and
396combines memory references with add or subtract instructions to make
397autoincrement or autodecrement addressing.
398
399@opindex df
400The option @option{-df} causes a debugging dump of the RTL code after
401this pass.  This dump file's name is made by appending @samp{.flow} to
402the input file name.  If stupid register allocation is in use, this
403dump file reflects the full results of such allocation.
404
405@cindex instruction combination
406@item
407Instruction combination (@file{combine.c}).  This pass attempts to
408combine groups of two or three instructions that are related by data
409flow into single instructions.  It combines the RTL expressions for
410the instructions by substitution, simplifies the result using algebra,
411and then attempts to match the result against the machine description.
412
413@opindex dc
414The option @option{-dc} causes a debugging dump of the RTL code after
415this pass.  This dump file's name is made by appending @samp{.combine}
416to the input file name.
417
418@cindex if conversion
419@item
420If-conversion is a transformation that transforms control dependencies
421into data dependencies (IE it transforms conditional code into a
422single control stream).
423It is implemented in the file @file{ifcvt.c}.
424
425@opindex dE
426The option @option{-dE} causes a debugging dump of the RTL code after
427this pass.  This dump file's name is made by appending @samp{.ce} to
428the input file name.
429
430@cindex register movement
431@item
432Register movement (@file{regmove.c}).  This pass looks for cases where
433matching constraints would force an instruction to need a reload, and
434this reload would be a register-to-register move.  It then attempts
435to change the registers used by the instruction to avoid the move
436instruction.
437
438@opindex dN
439The option @option{-dN} causes a debugging dump of the RTL code after
440this pass.  This dump file's name is made by appending @samp{.regmove}
441to the input file name.
442
443@cindex instruction scheduling
444@cindex scheduling, instruction
445@item
446Instruction scheduling (@file{sched.c}).  This pass looks for
447instructions whose output will not be available by the time that it is
448used in subsequent instructions.  (Memory loads and floating point
449instructions often have this behavior on RISC machines).  It re-orders
450instructions within a basic block to try to separate the definition and
451use of items that otherwise would cause pipeline stalls.
452
453Instruction scheduling is performed twice.  The first time is immediately
454after instruction combination and the second is immediately after reload.
455
456@opindex dS
457The option @option{-dS} causes a debugging dump of the RTL code after this
458pass is run for the first time.  The dump file's name is made by
459appending @samp{.sched} to the input file name.
460
461@cindex register allocation
462@item
463Register allocation.  These passes make sure that all occurrences of pseudo
464registers are eliminated, either by allocating them to a hard register,
465replacing them by an equivalent expression (e.g.@: a constant) or by placing
466them on the stack.  This is done in several subpasses:
467
468@itemize @bullet
469@cindex register class preference pass
470@item
471Register class preferencing.  The RTL code is scanned to find out
472which register class is best for each pseudo register.  The source
473file is @file{regclass.c}.
474
475@cindex local register allocation
476@item
477Local register allocation (@file{local-alloc.c}).  This pass allocates
478hard registers to pseudo registers that are used only within one basic
479block.  Because the basic block is linear, it can use fast and
480powerful techniques to do a very good job.
481
482@opindex dl
483The option @option{-dl} causes a debugging dump of the RTL code after
484this pass.  This dump file's name is made by appending @samp{.lreg} to
485the input file name.
486
487@cindex global register allocation
488@item
489Global register allocation (@file{global.c}).  This pass
490allocates hard registers for the remaining pseudo registers (those
491whose life spans are not contained in one basic block).
492
493@cindex graph coloring register allocation
494@opindex fnew-ra
495@opindex dl
496@item
497Graph coloring register allocator.  The files @file{ra.c}, @file{ra-build.c},
498@file{ra-colorize.c}, @file{ra-debug.c}, @file{ra-rewrite.c} together with
499the header @file{ra.h} contain another register allocator, which is used
500when the option @option{-fnew-ra} is given.  In that case it is run instead
501of the above mentioned local and global register allocation passes, and the
502option @option{-dl} causes a debugging dump of its work.
503
504@cindex reloading
505@item
506Reloading.  This pass renumbers pseudo registers with the hardware
507registers numbers they were allocated.  Pseudo registers that did not
508get hard registers are replaced with stack slots.  Then it finds
509instructions that are invalid because a value has failed to end up in
510a register, or has ended up in a register of the wrong kind.  It fixes
511up these instructions by reloading the problematical values
512temporarily into registers.  Additional instructions are generated to
513do the copying.
514
515The reload pass also optionally eliminates the frame pointer and inserts
516instructions to save and restore call-clobbered registers around calls.
517
518Source files are @file{reload.c} and @file{reload1.c}, plus the header
519@file{reload.h} used for communication between them.
520
521@opindex dg
522The option @option{-dg} causes a debugging dump of the RTL code after
523this pass.  This dump file's name is made by appending @samp{.greg} to
524the input file name.
525@end itemize
526
527@cindex instruction scheduling
528@cindex scheduling, instruction
529@item
530Instruction scheduling is repeated here to try to avoid pipeline stalls
531due to memory loads generated for spilled pseudo registers.
532
533@opindex dR
534The option @option{-dR} causes a debugging dump of the RTL code after
535this pass.  This dump file's name is made by appending @samp{.sched2}
536to the input file name.
537
538@cindex basic block reordering
539@cindex reordering, block
540@item
541Basic block reordering.  This pass implements profile guided code
542positioning.  If profile information is not available, various types of
543static analysis are performed to make the predictions normally coming
544from the profile feedback (IE execution frequency, branch probability,
545etc).  It is implemented in the file @file{bb-reorder.c}, and the
546various prediction routines are in @file{predict.c}.
547
548@opindex dB
549The option @option{-dB} causes a debugging dump of the RTL code after
550this pass.  This dump file's name is made by appending @samp{.bbro} to
551the input file name.
552
553@cindex delayed branch scheduling
554@cindex scheduling, delayed branch
555@item
556Delayed branch scheduling.  This optional pass attempts to find
557instructions that can go into the delay slots of other instructions,
558usually jumps and calls.  The source file name is @file{reorg.c}.
559
560@opindex dd
561The option @option{-dd} causes a debugging dump of the RTL code after
562this pass.  This dump file's name is made by appending @samp{.dbr}
563to the input file name.
564
565@cindex branch shortening
566@item
567Branch shortening.  On many RISC machines, branch instructions have a
568limited range.  Thus, longer sequences of instructions must be used for
569long branches.  In this pass, the compiler figures out what how far each
570instruction will be from each other instruction, and therefore whether
571the usual instructions, or the longer sequences, must be used for each
572branch.
573
574@cindex register-to-stack conversion
575@item
576Conversion from usage of some hard registers to usage of a register
577stack may be done at this point.  Currently, this is supported only
578for the floating-point registers of the Intel 80387 coprocessor.   The
579source file name is @file{reg-stack.c}.
580
581@opindex dk
582The options @option{-dk} causes a debugging dump of the RTL code after
583this pass.  This dump file's name is made by appending @samp{.stack}
584to the input file name.
585
586@cindex final pass
587@cindex peephole optimization
588@item
589Final.  This pass outputs the assembler code for the function.  It is
590also responsible for identifying spurious test and compare
591instructions.  Machine-specific peephole optimizations are performed
592at the same time.  The function entry and exit sequences are generated
593directly as assembler code in this pass; they never exist as RTL@.
594
595The source files are @file{final.c} plus @file{insn-output.c}; the
596latter is generated automatically from the machine description by the
597tool @file{genoutput}.  The header file @file{conditions.h} is used
598for communication between these files.
599
600@cindex debugging information generation
601@item
602Debugging information output.  This is run after final because it must
603output the stack slot offsets for pseudo registers that did not get
604hard registers.  Source files are @file{dbxout.c} for DBX symbol table
605format, @file{sdbout.c} for SDB symbol table format,  @file{dwarfout.c}
606for DWARF symbol table format, files @file{dwarf2out.c} and
607@file{dwarf2asm.c} for DWARF2 symbol table format, and @file{vmsdbgout.c}
608for VMS debug symbol table format.
609@end itemize
610
611Some additional files are used by all or many passes:
612
613@itemize @bullet
614@item
615Every pass uses @file{machmode.def} and @file{machmode.h} which define
616the machine modes.
617
618@item
619Several passes use @file{real.h}, which defines the default
620representation of floating point constants and how to operate on them.
621
622@item
623All the passes that work with RTL use the header files @file{rtl.h}
624and @file{rtl.def}, and subroutines in file @file{rtl.c}.  The tools
625@code{gen*} also use these files to read and work with the machine
626description RTL@.
627
628@item
629All the tools that read the machine description use support routines
630found in @file{gensupport.c}, @file{errors.c}, and @file{read-rtl.c}.
631
632@findex genconfig
633@item
634Several passes refer to the header file @file{insn-config.h} which
635contains a few parameters (C macro definitions) generated
636automatically from the machine description RTL by the tool
637@code{genconfig}.
638
639@cindex instruction recognizer
640@item
641Several passes use the instruction recognizer, which consists of
642@file{recog.c} and @file{recog.h}, plus the files @file{insn-recog.c}
643and @file{insn-extract.c} that are generated automatically from the
644machine description by the tools @file{genrecog} and
645@file{genextract}.
646
647@item
648Several passes use the header files @file{regs.h} which defines the
649information recorded about pseudo register usage, and @file{basic-block.h}
650which defines the information recorded about basic blocks.
651
652@item
653@file{hard-reg-set.h} defines the type @code{HARD_REG_SET}, a bit-vector
654with a bit for each hard register, and some macros to manipulate it.
655This type is just @code{int} if the machine has few enough hard registers;
656otherwise it is an array of @code{int} and some of the macros expand
657into loops.
658
659@item
660Several passes use instruction attributes.  A definition of the
661attributes defined for a particular machine is in file
662@file{insn-attr.h}, which is generated from the machine description by
663the program @file{genattr}.  The file @file{insn-attrtab.c} contains
664subroutines to obtain the attribute values for insns and information
665about processor pipeline characteristics for the instruction
666scheduler.  It is generated from the machine description by the
667program @file{genattrtab}.
668@end itemize
669