gcc/doc/cfg.texi

169689Skan@c -*-texinfo-*-
169689Skan@c Copyright (C) 2001, 2003, 2004, 2005 Free Software Foundation, Inc.
169689Skan@c This is part of the GCC manual.
169689Skan@c For copying conditions, see the file gcc.texi.
169689Skan
169689Skan@c ---------------------------------------------------------------------
169689Skan@c Control Flow Graph
169689Skan@c ---------------------------------------------------------------------
169689Skan
169689Skan@node Control Flow
169689Skan@chapter Control Flow Graph
169689Skan@cindex CFG, Control Flow Graph
169689Skan@findex basic-block.h
169689Skan
169689SkanA control flow graph (CFG) is a data structure built on top of the
169689Skanintermediate code representation (the RTL or @code{tree} instruction
169689Skanstream) abstracting the control flow behavior of a function that is
169689Skanbeing compiled.  The CFG is a directed graph where the vertices
169689Skanrepresent basic blocks and edges represent possible transfer of
169689Skancontrol flow from one basic block to another.  The data structures
169689Skanused to represent the control flow graph are defined in
169689Skan@file{basic-block.h}.
169689Skan
169689Skan@menu
169689Skan* Basic Blocks::           The definition and representation of basic blocks.
169689Skan* Edges::                  Types of edges and their representation.
169689Skan* Profile information::    Representation of frequencies and probabilities.
169689Skan* Maintaining the CFG::    Keeping the control flow graph and up to date.
169689Skan* Liveness information::   Using and maintaining liveness information.
169689Skan@end menu
169689Skan
169689Skan
169689Skan@node Basic Blocks
169689Skan@section Basic Blocks
169689Skan
169689Skan@cindex basic block
169689Skan@findex basic_block
169689SkanA basic block is a straight-line sequence of code with only one entry
169689Skanpoint and only one exit.  In GCC, basic blocks are represented using
169689Skanthe @code{basic_block} data type.
169689Skan
169689Skan@findex next_bb, prev_bb, FOR_EACH_BB
169689SkanTwo pointer members of the @code{basic_block} structure are the
169689Skanpointers @code{next_bb} and @code{prev_bb}.  These are used to keep
169689Skandoubly linked chain of basic blocks in the same order as the
169689Skanunderlying instruction stream.  The chain of basic blocks is updated
169689Skantransparently by the provided API for manipulating the CFG@.  The macro
169689Skan@code{FOR_EACH_BB} can be used to visit all the basic blocks in
169689Skanlexicographical order.  Dominator traversals are also possible using
169689Skan@code{walk_dominator_tree}.  Given two basic blocks A and B, block A
169689Skandominates block B if A is @emph{always} executed before B@.
169689Skan
169689Skan@findex BASIC_BLOCK
169689SkanThe @code{BASIC_BLOCK} array contains all basic blocks in an
169689Skanunspecified order.  Each @code{basic_block} structure has a field
169689Skanthat holds a unique integer identifier @code{index} that is the
169689Skanindex of the block in the @code{BASIC_BLOCK} array.
169689SkanThe total number of basic blocks in the function is
169689Skan@code{n_basic_blocks}.  Both the basic block indices and
169689Skanthe total number of basic blocks may vary during the compilation
169689Skanprocess, as passes reorder, create, duplicate, and destroy basic
169689Skanblocks.  The index for any block should never be greater than
169689Skan@code{last_basic_block}.
169689Skan
169689Skan@findex ENTRY_BLOCK_PTR, EXIT_BLOCK_PTR
169689SkanSpecial basic blocks represent possible entry and exit points of a
169689Skanfunction.  These blocks are called @code{ENTRY_BLOCK_PTR} and
169689Skan@code{EXIT_BLOCK_PTR}.  These blocks do not contain any code, and are
169689Skannot elements of the @code{BASIC_BLOCK} array.  Therefore they have
169689Skanbeen assigned unique, negative index numbers.
169689Skan
169689SkanEach @code{basic_block} also contains pointers to the first
169689Skaninstruction (the @dfn{head}) and the last instruction (the @dfn{tail})
169689Skanor @dfn{end} of the instruction stream contained in a basic block.  In
169689Skanfact, since the @code{basic_block} data type is used to represent
169689Skanblocks in both major intermediate representations of GCC (@code{tree}
169689Skanand RTL), there are pointers to the head and end of a basic block for
169689Skanboth representations.
169689Skan
169689Skan@findex NOTE_INSN_BASIC_BLOCK, CODE_LABEL, notes
169689SkanFor RTL, these pointers are @code{rtx head, end}.  In the RTL function
169689Skanrepresentation, the head pointer always points either to a
169689Skan@code{NOTE_INSN_BASIC_BLOCK} or to a @code{CODE_LABEL}, if present.
169689SkanIn the RTL representation of a function, the instruction stream
169689Skancontains not only the ``real'' instructions, but also @dfn{notes}.
169689SkanAny function that moves or duplicates the basic blocks needs
169689Skanto take care of updating of these notes.  Many of these notes expect
169689Skanthat the instruction stream consists of linear regions, making such
169689Skanupdates difficult.   The @code{NOTE_INSN_BASIC_BLOCK} note is the only
169689Skankind of note that may appear in the instruction stream contained in a
169689Skanbasic block.  The instruction stream of a basic block always follows a
169689Skan@code{NOTE_INSN_BASIC_BLOCK},  but zero or more @code{CODE_LABEL}
169689Skannodes can precede the block note.   A basic block ends by control flow
169689Skaninstruction or last instruction before following @code{CODE_LABEL} or
169689Skan@code{NOTE_INSN_BASIC_BLOCK}.  A @code{CODE_LABEL} cannot appear in
169689Skanthe instruction stream of a basic block.
169689Skan
169689Skan@findex can_fallthru
169689Skan@cindex table jump
169689SkanIn addition to notes, the jump table vectors are also represented as
169689Skan``pseudo-instructions'' inside the insn stream.  These vectors never
169689Skanappear in the basic block and should always be placed just after the
169689Skantable jump instructions referencing them.  After removing the
169689Skantable-jump it is often difficult to eliminate the code computing the
169689Skanaddress and referencing the vector, so cleaning up these vectors is
169689Skanpostponed until after liveness analysis.   Thus the jump table vectors
169689Skanmay appear in the insn stream unreferenced and without any purpose.
169689SkanBefore any edge is made @dfn{fall-thru}, the existence of such
169689Skanconstruct in the way needs to be checked by calling
169689Skan@code{can_fallthru} function.
169689Skan
169689Skan@cindex block statement iterators
169689SkanFor the @code{tree} representation, the head and end of the basic block
169689Skanare being pointed to by the @code{stmt_list} field, but this special
169689Skan@code{tree} should never be referenced directly.  Instead, at the tree
169689Skanlevel abstract containers and iterators are used to access statements
169689Skanand expressions in basic blocks.  These iterators are called
169689Skan@dfn{block statement iterators} (BSIs).  Grep for @code{^bsi}
169689Skanin the various @file{tree-*} files.
169689SkanThe following snippet will pretty-print all the statements of the
169689Skanprogram in the GIMPLE representation.
169689Skan
169689Skan@smallexample
169689SkanFOR_EACH_BB (bb)
169689Skan  @{
169689Skan     block_stmt_iterator si;
169689Skan
169689Skan     for (si = bsi_start (bb); !bsi_end_p (si); bsi_next (&si))
169689Skan       @{
169689Skan          tree stmt = bsi_stmt (si);
169689Skan          print_generic_stmt (stderr, stmt, 0);
169689Skan       @}
169689Skan  @}
169689Skan@end smallexample
169689Skan
169689Skan
169689Skan@node Edges
169689Skan@section Edges
169689Skan
169689Skan@cindex edge in the flow graph
169689Skan@findex edge
169689SkanEdges represent possible control flow transfers from the end of some
169689Skanbasic block A to the head of another basic block B@.  We say that A is
169689Skana predecessor of B, and B is a successor of A@.  Edges are represented
169689Skanin GCC with the @code{edge} data type.  Each @code{edge} acts as a
169689Skanlink between two basic blocks: the @code{src} member of an edge
169689Skanpoints to the predecessor basic block of the @code{dest} basic block.
169689SkanThe members @code{preds} and @code{succs} of the @code{basic_block} data
169689Skantype point to type-safe vectors of edges to the predecessors and
169689Skansuccessors of the block.
169689Skan
169689Skan@cindex edge iterators
169689SkanWhen walking the edges in an edge vector, @dfn{edge iterators} should
169689Skanbe used.  Edge iterators are constructed using the
169689Skan@code{edge_iterator} data structure and several methods are available
169689Skanto operate on them:
169689Skan
169689Skan@ftable @code
169689Skan@item ei_start
169689SkanThis function initializes an @code{edge_iterator} that points to the
169689Skanfirst edge in a vector of edges.
169689Skan
169689Skan@item ei_last
169689SkanThis function initializes an @code{edge_iterator} that points to the
169689Skanlast edge in a vector of edges.
169689Skan
169689Skan@item ei_end_p
169689SkanThis predicate is @code{true} if an @code{edge_iterator} represents
169689Skanthe last edge in an edge vector.
169689Skan
169689Skan@item ei_one_before_end_p
169689SkanThis predicate is @code{true} if an @code{edge_iterator} represents
169689Skanthe second last edge in an edge vector.
169689Skan
169689Skan@item ei_next
169689SkanThis function takes a pointer to an @code{edge_iterator} and makes it
169689Skanpoint to the next edge in the sequence.
169689Skan
169689Skan@item ei_prev
169689SkanThis function takes a pointer to an @code{edge_iterator} and makes it
169689Skanpoint to the previous edge in the sequence.
169689Skan
169689Skan@item ei_edge
169689SkanThis function returns the @code{edge} currently pointed to by an
169689Skan@code{edge_iterator}.
169689Skan
169689Skan@item ei_safe_safe
169689SkanThis function returns the @code{edge} currently pointed to by an
169689Skan@code{edge_iterator}, but returns @code{NULL} if the iterator is
169689Skanpointing at the end of the sequence.  This function has been provided
169689Skanfor existing code makes the assumption that a @code{NULL} edge
169689Skanindicates the end of the sequence.
169689Skan
169689Skan@end ftable
169689Skan
169689SkanThe convenience macro @code{FOR_EACH_EDGE} can be used to visit all of
169689Skanthe edges in a sequence of predecessor or successor edges.  It must
169689Skannot be used when an element might be removed during the traversal,
169689Skanotherwise elements will be missed.  Here is an example of how to use
169689Skanthe macro:
169689Skan
169689Skan@smallexample
169689Skanedge e;
169689Skanedge_iterator ei;
169689Skan
169689SkanFOR_EACH_EDGE (e, ei, bb->succs)
169689Skan  @{
169689Skan     if (e->flags & EDGE_FALLTHRU)
169689Skan       break;
169689Skan  @}
169689Skan@end smallexample
169689Skan
169689Skan@findex fall-thru
169689SkanThere are various reasons why control flow may transfer from one block
169689Skanto another.  One possibility is that some instruction, for example a
169689Skan@code{CODE_LABEL}, in a linearized instruction stream just always
169689Skanstarts a new basic block.  In this case a @dfn{fall-thru} edge links
169689Skanthe basic block to the first following basic block.  But there are
169689Skanseveral other reasons why edges may be created.  The @code{flags}
169689Skanfield of the @code{edge} data type is used to store information
169689Skanabout the type of edge we are dealing with.  Each edge is of one of
169689Skanthe following types:
169689Skan
169689Skan@table @emph
169689Skan@item jump
169689SkanNo type flags are set for edges corresponding to jump instructions.
169689SkanThese edges are used for unconditional or conditional jumps and in
169689SkanRTL also for table jumps.  They are the easiest to manipulate as they
169689Skanmay be freely redirected when the flow graph is not in SSA form.
169689Skan
169689Skan@item fall-thru
169689Skan@findex EDGE_FALLTHRU, force_nonfallthru
169689SkanFall-thru edges are present in case where the basic block may continue
169689Skanexecution to the following one without branching.  These edges have
169689Skanthe @code{EDGE_FALLTHRU} flag set.  Unlike other types of edges, these
169689Skanedges must come into the basic block immediately following in the
169689Skaninstruction stream.  The function @code{force_nonfallthru} is
169689Skanavailable to insert an unconditional jump in the case that redirection
169689Skanis needed.  Note that this may require creation of a new basic block.
169689Skan
169689Skan@item exception handling
169689Skan@cindex exception handling
169689Skan@findex EDGE_ABNORMAL, EDGE_EH
169689SkanException handling edges represent possible control transfers from a
169689Skantrapping instruction to an exception handler.  The definition of
169689Skan``trapping'' varies.  In C++, only function calls can throw, but for
169689SkanJava, exceptions like division by zero or segmentation fault are
169689Skandefined and thus each instruction possibly throwing this kind of
169689Skanexception needs to be handled as control flow instruction.  Exception
169689Skanedges have the @code{EDGE_ABNORMAL} and @code{EDGE_EH} flags set.
169689Skan
169689Skan@findex purge_dead_edges
169689SkanWhen updating the instruction stream it is easy to change possibly
169689Skantrapping instruction to non-trapping, by simply removing the exception
169689Skanedge.  The opposite conversion is difficult, but should not happen
169689Skananyway.  The edges can be eliminated via @code{purge_dead_edges} call.
169689Skan
169689Skan@findex REG_EH_REGION, EDGE_ABNORMAL_CALL
169689SkanIn the RTL representation, the destination of an exception edge is
169689Skanspecified by @code{REG_EH_REGION} note attached to the insn.
169689SkanIn case of a trapping call the @code{EDGE_ABNORMAL_CALL} flag is set
169689Skantoo.  In the @code{tree} representation, this extra flag is not set.
169689Skan
169689Skan@findex may_trap_p, tree_could_trap_p
169689SkanIn the RTL representation, the predicate @code{may_trap_p} may be used
169689Skanto check whether instruction still may trap or not.  For the tree
169689Skanrepresentation, the @code{tree_could_trap_p} predicate is available,
169689Skanbut this predicate only checks for possible memory traps, as in
169689Skandereferencing an invalid pointer location.
169689Skan
169689Skan
169689Skan@item sibling calls
169689Skan@cindex sibling call
169689Skan@findex EDGE_ABNORMAL, EDGE_SIBCALL
169689SkanSibling calls or tail calls terminate the function in a non-standard
169689Skanway and thus an edge to the exit must be present.
169689Skan@code{EDGE_SIBCALL} and @code{EDGE_ABNORMAL} are set in such case.
169689SkanThese edges only exist in the RTL representation.
169689Skan
169689Skan@item computed jumps
169689Skan@cindex computed jump
169689Skan@findex EDGE_ABNORMAL
169689SkanComputed jumps contain edges to all labels in the function referenced
169689Skanfrom the code.  All those edges have @code{EDGE_ABNORMAL} flag set.
169689SkanThe edges used to represent computed jumps often cause compile time
169689Skanperformance problems, since functions consisting of many taken labels
169689Skanand many computed jumps may have @emph{very} dense flow graphs, so
169689Skanthese edges need to be handled with special care.  During the earlier
169689Skanstages of the compilation process, GCC tries to avoid such dense flow
169689Skangraphs by factoring computed jumps.  For example, given the following
169689Skanseries of jumps,
169689Skan
169689Skan@smallexample
169689Skan  goto *x;
169689Skan  [ ... ]
169689Skan
169689Skan  goto *x;
169689Skan  [ ... ]
169689Skan
169689Skan  goto *x;
169689Skan  [ ... ]
169689Skan@end smallexample
169689Skan
169689Skan@noindent
169689Skanfactoring the computed jumps results in the following code sequence
169689Skanwhich has a much simpler flow graph:
169689Skan
169689Skan@smallexample
169689Skan  goto y;
169689Skan  [ ... ]
169689Skan
169689Skan  goto y;
169689Skan  [ ... ]
169689Skan
169689Skan  goto y;
169689Skan  [ ... ]
169689Skan
169689Skany:
169689Skan  goto *x;
169689Skan@end smallexample
169689Skan
169689SkanHowever, the classic problem with this transformation is that it has a
169689Skanruntime cost in there resulting code: An extra jump.  Therefore, the
169689Skancomputed jumps are un-factored in the later passes of the compiler.
169689SkanBe aware of that when you work on passes in that area.  There have
169689Skanbeen numerous examples already where the compile time for code with
169689Skanunfactored computed jumps caused some serious headaches.
169689Skan
169689Skan@item nonlocal goto handlers
169689Skan@cindex nonlocal goto handler
169689Skan@findex EDGE_ABNORMAL, EDGE_ABNORMAL_CALL
169689SkanGCC allows nested functions to return into caller using a @code{goto}
169689Skanto a label passed to as an argument to the callee.  The labels passed
169689Skanto nested functions contain special code to cleanup after function
169689Skancall.  Such sections of code are referred to as ``nonlocal goto
169689Skanreceivers''.  If a function contains such nonlocal goto receivers, an
169689Skanedge from the call to the label is created with the
169689Skan@code{EDGE_ABNORMAL} and @code{EDGE_ABNORMAL_CALL} flags set.
169689Skan
169689Skan@item function entry points
169689Skan@cindex function entry point, alternate function entry point
169689Skan@findex LABEL_ALTERNATE_NAME
169689SkanBy definition, execution of function starts at basic block 0, so there
169689Skanis always an edge from the @code{ENTRY_BLOCK_PTR} to basic block 0.
169689SkanThere is no @code{tree} representation for alternate entry points at
169689Skanthis moment.  In RTL, alternate entry points are specified by
169689Skan@code{CODE_LABEL} with @code{LABEL_ALTERNATE_NAME} defined.  This
169689Skanfeature is currently used for multiple entry point prologues and is
169689Skanlimited to post-reload passes only.  This can be used by back-ends to
169689Skanemit alternate prologues for functions called from different contexts.
169689SkanIn future full support for multiple entry functions defined by Fortran
169689Skan90 needs to be implemented.
169689Skan
169689Skan@item function exits
169689SkanIn the pre-reload representation a function terminates after the last
169689Skaninstruction in the insn chain and no explicit return instructions are
169689Skanused.  This corresponds to the fall-thru edge into exit block.  After
169689Skanreload, optimal RTL epilogues are used that use explicit (conditional)
169689Skanreturn instructions that are represented by edges with no flags set.
169689Skan
169689Skan@end table
169689Skan
169689Skan
169689Skan@node Profile information
169689Skan@section Profile information
169689Skan
169689Skan@cindex profile representation
169689SkanIn many cases a compiler must make a choice whether to trade speed in
169689Skanone part of code for speed in another, or to trade code size for code
169689Skanspeed.  In such cases it is useful to know information about how often
169689Skansome given block will be executed.  That is the purpose for
169689Skanmaintaining profile within the flow graph.
169689SkanGCC can handle profile information obtained through @dfn{profile
169689Skanfeedback}, but it can also  estimate branch probabilities based on
169689Skanstatics and heuristics.
169689Skan
169689Skan@cindex profile feedback
169689SkanThe feedback based profile is produced by compiling the program with
169689Skaninstrumentation, executing it on a train run and reading the numbers
169689Skanof executions of basic blocks and edges back to the compiler while
169689Skanre-compiling the program to produce the final executable.  This method
169689Skanprovides very accurate information about where a program spends most
169689Skanof its time on the train run.  Whether it matches the average run of
169689Skancourse depends on the choice of train data set, but several studies
169689Skanhave shown that the behavior of a program usually changes just
169689Skanmarginally over different data sets.
169689Skan
169689Skan@cindex Static profile estimation
169689Skan@cindex branch prediction
169689Skan@findex predict.def
169689SkanWhen profile feedback is not available, the compiler may be asked to
169689Skanattempt to predict the behavior of each branch in the program using a
169689Skanset of heuristics (see @file{predict.def} for details) and compute
169689Skanestimated frequencies of each basic block by propagating the
169689Skanprobabilities over the graph.
169689Skan
169689Skan@findex frequency, count, BB_FREQ_BASE
169689SkanEach @code{basic_block} contains two integer fields to represent
169689Skanprofile information: @code{frequency} and @code{count}.  The
169689Skan@code{frequency} is an estimation how often is basic block executed
169689Skanwithin a function.  It is represented as an integer scaled in the
169689Skanrange from 0 to @code{BB_FREQ_BASE}.  The most frequently executed
169689Skanbasic block in function is initially set to @code{BB_FREQ_BASE} and
169689Skanthe rest of frequencies are scaled accordingly.  During optimization,
169689Skanthe frequency of the most frequent basic block can both decrease (for
169689Skaninstance by loop unrolling) or grow (for instance by cross-jumping
169689Skanoptimization), so scaling sometimes has to be performed multiple
169689Skantimes.
169689Skan
169689Skan@findex gcov_type
169689SkanThe @code{count} contains hard-counted numbers of execution measured
169689Skanduring training runs and is nonzero only when profile feedback is
169689Skanavailable.  This value is represented as the host's widest integer
169689Skan(typically a 64 bit integer) of the special type @code{gcov_type}.
169689Skan
169689SkanMost optimization passes can use only the frequency information of a
169689Skanbasic block, but a few passes may want to know hard execution counts.
169689SkanThe frequencies should always match the counts after scaling, however
169689Skanduring updating of the profile information numerical error may
169689Skanaccumulate into quite large errors.
169689Skan
169689Skan@findex REG_BR_PROB_BASE, EDGE_FREQUENCY
169689SkanEach edge also contains a branch probability field: an integer in the
169689Skanrange from 0 to @code{REG_BR_PROB_BASE}.  It represents probability of
169689Skanpassing control from the end of the @code{src} basic block to the
169689Skan@code{dest} basic block, i.e.@: the probability that control will flow
169689Skanalong this edge.   The @code{EDGE_FREQUENCY} macro is available to
169689Skancompute how frequently a given edge is taken.  There is a @code{count}
169689Skanfield for each edge as well, representing same information as for a
169689Skanbasic block.
169689Skan
169689SkanThe basic block frequencies are not represented in the instruction
169689Skanstream, but in the RTL representation the edge frequencies are
169689Skanrepresented for conditional jumps (via the @code{REG_BR_PROB}
169689Skanmacro) since they are used when instructions are output to the
169689Skanassembly file and the flow graph is no longer maintained.
169689Skan
169689Skan@cindex reverse probability
169689SkanThe probability that control flow arrives via a given edge to its
169689Skandestination basic block is called @dfn{reverse probability} and is not
169689Skandirectly represented, but it may be easily computed from frequencies
169689Skanof basic blocks.
169689Skan
169689Skan@findex redirect_edge_and_branch
169689SkanUpdating profile information is a delicate task that can unfortunately
169689Skannot be easily integrated with the CFG manipulation API@.  Many of the
169689Skanfunctions and hooks to modify the CFG, such as
169689Skan@code{redirect_edge_and_branch}, do not have enough information to
169689Skaneasily update the profile, so updating it is in the majority of cases
169689Skanleft up to the caller.  It is difficult to uncover bugs in the profile
169689Skanupdating code, because they manifest themselves only by producing
169689Skanworse code, and checking profile consistency is not possible because
169689Skanof numeric error accumulation.  Hence special attention needs to be
169689Skangiven to this issue in each pass that modifies the CFG@.
169689Skan
169689Skan@findex REG_BR_PROB_BASE, BB_FREQ_BASE, count
169689SkanIt is important to point out that @code{REG_BR_PROB_BASE} and
169689Skan@code{BB_FREQ_BASE} are both set low enough to be possible to compute
169689Skansecond power of any frequency or probability in the flow graph, it is
169689Skannot possible to even square the @code{count} field, as modern CPUs are
169689Skanfast enough to execute $2^32$ operations quickly.
169689Skan
169689Skan
169689Skan@node Maintaining the CFG
169689Skan@section Maintaining the CFG
169689Skan@findex cfghooks.h
169689Skan
169689SkanAn important task of each compiler pass is to keep both the control
169689Skanflow graph and all profile information up-to-date.  Reconstruction of
169689Skanthe control flow graph after each pass is not an option, since it may be
169689Skanvery expensive and lost profile information cannot be reconstructed at
169689Skanall.
169689Skan
169689SkanGCC has two major intermediate representations, and both use the
169689Skan@code{basic_block} and @code{edge} data types to represent control
169689Skanflow.  Both representations share as much of the CFG maintenance code
169689Skanas possible.  For each representation, a set of @dfn{hooks} is defined
169689Skanso that each representation can provide its own implementation of CFG
169689Skanmanipulation routines when necessary.  These hooks are defined in
169689Skan@file{cfghooks.h}.  There are hooks for almost all common CFG
169689Skanmanipulations, including block splitting and merging, edge redirection
169689Skanand creating and deleting basic blocks.  These hooks should provide
169689Skaneverything you need to maintain and manipulate the CFG in both the RTL
169689Skanand @code{tree} representation.
169689Skan
169689SkanAt the moment, the basic block boundaries are maintained transparently
169689Skanwhen modifying instructions, so there rarely is a need to move them
169689Skanmanually (such as in case someone wants to output instruction outside
169689Skanbasic block explicitly).
169689SkanOften the CFG may be better viewed as integral part of instruction
169689Skanchain, than structure built on the top of it.  However, in principle
169689Skanthe control flow graph for the @code{tree} representation is
169689Skan@emph{not} an integral part of the representation, in that a function
169689Skantree may be expanded without first building a  flow graph for the
169689Skan@code{tree} representation at all.  This happens when compiling
169689Skanwithout any @code{tree} optimization enabled.  When the @code{tree}
169689Skanoptimizations are enabled and the instruction stream is rewritten in
169689SkanSSA form, the CFG is very tightly coupled with the instruction stream.
169689SkanIn particular, statement insertion and removal has to be done with
169689Skancare.  In fact, the whole @code{tree} representation can not be easily
169689Skanused or maintained without proper maintenance of the CFG
169689Skansimultaneously.
169689Skan
169689Skan@findex BLOCK_FOR_INSN, bb_for_stmt
169689SkanIn the RTL representation, each instruction has a
169689Skan@code{BLOCK_FOR_INSN} value that represents pointer to the basic block
169689Skanthat contains the instruction.  In the @code{tree} representation, the
169689Skanfunction @code{bb_for_stmt} returns a pointer to the basic block
169689Skancontaining the queried statement.
169689Skan
169689Skan@cindex block statement iterators
169689SkanWhen changes need to be applied to a function in its @code{tree}
169689Skanrepresentation, @dfn{block statement iterators} should be used.  These
169689Skaniterators provide an integrated abstraction of the flow graph and the
169689Skaninstruction stream.  Block statement iterators iterators are
169689Skanconstructed using the @code{block_stmt_iterator} data structure and
169689Skanseveral modifier are available, including the following:
169689Skan
169689Skan@ftable @code
169689Skan@item bsi_start
169689SkanThis function initializes a @code{block_stmt_iterator} that points to
169689Skanthe first non-empty statement in a basic block.
169689Skan
169689Skan@item bsi_last
169689SkanThis function initializes a @code{block_stmt_iterator} that points to
169689Skanthe last statement in a basic block.
169689Skan
169689Skan@item bsi_end_p
169689SkanThis predicate is @code{true} if a @code{block_stmt_iterator}
169689Skanrepresents the end of a basic block.
169689Skan
169689Skan@item bsi_next
169689SkanThis function takes a @code{block_stmt_iterator} and makes it point to
169689Skanits successor.
169689Skan
169689Skan@item bsi_prev
169689SkanThis function takes a @code{block_stmt_iterator} and makes it point to
169689Skanits predecessor.
169689Skan
169689Skan@item bsi_insert_after
169689SkanThis function inserts a statement after the @code{block_stmt_iterator}
169689Skanpassed in.  The final parameter determines whether the statement
169689Skaniterator is updated to point to the newly inserted statement, or left
169689Skanpointing to the original statement.
169689Skan
169689Skan@item bsi_insert_before
169689SkanThis function inserts a statement before the @code{block_stmt_iterator}
169689Skanpassed in.  The final parameter determines whether the statement
169689Skaniterator is updated to point to the newly inserted statement, or left
169689Skanpointing to the original  statement.
169689Skan
169689Skan@item bsi_remove
169689SkanThis function removes the @code{block_stmt_iterator} passed in and
169689Skanrechains the remaining statements in a basic block, if any.
169689Skan@end ftable
169689Skan
169689Skan@findex BB_HEAD, BB_END
169689SkanIn the RTL representation, the macros @code{BB_HEAD} and @code{BB_END}
169689Skanmay be used to get the head and end @code{rtx} of a basic block.  No
169689Skanabstract iterators are defined for traversing the insn chain, but you
169689Skancan just use @code{NEXT_INSN} and @code{PREV_INSN} instead.  See
169689Skan@xref{Insns}.
169689Skan
169689Skan@findex purge_dead_edges
169689SkanUsually a code manipulating pass simplifies the instruction stream and
169689Skanthe flow of control, possibly eliminating some edges.  This may for
169689Skanexample happen when a conditional jump is replaced with an
169689Skanunconditional jump, but also when simplifying possibly trapping
169689Skaninstruction to non-trapping while compiling Java.  Updating of edges
169689Skanis not transparent and each optimization pass is required to do so
169689Skanmanually.  However only few cases occur in practice.  The pass may
169689Skancall @code{purge_dead_edges} on a given basic block to remove
169689Skansuperfluous edges, if any.
169689Skan
169689Skan@findex redirect_edge_and_branch, redirect_jump
169689SkanAnother common scenario is redirection of branch instructions, but
169689Skanthis is best modeled as redirection of edges in the control flow graph
169689Skanand thus use of @code{redirect_edge_and_branch} is preferred over more
169689Skanlow level functions, such as @code{redirect_jump} that operate on RTL
169689Skanchain only.  The CFG hooks defined in @file{cfghooks.h} should provide
169689Skanthe complete API required for manipulating and maintaining the CFG@.
169689Skan
169689Skan@findex split_block
169689SkanIt is also possible that a pass has to insert control flow instruction
169689Skaninto the middle of a basic block, thus creating an entry point in the
169689Skanmiddle of the basic block, which is impossible by definition: The
169689Skanblock must be split to make sure it only has one entry point, i.e.@: the
169689Skanhead of the basic block.  The CFG hook @code{split_block} may be used
169689Skanwhen an instruction in the middle of a basic block has to become the
169689Skantarget of a jump or branch instruction.
169689Skan
169689Skan@findex insert_insn_on_edge
169689Skan@findex commit_edge_insertions
169689Skan@findex bsi_insert_on_edge
169689Skan@findex bsi_commit_edge_inserts
169689Skan@cindex edge splitting
169689SkanFor a global optimizer, a common operation is to split edges in the
169689Skanflow graph and insert instructions on them.  In the RTL
169689Skanrepresentation, this can be easily done using the
169689Skan@code{insert_insn_on_edge} function that emits an instruction
169689Skan``on the edge'', caching it for a later @code{commit_edge_insertions}
169689Skancall that will take care of moving the inserted instructions off the
169689Skanedge into the instruction stream contained in a basic block.  This
169689Skanincludes the creation of new basic blocks where needed.  In the
169689Skan@code{tree} representation, the equivalent functions are
169689Skan@code{bsi_insert_on_edge} which inserts a block statement
169689Skaniterator on an edge, and @code{bsi_commit_edge_inserts} which flushes
169689Skanthe instruction to actual instruction stream.
169689Skan
169689SkanWhile debugging the optimization pass, an @code{verify_flow_info}
169689Skanfunction may be useful to find bugs in the control flow graph updating
169689Skancode.
169689Skan
169689SkanNote that at present, the representation of control flow in the
169689Skan@code{tree} representation is discarded before expanding to RTL@.
169689SkanLong term the CFG should be maintained and ``expanded'' to the
169689SkanRTL representation along with the function @code{tree} itself.
169689Skan
169689Skan
169689Skan@node Liveness information
169689Skan@section Liveness information
169689Skan@cindex Liveness representation
169689SkanLiveness information is useful to determine whether some register is
169689Skan``live'' at given point of program, i.e.@: that it contains a value that
169689Skanmay be used at a later point in the program.  This information is
169689Skanused, for instance, during register allocation, as the pseudo
169689Skanregisters only need to be assigned to a unique hard register or to a
169689Skanstack slot if they are live.  The hard registers and stack slots may
169689Skanbe freely reused for other values when a register is dead.
169689Skan
169689Skan@findex REG_DEAD, REG_UNUSED
169689SkanThe liveness information is stored partly in the RTL instruction
169689Skanstream and partly in the flow graph.  Local information is stored in
169689Skanthe instruction stream:
169689SkanEach instruction may contain @code{REG_DEAD} notes representing that
169689Skanthe value of a given register is no longer needed, or
169689Skan@code{REG_UNUSED} notes representing that the value computed by the
169689Skaninstruction is never used.  The second is useful for instructions
169689Skancomputing multiple values at once.
169689Skan
169689Skan@findex global_live_at_start, global_live_at_end
169689SkanGlobal liveness information is stored in the control flow graph.
169689SkanEach basic block contains two bitmaps, @code{global_live_at_start} and
169689Skan@code{global_live_at_end} representing liveness of each register at
169689Skanthe entry and exit of the basic block.  The file @code{flow.c}
169689Skancontains functions to compute liveness of each register at any given
169689Skanplace in the instruction stream using this information.
169689Skan
169689Skan@findex BB_DIRTY, clear_bb_flags, update_life_info_in_dirty_blocks
169689SkanLiveness is expensive to compute and thus it is desirable to keep it
169689Skanup to date during code modifying passes.  This can be easily
169689Skanaccomplished using the @code{flags} field of a basic block.  Functions
169689Skanmodifying the instruction stream automatically set the @code{BB_DIRTY}
169689Skanflag of a modifies basic block, so the pass may simply
169689Skanuse@code{clear_bb_flags} before doing any modifications and then ask
169689Skanthe data flow module to have liveness updated via the
169689Skan@code{update_life_info_in_dirty_blocks} function.
169689Skan
169689SkanThis scheme works reliably as long as no control flow graph
169689Skantransformations are done.  The task of updating liveness after control
169689Skanflow graph changes is more difficult as normal iterative data flow
169689Skananalysis may produce invalid results or get into an infinite cycle
169689Skanwhen the initial solution is not below the desired one.  Only simple
169689Skantransformations, like splitting basic blocks or inserting on edges,
169689Skanare safe, as functions to implement them already know how to update
169689Skanliveness information locally.