gcc/doc/loop.texi

169689Skan@c Copyright (c) 2006 Free Software Foundation, Inc.
169689Skan@c Free Software Foundation, Inc.
169689Skan@c This is part of the GCC manual.
169689Skan@c For copying conditions, see the file gcc.texi.
169689Skan
169689Skan@c ---------------------------------------------------------------------
169689Skan@c Loop Representation
169689Skan@c ---------------------------------------------------------------------
169689Skan
169689Skan@node Loop Analysis and Representation
169689Skan@chapter Analysis and Representation of Loops
169689Skan
169689SkanGCC provides extensive infrastructure for work with natural loops, i.e.,
169689Skanstrongly connected components of CFG with only one entry block.  This
169689Skanchapter describes representation of loops in GCC, both on GIMPLE and in
169689SkanRTL, as well as the interfaces to loop-related analyses (induction
169689Skanvariable analysis and number of iterations analysis).
169689Skan
169689Skan@menu
169689Skan* Loop representation::		Representation and analysis of loops.
169689Skan* Loop querying::		Getting information about loops.
169689Skan* Loop manipulation::		Loop manipulation functions.
169689Skan* LCSSA::			Loop-closed SSA form.
169689Skan* Scalar evolutions::   	Induction variables on GIMPLE.
169689Skan* loop-iv::			Induction variables on RTL.
169689Skan* Number of iterations::	Number of iterations analysis.
169689Skan* Dependency analysis::		Data dependency analysis.
169689Skan* Lambda::			Linear loop transformations framework.
169689Skan@end menu
169689Skan
169689Skan@node Loop representation
169689Skan@section Loop representation
169689Skan@cindex Loop representation
169689Skan@cindex Loop analysis
169689Skan
169689SkanThis chapter describes the representation of loops in GCC, and functions
169689Skanthat can be used to build, modify and analyze this representation.  Most
169689Skanof the interfaces and data structures are declared in @file{cfgloop.h}.
169689SkanAt the moment, loop structures are analyzed and this information is
169689Skanupdated only by the optimization passes that deal with loops, but some
169689Skanefforts are being made to make it available throughout most of the
169689Skanoptimization passes.
169689Skan
169689SkanIn general, a natural loop has one entry block (header) and possibly
169689Skanseveral back edges (latches) leading to the header from the inside of
169689Skanthe loop.  Loops with several latches may appear if several loops share
169689Skana single header, or if there is a branching in the middle of the loop.
169689SkanThe representation of loops in GCC however allows only loops with a
169689Skansingle latch.  During loop analysis, headers of such loops are split and
169689Skanforwarder blocks are created in order to disambiguate their structures.
169689SkanA heuristic based on profile information is used to determine whether
169689Skanthe latches correspond to sub-loops or to control flow in a single loop.
169689SkanThis means that the analysis sometimes changes the CFG, and if you run
169689Skanit in the middle of an optimization pass, you must be able to deal with
169689Skanthe new blocks.
169689Skan
169689SkanBody of the loop is the set of blocks that are dominated by its header,
169689Skanand reachable from its latch against the direction of edges in CFG.  The
169689Skanloops are organized in a containment hierarchy (tree) such that all the
169689Skanloops immediately contained inside loop L are the children of L in the
169689Skantree.  This tree is represented by the @code{struct loops} structure.
169689SkanThe root of this tree is a fake loop that contains all blocks in the
169689Skanfunction.  Each of the loops is represented in a @code{struct loop}
169689Skanstructure.  Each loop is assigned an index (@code{num} field of the
169689Skan@code{struct loop} structure), and the pointer to the loop is stored in
169689Skanthe corresponding field of the @code{parray} field of the loops
169689Skanstructure.  Index of a sub-loop is always greater than the index of its
169689Skansuper-loop.  The indices do not have to be continuous, there may be
169689Skanempty (@code{NULL}) entries in the @code{parray} created by deleting
169689Skanloops.  The index of a loop never changes.  The first unused index is
169689Skanstored in the @code{num} field of the loops structure.
169689Skan
169689SkanEach basic block contains the reference to the innermost loop it belongs
169689Skanto (@code{loop_father}).  For this reason, it is only possible to have
169689Skanone @code{struct loops} structure initialized at the same time for each
169689SkanCFG.  It is recommended to use the global variable @code{current_loops}
169689Skanto contain the @code{struct loops} structure, especially if the loop
169689Skanstructures are updated throughout several passes.  Many of the loop
169689Skanmanipulation functions assume that dominance information is up-to-date.
169689Skan
169689SkanThe loops are analyzed through @code{loop_optimizer_init} function.  The
169689Skanargument of this function is a set of flags represented in an integer
169689Skanbitmask.  These flags specify what other properties of the loop
169689Skanstructures should be calculated/enforced and preserved later:
169689Skan
169689Skan@itemize
169689Skan@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such
169689Skana way that each loop has only one entry edge, and additionally, the
169689Skansource block of this entry edge has only one successor.  This creates a
169689Skannatural place where the code can be moved out of the loop, and ensures
169689Skanthat the entry edge of the loop leads from its immediate super-loop.
169689Skan@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to
169689Skanforce the latch block of each loop to have only one successor.  This
169689Skanensures that the latch of the loop does not belong to any of its
169689Skansub-loops, and makes manipulation with the loops significantly easier.
169689SkanMost of the loop manipulation functions assume that the loops are in
169689Skanthis shape.  Note that with this flag, the ``normal'' loop without any
169689Skancontrol flow inside and with one exit consists of two basic blocks.
169689Skan@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and
169689Skanedges in the strongly connected components that are not natural loops
169689Skan(have more than one entry block) are marked with
169689Skan@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags.  The
169689Skanflag is not set for blocks and edges that belong to natural loops that
169689Skanare in such an irreducible region (but it is set for the entry and exit
169689Skanedges of such a loop, if they lead to/from this region).
169689Skan@item @code{LOOPS_HAVE_MARKED_SINGLE_EXITS}: If a loop has exactly one
169689Skanexit edge, this edge is stored in @code{single_exit} field of the loop
169689Skanstructure.  @code{NULL} is stored there otherwise.
169689Skan@end itemize
169689Skan
169689SkanThese properties may also be computed/enforced later, using functions
169689Skan@code{create_preheaders}, @code{force_single_succ_latches},
169689Skan@code{mark_irreducible_loops} and @code{mark_single_exit_loops}.
169689Skan
169689SkanThe memory occupied by the loops structures should be freed with
169689Skan@code{loop_optimizer_finalize} function.
169689Skan
169689SkanThe CFG manipulation functions in general do not update loop structures.
169689SkanSpecialized versions that additionally do so are provided for the most
169689Skancommon tasks.  On GIMPLE, @code{cleanup_tree_cfg_loop} function can be
169689Skanused to cleanup CFG while updating the loops structures if
169689Skan@code{current_loops} is set.
169689Skan
169689Skan@node Loop querying
169689Skan@section Loop querying
169689Skan@cindex Loop querying
169689Skan
169689SkanThe functions to query the information about loops are declared in
169689Skan@file{cfgloop.h}.  Some of the information can be taken directly from
169689Skanthe structures.  @code{loop_father} field of each basic block contains
169689Skanthe innermost loop to that the block belongs.  The most useful fields of
169689Skanloop structure (that are kept up-to-date at all times) are:
169689Skan
169689Skan@itemize
169689Skan@item @code{header}, @code{latch}: Header and latch basic blocks of the
169689Skanloop.
169689Skan@item @code{num_nodes}: Number of basic blocks in the loop (including
169689Skanthe basic blocks of the sub-loops).
169689Skan@item @code{depth}: The depth of the loop in the loops tree, i.e., the
169689Skannumber of super-loops of the loop.
169689Skan@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first
169689Skansub-loop, and the sibling of the loop in the loops tree.
169689Skan@item @code{single_exit}: The exit edge of the loop, if the loop has
169689Skanexactly one exit and the loops were analyzed with
169689SkanLOOPS_HAVE_MARKED_SINGLE_EXITS.
169689Skan@end itemize
169689Skan
169689SkanThere are other fields in the loop structures, many of them used only by
169689Skansome of the passes, or not updated during CFG changes; in general, they
169689Skanshould not be accessed directly.
169689Skan
169689SkanThe most important functions to query loop structures are:
169689Skan
169689Skan@itemize
169689Skan@item @code{flow_loops_dump}: Dumps the information about loops to a
169689Skanfile.
169689Skan@item @code{verify_loop_structure}: Checks consistency of the loop
169689Skanstructures.
169689Skan@item @code{loop_latch_edge}: Returns the latch edge of a loop.
169689Skan@item @code{loop_preheader_edge}: If loops have preheaders, returns
169689Skanthe preheader edge of a loop.
169689Skan@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of
169689Skananother loop.
169689Skan@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs
169689Skanto a loop (including its sub-loops).
169689Skan@item @code{find_common_loop}: Finds the common super-loop of two loops.
169689Skan@item @code{superloop_at_depth}: Returns the super-loop of a loop with
169689Skanthe given depth.
169689Skan@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the
169689Skannumber of insns in the loop, on GIMPLE and on RTL.
169689Skan@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a
169689Skanloop.
169689Skan@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops
169689Skanwith @code{EDGE_LOOP_EXIT} flag.
169689Skan@item @code{get_loop_body}, @code{get_loop_body_in_dom_order},
169689Skan@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the
169689Skanloop in depth-first search order in reversed CFG, ordered by dominance
169689Skanrelation, and breath-first search order, respectively.
169689Skan@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop.
169689Skan@item @code{just_once_each_iteration_p}: Returns true if the basic block
169689Skanis executed exactly once during each iteration of a loop (that is, it
169689Skandoes not belong to a sub-loop, and it dominates the latch of the loop).
169689Skan@end itemize
169689Skan
169689Skan@node Loop manipulation
169689Skan@section Loop manipulation
169689Skan@cindex Loop manipulation
169689Skan
169689SkanThe loops tree can be manipulated using the following functions:
169689Skan
169689Skan@itemize
169689Skan@item @code{flow_loop_tree_node_add}: Adds a node to the tree.
169689Skan@item @code{flow_loop_tree_node_remove}: Removes a node from the tree.
169689Skan@item @code{add_bb_to_loop}: Adds a basic block to a loop.
169689Skan@item @code{remove_bb_from_loops}: Removes a basic block from loops.
169689Skan@end itemize
169689Skan
169689SkanThe specialized versions of several low-level CFG functions that also
169689Skanupdate loop structures are provided:
169689Skan
169689Skan@itemize
169689Skan@item @code{loop_split_edge_with}: Splits an edge, and places a
169689Skanspecified RTL code on it.  On GIMPLE, the function can still be used,
169689Skanbut the code must be NULL.
169689Skan@item @code{bsi_insert_on_edge_immediate_loop}: Inserts code on edge,
169689Skansplitting it if necessary.  Only works on GIMPLE.
169689Skan@item @code{remove_path}: Removes an edge and all blocks it dominates.
169689Skan@item @code{loop_commit_inserts}: Commits insertions scheduled on edges,
169689Skanand sets loops for the new blocks.  This function can only be used on
169689SkanGIMPLE.
169689Skan@item @code{split_loop_exit_edge}: Splits exit edge of the loop,
169689Skanensuring that PHI node arguments remain in the loop (this ensures that
169689Skanloop-closed SSA form is preserved).  Only useful on GIMPLE.
169689Skan@end itemize
169689Skan
169689SkanFinally, there are some higher-level loop transformations implemented.
169689SkanWhile some of them are written so that they should work on non-innermost
169689Skanloops, they are mostly untested in that case, and at the moment, they
169689Skanare only reliable for the innermost loops:
169689Skan
169689Skan@itemize
169689Skan@item @code{create_iv}: Creates a new induction variable.  Only works on
169689SkanGIMPLE.  @code{standard_iv_increment_position} can be used to find a
169689Skansuitable place for the iv increment.
169689Skan@item @code{duplicate_loop_to_header_edge},
169689Skan@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and
169689Skanon GIMPLE) duplicate the body of the loop prescribed number of times on
169689Skanone of the edges entering loop header, thus performing either loop
169689Skanunrolling or loop peeling.  @code{can_duplicate_loop_p}
169689Skan(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated
169689Skanloop.
169689Skan@item @code{loop_version}, @code{tree_ssa_loop_version}: These function
169689Skancreate a copy of a loop, and a branch before them that selects one of
169689Skanthem depending on the prescribed condition.  This is useful for
169689Skanoptimizations that need to verify some assumptions in runtime (one of
169689Skanthe copies of the loop is usually left unchanged, while the other one is
169689Skantransformed in some way).
169689Skan@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the
169689Skanextra iterations to make the number of iterations divisible by unroll
169689Skanfactor, updating the exit condition, and removing the exits that now
169689Skancannot be taken.  Works only on GIMPLE.
169689Skan@end itemize
169689Skan
169689Skan@node LCSSA
169689Skan@section Loop-closed SSA form
169689Skan@cindex LCSSA
169689Skan@cindex Loop-closed SSA form
169689Skan
169689SkanThroughout the loop optimizations on tree level, one extra condition is
169689Skanenforced on the SSA form:  No SSA name is used outside of the loop in
169689Skanthat it is defined.  The SSA form satisfying this condition is called
169689Skan``loop-closed SSA form'' -- LCSSA.  To enforce LCSSA, PHI nodes must be
169689Skancreated at the exits of the loops for the SSA names that are used
169689Skanoutside of them.  Only the real operands (not virtual SSA names) are
169689Skanheld in LCSSA, in order to save memory.
169689Skan
169689SkanThere are various benefits of LCSSA:
169689Skan
169689Skan@itemize
169689Skan@item Many optimizations (value range analysis, final value
169689Skanreplacement) are interested in the values that are defined in the loop
169689Skanand used outside of it, i.e., exactly those for that we create new PHI
169689Skannodes.
169689Skan@item In induction variable analysis, it is not necessary to specify the
169689Skanloop in that the analysis should be performed -- the scalar evolution
169689Skananalysis always returns the results with respect to the loop in that the
169689SkanSSA name is defined.
169689Skan@item It makes updating of SSA form during loop transformations simpler.
169689SkanWithout LCSSA, operations like loop unrolling may force creation of PHI
169689Skannodes arbitrarily far from the loop, while in LCSSA, the SSA form can be
169689Skanupdated locally.  However, since we only keep real operands in LCSSA, we
169689Skancannot use this advantage (we could have local updating of real
169689Skanoperands, but it is not much more efficient than to use generic SSA form
169689Skanupdating for it as well; the amount of changes to SSA is the same).
169689Skan@end itemize
169689Skan
169689SkanHowever, it also means LCSSA must be updated.  This is usually
169689Skanstraightforward, unless you create a new value in loop and use it
169689Skanoutside, or unless you manipulate loop exit edges (functions are
169689Skanprovided to make these manipulations simple).
169689Skan@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to
169689SkanLCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of
169689SkanLCSSA is preserved.
169689Skan
169689Skan@node Scalar evolutions
169689Skan@section Scalar evolutions
169689Skan@cindex Scalar evolutions
169689Skan@cindex IV analysis on GIMPLE
169689Skan
169689SkanScalar evolutions (SCEV) are used to represent results of induction
169689Skanvariable analysis on GIMPLE.  They enable us to represent variables with
169689Skancomplicated behavior in a simple and consistent way (we only use it to
169689Skanexpress values of polynomial induction variables, but it is possible to
169689Skanextend it).  The interfaces to SCEV analysis are declared in
169689Skan@file{tree-scalar-evolution.h}.  To use scalar evolutions analysis,
169689Skan@code{scev_initialize} must be used.  To stop using SCEV,
169689Skan@code{scev_finalize} should be used.  SCEV analysis caches results in
169689Skanorder to save time and memory.  This cache however is made invalid by
169689Skanmost of the loop transformations, including removal of code.  If such a
169689Skantransformation is performed, @code{scev_reset} must be called to clean
169689Skanthe caches.
169689Skan
169689SkanGiven an SSA name, its behavior in loops can be analyzed using the
169689Skan@code{analyze_scalar_evolution} function.  The returned SCEV however
169689Skandoes not have to be fully analyzed and it may contain references to
169689Skanother SSA names defined in the loop.  To resolve these (potentially
169689Skanrecursive) references, @code{instantiate_parameters} or
169689Skan@code{resolve_mixers} functions must be used.
169689Skan@code{instantiate_parameters} is useful when you use the results of SCEV
169689Skanonly for some analysis, and when you work with whole nest of loops at
169689Skanonce.  It will try replacing all SSA names by their SCEV in all loops,
169689Skanincluding the super-loops of the current loop, thus providing a complete
169689Skaninformation about the behavior of the variable in the loop nest.
169689Skan@code{resolve_mixers} is useful if you work with only one loop at a
169689Skantime, and if you possibly need to create code based on the value of the
169689Skaninduction variable.  It will only resolve the SSA names defined in the
169689Skancurrent loop, leaving the SSA names defined outside unchanged, even if
169689Skantheir evolution in the outer loops is known.
169689Skan
169689SkanThe SCEV is a normal tree expression, except for the fact that it may
169689Skancontain several special tree nodes.  One of them is
169689Skan@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be
169689Skanexpressed.  The other one is @code{POLYNOMIAL_CHREC}.  Polynomial chrec
169689Skanhas three arguments -- base, step and loop (both base and step may
169689Skancontain further polynomial chrecs).  Type of the expression and of base
169689Skanand step must be the same.  A variable has evolution
169689Skan@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified
169689Skanloop) equivalent to @code{x_1} in the following example
169689Skan
169689Skan@smallexample
169689Skanwhile (...)
169689Skan  @{
169689Skan    x_1 = phi (base, x_2);
169689Skan    x_2 = x_1 + step;
169689Skan  @}
169689Skan@end smallexample
169689Skan
169689SkanNote that this includes the language restrictions on the operations.
169689SkanFor example, if we compile C code and @code{x} has signed type, then the
169689Skanoverflow in addition would cause undefined behavior, and we may assume
169689Skanthat this does not happen.  Hence, the value with this SCEV cannot
169689Skanoverflow (which restricts the number of iterations of such a loop).
169689Skan
169689SkanIn many cases, one wants to restrict the attention just to affine
169689Skaninduction variables.  In this case, the extra expressive power of SCEV
169689Skanis not useful, and may complicate the optimizations.  In this case,
169689Skan@code{simple_iv} function may be used to analyze a value -- the result
169689Skanis a loop-invariant base and step.
169689Skan
169689Skan@node loop-iv
169689Skan@section IV analysis on RTL
169689Skan@cindex IV analysis on RTL
169689Skan
169689SkanThe induction variable on RTL is simple and only allows analysis of
169689Skanaffine induction variables, and only in one loop at once.  The interface
169689Skanis declared in @file{cfgloop.h}.  Before analyzing induction variables
169689Skanin a loop L, @code{iv_analysis_loop_init} function must be called on L.
169689SkanAfter the analysis (possibly calling @code{iv_analysis_loop_init} for
169689Skanseveral loops) is finished, @code{iv_analysis_done} should be called.
169689SkanThe following functions can be used to access the results of the
169689Skananalysis:
169689Skan
169689Skan@itemize
169689Skan@item @code{iv_analyze}: Analyzes a single register used in the given
169689Skaninsn.  If no use of the register in this insn is found, the following
169689Skaninsns are scanned, so that this function can be called on the insn
169689Skanreturned by get_condition.
169689Skan@item @code{iv_analyze_result}: Analyzes result of the assignment in the
169689Skangiven insn.
169689Skan@item @code{iv_analyze_expr}: Analyzes a more complicated expression.
169689SkanAll its operands are analyzed by @code{iv_analyze}, and hence they must
169689Skanbe used in the specified insn or one of the following insns.
169689Skan@end itemize
169689Skan
169689SkanThe description of the induction variable is provided in @code{struct
169689Skanrtx_iv}.  In order to handle subregs, the representation is a bit
169689Skancomplicated; if the value of the @code{extend} field is not
169689Skan@code{UNKNOWN}, the value of the induction variable in the i-th
169689Skaniteration is
169689Skan
169689Skan@smallexample
169689Skandelta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)),
169689Skan@end smallexample
169689Skan
169689Skanwith the following exception:  if @code{first_special} is true, then the
169689Skanvalue in the first iteration (when @code{i} is zero) is @code{delta +
169689Skanmult * base}.  However, if @code{extend} is equal to @code{UNKNOWN},
169689Skanthen @code{first_special} must be false, @code{delta} 0, @code{mult} 1
169689Skanand the value in the i-th iteration is
169689Skan
169689Skan@smallexample
169689Skansubreg_@{mode@} (base + i * step)
169689Skan@end smallexample
169689Skan
169689SkanThe function @code{get_iv_value} can be used to perform these
169689Skancalculations.
169689Skan
169689Skan@node Number of iterations
169689Skan@section Number of iterations analysis
169689Skan@cindex Number of iterations analysis
169689Skan
169689SkanBoth on GIMPLE and on RTL, there are functions available to determine
169689Skanthe number of iterations of a loop, with a similar interface.  In many
169689Skancases, it is not possible to determine number of iterations
169689Skanunconditionally -- the determined number is correct only if some
169689Skanassumptions are satisfied.  The analysis tries to verify these
169689Skanconditions using the information contained in the program; if it fails,
169689Skanthe conditions are returned together with the result.  The following
169689Skaninformation and conditions are provided by the analysis:
169689Skan
169689Skan@itemize
169689Skan@item @code{assumptions}: If this condition is false, the rest of
169689Skanthe information is invalid.
169689Skan@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If
169689Skanthis condition is true, the loop exits in the first iteration.
169689Skan@item @code{infinite}: If this condition is true, the loop is infinite.
169689SkanThis condition is only available on RTL.  On GIMPLE, conditions for
169689Skanfiniteness of the loop are included in @code{assumptions}.
169689Skan@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression
169689Skanthat gives number of iterations.  The number of iterations is defined as
169689Skanthe number of executions of the loop latch.
169689Skan@end itemize
169689Skan
169689SkanBoth on GIMPLE and on RTL, it necessary for the induction variable
169689Skananalysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL).
169689SkanOn GIMPLE, the results are stored to @code{struct tree_niter_desc}
169689Skanstructure.  Number of iterations before the loop is exited through a
169689Skangiven exit can be determined using @code{number_of_iterations_exit}
169689Skanfunction.  On RTL, the results are returned in @code{struct niter_desc}
169689Skanstructure.  The corresponding function is named
169689Skan@code{check_simple_exit}.  There are also functions that pass through
169689Skanall the exits of a loop and try to find one with easy to determine
169689Skannumber of iterations -- @code{find_loop_niter} on GIMPLE and
169689Skan@code{find_simple_exit} on RTL.  Finally, there are functions that
169689Skanprovide the same information, but additionally cache it, so that
169689Skanrepeated calls to number of iterations are not so costly --
169689Skan@code{number_of_iterations_in_loop} on GIMPLE and
169689Skan@code{get_simple_loop_desc} on RTL.
169689Skan
169689SkanNote that some of these functions may behave slightly differently than
169689Skanothers -- some of them return only the expression for the number of
169689Skaniterations, and fail if there are some assumptions.  The function
169689Skan@code{number_of_iterations_in_loop} works only for single-exit loops,
169689Skanand it returns the value for number of iterations higher by one with
169689Skanrespect to all other functions (i.e., it returns number of executions of
169689Skanthe exit statement, not of the loop latch).
169689Skan
169689Skan@node Dependency analysis
169689Skan@section Data Dependency Analysis
169689Skan@cindex Data Dependency Analysis
169689Skan
169689SkanThe code for the data dependence analysis can be found in
169689Skan@file{tree-data-ref.c} and its interface and data structures are
169689Skandescribed in @file{tree-data-ref.h}.  The function that computes the
169689Skandata dependences for all the array and pointer references for a given
169689Skanloop is @code{compute_data_dependences_for_loop}.  This function is
169689Skancurrently used by the linear loop transform and the vectorization
169689Skanpasses.  Before calling this function, one has to allocate two vectors:
169689Skana first vector will contain the set of data references that are
169689Skancontained in the analyzed loop body, and the second vector will contain
169689Skanthe dependence relations between the data references.  Thus if the
169689Skanvector of data references is of size @code{n}, the vector containing the
169689Skandependence relations will contain @code{n*n} elements.  However if the
169689Skananalyzed loop contains side effects, such as calls that potentially can
169689Skaninterfere with the data references in the current analyzed loop, the
169689Skananalysis stops while scanning the loop body for data references, and
169689Skaninserts a single @code{chrec_dont_know} in the dependence relation
169689Skanarray.
169689Skan
169689SkanThe data references are discovered in a particular order during the
169689Skanscanning of the loop body: the loop body is analyzed in execution order,
169689Skanand the data references of each statement are pushed at the end of the
169689Skandata reference array.  Two data references syntactically occur in the
169689Skanprogram in the same order as in the array of data references.  This
169689Skansyntactic order is important in some classical data dependence tests,
169689Skanand mapping this order to the elements of this array avoids costly
169689Skanqueries to the loop body representation.
169689Skan
169689SkanThree types of data references are currently handled: ARRAY_REF,
169689SkanINDIRECT_REF and COMPONENT_REF. The data structure for the data reference
169689Skanis @code{data_reference}, where @code{data_reference_p} is a name of a
169689Skanpointer to the data reference structure. The structure contains the
169689Skanfollowing elements:
169689Skan
169689Skan@itemize
169689Skan@item @code{base_object_info}: Provides information about the base object
169689Skanof the data reference and its access functions. These access functions
169689Skanrepresent the evolution of the data reference in the loop relative to
169689Skanits base, in keeping with the classical meaning of the data reference
169689Skanaccess function for the support of arrays. For example, for a reference
169689Skan@code{a.b[i][j]}, the base object is @code{a.b} and the access functions,
169689Skanone for each array subscript, are:
169689Skan@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}.
169689Skan
169689Skan@item @code{first_location_in_loop}: Provides information about the first
169689Skanlocation accessed by the data reference in the loop and about the access
169689Skanfunction used to represent evolution relative to this location. This data
169689Skanis used to support pointers, and is not used for arrays (for which we
169689Skanhave base objects). Pointer accesses are represented as a one-dimensional
169689Skanaccess that starts from the first location accessed in the loop. For
169689Skanexample:
169689Skan
169689Skan@smallexample
169689Skan      for1 i
169689Skan         for2 j
169689Skan          *((int *)p + i + j) = a[i][j];
169689Skan@end smallexample
169689Skan
169689SkanThe access function of the pointer access is @code{@{0, + 4B@}_for2}
169689Skanrelative to @code{p + i}. The access functions of the array are
169689Skan@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2}
169689Skanrelative to @code{a}.
169689Skan
169689SkanUsually, the object the pointer refers to is either unknown, or we can't
169689Skanprove that the access is confined to the boundaries of a certain object.
169689Skan
169689SkanTwo data references can be compared only if at least one of these two
169689Skanrepresentations has all its fields filled for both data references.
169689Skan
169689SkanThe current strategy for data dependence tests is as follows:
169689SkanIf both @code{a} and @code{b} are represented as arrays, compare
169689Skan@code{a.base_object} and @code{b.base_object};
169689Skanif they are equal, apply dependence tests (use access functions based on
169689Skanbase_objects).
169689SkanElse if both @code{a} and @code{b} are represented as pointers, compare
169689Skan@code{a.first_location} and @code{b.first_location};
169689Skanif they are equal, apply dependence tests (use access functions based on
169689Skanfirst location).
169689SkanHowever, if @code{a} and @code{b} are represented differently, only try
169689Skanto prove that the bases are definitely different.
169689Skan
169689Skan@item Aliasing information.
169689Skan@item Alignment information.
169689Skan@end itemize
169689Skan
169689SkanThe structure describing the relation between two data references is
169689Skan@code{data_dependence_relation} and the shorter name for a pointer to
169689Skansuch a structure is @code{ddr_p}.  This structure contains:
169689Skan
169689Skan@itemize
169689Skan@item a pointer to each data reference,
169689Skan@item a tree node @code{are_dependent} that is set to @code{chrec_known}
169689Skanif the analysis has proved that there is no dependence between these two
169689Skandata references, @code{chrec_dont_know} if the analysis was not able to
169689Skandetermine any useful result and potentially there could exist a
169689Skandependence between these data references, and @code{are_dependent} is
169689Skanset to @code{NULL_TREE} if there exist a dependence relation between the
169689Skandata references, and the description of this dependence relation is
169689Skangiven in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects}
169689Skanarrays,
169689Skan@item a boolean that determines whether the dependence relation can be
169689Skanrepresented by a classical distance vector,
169689Skan@item an array @code{subscripts} that contains a description of each
169689Skansubscript of the data references.  Given two array accesses a
169689Skansubscript is the tuple composed of the access functions for a given
169689Skandimension.  For example, given @code{A[f1][f2][f3]} and
169689Skan@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2,
169689Skang2), (f3, g3)}.
169689Skan@item two arrays @code{dir_vects} and @code{dist_vects} that contain
169689Skanclassical representations of the data dependences under the form of
169689Skandirection and distance dependence vectors,
169689Skan@item an array of loops @code{loop_nest} that contains the loops to
169689Skanwhich the distance and direction vectors refer to.
169689Skan@end itemize
169689Skan
169689SkanSeveral functions for pretty printing the information extracted by the
169689Skandata dependence analysis are available: @code{dump_ddrs} prints with a
169689Skanmaximum verbosity the details of a data dependence relations array,
169689Skan@code{dump_dist_dir_vectors} prints only the classical distance and
169689Skandirection vectors for a data dependence relations array, and
169689Skan@code{dump_data_references} prints the details of the data references
169689Skancontained in a data reference array.
169689Skan
169689Skan@node Lambda
169689Skan@section Linear loop transformations framework
169689Skan@cindex Linear loop transformations framework
169689Skan
169689SkanLambda is a framework that allows transformations of loops using
169689Skannon-singular matrix based transformations of the iteration space and
169689Skanloop bounds. This allows compositions of skewing, scaling, interchange,
169689Skanand reversal transformations.  These transformations are often used to
169689Skanimprove cache behavior or remove inner loop dependencies to allow
169689Skanparallelization and vectorization to take place.
169689Skan
169689SkanTo perform these transformations, Lambda requires that the loopnest be
169689Skanconverted into a internal form that can be matrix transformed easily.
169689SkanTo do this conversion, the function
169689Skan@code{gcc_loopnest_to_lambda_loopnest} is provided.  If the loop cannot
169689Skanbe transformed using lambda, this function will return NULL.
169689Skan
169689SkanOnce a @code{lambda_loopnest} is obtained from the conversion function,
169689Skanit can be transformed by using @code{lambda_loopnest_transform}, which
169689Skantakes a transformation matrix to apply.  Note that it is up to the
169689Skancaller to verify that the transformation matrix is legal to apply to the
169689Skanloop (dependence respecting, etc).  Lambda simply applies whatever
169689Skanmatrix it is told to provide.  It can be extended to make legal matrices
169689Skanout of any non-singular matrix, but this is not currently implemented.
169689SkanLegality of a matrix for a given loopnest can be verified using
169689Skan@code{lambda_transform_legal_p}.
169689Skan
169689SkanGiven a transformed loopnest, conversion back into gcc IR is done by
169689Skan@code{lambda_loopnest_to_gcc_loopnest}.  This function will modify the
169689Skanloops so that they match the transformed loopnest.
169689Skan