1169689Skan@c Copyright (c) 2006 Free Software Foundation, Inc. 2169689Skan@c Free Software Foundation, Inc. 3169689Skan@c This is part of the GCC manual. 4169689Skan@c For copying conditions, see the file gcc.texi. 5169689Skan 6169689Skan@c --------------------------------------------------------------------- 7169689Skan@c Loop Representation 8169689Skan@c --------------------------------------------------------------------- 9169689Skan 10169689Skan@node Loop Analysis and Representation 11169689Skan@chapter Analysis and Representation of Loops 12169689Skan 13169689SkanGCC provides extensive infrastructure for work with natural loops, i.e., 14169689Skanstrongly connected components of CFG with only one entry block. This 15169689Skanchapter describes representation of loops in GCC, both on GIMPLE and in 16169689SkanRTL, as well as the interfaces to loop-related analyses (induction 17169689Skanvariable analysis and number of iterations analysis). 18169689Skan 19169689Skan@menu 20169689Skan* Loop representation:: Representation and analysis of loops. 21169689Skan* Loop querying:: Getting information about loops. 22169689Skan* Loop manipulation:: Loop manipulation functions. 23169689Skan* LCSSA:: Loop-closed SSA form. 24169689Skan* Scalar evolutions:: Induction variables on GIMPLE. 25169689Skan* loop-iv:: Induction variables on RTL. 26169689Skan* Number of iterations:: Number of iterations analysis. 27169689Skan* Dependency analysis:: Data dependency analysis. 28169689Skan* Lambda:: Linear loop transformations framework. 29169689Skan@end menu 30169689Skan 31169689Skan@node Loop representation 32169689Skan@section Loop representation 33169689Skan@cindex Loop representation 34169689Skan@cindex Loop analysis 35169689Skan 36169689SkanThis chapter describes the representation of loops in GCC, and functions 37169689Skanthat can be used to build, modify and analyze this representation. Most 38169689Skanof the interfaces and data structures are declared in @file{cfgloop.h}. 39169689SkanAt the moment, loop structures are analyzed and this information is 40169689Skanupdated only by the optimization passes that deal with loops, but some 41169689Skanefforts are being made to make it available throughout most of the 42169689Skanoptimization passes. 43169689Skan 44169689SkanIn general, a natural loop has one entry block (header) and possibly 45169689Skanseveral back edges (latches) leading to the header from the inside of 46169689Skanthe loop. Loops with several latches may appear if several loops share 47169689Skana single header, or if there is a branching in the middle of the loop. 48169689SkanThe representation of loops in GCC however allows only loops with a 49169689Skansingle latch. During loop analysis, headers of such loops are split and 50169689Skanforwarder blocks are created in order to disambiguate their structures. 51169689SkanA heuristic based on profile information is used to determine whether 52169689Skanthe latches correspond to sub-loops or to control flow in a single loop. 53169689SkanThis means that the analysis sometimes changes the CFG, and if you run 54169689Skanit in the middle of an optimization pass, you must be able to deal with 55169689Skanthe new blocks. 56169689Skan 57169689SkanBody of the loop is the set of blocks that are dominated by its header, 58169689Skanand reachable from its latch against the direction of edges in CFG. The 59169689Skanloops are organized in a containment hierarchy (tree) such that all the 60169689Skanloops immediately contained inside loop L are the children of L in the 61169689Skantree. This tree is represented by the @code{struct loops} structure. 62169689SkanThe root of this tree is a fake loop that contains all blocks in the 63169689Skanfunction. Each of the loops is represented in a @code{struct loop} 64169689Skanstructure. Each loop is assigned an index (@code{num} field of the 65169689Skan@code{struct loop} structure), and the pointer to the loop is stored in 66169689Skanthe corresponding field of the @code{parray} field of the loops 67169689Skanstructure. Index of a sub-loop is always greater than the index of its 68169689Skansuper-loop. The indices do not have to be continuous, there may be 69169689Skanempty (@code{NULL}) entries in the @code{parray} created by deleting 70169689Skanloops. The index of a loop never changes. The first unused index is 71169689Skanstored in the @code{num} field of the loops structure. 72169689Skan 73169689SkanEach basic block contains the reference to the innermost loop it belongs 74169689Skanto (@code{loop_father}). For this reason, it is only possible to have 75169689Skanone @code{struct loops} structure initialized at the same time for each 76169689SkanCFG. It is recommended to use the global variable @code{current_loops} 77169689Skanto contain the @code{struct loops} structure, especially if the loop 78169689Skanstructures are updated throughout several passes. Many of the loop 79169689Skanmanipulation functions assume that dominance information is up-to-date. 80169689Skan 81169689SkanThe loops are analyzed through @code{loop_optimizer_init} function. The 82169689Skanargument of this function is a set of flags represented in an integer 83169689Skanbitmask. These flags specify what other properties of the loop 84169689Skanstructures should be calculated/enforced and preserved later: 85169689Skan 86169689Skan@itemize 87169689Skan@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such 88169689Skana way that each loop has only one entry edge, and additionally, the 89169689Skansource block of this entry edge has only one successor. This creates a 90169689Skannatural place where the code can be moved out of the loop, and ensures 91169689Skanthat the entry edge of the loop leads from its immediate super-loop. 92169689Skan@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to 93169689Skanforce the latch block of each loop to have only one successor. This 94169689Skanensures that the latch of the loop does not belong to any of its 95169689Skansub-loops, and makes manipulation with the loops significantly easier. 96169689SkanMost of the loop manipulation functions assume that the loops are in 97169689Skanthis shape. Note that with this flag, the ``normal'' loop without any 98169689Skancontrol flow inside and with one exit consists of two basic blocks. 99169689Skan@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and 100169689Skanedges in the strongly connected components that are not natural loops 101169689Skan(have more than one entry block) are marked with 102169689Skan@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags. The 103169689Skanflag is not set for blocks and edges that belong to natural loops that 104169689Skanare in such an irreducible region (but it is set for the entry and exit 105169689Skanedges of such a loop, if they lead to/from this region). 106169689Skan@item @code{LOOPS_HAVE_MARKED_SINGLE_EXITS}: If a loop has exactly one 107169689Skanexit edge, this edge is stored in @code{single_exit} field of the loop 108169689Skanstructure. @code{NULL} is stored there otherwise. 109169689Skan@end itemize 110169689Skan 111169689SkanThese properties may also be computed/enforced later, using functions 112169689Skan@code{create_preheaders}, @code{force_single_succ_latches}, 113169689Skan@code{mark_irreducible_loops} and @code{mark_single_exit_loops}. 114169689Skan 115169689SkanThe memory occupied by the loops structures should be freed with 116169689Skan@code{loop_optimizer_finalize} function. 117169689Skan 118169689SkanThe CFG manipulation functions in general do not update loop structures. 119169689SkanSpecialized versions that additionally do so are provided for the most 120169689Skancommon tasks. On GIMPLE, @code{cleanup_tree_cfg_loop} function can be 121169689Skanused to cleanup CFG while updating the loops structures if 122169689Skan@code{current_loops} is set. 123169689Skan 124169689Skan@node Loop querying 125169689Skan@section Loop querying 126169689Skan@cindex Loop querying 127169689Skan 128169689SkanThe functions to query the information about loops are declared in 129169689Skan@file{cfgloop.h}. Some of the information can be taken directly from 130169689Skanthe structures. @code{loop_father} field of each basic block contains 131169689Skanthe innermost loop to that the block belongs. The most useful fields of 132169689Skanloop structure (that are kept up-to-date at all times) are: 133169689Skan 134169689Skan@itemize 135169689Skan@item @code{header}, @code{latch}: Header and latch basic blocks of the 136169689Skanloop. 137169689Skan@item @code{num_nodes}: Number of basic blocks in the loop (including 138169689Skanthe basic blocks of the sub-loops). 139169689Skan@item @code{depth}: The depth of the loop in the loops tree, i.e., the 140169689Skannumber of super-loops of the loop. 141169689Skan@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first 142169689Skansub-loop, and the sibling of the loop in the loops tree. 143169689Skan@item @code{single_exit}: The exit edge of the loop, if the loop has 144169689Skanexactly one exit and the loops were analyzed with 145169689SkanLOOPS_HAVE_MARKED_SINGLE_EXITS. 146169689Skan@end itemize 147169689Skan 148169689SkanThere are other fields in the loop structures, many of them used only by 149169689Skansome of the passes, or not updated during CFG changes; in general, they 150169689Skanshould not be accessed directly. 151169689Skan 152169689SkanThe most important functions to query loop structures are: 153169689Skan 154169689Skan@itemize 155169689Skan@item @code{flow_loops_dump}: Dumps the information about loops to a 156169689Skanfile. 157169689Skan@item @code{verify_loop_structure}: Checks consistency of the loop 158169689Skanstructures. 159169689Skan@item @code{loop_latch_edge}: Returns the latch edge of a loop. 160169689Skan@item @code{loop_preheader_edge}: If loops have preheaders, returns 161169689Skanthe preheader edge of a loop. 162169689Skan@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of 163169689Skananother loop. 164169689Skan@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs 165169689Skanto a loop (including its sub-loops). 166169689Skan@item @code{find_common_loop}: Finds the common super-loop of two loops. 167169689Skan@item @code{superloop_at_depth}: Returns the super-loop of a loop with 168169689Skanthe given depth. 169169689Skan@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the 170169689Skannumber of insns in the loop, on GIMPLE and on RTL. 171169689Skan@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a 172169689Skanloop. 173169689Skan@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops 174169689Skanwith @code{EDGE_LOOP_EXIT} flag. 175169689Skan@item @code{get_loop_body}, @code{get_loop_body_in_dom_order}, 176169689Skan@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the 177169689Skanloop in depth-first search order in reversed CFG, ordered by dominance 178169689Skanrelation, and breath-first search order, respectively. 179169689Skan@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop. 180169689Skan@item @code{just_once_each_iteration_p}: Returns true if the basic block 181169689Skanis executed exactly once during each iteration of a loop (that is, it 182169689Skandoes not belong to a sub-loop, and it dominates the latch of the loop). 183169689Skan@end itemize 184169689Skan 185169689Skan@node Loop manipulation 186169689Skan@section Loop manipulation 187169689Skan@cindex Loop manipulation 188169689Skan 189169689SkanThe loops tree can be manipulated using the following functions: 190169689Skan 191169689Skan@itemize 192169689Skan@item @code{flow_loop_tree_node_add}: Adds a node to the tree. 193169689Skan@item @code{flow_loop_tree_node_remove}: Removes a node from the tree. 194169689Skan@item @code{add_bb_to_loop}: Adds a basic block to a loop. 195169689Skan@item @code{remove_bb_from_loops}: Removes a basic block from loops. 196169689Skan@end itemize 197169689Skan 198169689SkanThe specialized versions of several low-level CFG functions that also 199169689Skanupdate loop structures are provided: 200169689Skan 201169689Skan@itemize 202169689Skan@item @code{loop_split_edge_with}: Splits an edge, and places a 203169689Skanspecified RTL code on it. On GIMPLE, the function can still be used, 204169689Skanbut the code must be NULL. 205169689Skan@item @code{bsi_insert_on_edge_immediate_loop}: Inserts code on edge, 206169689Skansplitting it if necessary. Only works on GIMPLE. 207169689Skan@item @code{remove_path}: Removes an edge and all blocks it dominates. 208169689Skan@item @code{loop_commit_inserts}: Commits insertions scheduled on edges, 209169689Skanand sets loops for the new blocks. This function can only be used on 210169689SkanGIMPLE. 211169689Skan@item @code{split_loop_exit_edge}: Splits exit edge of the loop, 212169689Skanensuring that PHI node arguments remain in the loop (this ensures that 213169689Skanloop-closed SSA form is preserved). Only useful on GIMPLE. 214169689Skan@end itemize 215169689Skan 216169689SkanFinally, there are some higher-level loop transformations implemented. 217169689SkanWhile some of them are written so that they should work on non-innermost 218169689Skanloops, they are mostly untested in that case, and at the moment, they 219169689Skanare only reliable for the innermost loops: 220169689Skan 221169689Skan@itemize 222169689Skan@item @code{create_iv}: Creates a new induction variable. Only works on 223169689SkanGIMPLE. @code{standard_iv_increment_position} can be used to find a 224169689Skansuitable place for the iv increment. 225169689Skan@item @code{duplicate_loop_to_header_edge}, 226169689Skan@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and 227169689Skanon GIMPLE) duplicate the body of the loop prescribed number of times on 228169689Skanone of the edges entering loop header, thus performing either loop 229169689Skanunrolling or loop peeling. @code{can_duplicate_loop_p} 230169689Skan(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated 231169689Skanloop. 232169689Skan@item @code{loop_version}, @code{tree_ssa_loop_version}: These function 233169689Skancreate a copy of a loop, and a branch before them that selects one of 234169689Skanthem depending on the prescribed condition. This is useful for 235169689Skanoptimizations that need to verify some assumptions in runtime (one of 236169689Skanthe copies of the loop is usually left unchanged, while the other one is 237169689Skantransformed in some way). 238169689Skan@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the 239169689Skanextra iterations to make the number of iterations divisible by unroll 240169689Skanfactor, updating the exit condition, and removing the exits that now 241169689Skancannot be taken. Works only on GIMPLE. 242169689Skan@end itemize 243169689Skan 244169689Skan@node LCSSA 245169689Skan@section Loop-closed SSA form 246169689Skan@cindex LCSSA 247169689Skan@cindex Loop-closed SSA form 248169689Skan 249169689SkanThroughout the loop optimizations on tree level, one extra condition is 250169689Skanenforced on the SSA form: No SSA name is used outside of the loop in 251169689Skanthat it is defined. The SSA form satisfying this condition is called 252169689Skan``loop-closed SSA form'' -- LCSSA. To enforce LCSSA, PHI nodes must be 253169689Skancreated at the exits of the loops for the SSA names that are used 254169689Skanoutside of them. Only the real operands (not virtual SSA names) are 255169689Skanheld in LCSSA, in order to save memory. 256169689Skan 257169689SkanThere are various benefits of LCSSA: 258169689Skan 259169689Skan@itemize 260169689Skan@item Many optimizations (value range analysis, final value 261169689Skanreplacement) are interested in the values that are defined in the loop 262169689Skanand used outside of it, i.e., exactly those for that we create new PHI 263169689Skannodes. 264169689Skan@item In induction variable analysis, it is not necessary to specify the 265169689Skanloop in that the analysis should be performed -- the scalar evolution 266169689Skananalysis always returns the results with respect to the loop in that the 267169689SkanSSA name is defined. 268169689Skan@item It makes updating of SSA form during loop transformations simpler. 269169689SkanWithout LCSSA, operations like loop unrolling may force creation of PHI 270169689Skannodes arbitrarily far from the loop, while in LCSSA, the SSA form can be 271169689Skanupdated locally. However, since we only keep real operands in LCSSA, we 272169689Skancannot use this advantage (we could have local updating of real 273169689Skanoperands, but it is not much more efficient than to use generic SSA form 274169689Skanupdating for it as well; the amount of changes to SSA is the same). 275169689Skan@end itemize 276169689Skan 277169689SkanHowever, it also means LCSSA must be updated. This is usually 278169689Skanstraightforward, unless you create a new value in loop and use it 279169689Skanoutside, or unless you manipulate loop exit edges (functions are 280169689Skanprovided to make these manipulations simple). 281169689Skan@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to 282169689SkanLCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of 283169689SkanLCSSA is preserved. 284169689Skan 285169689Skan@node Scalar evolutions 286169689Skan@section Scalar evolutions 287169689Skan@cindex Scalar evolutions 288169689Skan@cindex IV analysis on GIMPLE 289169689Skan 290169689SkanScalar evolutions (SCEV) are used to represent results of induction 291169689Skanvariable analysis on GIMPLE. They enable us to represent variables with 292169689Skancomplicated behavior in a simple and consistent way (we only use it to 293169689Skanexpress values of polynomial induction variables, but it is possible to 294169689Skanextend it). The interfaces to SCEV analysis are declared in 295169689Skan@file{tree-scalar-evolution.h}. To use scalar evolutions analysis, 296169689Skan@code{scev_initialize} must be used. To stop using SCEV, 297169689Skan@code{scev_finalize} should be used. SCEV analysis caches results in 298169689Skanorder to save time and memory. This cache however is made invalid by 299169689Skanmost of the loop transformations, including removal of code. If such a 300169689Skantransformation is performed, @code{scev_reset} must be called to clean 301169689Skanthe caches. 302169689Skan 303169689SkanGiven an SSA name, its behavior in loops can be analyzed using the 304169689Skan@code{analyze_scalar_evolution} function. The returned SCEV however 305169689Skandoes not have to be fully analyzed and it may contain references to 306169689Skanother SSA names defined in the loop. To resolve these (potentially 307169689Skanrecursive) references, @code{instantiate_parameters} or 308169689Skan@code{resolve_mixers} functions must be used. 309169689Skan@code{instantiate_parameters} is useful when you use the results of SCEV 310169689Skanonly for some analysis, and when you work with whole nest of loops at 311169689Skanonce. It will try replacing all SSA names by their SCEV in all loops, 312169689Skanincluding the super-loops of the current loop, thus providing a complete 313169689Skaninformation about the behavior of the variable in the loop nest. 314169689Skan@code{resolve_mixers} is useful if you work with only one loop at a 315169689Skantime, and if you possibly need to create code based on the value of the 316169689Skaninduction variable. It will only resolve the SSA names defined in the 317169689Skancurrent loop, leaving the SSA names defined outside unchanged, even if 318169689Skantheir evolution in the outer loops is known. 319169689Skan 320169689SkanThe SCEV is a normal tree expression, except for the fact that it may 321169689Skancontain several special tree nodes. One of them is 322169689Skan@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be 323169689Skanexpressed. The other one is @code{POLYNOMIAL_CHREC}. Polynomial chrec 324169689Skanhas three arguments -- base, step and loop (both base and step may 325169689Skancontain further polynomial chrecs). Type of the expression and of base 326169689Skanand step must be the same. A variable has evolution 327169689Skan@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified 328169689Skanloop) equivalent to @code{x_1} in the following example 329169689Skan 330169689Skan@smallexample 331169689Skanwhile (...) 332169689Skan @{ 333169689Skan x_1 = phi (base, x_2); 334169689Skan x_2 = x_1 + step; 335169689Skan @} 336169689Skan@end smallexample 337169689Skan 338169689SkanNote that this includes the language restrictions on the operations. 339169689SkanFor example, if we compile C code and @code{x} has signed type, then the 340169689Skanoverflow in addition would cause undefined behavior, and we may assume 341169689Skanthat this does not happen. Hence, the value with this SCEV cannot 342169689Skanoverflow (which restricts the number of iterations of such a loop). 343169689Skan 344169689SkanIn many cases, one wants to restrict the attention just to affine 345169689Skaninduction variables. In this case, the extra expressive power of SCEV 346169689Skanis not useful, and may complicate the optimizations. In this case, 347169689Skan@code{simple_iv} function may be used to analyze a value -- the result 348169689Skanis a loop-invariant base and step. 349169689Skan 350169689Skan@node loop-iv 351169689Skan@section IV analysis on RTL 352169689Skan@cindex IV analysis on RTL 353169689Skan 354169689SkanThe induction variable on RTL is simple and only allows analysis of 355169689Skanaffine induction variables, and only in one loop at once. The interface 356169689Skanis declared in @file{cfgloop.h}. Before analyzing induction variables 357169689Skanin a loop L, @code{iv_analysis_loop_init} function must be called on L. 358169689SkanAfter the analysis (possibly calling @code{iv_analysis_loop_init} for 359169689Skanseveral loops) is finished, @code{iv_analysis_done} should be called. 360169689SkanThe following functions can be used to access the results of the 361169689Skananalysis: 362169689Skan 363169689Skan@itemize 364169689Skan@item @code{iv_analyze}: Analyzes a single register used in the given 365169689Skaninsn. If no use of the register in this insn is found, the following 366169689Skaninsns are scanned, so that this function can be called on the insn 367169689Skanreturned by get_condition. 368169689Skan@item @code{iv_analyze_result}: Analyzes result of the assignment in the 369169689Skangiven insn. 370169689Skan@item @code{iv_analyze_expr}: Analyzes a more complicated expression. 371169689SkanAll its operands are analyzed by @code{iv_analyze}, and hence they must 372169689Skanbe used in the specified insn or one of the following insns. 373169689Skan@end itemize 374169689Skan 375169689SkanThe description of the induction variable is provided in @code{struct 376169689Skanrtx_iv}. In order to handle subregs, the representation is a bit 377169689Skancomplicated; if the value of the @code{extend} field is not 378169689Skan@code{UNKNOWN}, the value of the induction variable in the i-th 379169689Skaniteration is 380169689Skan 381169689Skan@smallexample 382169689Skandelta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)), 383169689Skan@end smallexample 384169689Skan 385169689Skanwith the following exception: if @code{first_special} is true, then the 386169689Skanvalue in the first iteration (when @code{i} is zero) is @code{delta + 387169689Skanmult * base}. However, if @code{extend} is equal to @code{UNKNOWN}, 388169689Skanthen @code{first_special} must be false, @code{delta} 0, @code{mult} 1 389169689Skanand the value in the i-th iteration is 390169689Skan 391169689Skan@smallexample 392169689Skansubreg_@{mode@} (base + i * step) 393169689Skan@end smallexample 394169689Skan 395169689SkanThe function @code{get_iv_value} can be used to perform these 396169689Skancalculations. 397169689Skan 398169689Skan@node Number of iterations 399169689Skan@section Number of iterations analysis 400169689Skan@cindex Number of iterations analysis 401169689Skan 402169689SkanBoth on GIMPLE and on RTL, there are functions available to determine 403169689Skanthe number of iterations of a loop, with a similar interface. In many 404169689Skancases, it is not possible to determine number of iterations 405169689Skanunconditionally -- the determined number is correct only if some 406169689Skanassumptions are satisfied. The analysis tries to verify these 407169689Skanconditions using the information contained in the program; if it fails, 408169689Skanthe conditions are returned together with the result. The following 409169689Skaninformation and conditions are provided by the analysis: 410169689Skan 411169689Skan@itemize 412169689Skan@item @code{assumptions}: If this condition is false, the rest of 413169689Skanthe information is invalid. 414169689Skan@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If 415169689Skanthis condition is true, the loop exits in the first iteration. 416169689Skan@item @code{infinite}: If this condition is true, the loop is infinite. 417169689SkanThis condition is only available on RTL. On GIMPLE, conditions for 418169689Skanfiniteness of the loop are included in @code{assumptions}. 419169689Skan@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression 420169689Skanthat gives number of iterations. The number of iterations is defined as 421169689Skanthe number of executions of the loop latch. 422169689Skan@end itemize 423169689Skan 424169689SkanBoth on GIMPLE and on RTL, it necessary for the induction variable 425169689Skananalysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL). 426169689SkanOn GIMPLE, the results are stored to @code{struct tree_niter_desc} 427169689Skanstructure. Number of iterations before the loop is exited through a 428169689Skangiven exit can be determined using @code{number_of_iterations_exit} 429169689Skanfunction. On RTL, the results are returned in @code{struct niter_desc} 430169689Skanstructure. The corresponding function is named 431169689Skan@code{check_simple_exit}. There are also functions that pass through 432169689Skanall the exits of a loop and try to find one with easy to determine 433169689Skannumber of iterations -- @code{find_loop_niter} on GIMPLE and 434169689Skan@code{find_simple_exit} on RTL. Finally, there are functions that 435169689Skanprovide the same information, but additionally cache it, so that 436169689Skanrepeated calls to number of iterations are not so costly -- 437169689Skan@code{number_of_iterations_in_loop} on GIMPLE and 438169689Skan@code{get_simple_loop_desc} on RTL. 439169689Skan 440169689SkanNote that some of these functions may behave slightly differently than 441169689Skanothers -- some of them return only the expression for the number of 442169689Skaniterations, and fail if there are some assumptions. The function 443169689Skan@code{number_of_iterations_in_loop} works only for single-exit loops, 444169689Skanand it returns the value for number of iterations higher by one with 445169689Skanrespect to all other functions (i.e., it returns number of executions of 446169689Skanthe exit statement, not of the loop latch). 447169689Skan 448169689Skan@node Dependency analysis 449169689Skan@section Data Dependency Analysis 450169689Skan@cindex Data Dependency Analysis 451169689Skan 452169689SkanThe code for the data dependence analysis can be found in 453169689Skan@file{tree-data-ref.c} and its interface and data structures are 454169689Skandescribed in @file{tree-data-ref.h}. The function that computes the 455169689Skandata dependences for all the array and pointer references for a given 456169689Skanloop is @code{compute_data_dependences_for_loop}. This function is 457169689Skancurrently used by the linear loop transform and the vectorization 458169689Skanpasses. Before calling this function, one has to allocate two vectors: 459169689Skana first vector will contain the set of data references that are 460169689Skancontained in the analyzed loop body, and the second vector will contain 461169689Skanthe dependence relations between the data references. Thus if the 462169689Skanvector of data references is of size @code{n}, the vector containing the 463169689Skandependence relations will contain @code{n*n} elements. However if the 464169689Skananalyzed loop contains side effects, such as calls that potentially can 465169689Skaninterfere with the data references in the current analyzed loop, the 466169689Skananalysis stops while scanning the loop body for data references, and 467169689Skaninserts a single @code{chrec_dont_know} in the dependence relation 468169689Skanarray. 469169689Skan 470169689SkanThe data references are discovered in a particular order during the 471169689Skanscanning of the loop body: the loop body is analyzed in execution order, 472169689Skanand the data references of each statement are pushed at the end of the 473169689Skandata reference array. Two data references syntactically occur in the 474169689Skanprogram in the same order as in the array of data references. This 475169689Skansyntactic order is important in some classical data dependence tests, 476169689Skanand mapping this order to the elements of this array avoids costly 477169689Skanqueries to the loop body representation. 478169689Skan 479169689SkanThree types of data references are currently handled: ARRAY_REF, 480169689SkanINDIRECT_REF and COMPONENT_REF. The data structure for the data reference 481169689Skanis @code{data_reference}, where @code{data_reference_p} is a name of a 482169689Skanpointer to the data reference structure. The structure contains the 483169689Skanfollowing elements: 484169689Skan 485169689Skan@itemize 486169689Skan@item @code{base_object_info}: Provides information about the base object 487169689Skanof the data reference and its access functions. These access functions 488169689Skanrepresent the evolution of the data reference in the loop relative to 489169689Skanits base, in keeping with the classical meaning of the data reference 490169689Skanaccess function for the support of arrays. For example, for a reference 491169689Skan@code{a.b[i][j]}, the base object is @code{a.b} and the access functions, 492169689Skanone for each array subscript, are: 493169689Skan@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}. 494169689Skan 495169689Skan@item @code{first_location_in_loop}: Provides information about the first 496169689Skanlocation accessed by the data reference in the loop and about the access 497169689Skanfunction used to represent evolution relative to this location. This data 498169689Skanis used to support pointers, and is not used for arrays (for which we 499169689Skanhave base objects). Pointer accesses are represented as a one-dimensional 500169689Skanaccess that starts from the first location accessed in the loop. For 501169689Skanexample: 502169689Skan 503169689Skan@smallexample 504169689Skan for1 i 505169689Skan for2 j 506169689Skan *((int *)p + i + j) = a[i][j]; 507169689Skan@end smallexample 508169689Skan 509169689SkanThe access function of the pointer access is @code{@{0, + 4B@}_for2} 510169689Skanrelative to @code{p + i}. The access functions of the array are 511169689Skan@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2} 512169689Skanrelative to @code{a}. 513169689Skan 514169689SkanUsually, the object the pointer refers to is either unknown, or we can't 515169689Skanprove that the access is confined to the boundaries of a certain object. 516169689Skan 517169689SkanTwo data references can be compared only if at least one of these two 518169689Skanrepresentations has all its fields filled for both data references. 519169689Skan 520169689SkanThe current strategy for data dependence tests is as follows: 521169689SkanIf both @code{a} and @code{b} are represented as arrays, compare 522169689Skan@code{a.base_object} and @code{b.base_object}; 523169689Skanif they are equal, apply dependence tests (use access functions based on 524169689Skanbase_objects). 525169689SkanElse if both @code{a} and @code{b} are represented as pointers, compare 526169689Skan@code{a.first_location} and @code{b.first_location}; 527169689Skanif they are equal, apply dependence tests (use access functions based on 528169689Skanfirst location). 529169689SkanHowever, if @code{a} and @code{b} are represented differently, only try 530169689Skanto prove that the bases are definitely different. 531169689Skan 532169689Skan@item Aliasing information. 533169689Skan@item Alignment information. 534169689Skan@end itemize 535169689Skan 536169689SkanThe structure describing the relation between two data references is 537169689Skan@code{data_dependence_relation} and the shorter name for a pointer to 538169689Skansuch a structure is @code{ddr_p}. This structure contains: 539169689Skan 540169689Skan@itemize 541169689Skan@item a pointer to each data reference, 542169689Skan@item a tree node @code{are_dependent} that is set to @code{chrec_known} 543169689Skanif the analysis has proved that there is no dependence between these two 544169689Skandata references, @code{chrec_dont_know} if the analysis was not able to 545169689Skandetermine any useful result and potentially there could exist a 546169689Skandependence between these data references, and @code{are_dependent} is 547169689Skanset to @code{NULL_TREE} if there exist a dependence relation between the 548169689Skandata references, and the description of this dependence relation is 549169689Skangiven in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects} 550169689Skanarrays, 551169689Skan@item a boolean that determines whether the dependence relation can be 552169689Skanrepresented by a classical distance vector, 553169689Skan@item an array @code{subscripts} that contains a description of each 554169689Skansubscript of the data references. Given two array accesses a 555169689Skansubscript is the tuple composed of the access functions for a given 556169689Skandimension. For example, given @code{A[f1][f2][f3]} and 557169689Skan@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2, 558169689Skang2), (f3, g3)}. 559169689Skan@item two arrays @code{dir_vects} and @code{dist_vects} that contain 560169689Skanclassical representations of the data dependences under the form of 561169689Skandirection and distance dependence vectors, 562169689Skan@item an array of loops @code{loop_nest} that contains the loops to 563169689Skanwhich the distance and direction vectors refer to. 564169689Skan@end itemize 565169689Skan 566169689SkanSeveral functions for pretty printing the information extracted by the 567169689Skandata dependence analysis are available: @code{dump_ddrs} prints with a 568169689Skanmaximum verbosity the details of a data dependence relations array, 569169689Skan@code{dump_dist_dir_vectors} prints only the classical distance and 570169689Skandirection vectors for a data dependence relations array, and 571169689Skan@code{dump_data_references} prints the details of the data references 572169689Skancontained in a data reference array. 573169689Skan 574169689Skan@node Lambda 575169689Skan@section Linear loop transformations framework 576169689Skan@cindex Linear loop transformations framework 577169689Skan 578169689SkanLambda is a framework that allows transformations of loops using 579169689Skannon-singular matrix based transformations of the iteration space and 580169689Skanloop bounds. This allows compositions of skewing, scaling, interchange, 581169689Skanand reversal transformations. These transformations are often used to 582169689Skanimprove cache behavior or remove inner loop dependencies to allow 583169689Skanparallelization and vectorization to take place. 584169689Skan 585169689SkanTo perform these transformations, Lambda requires that the loopnest be 586169689Skanconverted into a internal form that can be matrix transformed easily. 587169689SkanTo do this conversion, the function 588169689Skan@code{gcc_loopnest_to_lambda_loopnest} is provided. If the loop cannot 589169689Skanbe transformed using lambda, this function will return NULL. 590169689Skan 591169689SkanOnce a @code{lambda_loopnest} is obtained from the conversion function, 592169689Skanit can be transformed by using @code{lambda_loopnest_transform}, which 593169689Skantakes a transformation matrix to apply. Note that it is up to the 594169689Skancaller to verify that the transformation matrix is legal to apply to the 595169689Skanloop (dependence respecting, etc). Lambda simply applies whatever 596169689Skanmatrix it is told to provide. It can be extended to make legal matrices 597169689Skanout of any non-singular matrix, but this is not currently implemented. 598169689SkanLegality of a matrix for a given loopnest can be verified using 599169689Skan@code{lambda_transform_legal_p}. 600169689Skan 601169689SkanGiven a transformed loopnest, conversion back into gcc IR is done by 602169689Skan@code{lambda_loopnest_to_gcc_loopnest}. This function will modify the 603169689Skanloops so that they match the transformed loopnest. 604169689Skan 605