1@c Copyright (c) 2006 Free Software Foundation, Inc. 2@c Free Software Foundation, Inc. 3@c This is part of the GCC manual. 4@c For copying conditions, see the file gcc.texi. 5 6@c --------------------------------------------------------------------- 7@c Loop Representation 8@c --------------------------------------------------------------------- 9 10@node Loop Analysis and Representation 11@chapter Analysis and Representation of Loops 12 13GCC provides extensive infrastructure for work with natural loops, i.e., 14strongly connected components of CFG with only one entry block. This 15chapter describes representation of loops in GCC, both on GIMPLE and in 16RTL, as well as the interfaces to loop-related analyses (induction 17variable analysis and number of iterations analysis). 18 19@menu 20* Loop representation:: Representation and analysis of loops. 21* Loop querying:: Getting information about loops. 22* Loop manipulation:: Loop manipulation functions. 23* LCSSA:: Loop-closed SSA form. 24* Scalar evolutions:: Induction variables on GIMPLE. 25* loop-iv:: Induction variables on RTL. 26* Number of iterations:: Number of iterations analysis. 27* Dependency analysis:: Data dependency analysis. 28* Lambda:: Linear loop transformations framework. 29@end menu 30 31@node Loop representation 32@section Loop representation 33@cindex Loop representation 34@cindex Loop analysis 35 36This chapter describes the representation of loops in GCC, and functions 37that can be used to build, modify and analyze this representation. Most 38of the interfaces and data structures are declared in @file{cfgloop.h}. 39At the moment, loop structures are analyzed and this information is 40updated only by the optimization passes that deal with loops, but some 41efforts are being made to make it available throughout most of the 42optimization passes. 43 44In general, a natural loop has one entry block (header) and possibly 45several back edges (latches) leading to the header from the inside of 46the loop. Loops with several latches may appear if several loops share 47a single header, or if there is a branching in the middle of the loop. 48The representation of loops in GCC however allows only loops with a 49single latch. During loop analysis, headers of such loops are split and 50forwarder blocks are created in order to disambiguate their structures. 51A heuristic based on profile information is used to determine whether 52the latches correspond to sub-loops or to control flow in a single loop. 53This means that the analysis sometimes changes the CFG, and if you run 54it in the middle of an optimization pass, you must be able to deal with 55the new blocks. 56 57Body of the loop is the set of blocks that are dominated by its header, 58and reachable from its latch against the direction of edges in CFG. The 59loops are organized in a containment hierarchy (tree) such that all the 60loops immediately contained inside loop L are the children of L in the 61tree. This tree is represented by the @code{struct loops} structure. 62The root of this tree is a fake loop that contains all blocks in the 63function. Each of the loops is represented in a @code{struct loop} 64structure. Each loop is assigned an index (@code{num} field of the 65@code{struct loop} structure), and the pointer to the loop is stored in 66the corresponding field of the @code{parray} field of the loops 67structure. Index of a sub-loop is always greater than the index of its 68super-loop. The indices do not have to be continuous, there may be 69empty (@code{NULL}) entries in the @code{parray} created by deleting 70loops. The index of a loop never changes. The first unused index is 71stored in the @code{num} field of the loops structure. 72 73Each basic block contains the reference to the innermost loop it belongs 74to (@code{loop_father}). For this reason, it is only possible to have 75one @code{struct loops} structure initialized at the same time for each 76CFG. It is recommended to use the global variable @code{current_loops} 77to contain the @code{struct loops} structure, especially if the loop 78structures are updated throughout several passes. Many of the loop 79manipulation functions assume that dominance information is up-to-date. 80 81The loops are analyzed through @code{loop_optimizer_init} function. The 82argument of this function is a set of flags represented in an integer 83bitmask. These flags specify what other properties of the loop 84structures should be calculated/enforced and preserved later: 85 86@itemize 87@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such 88a way that each loop has only one entry edge, and additionally, the 89source block of this entry edge has only one successor. This creates a 90natural place where the code can be moved out of the loop, and ensures 91that the entry edge of the loop leads from its immediate super-loop. 92@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to 93force the latch block of each loop to have only one successor. This 94ensures that the latch of the loop does not belong to any of its 95sub-loops, and makes manipulation with the loops significantly easier. 96Most of the loop manipulation functions assume that the loops are in 97this shape. Note that with this flag, the ``normal'' loop without any 98control flow inside and with one exit consists of two basic blocks. 99@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and 100edges in the strongly connected components that are not natural loops 101(have more than one entry block) are marked with 102@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags. The 103flag is not set for blocks and edges that belong to natural loops that 104are in such an irreducible region (but it is set for the entry and exit 105edges of such a loop, if they lead to/from this region). 106@item @code{LOOPS_HAVE_MARKED_SINGLE_EXITS}: If a loop has exactly one 107exit edge, this edge is stored in @code{single_exit} field of the loop 108structure. @code{NULL} is stored there otherwise. 109@end itemize 110 111These properties may also be computed/enforced later, using functions 112@code{create_preheaders}, @code{force_single_succ_latches}, 113@code{mark_irreducible_loops} and @code{mark_single_exit_loops}. 114 115The memory occupied by the loops structures should be freed with 116@code{loop_optimizer_finalize} function. 117 118The CFG manipulation functions in general do not update loop structures. 119Specialized versions that additionally do so are provided for the most 120common tasks. On GIMPLE, @code{cleanup_tree_cfg_loop} function can be 121used to cleanup CFG while updating the loops structures if 122@code{current_loops} is set. 123 124@node Loop querying 125@section Loop querying 126@cindex Loop querying 127 128The functions to query the information about loops are declared in 129@file{cfgloop.h}. Some of the information can be taken directly from 130the structures. @code{loop_father} field of each basic block contains 131the innermost loop to that the block belongs. The most useful fields of 132loop structure (that are kept up-to-date at all times) are: 133 134@itemize 135@item @code{header}, @code{latch}: Header and latch basic blocks of the 136loop. 137@item @code{num_nodes}: Number of basic blocks in the loop (including 138the basic blocks of the sub-loops). 139@item @code{depth}: The depth of the loop in the loops tree, i.e., the 140number of super-loops of the loop. 141@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first 142sub-loop, and the sibling of the loop in the loops tree. 143@item @code{single_exit}: The exit edge of the loop, if the loop has 144exactly one exit and the loops were analyzed with 145LOOPS_HAVE_MARKED_SINGLE_EXITS. 146@end itemize 147 148There are other fields in the loop structures, many of them used only by 149some of the passes, or not updated during CFG changes; in general, they 150should not be accessed directly. 151 152The most important functions to query loop structures are: 153 154@itemize 155@item @code{flow_loops_dump}: Dumps the information about loops to a 156file. 157@item @code{verify_loop_structure}: Checks consistency of the loop 158structures. 159@item @code{loop_latch_edge}: Returns the latch edge of a loop. 160@item @code{loop_preheader_edge}: If loops have preheaders, returns 161the preheader edge of a loop. 162@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of 163another loop. 164@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs 165to a loop (including its sub-loops). 166@item @code{find_common_loop}: Finds the common super-loop of two loops. 167@item @code{superloop_at_depth}: Returns the super-loop of a loop with 168the given depth. 169@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the 170number of insns in the loop, on GIMPLE and on RTL. 171@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a 172loop. 173@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops 174with @code{EDGE_LOOP_EXIT} flag. 175@item @code{get_loop_body}, @code{get_loop_body_in_dom_order}, 176@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the 177loop in depth-first search order in reversed CFG, ordered by dominance 178relation, and breath-first search order, respectively. 179@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop. 180@item @code{just_once_each_iteration_p}: Returns true if the basic block 181is executed exactly once during each iteration of a loop (that is, it 182does not belong to a sub-loop, and it dominates the latch of the loop). 183@end itemize 184 185@node Loop manipulation 186@section Loop manipulation 187@cindex Loop manipulation 188 189The loops tree can be manipulated using the following functions: 190 191@itemize 192@item @code{flow_loop_tree_node_add}: Adds a node to the tree. 193@item @code{flow_loop_tree_node_remove}: Removes a node from the tree. 194@item @code{add_bb_to_loop}: Adds a basic block to a loop. 195@item @code{remove_bb_from_loops}: Removes a basic block from loops. 196@end itemize 197 198The specialized versions of several low-level CFG functions that also 199update loop structures are provided: 200 201@itemize 202@item @code{loop_split_edge_with}: Splits an edge, and places a 203specified RTL code on it. On GIMPLE, the function can still be used, 204but the code must be NULL. 205@item @code{bsi_insert_on_edge_immediate_loop}: Inserts code on edge, 206splitting it if necessary. Only works on GIMPLE. 207@item @code{remove_path}: Removes an edge and all blocks it dominates. 208@item @code{loop_commit_inserts}: Commits insertions scheduled on edges, 209and sets loops for the new blocks. This function can only be used on 210GIMPLE. 211@item @code{split_loop_exit_edge}: Splits exit edge of the loop, 212ensuring that PHI node arguments remain in the loop (this ensures that 213loop-closed SSA form is preserved). Only useful on GIMPLE. 214@end itemize 215 216Finally, there are some higher-level loop transformations implemented. 217While some of them are written so that they should work on non-innermost 218loops, they are mostly untested in that case, and at the moment, they 219are only reliable for the innermost loops: 220 221@itemize 222@item @code{create_iv}: Creates a new induction variable. Only works on 223GIMPLE. @code{standard_iv_increment_position} can be used to find a 224suitable place for the iv increment. 225@item @code{duplicate_loop_to_header_edge}, 226@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and 227on GIMPLE) duplicate the body of the loop prescribed number of times on 228one of the edges entering loop header, thus performing either loop 229unrolling or loop peeling. @code{can_duplicate_loop_p} 230(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated 231loop. 232@item @code{loop_version}, @code{tree_ssa_loop_version}: These function 233create a copy of a loop, and a branch before them that selects one of 234them depending on the prescribed condition. This is useful for 235optimizations that need to verify some assumptions in runtime (one of 236the copies of the loop is usually left unchanged, while the other one is 237transformed in some way). 238@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the 239extra iterations to make the number of iterations divisible by unroll 240factor, updating the exit condition, and removing the exits that now 241cannot be taken. Works only on GIMPLE. 242@end itemize 243 244@node LCSSA 245@section Loop-closed SSA form 246@cindex LCSSA 247@cindex Loop-closed SSA form 248 249Throughout the loop optimizations on tree level, one extra condition is 250enforced on the SSA form: No SSA name is used outside of the loop in 251that it is defined. The SSA form satisfying this condition is called 252``loop-closed SSA form'' -- LCSSA. To enforce LCSSA, PHI nodes must be 253created at the exits of the loops for the SSA names that are used 254outside of them. Only the real operands (not virtual SSA names) are 255held in LCSSA, in order to save memory. 256 257There are various benefits of LCSSA: 258 259@itemize 260@item Many optimizations (value range analysis, final value 261replacement) are interested in the values that are defined in the loop 262and used outside of it, i.e., exactly those for that we create new PHI 263nodes. 264@item In induction variable analysis, it is not necessary to specify the 265loop in that the analysis should be performed -- the scalar evolution 266analysis always returns the results with respect to the loop in that the 267SSA name is defined. 268@item It makes updating of SSA form during loop transformations simpler. 269Without LCSSA, operations like loop unrolling may force creation of PHI 270nodes arbitrarily far from the loop, while in LCSSA, the SSA form can be 271updated locally. However, since we only keep real operands in LCSSA, we 272cannot use this advantage (we could have local updating of real 273operands, but it is not much more efficient than to use generic SSA form 274updating for it as well; the amount of changes to SSA is the same). 275@end itemize 276 277However, it also means LCSSA must be updated. This is usually 278straightforward, unless you create a new value in loop and use it 279outside, or unless you manipulate loop exit edges (functions are 280provided to make these manipulations simple). 281@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to 282LCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of 283LCSSA is preserved. 284 285@node Scalar evolutions 286@section Scalar evolutions 287@cindex Scalar evolutions 288@cindex IV analysis on GIMPLE 289 290Scalar evolutions (SCEV) are used to represent results of induction 291variable analysis on GIMPLE. They enable us to represent variables with 292complicated behavior in a simple and consistent way (we only use it to 293express values of polynomial induction variables, but it is possible to 294extend it). The interfaces to SCEV analysis are declared in 295@file{tree-scalar-evolution.h}. To use scalar evolutions analysis, 296@code{scev_initialize} must be used. To stop using SCEV, 297@code{scev_finalize} should be used. SCEV analysis caches results in 298order to save time and memory. This cache however is made invalid by 299most of the loop transformations, including removal of code. If such a 300transformation is performed, @code{scev_reset} must be called to clean 301the caches. 302 303Given an SSA name, its behavior in loops can be analyzed using the 304@code{analyze_scalar_evolution} function. The returned SCEV however 305does not have to be fully analyzed and it may contain references to 306other SSA names defined in the loop. To resolve these (potentially 307recursive) references, @code{instantiate_parameters} or 308@code{resolve_mixers} functions must be used. 309@code{instantiate_parameters} is useful when you use the results of SCEV 310only for some analysis, and when you work with whole nest of loops at 311once. It will try replacing all SSA names by their SCEV in all loops, 312including the super-loops of the current loop, thus providing a complete 313information about the behavior of the variable in the loop nest. 314@code{resolve_mixers} is useful if you work with only one loop at a 315time, and if you possibly need to create code based on the value of the 316induction variable. It will only resolve the SSA names defined in the 317current loop, leaving the SSA names defined outside unchanged, even if 318their evolution in the outer loops is known. 319 320The SCEV is a normal tree expression, except for the fact that it may 321contain several special tree nodes. One of them is 322@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be 323expressed. The other one is @code{POLYNOMIAL_CHREC}. Polynomial chrec 324has three arguments -- base, step and loop (both base and step may 325contain further polynomial chrecs). Type of the expression and of base 326and step must be the same. A variable has evolution 327@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified 328loop) equivalent to @code{x_1} in the following example 329 330@smallexample 331while (...) 332 @{ 333 x_1 = phi (base, x_2); 334 x_2 = x_1 + step; 335 @} 336@end smallexample 337 338Note that this includes the language restrictions on the operations. 339For example, if we compile C code and @code{x} has signed type, then the 340overflow in addition would cause undefined behavior, and we may assume 341that this does not happen. Hence, the value with this SCEV cannot 342overflow (which restricts the number of iterations of such a loop). 343 344In many cases, one wants to restrict the attention just to affine 345induction variables. In this case, the extra expressive power of SCEV 346is not useful, and may complicate the optimizations. In this case, 347@code{simple_iv} function may be used to analyze a value -- the result 348is a loop-invariant base and step. 349 350@node loop-iv 351@section IV analysis on RTL 352@cindex IV analysis on RTL 353 354The induction variable on RTL is simple and only allows analysis of 355affine induction variables, and only in one loop at once. The interface 356is declared in @file{cfgloop.h}. Before analyzing induction variables 357in a loop L, @code{iv_analysis_loop_init} function must be called on L. 358After the analysis (possibly calling @code{iv_analysis_loop_init} for 359several loops) is finished, @code{iv_analysis_done} should be called. 360The following functions can be used to access the results of the 361analysis: 362 363@itemize 364@item @code{iv_analyze}: Analyzes a single register used in the given 365insn. If no use of the register in this insn is found, the following 366insns are scanned, so that this function can be called on the insn 367returned by get_condition. 368@item @code{iv_analyze_result}: Analyzes result of the assignment in the 369given insn. 370@item @code{iv_analyze_expr}: Analyzes a more complicated expression. 371All its operands are analyzed by @code{iv_analyze}, and hence they must 372be used in the specified insn or one of the following insns. 373@end itemize 374 375The description of the induction variable is provided in @code{struct 376rtx_iv}. In order to handle subregs, the representation is a bit 377complicated; if the value of the @code{extend} field is not 378@code{UNKNOWN}, the value of the induction variable in the i-th 379iteration is 380 381@smallexample 382delta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)), 383@end smallexample 384 385with the following exception: if @code{first_special} is true, then the 386value in the first iteration (when @code{i} is zero) is @code{delta + 387mult * base}. However, if @code{extend} is equal to @code{UNKNOWN}, 388then @code{first_special} must be false, @code{delta} 0, @code{mult} 1 389and the value in the i-th iteration is 390 391@smallexample 392subreg_@{mode@} (base + i * step) 393@end smallexample 394 395The function @code{get_iv_value} can be used to perform these 396calculations. 397 398@node Number of iterations 399@section Number of iterations analysis 400@cindex Number of iterations analysis 401 402Both on GIMPLE and on RTL, there are functions available to determine 403the number of iterations of a loop, with a similar interface. In many 404cases, it is not possible to determine number of iterations 405unconditionally -- the determined number is correct only if some 406assumptions are satisfied. The analysis tries to verify these 407conditions using the information contained in the program; if it fails, 408the conditions are returned together with the result. The following 409information and conditions are provided by the analysis: 410 411@itemize 412@item @code{assumptions}: If this condition is false, the rest of 413the information is invalid. 414@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If 415this condition is true, the loop exits in the first iteration. 416@item @code{infinite}: If this condition is true, the loop is infinite. 417This condition is only available on RTL. On GIMPLE, conditions for 418finiteness of the loop are included in @code{assumptions}. 419@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression 420that gives number of iterations. The number of iterations is defined as 421the number of executions of the loop latch. 422@end itemize 423 424Both on GIMPLE and on RTL, it necessary for the induction variable 425analysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL). 426On GIMPLE, the results are stored to @code{struct tree_niter_desc} 427structure. Number of iterations before the loop is exited through a 428given exit can be determined using @code{number_of_iterations_exit} 429function. On RTL, the results are returned in @code{struct niter_desc} 430structure. The corresponding function is named 431@code{check_simple_exit}. There are also functions that pass through 432all the exits of a loop and try to find one with easy to determine 433number of iterations -- @code{find_loop_niter} on GIMPLE and 434@code{find_simple_exit} on RTL. Finally, there are functions that 435provide the same information, but additionally cache it, so that 436repeated calls to number of iterations are not so costly -- 437@code{number_of_iterations_in_loop} on GIMPLE and 438@code{get_simple_loop_desc} on RTL. 439 440Note that some of these functions may behave slightly differently than 441others -- some of them return only the expression for the number of 442iterations, and fail if there are some assumptions. The function 443@code{number_of_iterations_in_loop} works only for single-exit loops, 444and it returns the value for number of iterations higher by one with 445respect to all other functions (i.e., it returns number of executions of 446the exit statement, not of the loop latch). 447 448@node Dependency analysis 449@section Data Dependency Analysis 450@cindex Data Dependency Analysis 451 452The code for the data dependence analysis can be found in 453@file{tree-data-ref.c} and its interface and data structures are 454described in @file{tree-data-ref.h}. The function that computes the 455data dependences for all the array and pointer references for a given 456loop is @code{compute_data_dependences_for_loop}. This function is 457currently used by the linear loop transform and the vectorization 458passes. Before calling this function, one has to allocate two vectors: 459a first vector will contain the set of data references that are 460contained in the analyzed loop body, and the second vector will contain 461the dependence relations between the data references. Thus if the 462vector of data references is of size @code{n}, the vector containing the 463dependence relations will contain @code{n*n} elements. However if the 464analyzed loop contains side effects, such as calls that potentially can 465interfere with the data references in the current analyzed loop, the 466analysis stops while scanning the loop body for data references, and 467inserts a single @code{chrec_dont_know} in the dependence relation 468array. 469 470The data references are discovered in a particular order during the 471scanning of the loop body: the loop body is analyzed in execution order, 472and the data references of each statement are pushed at the end of the 473data reference array. Two data references syntactically occur in the 474program in the same order as in the array of data references. This 475syntactic order is important in some classical data dependence tests, 476and mapping this order to the elements of this array avoids costly 477queries to the loop body representation. 478 479Three types of data references are currently handled: ARRAY_REF, 480INDIRECT_REF and COMPONENT_REF. The data structure for the data reference 481is @code{data_reference}, where @code{data_reference_p} is a name of a 482pointer to the data reference structure. The structure contains the 483following elements: 484 485@itemize 486@item @code{base_object_info}: Provides information about the base object 487of the data reference and its access functions. These access functions 488represent the evolution of the data reference in the loop relative to 489its base, in keeping with the classical meaning of the data reference 490access function for the support of arrays. For example, for a reference 491@code{a.b[i][j]}, the base object is @code{a.b} and the access functions, 492one for each array subscript, are: 493@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}. 494 495@item @code{first_location_in_loop}: Provides information about the first 496location accessed by the data reference in the loop and about the access 497function used to represent evolution relative to this location. This data 498is used to support pointers, and is not used for arrays (for which we 499have base objects). Pointer accesses are represented as a one-dimensional 500access that starts from the first location accessed in the loop. For 501example: 502 503@smallexample 504 for1 i 505 for2 j 506 *((int *)p + i + j) = a[i][j]; 507@end smallexample 508 509The access function of the pointer access is @code{@{0, + 4B@}_for2} 510relative to @code{p + i}. The access functions of the array are 511@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2} 512relative to @code{a}. 513 514Usually, the object the pointer refers to is either unknown, or we can't 515prove that the access is confined to the boundaries of a certain object. 516 517Two data references can be compared only if at least one of these two 518representations has all its fields filled for both data references. 519 520The current strategy for data dependence tests is as follows: 521If both @code{a} and @code{b} are represented as arrays, compare 522@code{a.base_object} and @code{b.base_object}; 523if they are equal, apply dependence tests (use access functions based on 524base_objects). 525Else if both @code{a} and @code{b} are represented as pointers, compare 526@code{a.first_location} and @code{b.first_location}; 527if they are equal, apply dependence tests (use access functions based on 528first location). 529However, if @code{a} and @code{b} are represented differently, only try 530to prove that the bases are definitely different. 531 532@item Aliasing information. 533@item Alignment information. 534@end itemize 535 536The structure describing the relation between two data references is 537@code{data_dependence_relation} and the shorter name for a pointer to 538such a structure is @code{ddr_p}. This structure contains: 539 540@itemize 541@item a pointer to each data reference, 542@item a tree node @code{are_dependent} that is set to @code{chrec_known} 543if the analysis has proved that there is no dependence between these two 544data references, @code{chrec_dont_know} if the analysis was not able to 545determine any useful result and potentially there could exist a 546dependence between these data references, and @code{are_dependent} is 547set to @code{NULL_TREE} if there exist a dependence relation between the 548data references, and the description of this dependence relation is 549given in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects} 550arrays, 551@item a boolean that determines whether the dependence relation can be 552represented by a classical distance vector, 553@item an array @code{subscripts} that contains a description of each 554subscript of the data references. Given two array accesses a 555subscript is the tuple composed of the access functions for a given 556dimension. For example, given @code{A[f1][f2][f3]} and 557@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2, 558g2), (f3, g3)}. 559@item two arrays @code{dir_vects} and @code{dist_vects} that contain 560classical representations of the data dependences under the form of 561direction and distance dependence vectors, 562@item an array of loops @code{loop_nest} that contains the loops to 563which the distance and direction vectors refer to. 564@end itemize 565 566Several functions for pretty printing the information extracted by the 567data dependence analysis are available: @code{dump_ddrs} prints with a 568maximum verbosity the details of a data dependence relations array, 569@code{dump_dist_dir_vectors} prints only the classical distance and 570direction vectors for a data dependence relations array, and 571@code{dump_data_references} prints the details of the data references 572contained in a data reference array. 573 574@node Lambda 575@section Linear loop transformations framework 576@cindex Linear loop transformations framework 577 578Lambda is a framework that allows transformations of loops using 579non-singular matrix based transformations of the iteration space and 580loop bounds. This allows compositions of skewing, scaling, interchange, 581and reversal transformations. These transformations are often used to 582improve cache behavior or remove inner loop dependencies to allow 583parallelization and vectorization to take place. 584 585To perform these transformations, Lambda requires that the loopnest be 586converted into a internal form that can be matrix transformed easily. 587To do this conversion, the function 588@code{gcc_loopnest_to_lambda_loopnest} is provided. If the loop cannot 589be transformed using lambda, this function will return NULL. 590 591Once a @code{lambda_loopnest} is obtained from the conversion function, 592it can be transformed by using @code{lambda_loopnest_transform}, which 593takes a transformation matrix to apply. Note that it is up to the 594caller to verify that the transformation matrix is legal to apply to the 595loop (dependence respecting, etc). Lambda simply applies whatever 596matrix it is told to provide. It can be extended to make legal matrices 597out of any non-singular matrix, but this is not currently implemented. 598Legality of a matrix for a given loopnest can be verified using 599@code{lambda_transform_legal_p}. 600 601Given a transformed loopnest, conversion back into gcc IR is done by 602@code{lambda_loopnest_to_gcc_loopnest}. This function will modify the 603loops so that they match the transformed loopnest. 604 605