1@c Copyright (c) 2006 Free Software Foundation, Inc.
2@c Free Software Foundation, Inc.
3@c This is part of the GCC manual.
4@c For copying conditions, see the file gcc.texi.
5
6@c ---------------------------------------------------------------------
7@c Loop Representation
8@c ---------------------------------------------------------------------
9
10@node Loop Analysis and Representation
11@chapter Analysis and Representation of Loops
12
13GCC provides extensive infrastructure for work with natural loops, i.e.,
14strongly connected components of CFG with only one entry block.  This
15chapter describes representation of loops in GCC, both on GIMPLE and in
16RTL, as well as the interfaces to loop-related analyses (induction
17variable analysis and number of iterations analysis).
18
19@menu
20* Loop representation::		Representation and analysis of loops.
21* Loop querying::		Getting information about loops.
22* Loop manipulation::		Loop manipulation functions.
23* LCSSA::			Loop-closed SSA form.
24* Scalar evolutions::   	Induction variables on GIMPLE.
25* loop-iv::			Induction variables on RTL.
26* Number of iterations::	Number of iterations analysis.
27* Dependency analysis::		Data dependency analysis.
28* Lambda::			Linear loop transformations framework.
29@end menu
30
31@node Loop representation
32@section Loop representation
33@cindex Loop representation
34@cindex Loop analysis
35
36This chapter describes the representation of loops in GCC, and functions
37that can be used to build, modify and analyze this representation.  Most
38of the interfaces and data structures are declared in @file{cfgloop.h}.
39At the moment, loop structures are analyzed and this information is
40updated only by the optimization passes that deal with loops, but some
41efforts are being made to make it available throughout most of the
42optimization passes.
43
44In general, a natural loop has one entry block (header) and possibly
45several back edges (latches) leading to the header from the inside of
46the loop.  Loops with several latches may appear if several loops share
47a single header, or if there is a branching in the middle of the loop.
48The representation of loops in GCC however allows only loops with a
49single latch.  During loop analysis, headers of such loops are split and
50forwarder blocks are created in order to disambiguate their structures.
51A heuristic based on profile information is used to determine whether
52the latches correspond to sub-loops or to control flow in a single loop.
53This means that the analysis sometimes changes the CFG, and if you run
54it in the middle of an optimization pass, you must be able to deal with
55the new blocks.
56
57Body of the loop is the set of blocks that are dominated by its header,
58and reachable from its latch against the direction of edges in CFG.  The
59loops are organized in a containment hierarchy (tree) such that all the
60loops immediately contained inside loop L are the children of L in the
61tree.  This tree is represented by the @code{struct loops} structure.
62The root of this tree is a fake loop that contains all blocks in the
63function.  Each of the loops is represented in a @code{struct loop}
64structure.  Each loop is assigned an index (@code{num} field of the
65@code{struct loop} structure), and the pointer to the loop is stored in
66the corresponding field of the @code{parray} field of the loops
67structure.  Index of a sub-loop is always greater than the index of its
68super-loop.  The indices do not have to be continuous, there may be
69empty (@code{NULL}) entries in the @code{parray} created by deleting
70loops.  The index of a loop never changes.  The first unused index is
71stored in the @code{num} field of the loops structure.
72
73Each basic block contains the reference to the innermost loop it belongs
74to (@code{loop_father}).  For this reason, it is only possible to have
75one @code{struct loops} structure initialized at the same time for each
76CFG.  It is recommended to use the global variable @code{current_loops}
77to contain the @code{struct loops} structure, especially if the loop
78structures are updated throughout several passes.  Many of the loop
79manipulation functions assume that dominance information is up-to-date.
80
81The loops are analyzed through @code{loop_optimizer_init} function.  The
82argument of this function is a set of flags represented in an integer
83bitmask.  These flags specify what other properties of the loop
84structures should be calculated/enforced and preserved later:
85
86@itemize
87@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such
88a way that each loop has only one entry edge, and additionally, the
89source block of this entry edge has only one successor.  This creates a
90natural place where the code can be moved out of the loop, and ensures
91that the entry edge of the loop leads from its immediate super-loop.
92@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to
93force the latch block of each loop to have only one successor.  This
94ensures that the latch of the loop does not belong to any of its
95sub-loops, and makes manipulation with the loops significantly easier.
96Most of the loop manipulation functions assume that the loops are in
97this shape.  Note that with this flag, the ``normal'' loop without any
98control flow inside and with one exit consists of two basic blocks.
99@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and
100edges in the strongly connected components that are not natural loops
101(have more than one entry block) are marked with
102@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags.  The
103flag is not set for blocks and edges that belong to natural loops that
104are in such an irreducible region (but it is set for the entry and exit
105edges of such a loop, if they lead to/from this region).
106@item @code{LOOPS_HAVE_MARKED_SINGLE_EXITS}: If a loop has exactly one
107exit edge, this edge is stored in @code{single_exit} field of the loop
108structure.  @code{NULL} is stored there otherwise.
109@end itemize
110
111These properties may also be computed/enforced later, using functions
112@code{create_preheaders}, @code{force_single_succ_latches},
113@code{mark_irreducible_loops} and @code{mark_single_exit_loops}.
114
115The memory occupied by the loops structures should be freed with
116@code{loop_optimizer_finalize} function.
117
118The CFG manipulation functions in general do not update loop structures.
119Specialized versions that additionally do so are provided for the most
120common tasks.  On GIMPLE, @code{cleanup_tree_cfg_loop} function can be
121used to cleanup CFG while updating the loops structures if
122@code{current_loops} is set.
123
124@node Loop querying
125@section Loop querying
126@cindex Loop querying
127
128The functions to query the information about loops are declared in
129@file{cfgloop.h}.  Some of the information can be taken directly from
130the structures.  @code{loop_father} field of each basic block contains
131the innermost loop to that the block belongs.  The most useful fields of
132loop structure (that are kept up-to-date at all times) are:
133
134@itemize
135@item @code{header}, @code{latch}: Header and latch basic blocks of the
136loop.
137@item @code{num_nodes}: Number of basic blocks in the loop (including
138the basic blocks of the sub-loops).
139@item @code{depth}: The depth of the loop in the loops tree, i.e., the
140number of super-loops of the loop.
141@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first
142sub-loop, and the sibling of the loop in the loops tree.
143@item @code{single_exit}: The exit edge of the loop, if the loop has
144exactly one exit and the loops were analyzed with
145LOOPS_HAVE_MARKED_SINGLE_EXITS.
146@end itemize
147
148There are other fields in the loop structures, many of them used only by
149some of the passes, or not updated during CFG changes; in general, they
150should not be accessed directly.
151
152The most important functions to query loop structures are:
153
154@itemize
155@item @code{flow_loops_dump}: Dumps the information about loops to a
156file.
157@item @code{verify_loop_structure}: Checks consistency of the loop
158structures.
159@item @code{loop_latch_edge}: Returns the latch edge of a loop.
160@item @code{loop_preheader_edge}: If loops have preheaders, returns
161the preheader edge of a loop.
162@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of
163another loop.
164@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs
165to a loop (including its sub-loops).
166@item @code{find_common_loop}: Finds the common super-loop of two loops.
167@item @code{superloop_at_depth}: Returns the super-loop of a loop with
168the given depth.
169@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the
170number of insns in the loop, on GIMPLE and on RTL.
171@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a
172loop.
173@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops
174with @code{EDGE_LOOP_EXIT} flag.
175@item @code{get_loop_body}, @code{get_loop_body_in_dom_order},
176@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the
177loop in depth-first search order in reversed CFG, ordered by dominance
178relation, and breath-first search order, respectively.
179@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop.
180@item @code{just_once_each_iteration_p}: Returns true if the basic block
181is executed exactly once during each iteration of a loop (that is, it
182does not belong to a sub-loop, and it dominates the latch of the loop).
183@end itemize
184
185@node Loop manipulation
186@section Loop manipulation
187@cindex Loop manipulation
188
189The loops tree can be manipulated using the following functions:
190
191@itemize
192@item @code{flow_loop_tree_node_add}: Adds a node to the tree.
193@item @code{flow_loop_tree_node_remove}: Removes a node from the tree.
194@item @code{add_bb_to_loop}: Adds a basic block to a loop.
195@item @code{remove_bb_from_loops}: Removes a basic block from loops.
196@end itemize
197
198The specialized versions of several low-level CFG functions that also
199update loop structures are provided:
200
201@itemize
202@item @code{loop_split_edge_with}: Splits an edge, and places a
203specified RTL code on it.  On GIMPLE, the function can still be used,
204but the code must be NULL.
205@item @code{bsi_insert_on_edge_immediate_loop}: Inserts code on edge,
206splitting it if necessary.  Only works on GIMPLE.
207@item @code{remove_path}: Removes an edge and all blocks it dominates.
208@item @code{loop_commit_inserts}: Commits insertions scheduled on edges,
209and sets loops for the new blocks.  This function can only be used on
210GIMPLE.
211@item @code{split_loop_exit_edge}: Splits exit edge of the loop,
212ensuring that PHI node arguments remain in the loop (this ensures that
213loop-closed SSA form is preserved).  Only useful on GIMPLE.
214@end itemize
215
216Finally, there are some higher-level loop transformations implemented.
217While some of them are written so that they should work on non-innermost
218loops, they are mostly untested in that case, and at the moment, they
219are only reliable for the innermost loops:
220
221@itemize
222@item @code{create_iv}: Creates a new induction variable.  Only works on
223GIMPLE.  @code{standard_iv_increment_position} can be used to find a
224suitable place for the iv increment.
225@item @code{duplicate_loop_to_header_edge},
226@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and
227on GIMPLE) duplicate the body of the loop prescribed number of times on
228one of the edges entering loop header, thus performing either loop
229unrolling or loop peeling.  @code{can_duplicate_loop_p}
230(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated
231loop.
232@item @code{loop_version}, @code{tree_ssa_loop_version}: These function
233create a copy of a loop, and a branch before them that selects one of
234them depending on the prescribed condition.  This is useful for
235optimizations that need to verify some assumptions in runtime (one of
236the copies of the loop is usually left unchanged, while the other one is
237transformed in some way).
238@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the
239extra iterations to make the number of iterations divisible by unroll
240factor, updating the exit condition, and removing the exits that now
241cannot be taken.  Works only on GIMPLE.
242@end itemize
243
244@node LCSSA
245@section Loop-closed SSA form
246@cindex LCSSA
247@cindex Loop-closed SSA form
248
249Throughout the loop optimizations on tree level, one extra condition is
250enforced on the SSA form:  No SSA name is used outside of the loop in
251that it is defined.  The SSA form satisfying this condition is called
252``loop-closed SSA form'' -- LCSSA.  To enforce LCSSA, PHI nodes must be
253created at the exits of the loops for the SSA names that are used
254outside of them.  Only the real operands (not virtual SSA names) are
255held in LCSSA, in order to save memory.
256
257There are various benefits of LCSSA:
258
259@itemize
260@item Many optimizations (value range analysis, final value
261replacement) are interested in the values that are defined in the loop
262and used outside of it, i.e., exactly those for that we create new PHI
263nodes.
264@item In induction variable analysis, it is not necessary to specify the
265loop in that the analysis should be performed -- the scalar evolution
266analysis always returns the results with respect to the loop in that the
267SSA name is defined.
268@item It makes updating of SSA form during loop transformations simpler.
269Without LCSSA, operations like loop unrolling may force creation of PHI
270nodes arbitrarily far from the loop, while in LCSSA, the SSA form can be
271updated locally.  However, since we only keep real operands in LCSSA, we
272cannot use this advantage (we could have local updating of real
273operands, but it is not much more efficient than to use generic SSA form
274updating for it as well; the amount of changes to SSA is the same).
275@end itemize
276
277However, it also means LCSSA must be updated.  This is usually
278straightforward, unless you create a new value in loop and use it
279outside, or unless you manipulate loop exit edges (functions are
280provided to make these manipulations simple).
281@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to
282LCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of
283LCSSA is preserved.
284
285@node Scalar evolutions
286@section Scalar evolutions
287@cindex Scalar evolutions
288@cindex IV analysis on GIMPLE
289
290Scalar evolutions (SCEV) are used to represent results of induction
291variable analysis on GIMPLE.  They enable us to represent variables with
292complicated behavior in a simple and consistent way (we only use it to
293express values of polynomial induction variables, but it is possible to
294extend it).  The interfaces to SCEV analysis are declared in
295@file{tree-scalar-evolution.h}.  To use scalar evolutions analysis,
296@code{scev_initialize} must be used.  To stop using SCEV,
297@code{scev_finalize} should be used.  SCEV analysis caches results in
298order to save time and memory.  This cache however is made invalid by
299most of the loop transformations, including removal of code.  If such a
300transformation is performed, @code{scev_reset} must be called to clean
301the caches.
302
303Given an SSA name, its behavior in loops can be analyzed using the
304@code{analyze_scalar_evolution} function.  The returned SCEV however
305does not have to be fully analyzed and it may contain references to
306other SSA names defined in the loop.  To resolve these (potentially
307recursive) references, @code{instantiate_parameters} or
308@code{resolve_mixers} functions must be used.
309@code{instantiate_parameters} is useful when you use the results of SCEV
310only for some analysis, and when you work with whole nest of loops at
311once.  It will try replacing all SSA names by their SCEV in all loops,
312including the super-loops of the current loop, thus providing a complete
313information about the behavior of the variable in the loop nest.
314@code{resolve_mixers} is useful if you work with only one loop at a
315time, and if you possibly need to create code based on the value of the
316induction variable.  It will only resolve the SSA names defined in the
317current loop, leaving the SSA names defined outside unchanged, even if
318their evolution in the outer loops is known.
319
320The SCEV is a normal tree expression, except for the fact that it may
321contain several special tree nodes.  One of them is
322@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be
323expressed.  The other one is @code{POLYNOMIAL_CHREC}.  Polynomial chrec
324has three arguments -- base, step and loop (both base and step may
325contain further polynomial chrecs).  Type of the expression and of base
326and step must be the same.  A variable has evolution
327@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified
328loop) equivalent to @code{x_1} in the following example
329
330@smallexample
331while (...)
332  @{
333    x_1 = phi (base, x_2);
334    x_2 = x_1 + step;
335  @}
336@end smallexample
337
338Note that this includes the language restrictions on the operations.
339For example, if we compile C code and @code{x} has signed type, then the
340overflow in addition would cause undefined behavior, and we may assume
341that this does not happen.  Hence, the value with this SCEV cannot
342overflow (which restricts the number of iterations of such a loop).
343
344In many cases, one wants to restrict the attention just to affine
345induction variables.  In this case, the extra expressive power of SCEV
346is not useful, and may complicate the optimizations.  In this case,
347@code{simple_iv} function may be used to analyze a value -- the result
348is a loop-invariant base and step.
349
350@node loop-iv
351@section IV analysis on RTL
352@cindex IV analysis on RTL
353
354The induction variable on RTL is simple and only allows analysis of
355affine induction variables, and only in one loop at once.  The interface
356is declared in @file{cfgloop.h}.  Before analyzing induction variables
357in a loop L, @code{iv_analysis_loop_init} function must be called on L.
358After the analysis (possibly calling @code{iv_analysis_loop_init} for
359several loops) is finished, @code{iv_analysis_done} should be called.
360The following functions can be used to access the results of the
361analysis:
362
363@itemize
364@item @code{iv_analyze}: Analyzes a single register used in the given
365insn.  If no use of the register in this insn is found, the following
366insns are scanned, so that this function can be called on the insn
367returned by get_condition.
368@item @code{iv_analyze_result}: Analyzes result of the assignment in the
369given insn.
370@item @code{iv_analyze_expr}: Analyzes a more complicated expression.
371All its operands are analyzed by @code{iv_analyze}, and hence they must
372be used in the specified insn or one of the following insns.
373@end itemize
374
375The description of the induction variable is provided in @code{struct
376rtx_iv}.  In order to handle subregs, the representation is a bit
377complicated; if the value of the @code{extend} field is not
378@code{UNKNOWN}, the value of the induction variable in the i-th
379iteration is
380
381@smallexample
382delta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)),
383@end smallexample
384
385with the following exception:  if @code{first_special} is true, then the
386value in the first iteration (when @code{i} is zero) is @code{delta +
387mult * base}.  However, if @code{extend} is equal to @code{UNKNOWN},
388then @code{first_special} must be false, @code{delta} 0, @code{mult} 1
389and the value in the i-th iteration is
390
391@smallexample
392subreg_@{mode@} (base + i * step)
393@end smallexample
394
395The function @code{get_iv_value} can be used to perform these
396calculations.
397
398@node Number of iterations
399@section Number of iterations analysis
400@cindex Number of iterations analysis
401
402Both on GIMPLE and on RTL, there are functions available to determine
403the number of iterations of a loop, with a similar interface.  In many
404cases, it is not possible to determine number of iterations
405unconditionally -- the determined number is correct only if some
406assumptions are satisfied.  The analysis tries to verify these
407conditions using the information contained in the program; if it fails,
408the conditions are returned together with the result.  The following
409information and conditions are provided by the analysis:
410
411@itemize
412@item @code{assumptions}: If this condition is false, the rest of
413the information is invalid.
414@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If
415this condition is true, the loop exits in the first iteration.
416@item @code{infinite}: If this condition is true, the loop is infinite.
417This condition is only available on RTL.  On GIMPLE, conditions for
418finiteness of the loop are included in @code{assumptions}.
419@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression
420that gives number of iterations.  The number of iterations is defined as
421the number of executions of the loop latch.
422@end itemize
423
424Both on GIMPLE and on RTL, it necessary for the induction variable
425analysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL).
426On GIMPLE, the results are stored to @code{struct tree_niter_desc}
427structure.  Number of iterations before the loop is exited through a
428given exit can be determined using @code{number_of_iterations_exit}
429function.  On RTL, the results are returned in @code{struct niter_desc}
430structure.  The corresponding function is named
431@code{check_simple_exit}.  There are also functions that pass through
432all the exits of a loop and try to find one with easy to determine
433number of iterations -- @code{find_loop_niter} on GIMPLE and
434@code{find_simple_exit} on RTL.  Finally, there are functions that
435provide the same information, but additionally cache it, so that
436repeated calls to number of iterations are not so costly --
437@code{number_of_iterations_in_loop} on GIMPLE and
438@code{get_simple_loop_desc} on RTL.
439
440Note that some of these functions may behave slightly differently than
441others -- some of them return only the expression for the number of
442iterations, and fail if there are some assumptions.  The function
443@code{number_of_iterations_in_loop} works only for single-exit loops,
444and it returns the value for number of iterations higher by one with
445respect to all other functions (i.e., it returns number of executions of
446the exit statement, not of the loop latch).
447
448@node Dependency analysis
449@section Data Dependency Analysis
450@cindex Data Dependency Analysis
451
452The code for the data dependence analysis can be found in
453@file{tree-data-ref.c} and its interface and data structures are
454described in @file{tree-data-ref.h}.  The function that computes the
455data dependences for all the array and pointer references for a given
456loop is @code{compute_data_dependences_for_loop}.  This function is
457currently used by the linear loop transform and the vectorization
458passes.  Before calling this function, one has to allocate two vectors:
459a first vector will contain the set of data references that are
460contained in the analyzed loop body, and the second vector will contain
461the dependence relations between the data references.  Thus if the
462vector of data references is of size @code{n}, the vector containing the
463dependence relations will contain @code{n*n} elements.  However if the
464analyzed loop contains side effects, such as calls that potentially can
465interfere with the data references in the current analyzed loop, the
466analysis stops while scanning the loop body for data references, and
467inserts a single @code{chrec_dont_know} in the dependence relation
468array.
469
470The data references are discovered in a particular order during the
471scanning of the loop body: the loop body is analyzed in execution order,
472and the data references of each statement are pushed at the end of the
473data reference array.  Two data references syntactically occur in the
474program in the same order as in the array of data references.  This
475syntactic order is important in some classical data dependence tests,
476and mapping this order to the elements of this array avoids costly
477queries to the loop body representation.
478
479Three types of data references are currently handled: ARRAY_REF, 
480INDIRECT_REF and COMPONENT_REF. The data structure for the data reference 
481is @code{data_reference}, where @code{data_reference_p} is a name of a 
482pointer to the data reference structure. The structure contains the 
483following elements:
484
485@itemize
486@item @code{base_object_info}: Provides information about the base object 
487of the data reference and its access functions. These access functions 
488represent the evolution of the data reference in the loop relative to 
489its base, in keeping with the classical meaning of the data reference 
490access function for the support of arrays. For example, for a reference 
491@code{a.b[i][j]}, the base object is @code{a.b} and the access functions, 
492one for each array subscript, are: 
493@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}.
494
495@item @code{first_location_in_loop}: Provides information about the first 
496location accessed by the data reference in the loop and about the access 
497function used to represent evolution relative to this location. This data 
498is used to support pointers, and is not used for arrays (for which we 
499have base objects). Pointer accesses are represented as a one-dimensional
500access that starts from the first location accessed in the loop. For 
501example:
502
503@smallexample
504      for1 i
505         for2 j
506          *((int *)p + i + j) = a[i][j];
507@end smallexample
508
509The access function of the pointer access is @code{@{0, + 4B@}_for2} 
510relative to @code{p + i}. The access functions of the array are 
511@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2} 
512relative to @code{a}.
513
514Usually, the object the pointer refers to is either unknown, or we can't 
515prove that the access is confined to the boundaries of a certain object. 
516
517Two data references can be compared only if at least one of these two 
518representations has all its fields filled for both data references. 
519
520The current strategy for data dependence tests is as follows: 
521If both @code{a} and @code{b} are represented as arrays, compare 
522@code{a.base_object} and @code{b.base_object};
523if they are equal, apply dependence tests (use access functions based on 
524base_objects).
525Else if both @code{a} and @code{b} are represented as pointers, compare 
526@code{a.first_location} and @code{b.first_location}; 
527if they are equal, apply dependence tests (use access functions based on 
528first location).
529However, if @code{a} and @code{b} are represented differently, only try 
530to prove that the bases are definitely different.
531
532@item Aliasing information.
533@item Alignment information.
534@end itemize
535
536The structure describing the relation between two data references is
537@code{data_dependence_relation} and the shorter name for a pointer to
538such a structure is @code{ddr_p}.  This structure contains:
539
540@itemize
541@item a pointer to each data reference,
542@item a tree node @code{are_dependent} that is set to @code{chrec_known}
543if the analysis has proved that there is no dependence between these two
544data references, @code{chrec_dont_know} if the analysis was not able to
545determine any useful result and potentially there could exist a
546dependence between these data references, and @code{are_dependent} is
547set to @code{NULL_TREE} if there exist a dependence relation between the
548data references, and the description of this dependence relation is
549given in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects}
550arrays,
551@item a boolean that determines whether the dependence relation can be
552represented by a classical distance vector, 
553@item an array @code{subscripts} that contains a description of each
554subscript of the data references.  Given two array accesses a
555subscript is the tuple composed of the access functions for a given
556dimension.  For example, given @code{A[f1][f2][f3]} and
557@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2,
558g2), (f3, g3)}.
559@item two arrays @code{dir_vects} and @code{dist_vects} that contain
560classical representations of the data dependences under the form of
561direction and distance dependence vectors,
562@item an array of loops @code{loop_nest} that contains the loops to
563which the distance and direction vectors refer to.
564@end itemize
565
566Several functions for pretty printing the information extracted by the
567data dependence analysis are available: @code{dump_ddrs} prints with a
568maximum verbosity the details of a data dependence relations array,
569@code{dump_dist_dir_vectors} prints only the classical distance and
570direction vectors for a data dependence relations array, and
571@code{dump_data_references} prints the details of the data references
572contained in a data reference array.
573
574@node Lambda
575@section Linear loop transformations framework
576@cindex Linear loop transformations framework
577
578Lambda is a framework that allows transformations of loops using
579non-singular matrix based transformations of the iteration space and
580loop bounds. This allows compositions of skewing, scaling, interchange,
581and reversal transformations.  These transformations are often used to
582improve cache behavior or remove inner loop dependencies to allow
583parallelization and vectorization to take place.
584
585To perform these transformations, Lambda requires that the loopnest be
586converted into a internal form that can be matrix transformed easily.
587To do this conversion, the function
588@code{gcc_loopnest_to_lambda_loopnest} is provided.  If the loop cannot
589be transformed using lambda, this function will return NULL.
590
591Once a @code{lambda_loopnest} is obtained from the conversion function,
592it can be transformed by using @code{lambda_loopnest_transform}, which
593takes a transformation matrix to apply.  Note that it is up to the
594caller to verify that the transformation matrix is legal to apply to the
595loop (dependence respecting, etc).  Lambda simply applies whatever
596matrix it is told to provide.  It can be extended to make legal matrices
597out of any non-singular matrix, but this is not currently implemented.
598Legality of a matrix for a given loopnest can be verified using
599@code{lambda_transform_legal_p}.
600
601Given a transformed loopnest, conversion back into gcc IR is done by
602@code{lambda_loopnest_to_gcc_loopnest}.  This function will modify the
603loops so that they match the transformed loopnest.
604
605