1169689Skan@c Copyright (c) 2006 Free Software Foundation, Inc.
2169689Skan@c Free Software Foundation, Inc.
3169689Skan@c This is part of the GCC manual.
4169689Skan@c For copying conditions, see the file gcc.texi.
5169689Skan
6169689Skan@c ---------------------------------------------------------------------
7169689Skan@c Loop Representation
8169689Skan@c ---------------------------------------------------------------------
9169689Skan
10169689Skan@node Loop Analysis and Representation
11169689Skan@chapter Analysis and Representation of Loops
12169689Skan
13169689SkanGCC provides extensive infrastructure for work with natural loops, i.e.,
14169689Skanstrongly connected components of CFG with only one entry block.  This
15169689Skanchapter describes representation of loops in GCC, both on GIMPLE and in
16169689SkanRTL, as well as the interfaces to loop-related analyses (induction
17169689Skanvariable analysis and number of iterations analysis).
18169689Skan
19169689Skan@menu
20169689Skan* Loop representation::		Representation and analysis of loops.
21169689Skan* Loop querying::		Getting information about loops.
22169689Skan* Loop manipulation::		Loop manipulation functions.
23169689Skan* LCSSA::			Loop-closed SSA form.
24169689Skan* Scalar evolutions::   	Induction variables on GIMPLE.
25169689Skan* loop-iv::			Induction variables on RTL.
26169689Skan* Number of iterations::	Number of iterations analysis.
27169689Skan* Dependency analysis::		Data dependency analysis.
28169689Skan* Lambda::			Linear loop transformations framework.
29169689Skan@end menu
30169689Skan
31169689Skan@node Loop representation
32169689Skan@section Loop representation
33169689Skan@cindex Loop representation
34169689Skan@cindex Loop analysis
35169689Skan
36169689SkanThis chapter describes the representation of loops in GCC, and functions
37169689Skanthat can be used to build, modify and analyze this representation.  Most
38169689Skanof the interfaces and data structures are declared in @file{cfgloop.h}.
39169689SkanAt the moment, loop structures are analyzed and this information is
40169689Skanupdated only by the optimization passes that deal with loops, but some
41169689Skanefforts are being made to make it available throughout most of the
42169689Skanoptimization passes.
43169689Skan
44169689SkanIn general, a natural loop has one entry block (header) and possibly
45169689Skanseveral back edges (latches) leading to the header from the inside of
46169689Skanthe loop.  Loops with several latches may appear if several loops share
47169689Skana single header, or if there is a branching in the middle of the loop.
48169689SkanThe representation of loops in GCC however allows only loops with a
49169689Skansingle latch.  During loop analysis, headers of such loops are split and
50169689Skanforwarder blocks are created in order to disambiguate their structures.
51169689SkanA heuristic based on profile information is used to determine whether
52169689Skanthe latches correspond to sub-loops or to control flow in a single loop.
53169689SkanThis means that the analysis sometimes changes the CFG, and if you run
54169689Skanit in the middle of an optimization pass, you must be able to deal with
55169689Skanthe new blocks.
56169689Skan
57169689SkanBody of the loop is the set of blocks that are dominated by its header,
58169689Skanand reachable from its latch against the direction of edges in CFG.  The
59169689Skanloops are organized in a containment hierarchy (tree) such that all the
60169689Skanloops immediately contained inside loop L are the children of L in the
61169689Skantree.  This tree is represented by the @code{struct loops} structure.
62169689SkanThe root of this tree is a fake loop that contains all blocks in the
63169689Skanfunction.  Each of the loops is represented in a @code{struct loop}
64169689Skanstructure.  Each loop is assigned an index (@code{num} field of the
65169689Skan@code{struct loop} structure), and the pointer to the loop is stored in
66169689Skanthe corresponding field of the @code{parray} field of the loops
67169689Skanstructure.  Index of a sub-loop is always greater than the index of its
68169689Skansuper-loop.  The indices do not have to be continuous, there may be
69169689Skanempty (@code{NULL}) entries in the @code{parray} created by deleting
70169689Skanloops.  The index of a loop never changes.  The first unused index is
71169689Skanstored in the @code{num} field of the loops structure.
72169689Skan
73169689SkanEach basic block contains the reference to the innermost loop it belongs
74169689Skanto (@code{loop_father}).  For this reason, it is only possible to have
75169689Skanone @code{struct loops} structure initialized at the same time for each
76169689SkanCFG.  It is recommended to use the global variable @code{current_loops}
77169689Skanto contain the @code{struct loops} structure, especially if the loop
78169689Skanstructures are updated throughout several passes.  Many of the loop
79169689Skanmanipulation functions assume that dominance information is up-to-date.
80169689Skan
81169689SkanThe loops are analyzed through @code{loop_optimizer_init} function.  The
82169689Skanargument of this function is a set of flags represented in an integer
83169689Skanbitmask.  These flags specify what other properties of the loop
84169689Skanstructures should be calculated/enforced and preserved later:
85169689Skan
86169689Skan@itemize
87169689Skan@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such
88169689Skana way that each loop has only one entry edge, and additionally, the
89169689Skansource block of this entry edge has only one successor.  This creates a
90169689Skannatural place where the code can be moved out of the loop, and ensures
91169689Skanthat the entry edge of the loop leads from its immediate super-loop.
92169689Skan@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to
93169689Skanforce the latch block of each loop to have only one successor.  This
94169689Skanensures that the latch of the loop does not belong to any of its
95169689Skansub-loops, and makes manipulation with the loops significantly easier.
96169689SkanMost of the loop manipulation functions assume that the loops are in
97169689Skanthis shape.  Note that with this flag, the ``normal'' loop without any
98169689Skancontrol flow inside and with one exit consists of two basic blocks.
99169689Skan@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and
100169689Skanedges in the strongly connected components that are not natural loops
101169689Skan(have more than one entry block) are marked with
102169689Skan@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags.  The
103169689Skanflag is not set for blocks and edges that belong to natural loops that
104169689Skanare in such an irreducible region (but it is set for the entry and exit
105169689Skanedges of such a loop, if they lead to/from this region).
106169689Skan@item @code{LOOPS_HAVE_MARKED_SINGLE_EXITS}: If a loop has exactly one
107169689Skanexit edge, this edge is stored in @code{single_exit} field of the loop
108169689Skanstructure.  @code{NULL} is stored there otherwise.
109169689Skan@end itemize
110169689Skan
111169689SkanThese properties may also be computed/enforced later, using functions
112169689Skan@code{create_preheaders}, @code{force_single_succ_latches},
113169689Skan@code{mark_irreducible_loops} and @code{mark_single_exit_loops}.
114169689Skan
115169689SkanThe memory occupied by the loops structures should be freed with
116169689Skan@code{loop_optimizer_finalize} function.
117169689Skan
118169689SkanThe CFG manipulation functions in general do not update loop structures.
119169689SkanSpecialized versions that additionally do so are provided for the most
120169689Skancommon tasks.  On GIMPLE, @code{cleanup_tree_cfg_loop} function can be
121169689Skanused to cleanup CFG while updating the loops structures if
122169689Skan@code{current_loops} is set.
123169689Skan
124169689Skan@node Loop querying
125169689Skan@section Loop querying
126169689Skan@cindex Loop querying
127169689Skan
128169689SkanThe functions to query the information about loops are declared in
129169689Skan@file{cfgloop.h}.  Some of the information can be taken directly from
130169689Skanthe structures.  @code{loop_father} field of each basic block contains
131169689Skanthe innermost loop to that the block belongs.  The most useful fields of
132169689Skanloop structure (that are kept up-to-date at all times) are:
133169689Skan
134169689Skan@itemize
135169689Skan@item @code{header}, @code{latch}: Header and latch basic blocks of the
136169689Skanloop.
137169689Skan@item @code{num_nodes}: Number of basic blocks in the loop (including
138169689Skanthe basic blocks of the sub-loops).
139169689Skan@item @code{depth}: The depth of the loop in the loops tree, i.e., the
140169689Skannumber of super-loops of the loop.
141169689Skan@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first
142169689Skansub-loop, and the sibling of the loop in the loops tree.
143169689Skan@item @code{single_exit}: The exit edge of the loop, if the loop has
144169689Skanexactly one exit and the loops were analyzed with
145169689SkanLOOPS_HAVE_MARKED_SINGLE_EXITS.
146169689Skan@end itemize
147169689Skan
148169689SkanThere are other fields in the loop structures, many of them used only by
149169689Skansome of the passes, or not updated during CFG changes; in general, they
150169689Skanshould not be accessed directly.
151169689Skan
152169689SkanThe most important functions to query loop structures are:
153169689Skan
154169689Skan@itemize
155169689Skan@item @code{flow_loops_dump}: Dumps the information about loops to a
156169689Skanfile.
157169689Skan@item @code{verify_loop_structure}: Checks consistency of the loop
158169689Skanstructures.
159169689Skan@item @code{loop_latch_edge}: Returns the latch edge of a loop.
160169689Skan@item @code{loop_preheader_edge}: If loops have preheaders, returns
161169689Skanthe preheader edge of a loop.
162169689Skan@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of
163169689Skananother loop.
164169689Skan@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs
165169689Skanto a loop (including its sub-loops).
166169689Skan@item @code{find_common_loop}: Finds the common super-loop of two loops.
167169689Skan@item @code{superloop_at_depth}: Returns the super-loop of a loop with
168169689Skanthe given depth.
169169689Skan@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the
170169689Skannumber of insns in the loop, on GIMPLE and on RTL.
171169689Skan@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a
172169689Skanloop.
173169689Skan@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops
174169689Skanwith @code{EDGE_LOOP_EXIT} flag.
175169689Skan@item @code{get_loop_body}, @code{get_loop_body_in_dom_order},
176169689Skan@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the
177169689Skanloop in depth-first search order in reversed CFG, ordered by dominance
178169689Skanrelation, and breath-first search order, respectively.
179169689Skan@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop.
180169689Skan@item @code{just_once_each_iteration_p}: Returns true if the basic block
181169689Skanis executed exactly once during each iteration of a loop (that is, it
182169689Skandoes not belong to a sub-loop, and it dominates the latch of the loop).
183169689Skan@end itemize
184169689Skan
185169689Skan@node Loop manipulation
186169689Skan@section Loop manipulation
187169689Skan@cindex Loop manipulation
188169689Skan
189169689SkanThe loops tree can be manipulated using the following functions:
190169689Skan
191169689Skan@itemize
192169689Skan@item @code{flow_loop_tree_node_add}: Adds a node to the tree.
193169689Skan@item @code{flow_loop_tree_node_remove}: Removes a node from the tree.
194169689Skan@item @code{add_bb_to_loop}: Adds a basic block to a loop.
195169689Skan@item @code{remove_bb_from_loops}: Removes a basic block from loops.
196169689Skan@end itemize
197169689Skan
198169689SkanThe specialized versions of several low-level CFG functions that also
199169689Skanupdate loop structures are provided:
200169689Skan
201169689Skan@itemize
202169689Skan@item @code{loop_split_edge_with}: Splits an edge, and places a
203169689Skanspecified RTL code on it.  On GIMPLE, the function can still be used,
204169689Skanbut the code must be NULL.
205169689Skan@item @code{bsi_insert_on_edge_immediate_loop}: Inserts code on edge,
206169689Skansplitting it if necessary.  Only works on GIMPLE.
207169689Skan@item @code{remove_path}: Removes an edge and all blocks it dominates.
208169689Skan@item @code{loop_commit_inserts}: Commits insertions scheduled on edges,
209169689Skanand sets loops for the new blocks.  This function can only be used on
210169689SkanGIMPLE.
211169689Skan@item @code{split_loop_exit_edge}: Splits exit edge of the loop,
212169689Skanensuring that PHI node arguments remain in the loop (this ensures that
213169689Skanloop-closed SSA form is preserved).  Only useful on GIMPLE.
214169689Skan@end itemize
215169689Skan
216169689SkanFinally, there are some higher-level loop transformations implemented.
217169689SkanWhile some of them are written so that they should work on non-innermost
218169689Skanloops, they are mostly untested in that case, and at the moment, they
219169689Skanare only reliable for the innermost loops:
220169689Skan
221169689Skan@itemize
222169689Skan@item @code{create_iv}: Creates a new induction variable.  Only works on
223169689SkanGIMPLE.  @code{standard_iv_increment_position} can be used to find a
224169689Skansuitable place for the iv increment.
225169689Skan@item @code{duplicate_loop_to_header_edge},
226169689Skan@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and
227169689Skanon GIMPLE) duplicate the body of the loop prescribed number of times on
228169689Skanone of the edges entering loop header, thus performing either loop
229169689Skanunrolling or loop peeling.  @code{can_duplicate_loop_p}
230169689Skan(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated
231169689Skanloop.
232169689Skan@item @code{loop_version}, @code{tree_ssa_loop_version}: These function
233169689Skancreate a copy of a loop, and a branch before them that selects one of
234169689Skanthem depending on the prescribed condition.  This is useful for
235169689Skanoptimizations that need to verify some assumptions in runtime (one of
236169689Skanthe copies of the loop is usually left unchanged, while the other one is
237169689Skantransformed in some way).
238169689Skan@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the
239169689Skanextra iterations to make the number of iterations divisible by unroll
240169689Skanfactor, updating the exit condition, and removing the exits that now
241169689Skancannot be taken.  Works only on GIMPLE.
242169689Skan@end itemize
243169689Skan
244169689Skan@node LCSSA
245169689Skan@section Loop-closed SSA form
246169689Skan@cindex LCSSA
247169689Skan@cindex Loop-closed SSA form
248169689Skan
249169689SkanThroughout the loop optimizations on tree level, one extra condition is
250169689Skanenforced on the SSA form:  No SSA name is used outside of the loop in
251169689Skanthat it is defined.  The SSA form satisfying this condition is called
252169689Skan``loop-closed SSA form'' -- LCSSA.  To enforce LCSSA, PHI nodes must be
253169689Skancreated at the exits of the loops for the SSA names that are used
254169689Skanoutside of them.  Only the real operands (not virtual SSA names) are
255169689Skanheld in LCSSA, in order to save memory.
256169689Skan
257169689SkanThere are various benefits of LCSSA:
258169689Skan
259169689Skan@itemize
260169689Skan@item Many optimizations (value range analysis, final value
261169689Skanreplacement) are interested in the values that are defined in the loop
262169689Skanand used outside of it, i.e., exactly those for that we create new PHI
263169689Skannodes.
264169689Skan@item In induction variable analysis, it is not necessary to specify the
265169689Skanloop in that the analysis should be performed -- the scalar evolution
266169689Skananalysis always returns the results with respect to the loop in that the
267169689SkanSSA name is defined.
268169689Skan@item It makes updating of SSA form during loop transformations simpler.
269169689SkanWithout LCSSA, operations like loop unrolling may force creation of PHI
270169689Skannodes arbitrarily far from the loop, while in LCSSA, the SSA form can be
271169689Skanupdated locally.  However, since we only keep real operands in LCSSA, we
272169689Skancannot use this advantage (we could have local updating of real
273169689Skanoperands, but it is not much more efficient than to use generic SSA form
274169689Skanupdating for it as well; the amount of changes to SSA is the same).
275169689Skan@end itemize
276169689Skan
277169689SkanHowever, it also means LCSSA must be updated.  This is usually
278169689Skanstraightforward, unless you create a new value in loop and use it
279169689Skanoutside, or unless you manipulate loop exit edges (functions are
280169689Skanprovided to make these manipulations simple).
281169689Skan@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to
282169689SkanLCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of
283169689SkanLCSSA is preserved.
284169689Skan
285169689Skan@node Scalar evolutions
286169689Skan@section Scalar evolutions
287169689Skan@cindex Scalar evolutions
288169689Skan@cindex IV analysis on GIMPLE
289169689Skan
290169689SkanScalar evolutions (SCEV) are used to represent results of induction
291169689Skanvariable analysis on GIMPLE.  They enable us to represent variables with
292169689Skancomplicated behavior in a simple and consistent way (we only use it to
293169689Skanexpress values of polynomial induction variables, but it is possible to
294169689Skanextend it).  The interfaces to SCEV analysis are declared in
295169689Skan@file{tree-scalar-evolution.h}.  To use scalar evolutions analysis,
296169689Skan@code{scev_initialize} must be used.  To stop using SCEV,
297169689Skan@code{scev_finalize} should be used.  SCEV analysis caches results in
298169689Skanorder to save time and memory.  This cache however is made invalid by
299169689Skanmost of the loop transformations, including removal of code.  If such a
300169689Skantransformation is performed, @code{scev_reset} must be called to clean
301169689Skanthe caches.
302169689Skan
303169689SkanGiven an SSA name, its behavior in loops can be analyzed using the
304169689Skan@code{analyze_scalar_evolution} function.  The returned SCEV however
305169689Skandoes not have to be fully analyzed and it may contain references to
306169689Skanother SSA names defined in the loop.  To resolve these (potentially
307169689Skanrecursive) references, @code{instantiate_parameters} or
308169689Skan@code{resolve_mixers} functions must be used.
309169689Skan@code{instantiate_parameters} is useful when you use the results of SCEV
310169689Skanonly for some analysis, and when you work with whole nest of loops at
311169689Skanonce.  It will try replacing all SSA names by their SCEV in all loops,
312169689Skanincluding the super-loops of the current loop, thus providing a complete
313169689Skaninformation about the behavior of the variable in the loop nest.
314169689Skan@code{resolve_mixers} is useful if you work with only one loop at a
315169689Skantime, and if you possibly need to create code based on the value of the
316169689Skaninduction variable.  It will only resolve the SSA names defined in the
317169689Skancurrent loop, leaving the SSA names defined outside unchanged, even if
318169689Skantheir evolution in the outer loops is known.
319169689Skan
320169689SkanThe SCEV is a normal tree expression, except for the fact that it may
321169689Skancontain several special tree nodes.  One of them is
322169689Skan@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be
323169689Skanexpressed.  The other one is @code{POLYNOMIAL_CHREC}.  Polynomial chrec
324169689Skanhas three arguments -- base, step and loop (both base and step may
325169689Skancontain further polynomial chrecs).  Type of the expression and of base
326169689Skanand step must be the same.  A variable has evolution
327169689Skan@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified
328169689Skanloop) equivalent to @code{x_1} in the following example
329169689Skan
330169689Skan@smallexample
331169689Skanwhile (...)
332169689Skan  @{
333169689Skan    x_1 = phi (base, x_2);
334169689Skan    x_2 = x_1 + step;
335169689Skan  @}
336169689Skan@end smallexample
337169689Skan
338169689SkanNote that this includes the language restrictions on the operations.
339169689SkanFor example, if we compile C code and @code{x} has signed type, then the
340169689Skanoverflow in addition would cause undefined behavior, and we may assume
341169689Skanthat this does not happen.  Hence, the value with this SCEV cannot
342169689Skanoverflow (which restricts the number of iterations of such a loop).
343169689Skan
344169689SkanIn many cases, one wants to restrict the attention just to affine
345169689Skaninduction variables.  In this case, the extra expressive power of SCEV
346169689Skanis not useful, and may complicate the optimizations.  In this case,
347169689Skan@code{simple_iv} function may be used to analyze a value -- the result
348169689Skanis a loop-invariant base and step.
349169689Skan
350169689Skan@node loop-iv
351169689Skan@section IV analysis on RTL
352169689Skan@cindex IV analysis on RTL
353169689Skan
354169689SkanThe induction variable on RTL is simple and only allows analysis of
355169689Skanaffine induction variables, and only in one loop at once.  The interface
356169689Skanis declared in @file{cfgloop.h}.  Before analyzing induction variables
357169689Skanin a loop L, @code{iv_analysis_loop_init} function must be called on L.
358169689SkanAfter the analysis (possibly calling @code{iv_analysis_loop_init} for
359169689Skanseveral loops) is finished, @code{iv_analysis_done} should be called.
360169689SkanThe following functions can be used to access the results of the
361169689Skananalysis:
362169689Skan
363169689Skan@itemize
364169689Skan@item @code{iv_analyze}: Analyzes a single register used in the given
365169689Skaninsn.  If no use of the register in this insn is found, the following
366169689Skaninsns are scanned, so that this function can be called on the insn
367169689Skanreturned by get_condition.
368169689Skan@item @code{iv_analyze_result}: Analyzes result of the assignment in the
369169689Skangiven insn.
370169689Skan@item @code{iv_analyze_expr}: Analyzes a more complicated expression.
371169689SkanAll its operands are analyzed by @code{iv_analyze}, and hence they must
372169689Skanbe used in the specified insn or one of the following insns.
373169689Skan@end itemize
374169689Skan
375169689SkanThe description of the induction variable is provided in @code{struct
376169689Skanrtx_iv}.  In order to handle subregs, the representation is a bit
377169689Skancomplicated; if the value of the @code{extend} field is not
378169689Skan@code{UNKNOWN}, the value of the induction variable in the i-th
379169689Skaniteration is
380169689Skan
381169689Skan@smallexample
382169689Skandelta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)),
383169689Skan@end smallexample
384169689Skan
385169689Skanwith the following exception:  if @code{first_special} is true, then the
386169689Skanvalue in the first iteration (when @code{i} is zero) is @code{delta +
387169689Skanmult * base}.  However, if @code{extend} is equal to @code{UNKNOWN},
388169689Skanthen @code{first_special} must be false, @code{delta} 0, @code{mult} 1
389169689Skanand the value in the i-th iteration is
390169689Skan
391169689Skan@smallexample
392169689Skansubreg_@{mode@} (base + i * step)
393169689Skan@end smallexample
394169689Skan
395169689SkanThe function @code{get_iv_value} can be used to perform these
396169689Skancalculations.
397169689Skan
398169689Skan@node Number of iterations
399169689Skan@section Number of iterations analysis
400169689Skan@cindex Number of iterations analysis
401169689Skan
402169689SkanBoth on GIMPLE and on RTL, there are functions available to determine
403169689Skanthe number of iterations of a loop, with a similar interface.  In many
404169689Skancases, it is not possible to determine number of iterations
405169689Skanunconditionally -- the determined number is correct only if some
406169689Skanassumptions are satisfied.  The analysis tries to verify these
407169689Skanconditions using the information contained in the program; if it fails,
408169689Skanthe conditions are returned together with the result.  The following
409169689Skaninformation and conditions are provided by the analysis:
410169689Skan
411169689Skan@itemize
412169689Skan@item @code{assumptions}: If this condition is false, the rest of
413169689Skanthe information is invalid.
414169689Skan@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If
415169689Skanthis condition is true, the loop exits in the first iteration.
416169689Skan@item @code{infinite}: If this condition is true, the loop is infinite.
417169689SkanThis condition is only available on RTL.  On GIMPLE, conditions for
418169689Skanfiniteness of the loop are included in @code{assumptions}.
419169689Skan@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression
420169689Skanthat gives number of iterations.  The number of iterations is defined as
421169689Skanthe number of executions of the loop latch.
422169689Skan@end itemize
423169689Skan
424169689SkanBoth on GIMPLE and on RTL, it necessary for the induction variable
425169689Skananalysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL).
426169689SkanOn GIMPLE, the results are stored to @code{struct tree_niter_desc}
427169689Skanstructure.  Number of iterations before the loop is exited through a
428169689Skangiven exit can be determined using @code{number_of_iterations_exit}
429169689Skanfunction.  On RTL, the results are returned in @code{struct niter_desc}
430169689Skanstructure.  The corresponding function is named
431169689Skan@code{check_simple_exit}.  There are also functions that pass through
432169689Skanall the exits of a loop and try to find one with easy to determine
433169689Skannumber of iterations -- @code{find_loop_niter} on GIMPLE and
434169689Skan@code{find_simple_exit} on RTL.  Finally, there are functions that
435169689Skanprovide the same information, but additionally cache it, so that
436169689Skanrepeated calls to number of iterations are not so costly --
437169689Skan@code{number_of_iterations_in_loop} on GIMPLE and
438169689Skan@code{get_simple_loop_desc} on RTL.
439169689Skan
440169689SkanNote that some of these functions may behave slightly differently than
441169689Skanothers -- some of them return only the expression for the number of
442169689Skaniterations, and fail if there are some assumptions.  The function
443169689Skan@code{number_of_iterations_in_loop} works only for single-exit loops,
444169689Skanand it returns the value for number of iterations higher by one with
445169689Skanrespect to all other functions (i.e., it returns number of executions of
446169689Skanthe exit statement, not of the loop latch).
447169689Skan
448169689Skan@node Dependency analysis
449169689Skan@section Data Dependency Analysis
450169689Skan@cindex Data Dependency Analysis
451169689Skan
452169689SkanThe code for the data dependence analysis can be found in
453169689Skan@file{tree-data-ref.c} and its interface and data structures are
454169689Skandescribed in @file{tree-data-ref.h}.  The function that computes the
455169689Skandata dependences for all the array and pointer references for a given
456169689Skanloop is @code{compute_data_dependences_for_loop}.  This function is
457169689Skancurrently used by the linear loop transform and the vectorization
458169689Skanpasses.  Before calling this function, one has to allocate two vectors:
459169689Skana first vector will contain the set of data references that are
460169689Skancontained in the analyzed loop body, and the second vector will contain
461169689Skanthe dependence relations between the data references.  Thus if the
462169689Skanvector of data references is of size @code{n}, the vector containing the
463169689Skandependence relations will contain @code{n*n} elements.  However if the
464169689Skananalyzed loop contains side effects, such as calls that potentially can
465169689Skaninterfere with the data references in the current analyzed loop, the
466169689Skananalysis stops while scanning the loop body for data references, and
467169689Skaninserts a single @code{chrec_dont_know} in the dependence relation
468169689Skanarray.
469169689Skan
470169689SkanThe data references are discovered in a particular order during the
471169689Skanscanning of the loop body: the loop body is analyzed in execution order,
472169689Skanand the data references of each statement are pushed at the end of the
473169689Skandata reference array.  Two data references syntactically occur in the
474169689Skanprogram in the same order as in the array of data references.  This
475169689Skansyntactic order is important in some classical data dependence tests,
476169689Skanand mapping this order to the elements of this array avoids costly
477169689Skanqueries to the loop body representation.
478169689Skan
479169689SkanThree types of data references are currently handled: ARRAY_REF, 
480169689SkanINDIRECT_REF and COMPONENT_REF. The data structure for the data reference 
481169689Skanis @code{data_reference}, where @code{data_reference_p} is a name of a 
482169689Skanpointer to the data reference structure. The structure contains the 
483169689Skanfollowing elements:
484169689Skan
485169689Skan@itemize
486169689Skan@item @code{base_object_info}: Provides information about the base object 
487169689Skanof the data reference and its access functions. These access functions 
488169689Skanrepresent the evolution of the data reference in the loop relative to 
489169689Skanits base, in keeping with the classical meaning of the data reference 
490169689Skanaccess function for the support of arrays. For example, for a reference 
491169689Skan@code{a.b[i][j]}, the base object is @code{a.b} and the access functions, 
492169689Skanone for each array subscript, are: 
493169689Skan@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}.
494169689Skan
495169689Skan@item @code{first_location_in_loop}: Provides information about the first 
496169689Skanlocation accessed by the data reference in the loop and about the access 
497169689Skanfunction used to represent evolution relative to this location. This data 
498169689Skanis used to support pointers, and is not used for arrays (for which we 
499169689Skanhave base objects). Pointer accesses are represented as a one-dimensional
500169689Skanaccess that starts from the first location accessed in the loop. For 
501169689Skanexample:
502169689Skan
503169689Skan@smallexample
504169689Skan      for1 i
505169689Skan         for2 j
506169689Skan          *((int *)p + i + j) = a[i][j];
507169689Skan@end smallexample
508169689Skan
509169689SkanThe access function of the pointer access is @code{@{0, + 4B@}_for2} 
510169689Skanrelative to @code{p + i}. The access functions of the array are 
511169689Skan@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2} 
512169689Skanrelative to @code{a}.
513169689Skan
514169689SkanUsually, the object the pointer refers to is either unknown, or we can't 
515169689Skanprove that the access is confined to the boundaries of a certain object. 
516169689Skan
517169689SkanTwo data references can be compared only if at least one of these two 
518169689Skanrepresentations has all its fields filled for both data references. 
519169689Skan
520169689SkanThe current strategy for data dependence tests is as follows: 
521169689SkanIf both @code{a} and @code{b} are represented as arrays, compare 
522169689Skan@code{a.base_object} and @code{b.base_object};
523169689Skanif they are equal, apply dependence tests (use access functions based on 
524169689Skanbase_objects).
525169689SkanElse if both @code{a} and @code{b} are represented as pointers, compare 
526169689Skan@code{a.first_location} and @code{b.first_location}; 
527169689Skanif they are equal, apply dependence tests (use access functions based on 
528169689Skanfirst location).
529169689SkanHowever, if @code{a} and @code{b} are represented differently, only try 
530169689Skanto prove that the bases are definitely different.
531169689Skan
532169689Skan@item Aliasing information.
533169689Skan@item Alignment information.
534169689Skan@end itemize
535169689Skan
536169689SkanThe structure describing the relation between two data references is
537169689Skan@code{data_dependence_relation} and the shorter name for a pointer to
538169689Skansuch a structure is @code{ddr_p}.  This structure contains:
539169689Skan
540169689Skan@itemize
541169689Skan@item a pointer to each data reference,
542169689Skan@item a tree node @code{are_dependent} that is set to @code{chrec_known}
543169689Skanif the analysis has proved that there is no dependence between these two
544169689Skandata references, @code{chrec_dont_know} if the analysis was not able to
545169689Skandetermine any useful result and potentially there could exist a
546169689Skandependence between these data references, and @code{are_dependent} is
547169689Skanset to @code{NULL_TREE} if there exist a dependence relation between the
548169689Skandata references, and the description of this dependence relation is
549169689Skangiven in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects}
550169689Skanarrays,
551169689Skan@item a boolean that determines whether the dependence relation can be
552169689Skanrepresented by a classical distance vector, 
553169689Skan@item an array @code{subscripts} that contains a description of each
554169689Skansubscript of the data references.  Given two array accesses a
555169689Skansubscript is the tuple composed of the access functions for a given
556169689Skandimension.  For example, given @code{A[f1][f2][f3]} and
557169689Skan@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2,
558169689Skang2), (f3, g3)}.
559169689Skan@item two arrays @code{dir_vects} and @code{dist_vects} that contain
560169689Skanclassical representations of the data dependences under the form of
561169689Skandirection and distance dependence vectors,
562169689Skan@item an array of loops @code{loop_nest} that contains the loops to
563169689Skanwhich the distance and direction vectors refer to.
564169689Skan@end itemize
565169689Skan
566169689SkanSeveral functions for pretty printing the information extracted by the
567169689Skandata dependence analysis are available: @code{dump_ddrs} prints with a
568169689Skanmaximum verbosity the details of a data dependence relations array,
569169689Skan@code{dump_dist_dir_vectors} prints only the classical distance and
570169689Skandirection vectors for a data dependence relations array, and
571169689Skan@code{dump_data_references} prints the details of the data references
572169689Skancontained in a data reference array.
573169689Skan
574169689Skan@node Lambda
575169689Skan@section Linear loop transformations framework
576169689Skan@cindex Linear loop transformations framework
577169689Skan
578169689SkanLambda is a framework that allows transformations of loops using
579169689Skannon-singular matrix based transformations of the iteration space and
580169689Skanloop bounds. This allows compositions of skewing, scaling, interchange,
581169689Skanand reversal transformations.  These transformations are often used to
582169689Skanimprove cache behavior or remove inner loop dependencies to allow
583169689Skanparallelization and vectorization to take place.
584169689Skan
585169689SkanTo perform these transformations, Lambda requires that the loopnest be
586169689Skanconverted into a internal form that can be matrix transformed easily.
587169689SkanTo do this conversion, the function
588169689Skan@code{gcc_loopnest_to_lambda_loopnest} is provided.  If the loop cannot
589169689Skanbe transformed using lambda, this function will return NULL.
590169689Skan
591169689SkanOnce a @code{lambda_loopnest} is obtained from the conversion function,
592169689Skanit can be transformed by using @code{lambda_loopnest_transform}, which
593169689Skantakes a transformation matrix to apply.  Note that it is up to the
594169689Skancaller to verify that the transformation matrix is legal to apply to the
595169689Skanloop (dependence respecting, etc).  Lambda simply applies whatever
596169689Skanmatrix it is told to provide.  It can be extended to make legal matrices
597169689Skanout of any non-singular matrix, but this is not currently implemented.
598169689SkanLegality of a matrix for a given loopnest can be verified using
599169689Skan@code{lambda_transform_legal_p}.
600169689Skan
601169689SkanGiven a transformed loopnest, conversion back into gcc IR is done by
602169689Skan@code{lambda_loopnest_to_gcc_loopnest}.  This function will modify the
603169689Skanloops so that they match the transformed loopnest.
604169689Skan
605