1219820Sjeffibtracer
2219820Sjeff1/11/05
3219820Sjeff
4219820SjeffDescription:
5219820Sjeffibtracer is used to build a source route into a UD packet and validate the
6219820Sjeffpath taken to a destination. It is based on a client/server architecture and
7219820Sjeffrelies on a special ibtracer IB agent in each node along the way. It can
8219820Sjeffdeal with switches which do not currently run this agent but validation for
9219820Sjeffthat part of the path is impossible.
10219820Sjeff
11219820SjeffSyntax:
12219820Sjeffibtracer [-I mthca0] [-p port] [-r <# retries>] [-t <timeout in msec>] \
13219820Sjeff         [-l LID] [-g DGID] [-v]
14219820Sjeff
15219820SjeffArchitecture:
16219820SjeffIBA 1.2 defines a new set of vendor specific MADs which include OUI. OpenIB
17219820Sjeffwill use one of these classes (0x30) to implement ibtracer (and the vendor
18219820SjeffMAD option of ibping).
19219820Sjeff
20219820SjeffNote that these are general service MADs and rely on the network being up.
21219820SjeffIf the network is not up, then DR SMPs must be used. There are a number of
22219820Sjeffseparate SMP tools for this.
23219820Sjeff
24219820SjeffThe OpenIB vendor specific MAD agent will support the following attributes:
25219820SjeffClassPortInfo (0x0001) and SourceRoute (0x0010). There may be an additional
26219820Sjeffattribute (TBD) to support ibping but this can be done out of the same agent.
27219820Sjeff
28219820SjeffOnly the VendorGet method needs to be supported by this agent. No traps
29219820Sjeffare currently defined for this class.
30219820Sjeff
31219820SjeffAlthough from the ibtracer client perspective, these vendor MADs are sent
32219820Sjeffon outgoing ports, it is the server (agent) which needs to validate the
33219820Sjeffincoming port. As a result of this, it is the expected incoming port
34219820Sjeffat the next hop which needs to be added to the SourceRoute attribute.
35219820SjeffSourceRoute requests (Gets) and responses (GetResps) are exchanged
36219820Sjeffdirectly between the source node where the ibtracer command is initiated
37219820Sjeffand each hop along the way to the destination until the destination is
38219820Sjeffreached. As the hops to the destination are walked, the incoming ports are
39219820Sjeffadded to the SourceRoute attribute and checked when the packet is received
40219820Sjeffby that hop that it did arrive on that port. If it did not arrive on that
41219820Sjeffport, an error is indicated in the status field (status 7). In either case,
42219820Sjeffthe port it did arrive on is put in the SourceRoute attribute in the GetResp.
43219820Sjeff
44219820SjeffOne of DLID or DGID must be specified in the ibtracer invocation.
45219820SjeffIf DGID is specified, a PathRecord request is made to the SA
46219820Sjeffto obtain the DLID. Other than that, no SM or SA is involved with
47219820Sjeffibtracer although the SM is needed to set up the forwarding tables.
48219820Sjeff
49219820SjeffOnce the DLID is obtained from either the command invoication or the SA,
50219820Sjeffa DR SMP packet is sent to the next hop to obtain the PortInfo attribute
51219820Sjeffto obtain the base LID for the next hop. A VendorGet(ClassPortInfo) is
52219820Sjeffthen attempted to see if this management class is supported on that node.
53219820SjeffIf it is (a VendorGetResp is received), a VendorGet(SourceRoute)
54219820Sjeffto the next hop LID is attempted after updating the source route attribute
55219820Sjeffwith the local port number from the returned PortInfo attribute. Upon
56219820Sjeffreceipt of the VendorGet(SourceRoute), the receiving agent validates the
57219820Sjeffport number it is received on with the port number in the SourceRoute
58219820Sjeffattribute. It indicates failure when they do not match and in either case
59219820Sjeffthe port is was received on is put back into the VendorGetResp(SourceRoute).
60219820Sjeff
61219820SjeffIf the next hop does not support this management class, this is indicated
62219820Sjeff(if -v is enabled) and the algorithm proceeds with the next hop. The
63219820Sjeffalgorithm is terminated when the next hop LID is the DLID (factoring in
64219820Sjeffthe LMC).
65219820Sjeff
66219820SjeffNote that rather than doing much of this with DR SMPs directly,
67219820Sjeffthese could be SA requests (using PortInfoRecords and LinkRecords,
68219820Sjeffor TraceRecords). Investigation would need to be done to validate
69219820Sjeffwhether these SA attributes are supported by the various SMs
70219820Sjeff(although OpenSM is most important in terms of OpenIB). TraceRecords
71219820Sjeffare optional and are not believed to be currently supported. It can
72219820Sjeffbe done with just PortInfoRecords and LinkRecords.
73219820Sjeff
74219820SjeffSince vendor MADs are UD, there is a retransmission strategy (timeout/retry)
75219820Sjeffwhich have defaults but are settable on the command line.
76219820Sjeff
77219820Sjeff-v option displays entire path. Note that the incoming (rather than outgoing)
78219820Sjeffports are displayed. Without -v specified, just success or failure is
79219820Sjeffdisplayed.
80219820Sjeff
81219820SjeffReversible paths are used for the responses. Note that the path from A to
82219820SjeffB might not be the same from B to A so ibtracer needs to be initiated at
83219820Sjeffboth ends if this is of interest.
84219820Sjeff
85219820SjeffThis tool cannot currently be used for multicast tracing. There are a
86219820Sjeffcouple of reasons for this. Base switch port 0 does not support
87219820Sjeffmulticast and it is not a requirement of enhanced switch port 0
88219820Sjeffto support this so there would be more hop skipping. Also, the attribute
89219820Sjeffformat would need to be enhanced for this as well as the client needing
90219820Sjeffto handle multiple responses to a single request.
91219820Sjeff
92219820Sjeff
93219820SjeffSourceRoute attribute format
94219820Sjeff
95219820SjeffActual Incoming Port Number (valid on response) - 1 byte
96219820SjeffCurrent Hop Count                               - 1 byte
97219820SjeffVector of Incoming Port Numbers (0-63)          - 64 bytes
98219820Sjeff
99219820Sjeff
100219820SjeffOutstanding Questions
101219820Sjeff
102219820SjeffShould SL be supported rather than assume SL 0 ?
103219820Sjeff
104219820SjeffShould GRH be supported (and tied to GID specification in command invocation) ?
105219820Sjeff
106219820SjeffIs multicast tracing important ?
107