NameDateSize

..20-Dec-201613

AUTHORSH A D08-Mar-2015111

ChangeLogH A D08-Mar-20150

COPYINGH A D08-Mar-201519.6 KiB

doc/H20-Dec-20167

gen_chlog.shH A D08-Mar-20151.2 KiB

gen_ver.shH A D08-Mar-2015897

infiniband-diags/H20-Dec-201614

INSTALLH A D08-Mar-20150

libibcommon/H20-Dec-201612

libibmad/H20-Dec-201612

libibumad/H20-Dec-201613

make.distH A D08-Mar-20155.4 KiB

MakefileH A D08-Mar-2015513

NEWSH A D08-Mar-20150

opensm/H20-Dec-201622

READMEH A D08-Mar-20155.2 KiB

README

1This README is for OpenSM and the InfiniBand diagnostic utilities
2in this directory (management).
3
4The master source repository is
5git://git.openfabrics.org/~sashak/management.git and can be cloned by:
6
7  git clone git://git.openfabrics.org/~sashak/management.git
8
9
10Packages
11--------
12libibcommon - common stuff
13libibumad - interface to ib_umad module (user_mad) library
14libibmad - generic MAD handling library
15opensm - OpenSM
16infiniband-diags - various diagnostic tools
17
18
19Building
20--------
21To make this unpack tarballs and in directories libibcommon, libibumad,
22libibmad, opensm, infiniband-diags (in that order) run:
23
24  ./configure && make && make install
25
26(If you are building the cloned repository run also ./autogen.sh first)
27
28Typically the autogen and configure steps only need be done the first
29time unless configure.in or Makefile.am changes in the directories.
30
31Libraries are installed by default at /usr/local/lib and binaries at
32/usr/local/sbin.
33
34
35Running
36-------
37After compiling and installing, you can run opensm by invoking
38
39  /usr/local/sbin/opensm
40
41opensm must be run as root. Run 'opensm --help' to see the options.
42
43Note also that you must have udev mount /dev/infiniband or do it manually.
44See .../src/linux-kernel/docs/user_mad.txt. Also, ib_umad module must be
45loaded.
46
47opensm will run on the first existing port on the first IB device (HCA).
48You can override that by using "-g <portguid_in_hex>".
49Verify that the first port is active. This assumes the port is plugged
50into another IB device.
51
52In case of problems, run the opensm with -V and send the log file
53(/var/log/opensm.log).
54
55IMPORTANT:
56Don't forget to modprobe ib_umad and make sure udev is configured before
57using any of the userspace programs.
58
59
60OpenSM Limitations:
611. Retry mechanism in SM is primitive and needs enhancing to deal with
62ports which are active but don't respond to SM MADs.
632. Async events are not yet supported (by OpenSM). The only one supported
64is local LID change (and this is handled in the mthca driver). Future
65versions of OpenSM may need to act on more local events.
66
67
68Tuning OpenSM for Large Clusters
69--------------------------------
70Currently OpenSM is compiled with debug and no optimization. This
71should be changed to at least -O2 (and perhaps -O4) but I would start
72with -O2. This results in a 2x speedup for some code paths.
73
74OpenSM supports a pipelining mode for SMPs. The default is 4
75outstanding SMPs. -maxsmps <#> indicates the number of outstanding SMPs
76allowed and should speed up the initialization. Useful values of this
77are 16 and 32.
78
79Beyond this, there may be some issue with a link which is causing
80timeout and retries to kick in. The OpenSM log should have some messages
81in there indicating this.
82
83
84Other utilities (infiniband diagnostics)
85---------------------------------------
86ibstat - show host adapters status
87ibstatus - similar to ibstat but implemented as a script
88ibnetdiscover - scan topology
89ibaddr - shows the lid range and default GID of the target (default is
90	the local port)
91ibroute - display unicast and multicast forwarding tables of switches
92ibtracert - display unicast or multicast route from source to destination
93ibping - ping/pong between IB nodes (currently using vendor MADs)
94ibsysstat - obtain basic information for node (hostname, cpus, memory,
95	utilization) which may be remote
96sminfo - query the SMInfo attribute on a node
97smpdump - simple solicited SMP query tool. Output is hex dump
98	(unless requested otherwise, e.g. using -s)
99smpquery - formatted SMP query tool
100perfquery - dump (and optionally clear) the performance (including error)
101	counters of the destination port
102ibcheckport - perform some basic tests on the specified port
103ibchecknode - perform some basic tests on the specified node
104ibcheckerrs - check if the error counters of the port/node have passed
105	some predefined thresholds
106ibchecknet - perform port/node/errors check on the subnet. ibnetdiscover
107	output can be used as in input topology
108ibswitches - scan the net or use existing net topology file and list all
109	switches
110ibhosts - scan the net or use existing net topology file and list all hosts
111ibnodes - scan the net or use existing net topology file and list all nodes
112ibportstate - get the logical and physical port state of an IB port or
113	disable or enable the port (only on a switch)
114ibcheckwidth - perform port width check on the subnet. Used to find ports
115	with 1x link width.
116ibcheckportwidth - perform 1x port width check on specified port
117ibcheckstate - perform port state (and physical port state) check on
118	the subnet. Used to find ports not in LinkUp physical port state
119	and not Active port state
120ibcheckportstate - perform port state (and physical port state) check on
121	specified port
122ibcheckerrors - perform error check on subnet. Used to find ports with
123	error counters (PMA PortCounters) beyond the indicated thresholds
124ibclearerrors - clear all error counters on subnet
125ibclearcounters - clear all port counters on subnet
126ibdiscover.pl - takes output of ibnetdiscover and a map file and produces
127	a topology file (local node GUID and port connected to remote
128	node GUID and port)
129saquery - issue some SA queries
130
131Note that the above list is not up to date and the infiniband-diags
132subdirectory should be checked for the latest tools.
133