History log of /linux-master/drivers/infiniband/core/sysfs.c
Revision Date Author Comments
# 703289ce 20-Sep-2023 Or Har-Toov <ohartoov@nvidia.com>

IB/core: Add support for XDR link speed

Add new IBTA speed XDR, the new rate that was added to Infiniband spec
as part of XDR and supporting signaling rate of 200Gb.

In order to report that value to rdma-core, add new u32 field to
query_port response.

Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/9d235fc600a999e8274010f0e18b40fa60540e6c.1695204156.git.leon@kernel.org
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>


# 81760bed 17-Sep-2023 Gustavo A. R. Silva <gustavoars@kernel.org>

RDMA/core: Use size_{add,sub,mul}() in calls to struct_size()

If, for any reason, the open-coded arithmetic causes a wraparound,
the protection that `struct_size()` provides against potential integer
overflows is defeated. Fix this by hardening calls to `struct_size()`
with `size_add()`, `size_sub()` and `size_mul()`.

Fixes: 467f432a521a ("RDMA/core: Split port and device counter sysfs attributes")
Fixes: a4676388e2e2 ("RDMA/core: Simplify how the gid_attrs sysfs is created")
Fixes: e9dd5daf884c ("IB/umad: Refactor code to use cdev_device_add()")
Fixes: 324e227ea7c9 ("RDMA/device: Add ib_device_get_by_netdev()")
Fixes: 5aad26a7eac5 ("IB/core: Use struct_size() in kzalloc()")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/ZQdt4NsJFwwOYxUR@work
Signed-off-by: Leon Romanovsky <leon@kernel.org>


# 5e15ff29 07-Nov-2022 Mark Zhang <markzhang@nvidia.com>

RDMA/core: Make sure "ib_port" is valid when access sysfs node

The "ib_port" structure must be set before adding the sysfs kobject,
and reset after removing it, otherwise it may crash when accessing
the sysfs node:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
Mem abort info:
ESR = 0x96000006
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000e85f5ba5
[0000000000000050] pgd=0000000848fd9003, pud=000000085b387003, pmd=0000000000000000
Internal error: Oops: 96000006 [#2] PREEMPT SMP
Modules linked in: ib_umad(O) mlx5_ib(O) nfnetlink_cttimeout(E) nfnetlink(E) act_gact(E) cls_flower(E) sch_ingress(E) openvswitch(E) nsh(E) nf_nat_ipv6(E) nf_nat_ipv4(E) nf_conncount(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) mst_pciconf(O) ipmi_devintf(E) ipmi_msghandler(E) ipmb_dev_int(OE) mlx5_core(O) mlxfw(O) mlxdevm(O) auxiliary(O) ib_uverbs(O) ib_core(O) mlx_compat(O) psample(E) sbsa_gwdt(E) uio_pdrv_genirq(E) uio(E) mlxbf_pmc(OE) mlxbf_gige(OE) mlxbf_tmfifo(OE) gpio_mlxbf2(OE) pwr_mlxbf(OE) mlx_trio(OE) i2c_mlxbf(OE) mlx_bootctl(OE) bluefield_edac(OE) knem(O) ip_tables(E) ipv6(E) crc_ccitt(E) [last unloaded: mst_pci]
Process grep (pid: 3372, stack limit = 0x0000000022055c92)
CPU: 5 PID: 3372 Comm: grep Tainted: G D OE 4.19.161-mlnx.47.gadcd9e3 #1
Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS BlueField:3.9.2-15-ga2403ab Sep 8 2022
pstate: 40000005 (nZcv daif -PAN -UAO)
pc : hw_stat_port_show+0x4c/0x80 [ib_core]
lr : port_attr_show+0x40/0x58 [ib_core]
sp : ffff000029f43b50
x29: ffff000029f43b50 x28: 0000000019375000
x27: ffff8007b821a540 x26: ffff000029f43e30
x25: 0000000000008000 x24: ffff000000eaa958
x23: 0000000000001000 x22: ffff8007a4ce3000
x21: ffff8007baff8000 x20: ffff8007b9066ac0
x19: ffff8007bae97578 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000
x9 : 0000000000000000 x8 : ffff8007a4ce4000
x7 : 0000000000000000 x6 : 000000000000003f
x5 : ffff000000e6a280 x4 : ffff8007a4ce3000
x3 : 0000000000000000 x2 : aaaaaaaaaaaaaaab
x1 : ffff8007b9066a10 x0 : ffff8007baff8000
Call trace:
hw_stat_port_show+0x4c/0x80 [ib_core]
port_attr_show+0x40/0x58 [ib_core]
sysfs_kf_seq_show+0x8c/0x150
kernfs_seq_show+0x44/0x50
seq_read+0x1b4/0x45c
kernfs_fop_read+0x148/0x1d8
__vfs_read+0x58/0x180
vfs_read+0x94/0x154
ksys_read+0x68/0xd8
__arm64_sys_read+0x28/0x34
el0_svc_common+0x88/0x18c
el0_svc_handler+0x78/0x94
el0_svc+0x8/0xe8
Code: f2955562 aa1603e4 aa1503e0 f9405683 (f9402861)

Fixes: d8a5883814b9 ("RDMA/core: Replace the ib_port_data hw_stats pointers with a ib_port pointer")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Link: https://lore.kernel.org/r/88867e705c42c1cd2011e45201c25eecdb9fef94.1667810736.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>


# 01097139 03-Jan-2022 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

RDMA: Use default_groups in kobj_type

There are currently 2 ways to create a set of sysfs files for a kobj_type,
through the default_attrs field, and the default_groups field. Move the
IB code to use default_groups field which has been the preferred way since
commit aa30f47cf666 ("kobject: Add support for default attribute groups to
kobj_type") so that we can soon get rid of the obsolete default_attrs
field.

Link: https://lore.kernel.org/r/20220103152259.531034-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 067113d9 26-Oct-2021 Mark Zhang <markzhang@nvidia.com>

RDMA/core: Fix missed initialization of rdma_hw_stats::lock

alloc_and_bind() creates a new rdma_hw_stats structure but misses
initializing the mutex lock.

This causes debug kernel failures:

DEBUG_LOCKS_WARN_ON(lock->magic != lock)
WARNING: CPU: 4 PID: 64464 at kernel/locking/mutex.c:575 __mutex_lock+0x9c3/0x12b0
Call Trace:
fill_res_counter_entry+0x6ee/0x1020 [ib_core]
res_get_common_dumpit+0x907/0x10a0 [ib_core]
nldev_stat_get_dumpit+0x20a/0x290 [ib_core]
netlink_dump+0x451/0x1040
__netlink_dump_start+0x583/0x830
rdma_nl_rcv_msg+0x3f3/0x7c0 [ib_core]
rdma_nl_rcv+0x264/0x410 [ib_core]
netlink_unicast+0x433/0x700
netlink_sendmsg+0x707/0xbf0
sock_sendmsg+0xb0/0xe0
__sys_sendto+0x193/0x240
__x64_sys_sendto+0xdd/0x1b0
do_syscall_64+0x3d/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae

Instead of requiring all users to open code initialization of the lock put
it in the general rdma_alloc_hw_stats_struct() function and remove
duplicates.

Fixes: c4ffee7c9bdb ("RDMA/netlink: Implement counter dumpit calback")
Link: https://lore.kernel.org/r/4a22986c4685058d2c735d91703ee7d865815bb9.1635237668.git.leonro@nvidia.com
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 911a81c9 18-Oct-2021 wangyugui <wangyugui@e16-tech.com>

RDMA/core: Use kvzalloc when allocating the struct ib_port

The 'struct attribute' flex array contains some struct lock_class_key's
which become big when lockdep is turned on. Big enough that some drivers
will not load when CONFIG_PROVE_LOCKING=y because they cannot allocate
enough memory:

WARNING: CPU: 36 PID: 8 at mm/page_alloc.c:5350 __alloc_pages+0x27e/0x3e0
Call Trace:
kmalloc_order+0x2a/0xb0
kmalloc_order_trace+0x19/0xf0
__kmalloc+0x231/0x270
ib_setup_port_attrs+0xd8/0x870 [ib_core]
ib_register_device+0x419/0x4e0 [ib_core]
bnxt_re_task+0x208/0x2d0 [bnxt_re]

Link: https://lore.kernel.org/r/20211019002656.17745-1-wangyugui@e16-tech.com
Signed-off-by: wangyugui <wangyugui@e16-tech.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 5e2ddd1e 08-Oct-2021 Aharon Landau <aharonl@nvidia.com>

RDMA/counter: Add optional counter support

An optional counter is a driver-specific counter that may be dynamically
enabled/disabled. This enhancement allows drivers to expose counters
which are, for example, mutually exclusive and cannot be enabled at the
same time, counters that might degrades performance, optional debug
counters, etc.

Optional counters are marked with IB_STAT_FLAG_OPTIONAL flag. They are not
exported in sysfs, and must be at the end of all stats, otherwise the
attr->show() in sysfs would get wrong indexes for hwcounters that are
behind optional counters.

Link: https://lore.kernel.org/r/20211008122439.166063-7-markzhang@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 0a0800ce 08-Oct-2021 Mark Zhang <markzhang@nvidia.com>

RDMA/core: Add a helper API rdma_free_hw_stats_struct

Add a new API rdma_free_hw_stats_struct to pair with
rdma_alloc_hw_stats_struct (which is also de-inlined).

This will be useful when there are more alloc/free works in following
patches.

Link: https://lore.kernel.org/r/20211008122439.166063-5-markzhang@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 13f30b0f 08-Oct-2021 Aharon Landau <aharonl@nvidia.com>

RDMA/counter: Add a descriptor in struct rdma_hw_stats

Add a counter statistic descriptor structure in rdma_hw_stats. In addition
to the counter name, more meta-information will be added. This code
extension is needed for optional-counter support in the following patches.

Link: https://lore.kernel.org/r/20211008122439.166063-4-markzhang@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 3cea7b4a 10-Jun-2021 Wenpeng Liang <liangwenpeng@huawei.com>

RDMA/core: Fix incorrect print format specifier

There are some '%u' for 'int' and '%d' for 'unsigend int', they should be
fixed.

Link: https://lore.kernel.org/r/1623325232-30900-1-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# d7407d16 11-Jun-2021 Jason Gunthorpe <jgg@ziepe.ca>

RDMA: Change ops->init_port to ops->port_groups

init_port was only being used to register sysfs attributes against the
port kobject. Now that all users are creating static attribute_group's we
can simply set the attribute_group list in the ops and the core code can
just handle it directly.

This makes all the sysfs management quite straightforward and prevents any
driver from abusing the naked port kobject in future because no driver
code can access it.

Link: https://lore.kernel.org/r/114f68f3d921460eafe14cea5a80ca65d81729c3.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 526a12c8 11-Jun-2021 Jason Gunthorpe <jgg@ziepe.ca>

RDMA/cm: Use an attribute_group on the ib_port_attribute intead of kobj's

This code is trying to attach a list of counters grouped into 4 groups to
the ib_port sysfs. Instead of creating a bunch of kobjects simply express
everything naturally as an ib_port_attribute and add a single
attribute_groups list.

Remove all the naked kobject manipulations.

Link: https://lore.kernel.org/r/0d5a7241ee0fe66622de04fcbaafaf6a791d5c7c.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 054239f4 11-Jun-2021 Jason Gunthorpe <jgg@ziepe.ca>

RDMA/core: Expose the ib port sysfs attribute machinery

Other things outside the core code are creating attributes against the
port. This patch exposes the basic machinery to do this.

The ib_port_attribute type allows creating groups of attributes attatched
to the port and comes with the usual machinery to do this.

Link: https://lore.kernel.org/r/5c4aeae57f6fa7c59a1d6d1c5506069516ae9bbf.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# d89eb509 11-Jun-2021 Jason Gunthorpe <jgg@ziepe.ca>

RDMA/core: Remove the kobject_uevent() NOP

This call does nothing because the ib_port kobj is nested under a struct
device kobject and the dev_uevent_filter() function of the struct device
blocks uevents for any children kobj's that are not also struct devices.

A uevent for the struct device will be triggered after
ib_setup_port_attrs() returns which causes udev to pick up all the deep
"attributes" which are implemented as kobjects nested under a struct
device and assign them to the udev object for the struct device:

$ udevadm info -a /sys/class/infiniband/ibp0s9
ATTR{ports/1/counters/excessive_buffer_overrun_errors}=="0"

Link: https://lore.kernel.org/r/49231c92c7d4c60686de18f7e20932d0c82160ee.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# b7066b32 11-Jun-2021 Jason Gunthorpe <jgg@ziepe.ca>

RDMA/core: Create the device hw_counters through the normal groups mechanism

Instead of calling device_add_groups() add the group to the existing
groups array which is managed through device_add().

This requires setting up the hw_counters before device_add(), so it gets
split up from the already split port sysfs flow.

Move all the memory freeing to the release function.

Link: https://lore.kernel.org/r/666250d937b64f6fdf45da9e2dc0b6e5e4f7abd8.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 2ca1cca4 11-Jun-2021 Jason Gunthorpe <jgg@ziepe.ca>

RDMA/core: Simplify how the port sysfs is created

Use the same technique as gid_attrs now uses to manage the port
sysfs. Bundle everything into three allocations and use a single
sysfs_create_groups() to build everything in one shot.

All the memory is always freed in the kobj release function, removing most
of the error unwinding.

The gid_attr technique and the hw_counters are very similar, merge the two
together and combine the sysfs_create_group() call for hw_counters with
the single sysfs group setup.

Link: https://lore.kernel.org/r/b688f3340694c59f7b44b1bde40e25559ef43cf3.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# a4676388 11-Jun-2021 Jason Gunthorpe <jgg@ziepe.ca>

RDMA/core: Simplify how the gid_attrs sysfs is created

Instead of having an whole bunch of different allocations to create the
gid_attr kobjects reduce it to three, one for the kobj struct plus the
attributes, and one for the attribute list for each of the two
groups.

Move the freeing of all allocations to the release function.

Reorder the operations so all the allocations happen first then the
kobject & sysfs operations are last.

This removes the majority of the complicated error unwind since the
release function will always undo all the memory allocations. Freeing the
memory is also much simpler since there is a lot less of it.

Consolidate creating the "group of array indexes" pattern into one helper
function. Ensure kobject_del is used.

Link: https://lore.kernel.org/r/f4149d379db7178d37d11d75e3026bf550f818a1.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# a32f4335 11-Jun-2021 Jason Gunthorpe <jgg@ziepe.ca>

RDMA/core: Split gid_attrs related sysfs from add_port()

The gid_attrs directory is a dedicated kobj nested under the port,
construct/destruct it with its own pair of functions for
understandability. This is much more readable than having it weirdly
inlined out of order into the add_port() function.

Link: https://lore.kernel.org/r/1c9434111b6770a7aef0e644a88a16eee7e325b8.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 467f432a 11-Jun-2021 Jason Gunthorpe <jgg@ziepe.ca>

RDMA/core: Split port and device counter sysfs attributes

This code creates a 'struct hw_stats_attribute' for each sysfs entry that
contains a naked 'struct attribute' inside.

It then proceeds to attach this same structure to a 'struct device' kobj
and a 'struct ib_port' kobj. However, this violates the typing
requirements. 'struct device' requires the attribute to be a 'struct
device_attribute' and 'struct ib_port' requires the attribute to be
'struct port_attribute'.

This happens to work because the show/store function pointers in all three
structures happen to be at the same offset and happen to be nearly the
same signature. This means when container_of() was used to go between the
wrong two types it still managed to work.

However clang CFI detection notices that the function pointers have a
slightly different signature. As with show/store this was only working
because the device and port struct layouts happened to have the kobj at
the front.

Correct this by have two independent sets of data structures for the port
and device case. The two different attributes correctly include the
port/device_attribute struct and everything from there up is kept
split. The show/store function call chains start with device/port unique
functions that invoke a common show/store function pointer.

Link: https://lore.kernel.org/r/a8b3864b4e722aed3657512af6aa47dc3c5033be.1623427137.git.leonro@nvidia.com
Reported-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# d8a58838 11-Jun-2021 Jason Gunthorpe <jgg@ziepe.ca>

RDMA/core: Replace the ib_port_data hw_stats pointers with a ib_port pointer

It is much saner to store a pointer to the kobject structure that contains
the cannonical stats pointer than to copy the stats pointers into a public
structure.

Future patches will require the sysfs pointer for other purposes.

Link: https://lore.kernel.org/r/f90551dfd296cde1cb507bbef27cca9891d19871.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 4b5f4d3f 11-Jun-2021 Jason Gunthorpe <jgg@ziepe.ca>

RDMA: Split the alloc_hw_stats() ops to port and device variants

This is being used to implement both the port and device global stats,
which is causing some confusion in the drivers. For instance EFA and i40iw
both seem to be misusing the device stats.

Split it into two ops so drivers that don't support one or the other can
leave the op NULL'd, making the calling code a little simpler to
understand.

Link: https://lore.kernel.org/r/1955c154197b2a159adc2dc97266ddc74afe420c.1623427137.git.leonro@nvidia.com
Tested-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# b6eb7011 07-Apr-2021 Wenpeng Liang <liangwenpeng@huawei.com>

RDMA/core: Correct format of braces

Do following cleanups about braces:

- Add the necessary braces to maintain context alignment.
- Fix the open '{' that is not on the same line as "switch".
- Remove braces that are not necessary for single statement blocks.
- Fix "else" that doesn't follow close brace '}'.

Link: https://lore.kernel.org/r/1617783353-48249-6-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# f681967a 07-Apr-2021 Wenpeng Liang <liangwenpeng@huawei.com>

RDMA/core: Remove redundant spaces

Space is not required after '(', before ')', before ',' and between '*'
and symbol name of a definition.

Link: https://lore.kernel.org/r/1617783353-48249-5-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 9279c35b 07-Apr-2021 Wenpeng Liang <liangwenpeng@huawei.com>

RDMA/core: Remove the redundant return statements

The return statements at the end of a void function is meaningless.

Link: https://lore.kernel.org/r/1617783353-48249-3-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 1fb7f897 01-Mar-2021 Mark Bloch <mbloch@nvidia.com>

RDMA: Support more than 255 rdma ports

Current code uses many different types when dealing with a port of a RDMA
device: u8, unsigned int and u32. Switch to u32 to clean up the logic.

This allows us to make (at least) the core view consistent and use the
same type. Unfortunately not all places can be converted. Many uverbs
functions expect port to be u8 so keep those places in order not to break
UAPIs. HW/Spec defined values must also not be changed.

With the switch to u32 we now can support devices with more than 255
ports. U32_MAX is reserved to make control logic a bit easier to deal
with. As a device with U32_MAX ports probably isn't going to happen any
time soon this seems like a non issue.

When a device with more than 255 ports is created uverbs will report the
RDMA device as having 255 ports as this is the max currently supported.

The verbs interface is not changed yet because the IBTA spec limits the
port size in too many places to be u8 and all applications that relies in
verbs won't be able to cope with this change. At this stage, we are
extending the interfaces that are using vendor channel solely

Once the limitation is lifted mlx5 in switchdev mode will be able to have
thousands of SFs created by the device. As the only instance of an RDMA
device that reports more than 255 ports will be a representor device and
it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other
ULPs aren't effected by this change and their sysfs/interfaces that are
exposes to userspace can remain unchanged.

While here cleanup some alignment issues and remove unneeded sanity
checks (mainly in rdmavt),

Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# c7adf771 26-Oct-2020 Meir Lichtinger <meirl@mellanox.com>

IB/core: Add support for NDR link speed

Add new IBTA speed NDR, supporting signaling rate of 100Gb.

Link: https://lore.kernel.org/r/20201026133738.1340432-2-leon@kernel.org
Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# e28bf1f0 07-Oct-2020 Joe Perches <joe@perches.com>

RDMA: Convert various random sprintf sysfs _show uses to sysfs_emit

Manual changes for sysfs_emit as cocci scripts can't easily convert them.

Link: https://lore.kernel.org/r/ecde7791467cddb570c6f6d2c908ffbab9145cac.1602122880.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 45808361 07-Oct-2020 Joe Perches <joe@perches.com>

RDMA: Manual changes for sysfs_emit and neatening

Make changes to use sysfs_emit in the RDMA code as cocci scripts can not
be written to handle _all_ the possible variants of various sprintf family
uses in sysfs show functions.

While there, make the code more legible and update its style to be more
like the typical kernel styles.

Miscellanea:

o Use intermediate pointers for dereferences
o Add and use string lookup functions
o return early when any intermediate call fails so normal return is
at the bottom of the function
o mlx4/mcg.c:sysfs_show_group: use scnprintf to format intermediate strings

Link: https://lore.kernel.org/r/f5c9e4c9d8dafca1b7b70bd597ee7f8f219c31c8.1602122880.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 1c7fd726 07-Oct-2020 Joe Perches <joe@perches.com>

RDMA: Convert sysfs device * show functions to use sysfs_emit()

Done with cocci script:

@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
<...
return
- sprintf(buf,
+ sysfs_emit(buf,
...);
...>
}

@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
<...
return
- snprintf(buf, PAGE_SIZE,
+ sysfs_emit(buf,
...);
...>
}

@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
<...
return
- scnprintf(buf, PAGE_SIZE,
+ sysfs_emit(buf,
...);
...>
}

@@
identifier d_show;
identifier dev, attr, buf;
expression chr;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
<...
return
- strcpy(buf, chr);
+ sysfs_emit(buf, chr);
...>
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
<...
len =
- sprintf(buf,
+ sysfs_emit(buf,
...);
...>
return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
<...
len =
- snprintf(buf, PAGE_SIZE,
+ sysfs_emit(buf,
...);
...>
return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
<...
len =
- scnprintf(buf, PAGE_SIZE,
+ sysfs_emit(buf,
...);
...>
return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
<...
- len += scnprintf(buf + len, PAGE_SIZE - len,
+ len += sysfs_emit_at(buf, len,
...);
...>
return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
expression chr;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
...
- strcpy(buf, chr);
- return strlen(buf);
+ return sysfs_emit(buf, chr);
}

Link: https://lore.kernel.org/r/7f406fa8e3aa2552c022bec680f621e38d1fe414.1602122879.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 3ff4de8f 23-Sep-2020 Avihai Horon <avihaih@nvidia.com>

RDMA/core: Change rdma_get_gid_attr returned error code

Change the error code returned from rdma_get_gid_attr when the GID entry
is invalid but the GID index is in the gid table size range to -ENODATA
instead of -EINVAL.

This change is done in order to provide a more accurate error reporting to
be used by the new GID query API in user space. Nevertheless, -EINVAL is
still returned from sysfs in the aforementioned case to maintain
compatibility with user space that expects -EINVAL.

Link: https://lore.kernel.org/r/20200923165015.2491894-2-leon@kernel.org
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 42d5179c 30-Sep-2020 Rikard Falkeborn <rikard.falkeborn@gmail.com>

RDMA/core: Constify struct attribute_group

The only usage of the pma_table field in the ib_port struct is to pass its
address to sysfs_create_group() and sysfs_remove_group(). Make it const to
make it possible to constify a couple of static struct
attribute_group. This allows the compiler to put them in read-only memory.

Link: https://lore.kernel.org/r/20200930224004.24279-2-rikard.falkeborn@gmail.com
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 90efc8b2 14-Jul-2020 Kamal Heib <kamalheib1@gmail.com>

RDMA/core: Expose pkeys sysfs files only if pkey_tbl_len is set

Expose the pkeys sysfs files only if the pkey_tbl_len is set by the
providers.

Link: https://lore.kernel.org/r/20200714183414.61069-2-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 0b8e125e 27-May-2020 Qiushi Wu <wu000273@umn.edu>

RDMA/core: Fix several reference count leaks.

kobject_init_and_add() takes reference even when it fails. If this
function returns an error, kobject_put() must be called to properly clean
up the memory associated with the object. Previous
commit b8eb718348b8 ("net-sysfs: Fix reference count leak in
rx|netdev_queue_add_kobject") fixed a similar problem.

Link: https://lore.kernel.org/r/20200528030231.9082-1-wu000273@umn.edu
Signed-off-by: Qiushi Wu <wu000273@umn.edu>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# e26e7b88 29-Oct-2019 Leon Romanovsky <leon@kernel.org>

RDMA: Change MAD processing function to remove extra casting and parameter

All users of process_mad() converts input pointers from ib_mad_hdr to be
ib_mad, update the function declaration to use ib_mad directly.

Also remove not used input MAD size parameter.

Link: https://lore.kernel.org/r/20191029062745.7932-17-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Tested-By: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# be4a8d46 29-Oct-2019 Leon Romanovsky <leon@kernel.org>

RDMA/mad: Allocate zeroed MAD buffer

Ensure that MAD output buffer is zero-based allocated in all the callers
of process_mad and remove the various memset()'s from the drivers.

Link: https://lore.kernel.org/r/20191029062745.7932-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# d0f3ef36 23-Sep-2019 Kamal Heib <kamalheib1@gmail.com>

RDMA/core: Fix return code when modify_device isn't supported

The proper return code is "-EOPNOTSUPP" when modify_device callback is not
supported.

Link: https://lore.kernel.org/r/20190923104158.5331-2-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 72a7720f 07-Aug-2019 Kamal Heib <kamalheib1@gmail.com>

RDMA: Introduce ib_port_phys_state enum

In order to improve readability, add ib_port_phys_state enum to replace
the use of magic numbers.

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Andrew Boyer <aboyer@tobark.org>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Acked-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20190807103138.17219-2-kamalheib1@gmail.com
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 6e7be47a 02-Jul-2019 Mark Zhang <markz@mellanox.com>

RDMA/nldev: Allow get default counter statistics through RDMA netlink

This patch adds the ability to return the hwstats of per-port default
counters (which can also be queried through sysfs nodes).

Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# f34a55e4 02-Jul-2019 Mark Zhang <markz@mellanox.com>

RDMA/core: Get sum value of all counters when perform a sysfs stat read

Since a QP can only be bound to one counter, then if it is bound to a
separate counter, for backward compatibility purpose, the statistic value
must be:
* stat of default counter
+ stat of all running allocated counters
+ stat of all deallocated counters (history stats)

Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# f95be3d2 05-May-2019 Gal Pressman <galpress@amazon.com>

RDMA: Add EFA related definitions

Add EFA driver ID to the IOCTL interface uapi. This patch also adds
unspecified node/transport type that will be used by EFA (usnic is left
unchanged as it's already part of our ABI).

Signed-off-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 943bd984 02-May-2019 Parav Pandit <parav@mellanox.com>

RDMA/core: Allow detaching gid attribute netdevice for RoCE

When there is active traffic through a GID, a QP/AH holds reference to
this GID entry. RoCE GID entry holds reference to its attached
netdevice. Due to this when netdevice is deleted by admin user, its
refcount is not dropped.

Therefore, while deleting RoCE GID, wait for all GID attribute's netdev
users to finish accessing netdev in rcu context. Once all users done
accessing it, release the netdev refcount.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# eb15c78b 30-Apr-2019 Parav Pandit <parav@mellanox.com>

RDMA/core: Do not invoke init_port on compat devices

The driver interface cannot manipulate the sysfs of the compat device,
only of the full device so we must avoid calling the driver sysfs APIs on
compat devices.

This prevents an oops:

Call Trace:
dump_stack+0x5a/0x73
kobject_init+0x74/0x80
kobject_init_and_add+0x35/0xb0
hfi1_create_port_files+0x6e/0x3c0 [hfi1]
ib_setup_port_attrs+0x43b/0x560 [ib_core]
add_one_compat_dev+0x16a/0x230 [ib_core]
rdma_dev_init_net+0x110/0x160 [ib_core]
ops_init+0x38/0xf0
setup_net+0xcf/0x1e0
copy_net_ns+0xb7/0x130
create_new_namespaces+0x11a/0x1b0
unshare_nsproxy_namespaces+0x55/0xa0
ksys_unshare+0x1a7/0x340
__x64_sys_unshare+0xe/0x20
do_syscall_64+0x5b/0x180
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 5417783eabb2 ("RDMA/core: Support core port attributes in non init_net")
Reported-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# c87e65cf 11-Mar-2019 Leon Romanovsky <leon@kernel.org>

RDMA/cm: Move debug counters to be under relevant IB device

The sysfs layout is created by CM incorrectly presented RDMA devices with
InfiniBand link layer. Layout of such devices represents device tree of
connections. By moving CM statistics to be under relevant port of IB
device, we will fix the following issues:

* Symlink name - It used device name instead of specific identifier.
* Target location - It was supposed to point to PCI-ID/infiniband_cm/
instead of PCI-ID/infiniband/
* Target name - It created extra device file under already existing
device folder, e.g. mlx5_0/mlx5_0
* Crash during boot with RDMA persistent naming patches.

sysfs: cannot create duplicate filename '/class/infiniband_cm/mlx5_0'
CPU: 29 PID: 433 Comm: modprobe Not tainted 5.0.0-rc5+ #178
Call Trace:
dump_stack+0xcc/0x180
sysfs_warn_dup.cold.3+0x17/0x2d
sysfs_do_create_link_sd.isra.2+0xd0/0xf0
device_add+0x7cb/0x1450
device_create_groups_vargs+0x1ae/0x220
device_create+0x93/0xc0
cm_add_one+0x38f/0xf60 [ib_cm]
add_client_context+0x167/0x210 [ib_core]
enable_device_and_get+0x230/0x3f0 [ib_core]
ib_register_device+0x823/0xbf0 [ib_core]
__mlx5_ib_add+0x45/0x150 [mlx5_ib]
mlx5_ib_add+0x1b3/0x5e0 [mlx5_ib]
mlx5_add_device+0x130/0x3a0 [mlx5_core]
mlx5_register_interface+0x1a9/0x270 [mlx5_core]
do_one_initcall+0x14f/0x5de
do_init_module+0x247/0x7c0
load_module+0x4c2f/0x60d0
entry_SYSCALL_64_after_hwframe+0x49/0xbe

After this change:
[leonro@server ~]$ ls -al /sys/class/infiniband/ibp0s12f0/ports/1/
drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_rx_duplicates
drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_rx_msgs
drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_tx_msgs
drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_tx_retries

Fixes: 110cf374a809 ("infiniband: make cm_device use a struct device and not a kobject.")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 5417783e 26-Feb-2019 Parav Pandit <parav@mellanox.com>

RDMA/core: Support core port attributes in non init_net

Now that sysfs compatibility layer for non init_net exists, add core port
attributes such as pkey and gid table to non init_net ns.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# cebe556b 26-Feb-2019 Parav Pandit <parav@mellanox.com>

RDMA/core: Introduce ib_core_device to hold device

In order to support sysfs entries in multiple net namespaces for a rdma
device, introduce a ib_core_device whose scope is limited to hold core
device and per port sysfs related entries.

This is preparation patch so that multiple ib_core_devices in each net
namespace can be created in subsequent patch who all can share ib_device.

(a) Move sysfs specific fields to ib_core_device.
(b) Make sysfs and device life cycle related routines to work on
ib_core_device.
(c) Introduce and use rdma_init_coredev() helper to initialize
coredev fields.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# ea1075ed 12-Feb-2019 Jason Gunthorpe <jgg@ziepe.ca>

RDMA: Add and use rdma_for_each_port

We have many loops iterating over all of the end port numbers on a struct
ib_device, simplify them with a for_each helper.

Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 5f8f5499 13-Feb-2019 Parav Pandit <parav@mellanox.com>

RDMA/core: Move device addition deletion to device.c

Move core device addition and removal from sysfs.c to device.c as device.c
is more appropriate place for device management.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 5767198a 13-Feb-2019 Parav Pandit <parav@mellanox.com>

RDMA/core: Introduce and use ib_setup_port_attrs()

Refactor code for device and port sysfs attributes for reuse.

While at it, rename counter part free function to ib_free_port_attrs.

Also attribute setup sequence is:
(a) port specific init.
(b) device stats alloc/init.

So for cleanup, follow reverse sequence:
(a) device stats dealloc
(b) port specific cleanup

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# e155755e 13-Feb-2019 Parav Pandit <parav@mellanox.com>

RDMA/core: Use simpler device_del() instead of device_unregister()

Instead of holding extra reference using get_device() that
device_unregister() releases, simplify it as below.

device_add() balances with device_del(). device_initialize() balances
with put_device(), always via ib_dealloc_device().

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 54747231 18-Dec-2018 Parav Pandit <parav@mellanox.com>

RDMA: Introduce and use rdma_device_to_ibdev()

Introduce and use rdma_device_to_ibdev() API for those drivers which are
registering one sysfs group and also use in ib_core.

In subsequent patch, device->provider_ibdev one-to-one mapping is no
longer holds true during accessing sysfs entries.
Therefore, introduce an API rdma_device_to_ibdev() that provides such
information.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# ea4baf7f 18-Dec-2018 Parav Pandit <parav@mellanox.com>

RDMA: Rename port_callback to init_port

Most provider routines are callback routines which ib core invokes.
_callback suffix doesn't convey information about when such callback is
invoked. Therefore, rename port_callback to init_port.

Additionally, store the init_port function pointer in ib_device_ops, so
that it can be accessed in subsequent patches when binding rdma device to
net namespace.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 3023a1e9 10-Dec-2018 Kamal Heib <kamalheib1@gmail.com>

RDMA: Start use ib_device_ops

Make all the required change to start use the ib_device_ops structure.

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 76d865b8 17-Oct-2018 Parav Pandit <parav@mellanox.com>

RDMA/core: Fix comment for hw stats init for port == 0

When add_port() is done for port == 0, it indicates that ports hardware
counters initialization should be skipped. Reflect so in the comment.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 1ae4cfa0 06-Oct-2018 Parav Pandit <parav@mellanox.com>

RDMA/core: Rename ports_parent to ports_kobj

Normally kobj objects have kobj suffix to reflect it.
Rename ports_parent to ports_kobj.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 0f6ef65d 06-Oct-2018 Parav Pandit <parav@mellanox.com>

RDMA/core: Do not expose unsupported counters

If the provider driver (such as rdma_rxe) doesn't support pma counters,
avoid exposing its directory similar to optional hw_counters directory.
If core fails to read the PMA counter, return an error so that user can
retry later if needed.

Fixes: 35c4cbb17811 ("IB/core: Create get_perf_mad function in sysfs.c")
Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# e349f858 25-Sep-2018 Jason Gunthorpe <jgg@ziepe.ca>

RDMA: Fully setup the device name in ib_register_device

The current code has two copies of the device name, ibdev->dev and
dev_name(&ibdev->dev), and they are setup at different times, which is
very confusing.

Set them both up at the same time and make dev_name() the lead name, which
is the proper use of the driver core APIs. To make it very clear that the
name is not valid until registration pass it in to the
ib_register_device() call rather than messing with ibdev->name directly.

Also the reorganization now checks that dev_name is unique even if it does
not contain a %.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Acked-by: Devesh Sharma <devesh.sharma@broadcom.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>


# adee9f3f 05-Sep-2018 Parav Pandit <parav@mellanox.com>

RDMA/core: Depend on device_add() to add device attributes

Instead of adding/removing device attribute files, depend on device_add()
which considers adding these device files based on NULL terminated
attributes group array.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 627212c9 03-Sep-2018 Parav Pandit <parav@mellanox.com>

RDMA/core: Replace open-coded variant of get_device

Reuse existing get_device() API to do it symmetric to already used
put_device() in commit 924b8900a49d ("RDMA/core: Replace open-coded
variant of put_device")

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 77e786fc 04-Jun-2018 Parav Pandit <parav@mellanox.com>

IB/core: Replace ib_query_gid with rdma_get_gid_attr

These call sites have a use of ib_query_gid with a simple lifetime for the
struct gid_attr pointer, with an easy conversion.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 598ff6ba 01-Apr-2018 Parav Pandit <parav@mellanox.com>

IB/core: Refactor GID modify code for RoCE

Code is refactored to prepare separate functions for RoCE which can do more
complex operations related to reference counting, while still
maintainining code readability. This includes
(a) Simplification to not perform netdevice checks and modifications
for IB link layer.
(b) Do not add RoCE GID entry which has NULL netdevice; instead return
an error.
(c) If GID addition fails at provider level add_gid(), do not add the
entry in the cache and keep the entry marked as INVALID.
(d) Simplify and reuse the ib_cache_gid_add()/del() routines so that they
can be used even for modifying default GIDs. This avoid some code
duplication in modifying default GIDs.
(e) find_gid() routine refers to the data entry flags to qualify a GID
as valid or invalid GID rather than depending on attributes and zeroness
of the GID content.
(f) gid_table_reserve_default() sets the GID default attribute at
beginning while setting up the GID table. There is no need to use
default_gid flag in low level functions such as write_gid(), add_gid(),
del_gid(), as they never need to update the DEFAULT property of the GID
entry while during GID table update.

As as result of this refactor, reserved GID 0:0:0:0:0:0:0:0 is no longer
searchable as described below.

A unicast GID entry of 0:0:0:0:0:0:0:0 is Reserved GID as per the IB
spec version 1.3 section 4.1.1, point (6) whose snippet is below.

"The unicast GID address 0:0:0:0:0:0:0:0 is reserved - referred to as
the Reserved GID. It shall never be assigned to any endport. It shall
not be used as a destination address or in a global routing header
(GRH)."

GID table cache now only stores valid GID entries. Before this patch,
Reserved GID 0:0:0:0:0:0:0:0 was searchable in the GID table using
ib_find_cached_gid_by_port() and other similar find routines.

Zero GID is no longer searchable as it shall not to be present in GRH or
path recored entry as described in IB spec version 1.3 section 4.1.1,
point (6), section 12.7.10 and section 12.7.20.

ib_cache_update() is simplified to check link layer once, use unified
locking scheme for all link layers, removed temporary gid table
allocation/free logic.

Additionally,
(a) Expand ib_gid_attr to store port and index so that GID query
routines can get port and index information from the attribute structure.
(b) Expand ib_gid_attr to store device as well so that in future code when
GID reference counting is done, device is used to reach back to the GID
table entry.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# e945130b 27-Mar-2018 Mark Bloch <markb@mellanox.com>

IB/core: Protect against concurrent access to hardware stats

Currently access to hardware stats buffer isn't protected, this can
result in multiple writes and reads at the same time to the same
memory location. This can lead to providing an incorrect value to
the user. Add a mutex to protect against it.

Fixes: b40f4757daa1 ("IB/core: Make device counter infrastructure dynamic")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 311d0da9 15-Mar-2018 Honggang Li <honli@redhat.com>

IB/core: Set speed string to SDR for invalid active rates

Before commit f1b65df5a232 ("IB/mlx5: Add support for active_width and
active_speed in RoCE"), the mlx5_ib driver set default active_width and
active_speed to IB_WIDTH_4X and IB_SPEED_QDR.

Now, the active_width and active_speed are zeros if the RoCE port
is in DOWN state. The speed string should be set to " SDR" instead of
a blank string when active_speed is zero.

Signed-off-by: Honggang Li <honli@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 02ee9da3 03-Jan-2018 Bart Van Assche <bvanassche@acm.org>

IB/core: Fix two kernel warnings triggered by rxe registration

Eliminate the WARN_ONs that create following two warnings when
registering an rxe device:

WARNING: CPU: 2 PID: 1005 at drivers/infiniband/core/device.c:449 ib_register_device+0x591/0x640 [ib_core]
CPU: 2 PID: 1005 Comm: run_tests Not tainted 4.15.0-rc4-dbg+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:ib_register_device+0x591/0x640 [ib_core]
Call Trace:
rxe_register_device+0x3c6/0x470 [rdma_rxe]
rxe_add+0x543/0x5e0 [rdma_rxe]
rxe_net_add+0x37/0xb0 [rdma_rxe]
rxe_param_set_add+0x5a/0x120 [rdma_rxe]
param_attr_store+0x5e/0xc0
module_attr_store+0x19/0x30
sysfs_kf_write+0x3d/0x50
kernfs_fop_write+0x116/0x1a0
__vfs_write+0x23/0x120
vfs_write+0xbe/0x1b0
SyS_write+0x44/0xa0
entry_SYSCALL_64_fastpath+0x23/0x9a

WARNING: CPU: 2 PID: 1005 at drivers/infiniband/core/sysfs.c:1279 ib_device_register_sysfs+0x11d/0x160 [ib_core]
CPU: 2 PID: 1005 Comm: run_tests Tainted: G W 4.15.0-rc4-dbg+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:ib_device_register_sysfs+0x11d/0x160 [ib_core]
Call Trace:
ib_register_device+0x3f7/0x640 [ib_core]
rxe_register_device+0x3c6/0x470 [rdma_rxe]
rxe_add+0x543/0x5e0 [rdma_rxe]
rxe_net_add+0x37/0xb0 [rdma_rxe]
rxe_param_set_add+0x5a/0x120 [rdma_rxe]
param_attr_store+0x5e/0xc0
module_attr_store+0x19/0x30
sysfs_kf_write+0x3d/0x50
kernfs_fop_write+0x116/0x1a0
__vfs_write+0x23/0x120
vfs_write+0xbe/0x1b0
SyS_write+0x44/0xa0
entry_SYSCALL_64_fastpath+0x23/0x9a

The code should accept either a parent pointer or a fully specified DMA
specification without producing warnings.

Fixes: 99db9494035f ("IB/core: Remove ib_device.dma_device")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: stable@vger.kernel.org # v4.11
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 79c4d80b 15-Oct-2017 Parav Pandit <parav@mellanox.com>

IB/core: Fix unable to change lifespan entry for hw_counters

This patch fixes the case where 'lifespan' entry of the hw_counters
is not writable. Currently write callback is not exposed for for
the hw_counters sysfs operation. Due to this, modifying lifespan
value results into permission denied error in below example.

echo 10 > /sys/class/infiniband/mlx5_0/ports/1/hw_counters/lifespan
-bash: /sys/class/infiniband/mlx5_0/ports/1/hw_counters/lifespan:
Permission denied

This patch adds the hook to modify any attribute which implements
store() operation.

Fixes: b40f4757daa1 ("IB/core: Make device counter infrastructure dynamic")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 9abb0d1b 27-Jun-2017 Leon Romanovsky <leon@kernel.org>

RDMA: Simplify get firmware interface

There is a need to forward FW version to user space
application through RDMA netlink. In order to make it safe, there
is need to declare nla_policy and limit the size of FW string.

The new define IB_FW_VERSION_NAME_MAX will limit the size of
FW version string. That define was chosen to be equal to
ETHTOOL_FWVERS_LEN, because many drivers anyway are limited
by that value indirectly.

The introduction of this define allows us to remove the string size
from get_fw_str function signature.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>


# 12113a35 20-Apr-2017 Noa Osherovich <noaos@mellanox.com>

IB/core: Add HDR speed enum

Add high data rate speed to the ib_port_speed enumeration.

Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# b312be3d 19-Mar-2017 Jack Morgenstein <jackm@dev.mellanox.co.il>

IB/core: Fix sysfs registration error flow

The kernel commit cited below restructured ib device management
so that the device kobject is initialized in ib_alloc_device.

As part of the restructuring, the kobject is now initialized in
procedure ib_alloc_device, and is later added to the device hierarchy
in the ib_register_device call stack, in procedure
ib_device_register_sysfs (which calls device_add).

However, in the ib_device_register_sysfs error flow, if an error
occurs following the call to device_add, the cleanup procedure
device_unregister is called. This call results in the device object
being deleted -- which results in various use-after-free crashes.

The correct cleanup call is device_del -- which undoes device_add
without deleting the device object.

The device object will then (correctly) be deleted in the
ib_register_device caller's error cleanup flow, when the caller invokes
ib_dealloc_device.

Fixes: 55aeed06544f6 ("IB/core: Make ib_alloc_device init the kobject")
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 97a9ea84 20-Jan-2017 Bart Van Assche <bvanassche@acm.org>

IB/core: Initialize ib_device.dev.parent earlier

Move the ib_device.dev.parent initialization code from
ib_device_register_sysfs() to ib_register_device(). Additionally,
allow HBA drivers to set ib_device.dev.parent without setting
ib_device.dma_device. This is the first step towards removing
ib_device.dma_device.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# bd99fdea 25-Aug-2016 Yuval Shaia <yuval.shaia@oracle.com>

IB/{core,hw}: Add constant for node_desc

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# c5a81d11 08-Jul-2016 Christoph Lameter <cl@linux.com>

IB core: Add port_xmit_wait counter

Add the missing port_xmit_wait counter. This counter is displayed through
some tools like perfquery but is not available via sysfs.

For the PORT_PMA_ATTR macro the _counter field is set to zero
allowing us to specify the offset directly like with PORT_PMA_ATTR_EXT

See also the earlier work in 2008 by Vladimir Skolovsky

https://www.mail-archive.com/general@lists.openfabrics.org/msg20313.html

Signed-off-by: Vladimir Sokolvsky <vlad@mellanox.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 41a6ae1e 15-Jun-2016 Ira Weiny <ira.weiny@intel.com>

IB/core: Export a common fw_ver sysfs entry

Now that all the devices have stopped exporting their own sysfs
entry points we can have the core export this on their behalf.

Eventually this may be removed but this provides for backwards
compatibility.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 8aec013a 04-Jun-2016 Mark Bloch <markb@mellanox.com>

IB/core: Initialize sysfs attributes before sysfs create group

For dynamically allocated sysfs attributes there is a need to call
sysfs_attr_init in order to comply with lockdep, not calling it
will result in error complaining key is not in .data section.

Fixes: b40f4757daa1 ("IB/core: Make device counter infrastructure dynamic")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 495fbae6 07-Jun-2016 Doug Ledford <dledford@redhat.com>

IB/core: fix error unwind in sysfs hw counters code

Between the initial and final versions of the function setup_hw_stats,
the order of variable initialization was changed. However, the unwind
flow on error did not properly keep up with the flow changes. Make
the unwind flow match a proper unwind of the allocation flow, then
remove no longer needed variable initializations.

Fixes: b40f4757daa1 (IB/core: Make device counter infrastructure
dynamic)
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 41aaa99f 06-Jun-2016 Doug Ledford <dledford@redhat.com>

IB/core: Fix array length allocation

The new sysfs hw_counters code had an off by one in its array allocation
length. Fix that and the comment along with it.

Reported-by: Mark Bloch <markb@mellanox.com>
Fixes: b40f4757daa1 (IB/core: Make device counter infrastructure
dynamic)
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 0147ebcf 01-Jun-2016 Colin Ian King <colin.king@canonical.com>

IB/core: fix null pointer deref and mem leak in error handling

The current error handling in setup_hw_stats has a couple of issues.
It is possible to generate a null pointer deference on the
kfree of hsag->attrs[i] because two of the early error exit paths
jump to the kfree when hsags NULL and not allocated. Fix this by
moving the kfree on stats and jumping to that, avoiding the hsag
freeing.

Secondly, there is a memory leak of stats if the hsag allocation
fails; instead of returning, jump to the kfree on stats.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# b40f4757 15-May-2016 Christoph Lameter <cl@linux.com>

IB/core: Make device counter infrastructure dynamic

In practice, each RDMA device has a unique set of counters that the
hardware implements. Having a central set of counters that they must
all adhere to is limiting and causes many useful counters to not be
available.

Therefore we create a dynamic counter registration infrastructure.

The driver must implement a stats structure allocation routine, in
which the driver must place the directory name it wants, a list of
names for all of the counters, an array of u64 counters themselves,
plus a few generic configuration options.

We then implement a core routine to create a sysfs file for each
of the named stats elements, and a core routine to retrieve the
stats when any of the sysfs attribute files are read.

To avoid excessive beating on the stats generation routine in the
drivers, the core code also caches the stats for a short period of
time so that someone attempting to read all of the stats in a
given device's directory will not result in a stats generation
call per file read.

Future work will attempt to standardize just the shared stats
elements, and possibly add a method to get the stats via netlink
in addition to sysfs.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
[ Add caching, make structure names more informative, add i40iw support,
other significant rewrites from the original patch ]


# ee50aeac 11-Feb-2016 Eran Ben Elisha <eranbe@mellanox.com>

IB/core: Fix reading capability mask of the port info class

When checking specific attribute from a bit mask, need to use bitwise
AND and not logical AND, fixed that.

Fixes: 145d9c541032 ('IB/core: Display extended counter set if
available')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 9f780dab 25-Jan-2016 Colin Ian King <colin.king@canonical.com>

IB/sysfs: remove unused va_list args

_show_port_gid_attr performs a va_end on some unused va_list args.
Clean this up by removing the args completely.

Fixes: 470be516a226e8 ("IB/core: Add gid attributes to sysfs")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 6e2a51a0 29-Dec-2015 Hal Rosenstock <hal@dev.mellanox.co.il>

IB/core: sysfs.c: Fix PerfMgt ClassPortInfo handling

Port number is not part of ClassPortInfo attribute but is
still needed as a parameter when invoking process_mad.

To properly handle this attribute, port_num is added as a
parameter to get_counter_table and get_perf_mad was changed
not to store port_num in the attribute itself when it's
querying the ClassPortInfo attribute.

This handles issue pointed out by Matan Barak <matanb@dev.mellanox.co.il>

Fixes: 145d9c541032 ('IB/core: Display extended counter set if available')

Signed-off-by: Hal Rosenstock <hal@mellanox.com>
Acked-by: Matan Barak <matanb@mellanox.com>
Acked-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 65487fdc 03-Jan-2016 Ira Weiny <ira.weiny@intel.com>

IB/sysfs: Fix sparse warning on attr_id

Attributed ID was declared as an int while the value should really be big
endian 16.

Fixes: 35c4cbb17811 ("IB/core: Create get_perf_mad function in sysfs.c")

Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 145d9c54 21-Dec-2015 Christoph Lameter <cl@linux.com>

IB/core: Display extended counter set if available

Check if the extended counters are available and if so
create the proper extended and additional counters.

Signed-off-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# b2788ce5 21-Dec-2015 Christoph Lameter <cl@linux.com>

IB/core: Specify attribute_id in port_table_attribute

Add the attr_id on port_table_attribute since we will have to add
a different port_table_attribute for the extended attribute soon.

Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 35c4cbb1 21-Dec-2015 Christoph Lameter <cl@linux.com>

IB/core: Create get_perf_mad function in sysfs.c

Create a new function to retrieve performance management
data from the existing code in get_pma_counter().

Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 470be516 23-Dec-2015 Matan Barak <matanb@mellanox.com>

IB/core: Add gid attributes to sysfs

This patch set adds attributes of net device and gid type to each GID
in the GID table. Users that use verbs directly need to specify
the GID index. Since the same GID could have different types or
associated net devices, users should have the ability to query the
associated GID attributes. Adding these attributes to sysfs.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 86bee4c9 18-Dec-2015 Or Gerlitz <ogerlitz@mellanox.com>

IB/core: Avoid calling ib_query_device

Use the cached copy of the attributes present on the device, except for
the case of a query originating from user-space, where we have to invoke
the driver query_device entry, so they can fill in their udata.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 55ee3ab2 15-Oct-2015 Matan Barak <matanb@mellanox.com>

IB/core: Add netdev and gid attributes paramteres to cache

Adding an ability to query the IB cache by a netdev and get the
attributes of a GID. These parameters are necessary in order to
successfully resolve the required GID (when the netdevice is known)
and get the Ethernet L2 attributes from a GID.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 55aeed06 04-Aug-2015 Jason Gunthorpe <jgg@ziepe.ca>

IB/core: Make ib_alloc_device init the kobject

This gets rid of the weird in-between state where struct ib_device
was allocated but the kobject didn't work.

Consequently ib_device_release is now guaranteed to be called in
all situations and we needn't duplicate its kfrees on error paths.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 4139032b 29-Jun-2015 Hal Rosenstock <hal@dev.mellanox.co.il>

IB: Add rdma_cap_ib_switch helper and use where appropriate

Persuant to Liran's comments on node_type on linux-rdma
mailing list:

In an effort to reform the RDMA core and ULPs to minimize use of
node_type in struct ib_device, an additional bit is added to
struct ib_device for is_switch (IB switch). This is needed
to be initialized by any IB switch device driver. This is a
NEW requirement on such device drivers which are all
"out of tree".

In addition, an ib_switch helper was added to ib_verbs.h
based on the is_switch device bit rather than node_type
(although those should be consistent).

The RDMA core (MAD, SMI, agent, sa_query, multicast, sysfs)
as well as (IPoIB and SRP) ULPs are updated where
appropriate to use this new helper. In some cases,
the helper is now used under the covers of using
rdma_[start end]_port rather than the open coding
previously used.

Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 4cd7c947 06-Jun-2015 Ira Weiny <ira.weiny@intel.com>

IB/mad: Add support for additional MAD info to/from drivers

In order to support alternate sized MADs (and variable sized MADs on OPA
devices) add in/out MAD size parameters to the process_mad core call.

In addition, add an out_mad_pkey_index to communicate the pkey index the driver
wishes the MAD stack to use when sending OPA MAD responses.

The out MAD size and the out MAD PKey index are required by the MAD
stack to generate responses on OPA devices.

Furthermore, the in and out MAD parameters are made generic by specifying them
as ib_mad_hdr rather than ib_mad.

Drivers are modified as needed and are protected by BUG_ON flags if the MAD
sizes passed to them is incorrect.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 7738613e 13-May-2015 Ira Weiny <ira.weiny@intel.com>

IB/core: Add per port immutable struct to ib_device

As of commit 5eb620c81ce3 "IB/core: Add helpers for uncached GID and P_Key
searches"; pkey_tbl_len and gid_tbl_len are immutable data which are stored in
the ib_device.

The per port core capability flags to be added later are also immutable data to
be stored in the ib_device object.

In preparation for this create a structure for per port immutable data and
place the pkey and gid table lengths within this structure.

"get_port_immutable" is added as a mandatory device function to allow the
drivers to fill in this data.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 584482ac 18-May-2014 Haggai Eran <haggaie@mellanox.com>

IB/core: Fix kobject leak on device register error flow

The ports kobject isn't being released during error flow in device
registration. This patch refactors the ports kobject cleanup into a
single function called from both the error flow in device registration
and from the unregistration function.

A couple of attributes aren't being deleted (iw_stats_group, and
ib_class_attributes). While this may be handled implicitly by the
destruction of their kobjects, it seems better to handle all the
attributes the same way.

Signed-off-by: Haggai Eran <haggaie@mellanox.com>

[ Make free_port_list_attributes() static. - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>


# cad6d02a 18-May-2014 Haggai Eran <haggaie@mellanox.com>

IB/core: Fix port kobject deletion during error flow

When encountering an error during the add_port function, adding a port
to sysfs, the port kobject is freed without being deleted from sysfs.

Instead of freeing it directly, the patch uses kobject_put to release
the kobject and delete it.

Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>


# 373c0ea1 18-May-2014 Haggai Eran <haggaie@mellanox.com>

IB/core: Remove unneeded kobject_get/put calls

The ib_core module will call kobject_get on the parent object of each
kobject it creates. This is redundant since kobject_add does that
anyway.

As a side effect, this patch should fix leaking the ports kobject and
the device kobject during unregister flow, since the previous code
didn't seem to take into account the kobject_get calls on behalf of
the child kobjects.

Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>


# 5db5765e 15-Jan-2014 Upinder Malhi <umalhi@cisco.com>

IB/core: Add support for RDMA_NODE_USNIC_UDP

Add the complementary RDMA_NODE_USNIC_UDP for RDMA_TRANSPORT_USNIC_UDP.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>


# 180771a3 09-Sep-2013 Upinder Malhi \(umalhi\) <umalhi@cisco.com>

IB/core: Add Cisco usNIC rdma node and transport types

This patch adds new rdma node and new rdma transport, and supporting
code used by Cisco's low latency driver called usNIC. usNIC uses its
own transport, distinct from IB and iWARP.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>


# 02aa2a37 03-Jul-2013 Kees Cook <keescook@chromium.org>

drivers: avoid format string in dev_set_name

Calling dev_set_name with a single paramter causes it to be handled as a
format string. Many callers are passing potentially dynamic string
content, so use "%s" in those cases to avoid any potential accidents,
including wrappers like device_create*() and bdi_register().

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 80b15043 20-Jun-2013 Wei Yongjun <weiyj.lk@gmail.com>

IB/core: Fix error return code in add_port()

Fix to return -ENOMEM in the add_port() error handling case instead of
0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Roland Dreier <roland@purestorage.com>


# 0559d8dc 02-Apr-2012 Roland Dreier <roland@purestorage.com>

IB/core: Don't return EINVAL from sysfs rate attribute for invalid speeds

Commit e9319b0cb00d ("IB/core: Fix SDR rates in sysfs") changed our
sysfs rate attribute to return EINVAL to userspace if the underlying
device driver returns an invalid rate. Apparently some drivers do this
when the link is down and some userspace pukes if it gets an error when
reading this attribute, so avoid a regression by not return an error to
match the old code.

Signed-off-by: Roland Dreier <roland@purestorage.com>


# 2e96691c 28-Feb-2012 Or Gerlitz <ogerlitz@mellanox.com>

IB: Use central enum for speed instead of hard-coded values

The kernel IB stack uses one enumeration for IB speed, which wasn't
explicitly specified in the verbs header file. Add that enum, and use
it all over the code.

The IB speed/width notation is also used by iWARP and IBoE HW drivers,
which use the convention of rate = speed * width to advertise their
port link rate.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>


# e9319b0c 27-Feb-2012 Roland Dreier <roland@purestorage.com>

IB/core: Fix SDR rates in sysfs

Commit 71eeba16 ("IB: Add new InfiniBand link speeds") introduced a bug
where eg the rate for IB 4X SDR links iss displayed as "8.5 Gb/sec"
instead of "10 Gb/sec" as it used to be. Fix that.

Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>


# fc87af74 27-May-2011 Paul Gortmaker <paul.gortmaker@windriver.com>

infiniband: Fix up users implicitly relying on getting stat.h

They get it via module.h (via device.h) but we want to clean that up.
When we do, we'll get things like:

CC [M] drivers/infiniband/core/sysfs.o
sysfs.c:361: error: 'S_IRUGO' undeclared here (not in a function)
sysfs.c:654: error: 'S_IWUSR' undeclared here (not in a function)

so add in the stat header it is using explicitly in advance.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>


# 71eeba16 05-Oct-2011 Marcel Apfelbaum <marcela@dev.mellanox.co.il>

IB: Add new InfiniBand link speeds

Introduce support for the following extended speeds:

FDR-10: a Mellanox proprietary link speed which is 10.3125 Gbps with
64b/66b encoding rather than 8b/10b encoding.
FDR: IBA extended speed 14.0625 Gbps.
EDR: IBA extended speed 25.78125 Gbps.

Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>


# 8ad330a0 22-Oct-2010 Eli Cohen <eli@mellanox.co.il>

IB/core: Add link layer type information to sysfs

Since an IB transport port may use either IB or Ethernet as its link layer,
add the file /sys/class/infiniband/<device>/ports/<port_num>/link_layer to
show the link layer for the port.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 9a6edb60 06-May-2010 Ralph Campbell <ralph.campbell@qlogic.com>

IB/core: Allow device-specific per-port sysfs files

Add a new parameter to ib_register_device() so that low-level device
drivers can pass in a pointer to a callback function that will be
called for each port that is registered in sysfs. This allows
low-level device drivers to create files in

/sys/class/infiniband/<hca>/ports/<N>/

without having to poke through the internals of the RDMA sysfs handling.

There is no need for an unregister function since the kobject
reference will go to zero when ib_unregister_device() is called.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 21e3bde9 15-Mar-2010 Greg Kroah-Hartman <gregkh@suse.de>

sysfs: fix sysfs lockdep warning in infiniband code

This fixes a sysfs lockdep warning in the infiniband code.

Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 52cf25d0 18-Jan-2010 Emese Revfy <re.emese@gmail.com>

Driver core: Constify struct sysfs_ops in struct kobj_type

Constify struct sysfs_ops.

This is part of the ops structure constification
effort started by Arjan van de Ven et al.

Benefits of this constification:

* prevents modification of data that is shared
(referenced) by many other structure instances
at runtime

* detects/prevents accidental (but not intentional)
modification attempts on archs that enforce
read-only kernel data at runtime

* potentially better optimized code as the compiler
can assume that the const data cannot be changed

* the compiler/linker move const data into .rodata
and therefore exclude them from false sharing

Signed-off-by: Emese Revfy <re.emese@gmail.com>
Acked-by: David Teigland <teigland@redhat.com>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 3f7c58a0 30-Apr-2009 Greg Kroah-Hartman <gregkh@suse.de>

infiniband: remove driver_data direct access of struct device

In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device. Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used. These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.


Cc: general@lists.openfabrics.org
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 6432f366 04-Mar-2009 Roland Dreier <rolandd@cisco.com>

IB: Remove useless ibdev_is_alive() tests from sysfs code

Some attribute show functions test ibdev_is_alive() to make sure that
it's OK to access device state. However, the sysfs attributes will
not be registered until the device is fully initialized, and they'll
be unregistered before anything is torn down, so ibdev_is_alive()
doesn't do anything useful. Remove it.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 9206dff1 25-Feb-2009 Roland Dreier <rolandd@cisco.com>

IB: Remove sysfs files before unregistering device

Move the ib_device_unregister_sysfs() call from ib_dealloc_device() to
ib_unregister_device(). The old code allows device unregister to
proceed even if some sysfs files are open, which leaves a window where
userspace can open a file before a device is removed but then end up
reading the file after the device is removed, which leads to various
kernel crashes either because the device data structure is freed or
because the low-level driver code is gone after module removal.

By not returning from ib_unregister_device() until after all sysfs
entries are removed, we make sure that data structures and/or module
code is not freed until after all sysfs access is done.

Reported-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# d927e38c 06-Jan-2009 Kay Sievers <kay.sievers@vrfy.org>

infiniband: struct device - replace bus_id with dev_name(), dev_set_name()

Acked-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 5b095d989 29-Oct-2008 Harvey Harrison <harvey.harrison@gmail.com>

net: replace %p6 with %pI6

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 8867cd7c 28-Oct-2008 Harvey Harrison <harvey.harrison@gmail.com>

infiniband: use %p6 for printing message ids

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7f624d02 15-Jul-2008 Steve Wise <larrystevenwise@gmail.com>

RDMA/core: Add iWARP protocol statistics attributes in sysfs

This patch adds a sysfs attribute group called "proto_stats" under
/sys/class/infiniband/$device/ and populates this group with protocol
statistics if they exist for a given device. Currently, only iWARP
stats are defined, but the code is designed to allow InfiniBand
protocol stats if they become available. These stats are per-device
and more importantly -not- per port.

Details:

- Add union rdma_protocol_stats in ib_verbs.h. This union allows
defining transport-specific stats. Currently only iwarp stats are
defined.

- Add struct iw_protocol_stats to define the current set of iwarp
protocol stats.

- Add new ib_device method called get_proto_stats() to return protocol
statistics.

- Add logic in core/sysfs.c to create iwarp protocol stats attributes
if the device is an RNIC and has a get_proto_stats() method.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# f3781d2e 15-Jul-2008 Roland Dreier <rolandd@cisco.com>

RDMA: Remove subversion $Id tags

They don't get updated by git and so they're worse than useless.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# f4e91eb4 21-Feb-2008 Tony Jones <tonyj@suse.de>

IB: convert struct class_device to struct device

This converts the main ib_device to use struct device instead of struct
class_device as class_device is going away.

Signed-off-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# c7482b81 14-Feb-2008 Li Zefan <lizf@cn.fujitsu.com>

IB: Fix return value in ib_device_register_sysfs()

If kobject_create_and_add() fails and returns NULL, the current code
in ib_device_register_sysfs() does not set ret and hence returns 0.
Set ret to -ENOMEM for this failure, so that the caller knows that
ib_device_register_sysfs() actually failed.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# c10997f6 20-Dec-2007 Greg Kroah-Hartman <gregkh@suse.de>

Kobject: convert drivers/* from kobject_unregister() to kobject_put()

There is no need for kobject_unregister() anymore, thanks to Kay's
kobject cleanup changes, so replace all instances of it with
kobject_put().


Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 35be0681 17-Dec-2007 Greg Kroah-Hartman <gregkh@suse.de>

Kobject: change drivers/infiniband to use kobject_init_and_add

Stop using kobject_register, as this way we can control the sending of
the uevent properly, after everything is properly initialized.

Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <mshefty@ichips.intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 7eff2e7a 14-Aug-2007 Kay Sievers <kay.sievers@vrfy.org>

Driver core: change add_uevent_var to use a struct

This changes the uevent buffer functions to use a struct instead of a
long list of parameters. It does no longer require the caller to do the
proper buffer termination and size accounting, which is currently wrong
in some places. It fixes a known bug where parts of the uevent
environment are overwritten because of wrong index calculations.

Many thanks to Mathieu Desnoyers for finding bugs and improving the
error handling.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 7b595756 13-Jun-2007 Tejun Heo <htejun@gmail.com>

sysfs: kill unnecessary attribute->owner

sysfs is now completely out of driver/module lifetime game. After
deletion, a sysfs node doesn't access anything outside sysfs proper,
so there's no reason to hold onto the attribute owners. Note that
often the wrong modules were accounted for as owners leading to
accessing removed modules.

This patch kills now unnecessary attribute->owner. Note that with
this change, userland holding a sysfs node does not prevent the
backing module from being unloaded.

For more info regarding lifetime rule cleanup, please read the
following message.

http://article.gmane.org/gmane.linux.kernel/510293

(tweaked by Greg to not delete the field just yet, to make it easier to
merge things properly.)

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 856c52a7 10-Jul-2007 Dotan Barak <dotanb@dev.mellanox.co.il>

IB/core: Take sizeof the correct pointer when calling kmalloc()

When allocating out_mad in show_pma_counter(), take sizeof *out_mad
instead of sizeof *in_mad. It is true that today the type of in_mad
and out_mad are the same, but this patch will give us a cleaner code.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 1912ffbb 23-Apr-2007 Joachim Fenkes <fenkes@de.ibm.com>

IB: Set class_dev->dev in core for nice device symlink

All RDMA drivers except ehca set class_dev->dev to their dma_device
value (ehca leaves this unset). dma_device is the only value that
makes any sense, so move this assignment to core/sysfs.c. This reduce
the duplicated code in the rest of the drivers and gives ehca a nice
/sys/class/infiniband/ehcaX/device symlink.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 38abaa63 16-Feb-2007 Roland Dreier <rolandd@cisco.com>

IB/core: Fix sparse warnings about shadowed declarations

Change a couple of variable names to avoid sparse warnings about
symbols being shadowed.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 07ebafba 03-Aug-2006 Tom Tucker <tom@opengridcomputing.com>

RDMA: iWARP Core Changes.

Modifications to the existing rdma header files, core files, drivers,
and ulp files to support iWARP, including:
- Hook iWARP CM into the build system and use it in rdma_cm.
- Convert enum ib_node_type to enum rdma_node_type, which includes
the possibility of RDMA_NODE_RNIC, and update everything for this.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 3cd96564 22-Sep-2006 Roland Dreier <rolandd@cisco.com>

IB: Whitespace fixes

Remove some trailing whitespace that has snuck in despite the best
efforts of whitespace=error-all. Also fix a few other whitespace
bogosities.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# d8b9f23b 09-May-2006 Ralph Campbell <ralph.campbell@qlogic.com>

IB: Fix display of 4-bit port counters in sysfs

The code to display local_link_integrity_errors and
excessive_buffer_overrun_errors in
/sys/class/infiniband/<hca>/ports/<n>/counters/
uses the wrong shift to extract the 4 bit values.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 048975ac 20-Mar-2006 Roland Dreier <rolandd@cisco.com>

IB: Coverity fixes to sysfs.c

Fix two bugs found by coverity:
- Memory leak in error path of alloc_group_attrs()
- Fencepost error in state_show(): the test should be < ARRAY_SIZE(),
not <= ARRAY_SIZE().

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# c5bcbbb9 02-Feb-2006 Roland Dreier <rolandd@cisco.com>

IB: Allow userspace to set node description

Expose a writable "node_desc" sysfs attribute for InfiniBand devices.
This allows userspace to update the node description with information
such as the node's hostname, so that IB network management software
can tie its view to the real world.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# cf311cd4 10-Jan-2006 Sean Hefty <sean.hefty@intel.com>

IB: Add node_guid to struct ib_device

Add a node_guid field to struct ib_device. It is the responsibility
of the low-level driver to initialize this field before registering a
device with the midlayer. Convert everyone to looking at this field
instead of calling ib_query_device() when all they want is the node
GUID, and remove the node_guid field from struct ib_device_attr.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 312c004d 16-Nov-2005 Kay Sievers <kay.sievers@suse.de>

[PATCH] driver core: replace "hotplug" by "uevent"

Leave the overloaded "hotplug" word to susbsystems which are handling
real devices. The driver core does not "plug" anything, it just exports
the state to userspace and generates events.

Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 8c65b4a6 07-Nov-2005 Tim Schmielau <tim@physik3.uni-rostock.de>

[PATCH] fix remaining missing includes

Fix more include file problems that surfaced since I submitted the previous
fix-missing-includes.patch. This should now allow not to include sched.h
from module.h, which is done by a followup patch.

Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# de6eb66b 02-Nov-2005 Roland Dreier <rolandd@cisco.com>

[IB] kzalloc() conversions

Replace kmalloc()+memset(,0,) with kzalloc(), for a net savings of 35
source lines and about 500 bytes of text.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# ba8e9310 18-Oct-2005 Roland Dreier <rolandd@cisco.com>

[IB] Fail sysfs queries after device is unregistered

We keep IB device structures around until the last sysfs reference is
gone, but we shouldn't ask the low-level driver to do anything after
the LLD unregisters the device. To handle this, check the reg_state
field and just fail sysfs show() requests if the device has already
been unregistered.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 82ca76b6 06-Sep-2005 Pekka Enberg <penberg@cs.helsinki.fi>

[PATCH] drivers: convert kcalloc to kzalloc

This patch converts kcalloc(1, ...) calls to use the new kzalloc() function.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# a4d61e84 25-Aug-2005 Roland Dreier <roland@eddore.topspincom.com>

[PATCH] IB: move include files to include/rdma

Move the InfiniBand headers from drivers/infiniband/include to include/rdma.
This allows InfiniBand-using code to live elsewhere, and lets us remove the
ugly EXTRA_CFLAGS include path from the InfiniBand Makefiles.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 97f52eb4 13-Aug-2005 Sean Hefty <sean.hefty@intel.com>

[PATCH] IB: sparse endianness cleanup

Fix sparse warnings. Use __be* where appropriate.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 2a1d9b7f 11-Aug-2005 Roland Dreier <roland@eddore.topspincom.com>

[PATCH] IB: Add copyright notices

Make some lawyers happy and add copyright notices for people who
forgot to include them when they actually touched the code.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 70f2817a 29-Apr-2005 Dmitry Torokhov <dtor_core@ameritech.net>

[PATCH] sysfs: (rest) if show/store is missing return -EIO

sysfs: fix the rest of the kernel so if an attribute doesn't
implement show or store method read/write will return
-EIO instead of 0 or -EINVAL or -EPERM.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# d48593bf 28-Apr-2005 Dmitry Torokhov <dtor_core@ameritech.net>

[PATCH] Make attributes names const char *

sysfs: make attributes and attribute_group's names const char *

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 1da177e4 16-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org>

Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!