History log of /freebsd-10-stable/sys/ofed/
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
359123 19-Mar-2020 hselasky

MFC r359014:
Fix for double unlock in ipoib.

The ipoib_unicast_send() function is not supposed to unlock the priv lock.

Sponsored by: Mellanox Technologies

358933 13-Mar-2020 hselasky

MFC r358694:
Fix some whitespace issues in ipoib.

Sponsored by: Mellanox Technologies

357429 03-Feb-2020 hselasky

MFC r356633:
Make sure the VNET is properly set when reaping mbufs in ipoib.
Else the following panic may happen:

panic()
icmp_error()
ipoib_cm_mb_reap()
linux_work_fn()
taskqueue_run_locked()
taskqueue_thread_loop()
fork_exit()
fork_trampoline()

Submitted by: Andreas Kempe <kempe@lysator.liu.se>
Sponsored by: Mellanox Technologies

339086 02-Oct-2018 hselasky

Selectivly backport fix for firmware command hang when switching from
polling-based firmware commands to event based firmware commands.

This is a direct commit.

Linux commit:
a7e1f04905e5b2b90251974dddde781301b6be37

Sponsored by: Mellanox Technologies

338615 12-Sep-2018 hselasky

Fix incorrect display of the sys.class.infiniband.xxx.ports.1.rate sysctl
entry in ibcore by adding support for new rate types.

This is a direct commit.

Sponsored by: Mellanox Technologies

332928 24-Apr-2018 hselasky

MFC r329372 and r329464:
Implement enable_irq() and disable_irq() in the LinuxKPI and add checks for
valid IRQ tag before setting up or tearing down an interrupt handler in the
LinuxKPI. This is needed when the interrupt handler is disabled
before freeing the interrupt.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies

332922 24-Apr-2018 hselasky

MFC r331355:
Clear old MSIX IRQ numbers in the LinuxKPI.

When disabling the MSIX IRQ vectors for a PCI device through the
LinuxKPI, make sure any old MSIX IRQ numbers are no longer visible to
the linux_pci_find_irq_dev() function else IRQs can be requested from
the wrong PCI device.

Sponsored by: Mellanox Technologies

332160 07-Apr-2018 brooks

MFC r331648:

Improve copy-and-pasted versions of SIOCGIFADDR.

The original implementation used a reference to ifr_data and a cast to
do the equivalent of accessing ifr_addr. This was copied multiple
times since 1996.

Approved by: kib
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14873

329835 22-Feb-2018 hselasky

Fix for LINT-NOINET kernel build.

This is a direct commit.

Reported by: rpokala@
Sponsored by: Mellanox Technologies

328656 01-Feb-2018 hselasky

MFC r328623:
Properly implement the cond_resched() function macro in the LinuxKPI.

Sponsored by: Mellanox Technologies

326704 08-Dec-2017 hselasky

Add support for IPv6 based addresses as part of the TCP unify portspace feature
in ibcore. This resolves an interopability issue when using both iWarp(T6) and
RDMA(CX-4 and CX-5) devices at the same time.

The problem is IPv4 based sockets cannot be bound to an IPv6 based address
causing sobind() to fail preventing all use of IPv6 based addresses with RDMA
when an iWarp device is present.

This is a direct commit.

Tested by: KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
Sponsored by: Mellanox Technologies

326054 21-Nov-2017 hselasky

MFC r299674 and r299931:
Handle case of class being set, but not parent when calling
device_register() in the LinuxKPI.

Requested by: Chelsio
Sponsored by: Mellanox Technologies

325945 17-Nov-2017 hselasky

MFC r325616:
Make sure sin_zero is zero in ibcore. Else socket address maching using
bcmp() might fail.

Sponsored by: Mellanox Technologies

325943 17-Nov-2017 hselasky

MFC r325615:
Make sure the IPv6 scope ID gets zeroed when exchanging CMA messages in ibcore.
Else the IPv6 address matching might fail. This change adds support for both
embedded and non-embedded IPv6 scope IDs when passing a IPv6 link-local socket
address to RDMA. Prior to this change only global IPv6 addresses would work
with RDMA.

Sponsored by: Mellanox Technologies

325940 17-Nov-2017 hselasky

MFC r325614:
Multiple fixes for using IPv6 link-local addresses with RDMA.

1) Fail to resolve RDMA address if rtalloc1() returns the loopback
device, lo0, as the gateway interface.

2) Use ip_dev_find() and ip6_dev_find() to lookup network interfaces
with matching IPv4 and IPv6 addresses, respectivly.

3) In addr_resolve() make sure the "ifa" pointer is always set, also when
the "ifp" is NULL. Else a NULL pointer access might happen trying to
read from the "ifa" pointer later on.

Sponsored by: Mellanox Technologies

325937 17-Nov-2017 hselasky

MFC r325533:
Make the dma_alloc_coherent() function in the LinuxKPI NULL safe with regard
to the "dev" argument.

Submitted by: Krishnamraju Eraparaju @ Chelsio
Sponsored by: Chelsio Communications

325617 09-Nov-2017 hselasky

Remove the now obsolete "unify_tcp_port_space" ibcore module parameter.
Missed as part of the MFC of r324792 in r325611.

This is a direct commit.

Sponsored by: Mellanox Technologies

325613 09-Nov-2017 hselasky

MFC r325278:
Unconditionally include "opt_inet6.h" in the LinuxKPI.
This makes sure the INET6 macro gets properly defined,
also for kernel module builds.

Sponsored by: Mellanox Technologies

325611 09-Nov-2017 hselasky

MFC r324792:
The remote DMA TCP portspace selector, RDMA_PS_TCP, is used for both
iWarp and RoCE in ibcore. The selection of RDMA_PS_TCP can not be used
to indicate iWarp protocol use. Backport the proper IB device
capabilities from Linux upstream to distinguish between iWarp and
RoCE. Only allocate the additional socket required for iWarp for RDMA
IDs when at least one iWarp device present. This resolves
interopability issues between iWarp and RoCE in ibcore

Reviewed by: np @
Differential Revision: https://reviews.freebsd.org/D12563
Sponsored by: Mellanox Technologies

324685 17-Oct-2017 hselasky

MFC r289568, r300676, r300677, r300719, r300720 and r300721:
Implement LinuxKPI module parameters as SYSCTLs.

The bool module parameter is no longer supported, because there is no
equivalent in FreeBSD 10-stable. These are converted into "int" type.

There are two macros available which control the behaviour of the
LinuxKPI module parameters:

- LINUXKPI_PARAM_PARENT allows the consumer to set the SYSCTL parent
where the modules parameters will be created.

- LINUXKPI_PARAM_PREFIX defines a parameter name prefix, which is
added to all created module parameters.

The LinuxKPI module parameters also have a permissions value.
If any write bits are set we are allowed to modify the module
parameter runtime. Reflect this when creating the static SYSCTL
nodes.

The module_param_call() function is no longer supported.

Sponsored by: Mellanox Technologies

324527 11-Oct-2017 hselasky

MFC r315404:
Add basic support for VIMAGE to the LinuxKPI and ibcore.

Support is implemented by mapping Linux's "struct net" into FreeBSD's
"struct vnet". Currently only vnet0 is supported by ibcore.

Sponsored by: Mellanox Technologies

324525 11-Oct-2017 hselasky

MFC r315405, r323351 and r323364:
Add helper function similar to ip_dev_find() to the LinuxKPI to lookup
a network device by its IPv6 address in the given VNET.

Sponsored by: Mellanox Technologies

322531 15-Aug-2017 hselasky

MFC r322248:
Fix for mlx4en(4) to properly call m_defrag().

The m_defrag() function can only defrag mbuf chains which have a valid
mbuf packet header. In r291699 when the mlx4en(4) driver was converted
into using BUSDMA(9), the call to m_defrag() was moved after the part
of the transmit routine which strips the header from the mbuf chain.
This effectivly disabled the mbuf defrag mechanism and such packets
simply got dropped.

This patch removes the stripping of mbufs from a chain and loads all
mbufs using busdma. If busdma finds there are no segments, unload
the DMA map and free the mbuf right away, because that means all
data in the mbuf has been inlined in the TX ring. Else proceed
as usual.

Add a per-ring rounter for the number of defrag attempts and
make sure the oversized_packets counter gets zeroed while at it.

The counters are per-ring to avoid excessive cache misses in the
TX path.

Approved by: re (kib)
Submitted by: mjoras@
Differential Revision: https://reviews.freebsd.org/D11683
Sponsored by: Mellanox Technologies

322507 14-Aug-2017 hselasky

MFC r322306:
Print maximum MTU when trying to set invalid MTU in the mlx4en(4) driver.
Useful for debugging.

Approved by: re (marius, gjb)
Submitted by: Sepherosa Ziehau <sephe@dragonflybsd.org>
Sponsored by: Mellanox Technologies

322504 14-Aug-2017 hselasky

MFC r322304:
Add support for RX and TX statistics when the mlx4en(4) PCI device
is in VF or SRIOV mode typically in a virtual machine environment.

Approved by: re (kib)
Submitted by: Sepherosa Ziehau <sephe@dragonflybsd.org>
Sponsored by: Mellanox Technologies

322500 14-Aug-2017 hselasky

MFC r314878:
Add support for constant pointer constructs to READ_ONCE() in the
LinuxKPI. When the type of the argument is constant the temporary
variable cannot be assigned after the barrier. Instead assign the
temporary variable by initialization.

Approved by: re (kib)
Sponsored by: Mellanox Technologies

322165 07-Aug-2017 hselasky

MFC r321782:
Remove some dead statistics related code and a structure field from the
mlx4en driver which is used by its Linux counterpart, but not under
FreeBSD.

Sponsored by: Mellanox Technologies

322162 07-Aug-2017 hselasky

MFC r321772:
Fix broken usage of the mlx4_read_clock() function:
- return value has too small width
- cycle_t is unsigned and cannot be less than zero

Sponsored by: Mellanox Technologies

322159 07-Aug-2017 hselasky

MFC r321780:
Make sure on-stack buffer is properly aligned.

Sponsored by: Mellanox Technologies

322156 07-Aug-2017 hselasky

MFC r321986:
Change reject message type when destroying cm_id in ibore.

This patch fixes an interopability issue between FreeBSD and non-FreeBSD
systems when the connection establishment is aborted. Refer to the
initial commit in Linux, drivers/infiniband/core/cm.c,
for a more detailed description.

Obtained from: Linux
Sponsored by: Mellanox Technologies

322153 07-Aug-2017 hselasky

MFC r321985:
Ticks are 32-bit in FreeBSD.

Sponsored by: Mellanox Technologies

321020 15-Jul-2017 dchagin

MFC r281436 (by mjg@):

fd: remove filedesc argument from fdclose

Just accept a thread instead. This makes it consistent with fdalloc.

No functional changes.

320945 13-Jul-2017 hselasky

MFC r320876:
Make sure the mlx4en RX DMA ring gets stamped with software ownership
in order to prevent the flow of QP to error in the firmware once
UPDATE_QP is called.

Sponsored by: Mellanox Technologies

320067 18-Jun-2017 hselasky

MFC r319972:
Use static device numbering instead of dynamic one when creating
mlx4en network interfaces. This prevents infinite unit number growth
typically when the mlx4en driver is used inside virtual machines which
support runtime PCI attach and detach.

Sponsored by: Mellanox Technologies

319567 04-Jun-2017 hselasky

MFC r319413:
Free hardware queue resource after port is stopped in the mlx4en(4)
driver. Else if the port is up the resource might still be busy and
the MTT free will fail.

PR: 216493
Sponsored by: Mellanox Technologies

319564 04-Jun-2017 hselasky

MFC r319414:
Allow communication between functions on the same host when using the
mlx4en(4) driver in SRIOV mode.

Place a copy of the destination MAC address in the send WQE only under
SRIOV/eSwitch configuration or when the device is in selftest. This
allows communication between functions on the same host.

PR: 216493
Sponsored by: Mellanox Technologies

318802 24-May-2017 np

MFC r314131:

Avoid NULL dereference in a couple of sysctl handlers in ibcore.
iw_cxgbe sets ib_device->dma_device to NULL (since r311880).

Sponsored by: Chelsio Communications

318628 22-May-2017 hselasky

MFC r318531:

mlx4: Use the CQ quota for SRIOV when creating completion EQs

When creating EQs to handle CQ completion events for the PF or for
VFs, we create enough EQE entries to handle completions for the max
number of CQs that can use that EQ.

When SRIOV is activated, the max number of CQs a VF (or the PF) can
obtain is its CQ quota (determined by the Hypervisor resource
tracker). Therefore, when creating an EQ, the number of EQE entries
that the VF should request for that EQ is the CQ quota value (and not
the total number of CQs available in the firmware).

Under SRIOV, the PF, also must use its CQ quota, because the resource
tracker also controls how many CQs the PF can obtain.

Using the firmware total CQs instead of the CQ quota when creating EQs
resulted wasting MTT entries, due to allocating more EQEs than were
needed.

Sponsored by: Mellanox Technologies

318540 19-May-2017 hselasky

MFC r317505:
Don't free uninitialized sysctl contexts in the mlx4en driver. This
can cause NULL pointer panics during failed device attach.

Differential Revision: https://reviews.freebsd.org/D8876
Sponsored by: Mellanox Technologies

318536 19-May-2017 hselasky

MFC r313555:
Flexible and asymmetric allocation of EQs and MSI-X vectors for PF/VFs.

Previously, the mlx4 driver queried the firmware in order to get the
number of supported EQs. Under SRIOV, since this was done before the
driver notified the firmware how many VFs it actually needs, the
firmware had to take into account a worst case scenario and always
allocated four EQs per VF, where one was used for events while the
others were used for completions. Now, when the firmware supports the
asymmetric allocation scheme, denoted by exposing num_sys_eqs > 0 (-->
MLX4_DEV_CAP_FLAG2_SYS_EQS), we use the QUERY_FUNC command to query
the firmware before enabling SRIOV. Thus we can get more EQs and MSI-X
vectors per function. Moreover, when running in the new
firmware/driver mode, the limitation that the number of EQs should be
a power of two is lifted.

Obtained from: Linux (dual BSD/GPLv2 licensed)
Submitted by: Dexuan Cui @ microsoft . com
Differential Revision: https://reviews.freebsd.org/D8867
Sponsored by: Mellanox Technologies

318533 19-May-2017 hselasky

MFC r313556:
Change mlx4 QP allocation scheme.

When using Blue-Flame, BF, the QPN overrides the VLAN, CV, and SV
fields in the WQE. Thus, BF may only be used for QPNs with bits 6,7
unset.

The current ethernet driver code reserves a TX QP range with 256b
alignment.

This is wrong because if there are more than 64 TX QPs in use, QPNs >=
base + 65 will have bits 6/7 set.

This problem is not specific for the Ethernet driver, any entity that
tries to reserve more than 64 BF-enabled QPs should fail. Also, using
ranges is not necessary here and is wasteful.

The new mechanism introduced here will support reservation for "Eth
QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs
(when hypervisors support WC in VMs). The flow we use is:

1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation,
and request "BF enabled QPs" if BF is supported for the function

2. In the ALLOC_RES FW command, change param1 to:
a. param1[23:0] - number of QPs
b. param1[31-24] - flags controlling QPs reservation

Bit 31 refers to Eth blueflame supported QPs. Those QPs must have bits
6 and 7 unset in order to be used in Ethernet.

Bits 24-30 of the flags are currently reserved.

When a function tries to allocate a QP, it states the required
attributes for this QP. Those attributes are considered "best-effort".
If an attribute, such as Ethernet BF enabled QP, is a must-have
attribute, the function has to check that attribute is supported
before trying to do the allocation.

In a lower layer of the code, mlx4_qp_reserve_range masks out the bits
which are unsupported. If SRIOV is used, the PF validates those
attributes and masks out unsupported attributes as well. In order to
notify VFs which attributes are supported, the VF uses QUERY_FUNC_CAP
command. This command's mailbox is filled by the PF, which notifies
which QP allocation attributes it supports.

Obtained from: Linux (dual BSD/GPLv2 licensed)
Submitted by: Dexuan Cui @ microsoft . com
Differential Revision: https://reviews.freebsd.org/D8868
Sponsored by: Mellanox Technologies

315328 15-Mar-2017 dim

MFC r310232:

After r310171, the kernel version of sscanf() has format string checking
enabled. This results in a -Werror warning in mlx4ib:

sys/dev/mlx4/mlx4_ib/mlx4_ib_sysfs.c:90:22: error: format specifies type 'unsigned long long *' but the argument has type 'u64 *' (aka 'unsigned long *') [-Werror,-Wformat]
sscanf(buf, "%llx", &sysadmin_ag_val);
~~~~ ^~~~~~~~~~~~~~~~

Change sysadmin_ag_val to unsigned long long to avoid the warning.

Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D8831

315257 14-Mar-2017 hselasky

MFC r313778:

Improve code readability and fix compilation error when using clang 4.x.

Found by: emaste @
Sponsored by: Mellanox Technologies

314667 04-Mar-2017 avg

MFC r283291: don't use CALLOUT_MPSAFE with callout_init()

The main purpose of this MFC is to reduce conflicts for other merges.
Parts of the original change have already "trickled down" via individual MFCs.


/freebsd-10-stable/sys/amd64/amd64/mp_watchdog.c
/freebsd-10-stable/sys/cddl/contrib/opensolaris/uts/common/dtrace/dtrace.c
/freebsd-10-stable/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c
/freebsd-10-stable/sys/cddl/dev/profile/profile.c
/freebsd-10-stable/sys/compat/ndis/subr_ntoskrnl.c
/freebsd-10-stable/sys/contrib/ipfilter/netinet/ip_fil_freebsd.c
/freebsd-10-stable/sys/dev/altera/jtag_uart/altera_jtag_uart_tty.c
/freebsd-10-stable/sys/dev/ath/if_ath.c
/freebsd-10-stable/sys/dev/ce/if_ce.c
/freebsd-10-stable/sys/dev/cp/if_cp.c
/freebsd-10-stable/sys/dev/ctau/if_ct.c
/freebsd-10-stable/sys/dev/cx/if_cx.c
/freebsd-10-stable/sys/dev/cxgb/cxgb_main.c
/freebsd-10-stable/sys/dev/cxgb/cxgb_sge.c
/freebsd-10-stable/sys/dev/dcons/dcons_os.c
/freebsd-10-stable/sys/dev/drm2/drm_irq.c
/freebsd-10-stable/sys/dev/drm2/i915/intel_display.c
/freebsd-10-stable/sys/dev/glxsb/glxsb.c
/freebsd-10-stable/sys/dev/gxemul/cons/gxemul_cons.c
/freebsd-10-stable/sys/dev/hifn/hifn7751.c
/freebsd-10-stable/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c
/freebsd-10-stable/sys/dev/if_ndis/if_ndis.c
/freebsd-10-stable/sys/dev/isci/isci_io_request.c
/freebsd-10-stable/sys/dev/mfi/mfi.c
/freebsd-10-stable/sys/dev/mwl/if_mwl.c
/freebsd-10-stable/sys/dev/nand/nandsim_chip.c
/freebsd-10-stable/sys/dev/ntb/ntb_hw/ntb_hw.c
/freebsd-10-stable/sys/dev/nxge/if_nxge.c
/freebsd-10-stable/sys/dev/oce/oce_if.c
/freebsd-10-stable/sys/dev/patm/if_patm_attach.c
/freebsd-10-stable/sys/dev/rndtest/rndtest.c
/freebsd-10-stable/sys/dev/safe/safe.c
/freebsd-10-stable/sys/dev/sound/midi/mpu401.c
/freebsd-10-stable/sys/dev/sound/pci/atiixp.c
/freebsd-10-stable/sys/dev/sound/pci/es137x.c
/freebsd-10-stable/sys/dev/sound/pci/hda/hdaa.c
/freebsd-10-stable/sys/dev/sound/pci/hda/hdac.c
/freebsd-10-stable/sys/dev/sound/pci/via8233.c
/freebsd-10-stable/sys/dev/twa/tw_osl_freebsd.c
/freebsd-10-stable/sys/dev/tws/tws.c
/freebsd-10-stable/sys/dev/ubsec/ubsec.c
/freebsd-10-stable/sys/dev/virtio/random/virtio_random.c
/freebsd-10-stable/sys/dev/xen/netfront/netfront.c
/freebsd-10-stable/sys/fs/nfs/nfs_commonport.c
/freebsd-10-stable/sys/gdb/gdb_cons.c
/freebsd-10-stable/sys/geom/gate/g_gate.c
/freebsd-10-stable/sys/geom/journal/g_journal.c
/freebsd-10-stable/sys/geom/mirror/g_mirror.c
/freebsd-10-stable/sys/geom/raid3/g_raid3.c
/freebsd-10-stable/sys/geom/sched/gs_rr.c
/freebsd-10-stable/sys/i386/i386/mp_watchdog.c
/freebsd-10-stable/sys/kern/init_main.c
/freebsd-10-stable/sys/kern/kern_synch.c
/freebsd-10-stable/sys/kern/kern_thread.c
/freebsd-10-stable/sys/kern/subr_vmem.c
/freebsd-10-stable/sys/kern/uipc_domain.c
/freebsd-10-stable/sys/mips/cavium/octe/ethernet.c
/freebsd-10-stable/sys/mips/cavium/octeon_rnd.c
/freebsd-10-stable/sys/mips/nlm/dev/net/xlpge.c
/freebsd-10-stable/sys/mips/rmi/dev/xlr/rge.c
/freebsd-10-stable/sys/net/if_spppsubr.c
/freebsd-10-stable/sys/net80211/ieee80211_ht.c
/freebsd-10-stable/sys/net80211/ieee80211_hwmp.c
/freebsd-10-stable/sys/net80211/ieee80211_mesh.c
/freebsd-10-stable/sys/net80211/ieee80211_node.c
/freebsd-10-stable/sys/net80211/ieee80211_proto.c
/freebsd-10-stable/sys/netgraph/netflow/ng_netflow.c
/freebsd-10-stable/sys/netgraph/netgraph.h
/freebsd-10-stable/sys/netinet/in_pcb.c
/freebsd-10-stable/sys/netinet/ip_mroute.c
/freebsd-10-stable/sys/netinet/tcp_hostcache.c
/freebsd-10-stable/sys/netinet/tcp_subr.c
/freebsd-10-stable/sys/netinet6/in6_rmx.c
/freebsd-10-stable/sys/netpfil/ipfw/ip_dummynet.c
/freebsd-10-stable/sys/netpfil/ipfw/ip_fw_dynamic.c
/freebsd-10-stable/sys/netpfil/pf/if_pfsync.c
include/linux/timer.h
include/linux/workqueue.h
/freebsd-10-stable/sys/powerpc/mambo/mambo_console.c
/freebsd-10-stable/sys/powerpc/pseries/phyp_console.c
/freebsd-10-stable/sys/sys/callout.h
/freebsd-10-stable/sys/vm/uma_core.c
/freebsd-10-stable/sys/x86/x86/mca.c
314606 03-Mar-2017 np

MFC r314400:

cxgbe/iw_cxgbe: fix various double-close panics with iWARP sockets.

Sockets representing the TCP endpoints for iWARP connections are
allocated by the ibcore module. Before this revision they were closed
either by the ibcore module or the iw_cxgbe hardware driver depending on
the state transitions during connection teardown. This is error prone
and there were cases where both iw_cxgbe and ibcore closed the socket
leading to double-free panics. The fix is to let ibcore close the
sockets it creates and never do it in the driver.

- Use sodisconnect instead of soclose (preceded by solinger = 0) in the
driver to tear down an RDMA connection abruptly. This does what's
intended without releasing the socket's fd reference.

- Close the socket in ibcore when the iWARP iw_cm_id is destroyed. This
works for all kinds of sockets: clients that initiate connections,
listeners, and sockets accepted off of listeners.

Sponsored by: Chelsio Communications

311795 09-Jan-2017 hselasky

MFC r310058:
Fix initialisation of mlx4_pci_table's .driver_data fields.

Differential Revision: https://reviews.freebsd.org/D8791
Sponsored by: Mellanox Technologies
Submitted by: Dexuan Cui <decui@microsoft.com>

309378 01-Dec-2016 jhb

MFC 273806,289103,289201,289338,289578,293185,294474,294610,297124,297368,
297406,300875,300888,301158,301896,301897,304838:

Pull in most of the Chelsio and iWARP related changes from stable/11 into
stable/10. A few changes from 278886 (OFED 1.2) were also included though
the full merge is not:
- The find_gid_port() function in infiband/core/cma.c.
- Addition of the 'ord' and 'ird' fields to 'struct iw_cm_event'.

273806:
Userspace library for Chelsio's Terminator 5 based iWARP RNICs (pretty
much every T5 card that does _not_ have "-SO" in its name is RDMA
capable).

This plugs into the OFED verbs framework and allows userspace RDMA
applications to work over T5 RNICs. Tested with rping.

289103:
iw_cxgbe: fix for page fault in cm_close_handler().

This is roughly the iw_cxgbe equivalent of
https://github.com/torvalds/linux/commit/be13b2dff8c4e41846477b22cc5c164ea5a6ac2e
-----------------
RDMA/cxgb4: Connect_request_upcall fixes

When processing an MPA Start Request, if the listening endpoint is
DEAD, then abort the connection.

If the IWCM returns an error, then we must abort the connection and
release resources. Also abort_connection() should not post a CLOSE
event, so clean that up too.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
-----------------

289201:
iw_cxgbe: MPA v2 is always available.

289338:
iw_cxgbe: use correct RFC number.

289578:
Merge LinuxKPI changes from DragonflyBSD:
- Define the kref structure identical to the one found in Linux.
- Update clients referring inside the kref structure.
- Implement kref_sub() for FreeBSD.

293185:
iw_cxgbe: Shut down the socket but do not close the fd in case of error.
The fd is closed later in this case. This fixes a "SS_NOFDREF on enter"
panic.

294474:
iw_cxgbe: fix a couple of problems int the RDMA_TERMINATE handler.

a) Look for the CPL in the payload buffer instead of the descriptor.
b) Retrieve the socket associated with the tid with the inpcb lock held.

294610:
Fix for iWARP servers that listen on INADDR_ANY.

The iWARP Connection Manager (CM) on FreeBSD creates a TCP socket to
represent an iWARP endpoint when the connection is over TCP. For
servers the current approach is to invoke create_listen callback for
each iWARP RNIC registered with the CM. This doesn't work too well for
INADDR_ANY because a listen on any TCP socket already notifies all
hardware TOEs/RNICs of the new listener. This patch fixes the server
side of things for FreeBSD. We've tried to keep all these modifications
in the iWARP/TCP specific parts of the OFED infrastructure as much as
possible.

297124:
iw_cxgbe/libcxgb4: Pull in many applicable fixes from the upstream Linux
iWARP driver and userspace library to the FreeBSD iw_cxgbe and libcxgb4.

This commit includes internal changesets 6785 8111 8149 8478 8617 8648
8650 9110 9143 9440 9511 9894 10164 10261 10450 10980 10981 10982 11730
11792 12218 12220 12222 12223 12225 12226 12227 12228 12229 12654.

297368:
cxgbe/iw_cxgbe: Fix for stray "start_ep_timer timer already started!"
messages.

297406:
Remove unnecessary dequeue_mutex (added in r294610) from the iWARP
connection manager. Examining so_comp without synchronization with
iw_so_event_handler is a harmless race.

300875:
iw_cxgbe: Use vmem(9) to manage PBL and RQT allocations.

300888:
iw_cxgbe: Plug a lock leak in process_mpa_request().

If the parent is DEAD or connect_request_upcall() fails, the parent
mutex is left locked. This leads to a hang when process_mpa_request()
is called again for another child of the listening endpoint.

301158:
iw_cxgbe: Fix panic that occurs when c4iw_ev_handler tries to acquire
comp_handler_lock but c4iw_destroy_cq has already freed the CQ memory
(which is where the lock resides).

301896:
Fix bug in iwcm that caused a panic in iw_cm_wq when krping is run
repeatedly in a tight loop.

301897:
iw_cxgbe: Make sure that send_abort results in a TCP RST and not a FIN.
Release the hold on ep->com immediately after sending the RST. This
fixes a bug that sometimes leaves userspace iWARP tools hung when the
user presses ^C.

304838:
Do not free an uninitialized pointer on soaccept failure in the iWARP
connection manager.

Submitted by: Krishnamraju Eraparaju @ Chelsio (original patch)
Sponsored by: Chelsio Communications

308399 07-Nov-2016 hselasky

MFC r308031:
Fix indentation and remove duplicate queue stopped stats increment.

Found by: Ryan Stone <rysto32@gmail.com>
Sponsored by: Mellanox Technologies

307011 11-Oct-2016 sephe

MFC 306480

linuxkpi: Fix PCI BAR lazy allocation support.

FreeBSD supports lazy allocation of PCI BAR, that is, when a device
driver's attach method is invoked, even if the device's PCI BAR
address wasn't initialized, the invocation of bus_alloc_resource_any()
(the call chain: pci_alloc_resource() -> pci_alloc_multi_resource() ->
pci_reserve_map() -> pci_write_bar()) would allocate a proper address
for the PCI BAR and write this 'lazy allocated' address into the PCI
BAR.

This model works fine for native FreeBSD device drivers, but _not_ for
device drivers shared with Linux (e.g. dev/mlx5/mlx5_core/mlx5_main.c
and ofed/drivers/net/mlx4/main.c. Both of them use
pci_request_regions(), which doesn't work properly with the PCI BAR
lazy allocation, because pci_resource_type() -> _pci_get_rle() always
returns NULL, so pci_request_regions() doesn't have the opportunity to
invoke bus_alloc_resource_any(). We now use pci_find_bar() in
pci_resource_type(), which is able to locate all available PCI BARs
even if some of them will be lazy allocated.

Submitted by: Dexuan Cui <decui microsoft com>
Reviewed by: hps
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D8071

306955 10-Oct-2016 hselasky

MFC r306454:
Set hardware stats flag to avoid double counting the number of incoming bytes.

Found by: Ben RUBSON <ben.rubson@gmail.com>
Sponsored by: Mellanox Technologies

306950 10-Oct-2016 hselasky

MFC r306451:
The IORESOURCE_XXX defines should resemble a bitmask while SYS_RES_XXX
are not bitmasks. Fix return value of pci_resource_flags() to reflect
this change.

Sponsored by: Mellanox Technologies

304846 26-Aug-2016 hselasky

MFC r304342:
Add support for setting blocking and non-blocking mode on /dev/rdma_cm
by returning success on FIONBIO and FIOASYNC IOCTLs. The actual flags
handling is done by the kern_ioctl() function.

Reported by: Alex Bowden <alex.bowden@outlook.com>
Sponsored by: Mellanox Technologies

302926 16-Jul-2016 markj

MFC r301877:
Add a missing error check for a malloc() call in idr_get().

302271 29-Jun-2016 hselasky

MFC r301544:
Fallback to arc4rand() in the LinuxKPI when read_random() returns
zero. This can happen for virtual machines.

Sponsored by: Mellanox Technologies

301264 03-Jun-2016 hselasky

MFC r294832:
Implement ether_addr_equal(), ether_addr_equal_64bits() and
random_ether_addr() for the LinuxKPI.

Sponsored by: Mellanox Technologies

298779 29-Apr-2016 hselasky

MFC r298458:
Add missing set of the current VNET when inputting IP packets in IPoIB.

This fixes a kernel panic when using IPoIB with VIMAGE and infiniband.

PR: 208957
Sponsored by: Mellanox Technologies
Tested by: Justin Clift <justin@postgresql.org>

298778 29-Apr-2016 hselasky

MFC r297968:
Remove some unused fields.

Sponsored by: Mellanox Technologies

298775 29-Apr-2016 hselasky

MFC r297967:
Ensure the received IP header gets 32-bits aligned.

The FreeBSD's TCP/IP stack assumes that the IP-header is 32-bits aligned
when decoding it. Else unaligned 32-bit memory access can happen, which
not all processor architectures support.

Sponsored by: Mellanox Technologies

298773 29-Apr-2016 hselasky

MFC r297966:
Add missing port_up checks.

When downing a mlxen network adapter we need to check the port_up variable
to ensure we don't continue to transmit data or restart timers which can
reside in freed memory.

Sponsored by: Mellanox Technologies

297658 07-Apr-2016 hselasky

MFC r294520:
LinuxKPI atomic fixes:
- Fix implementation of atomic_add_unless(). The atomic_cmpset_int()
function returns a boolean and not the previous value of the atomic
variable.
- The atomic counters should be signed according to Linux.
- Some minor cosmetics and styling while at it.

Reviewed by: alfred @
Sponsored by: Mellanox Technologies

297656 07-Apr-2016 hselasky

MFC r297444:
Fix bugs in currently unused bit searching loop.

Sponsored by: Mellanox Technologies

297652 07-Apr-2016 hselasky

MFC r296987:
Add missing curly brackets in for loop.

Sponsored by: Mellanox Technologies

297651 07-Apr-2016 hselasky

MFC r296910:
Use hardware computed Toeplitz hash for incoming flowids

Use the Toeplitz hash value as source for the flowid. This makes the
hash value more suitable for so-called hash bucket algorithms which
are used in the FreeBSD's TCP/IP stack when RSS is enabled.

Sponsored by: Mellanox Technologies

297648 07-Apr-2016 hselasky

MFC r296909:
Fix witness panic in the ipoib_ioctl() function when unloading the
ipoib module.

The bpfdetach() function is trying to turn off promiscious mode on the
network interface it is attached to while holding a mutex. The fix
consists of ignoring any further calls to the ipoib_ioctl() function
when the network interface is going to be detached. The ipoib_ioctl()
function might sleep.

Sponsored by: Mellanox Technologies

294636 23-Jan-2016 jhb

MFC 294366:
Initialize vm_page_prot to VM_MEMATTR_DEFAULT instead of 0.

If a driver's Linux mmap callback passed vm_page_prot through unchanged,
then linux_dev_mmap_single() would try to apply whatever VM_MEMATTR_xxx
value 0 is to the mapping. On x86, VM_MEMATTR_DEFAULT is the PAT value
for write-back (WB) which is 6, while 0 maps to the PAT value for
uncacheable (UC). Thus, any mmap request that did not explicitly set
page_prot was tried to map memory as UC triggering the warning in
sg_pager_getpages().

Sponsored by: Chelsio Communications

293736 12-Jan-2016 hselasky

MFC r292989:
Handle when filedescriptors are closed before initialized. An early
fdclose() call can cause fget_unlocked() to fail.

293151 04-Jan-2016 hselasky

MFC r289563,r291481,r292537,r292538,r292542,r292543,r292544 and r292834:

Update the LinuxKPI:
- Add more functions and types.
- Implement ACCESS_ONCE(), WRITE_ONCE() and READ_ONCE().
- Implement sleepable RCU mechanism using shared exclusive locks.
- Minor workqueue cleanup:
- Make some functions global instead of inline to ease debugging.
- Fix some minor style issues.
- In the zero delay case in queue_delayed_work() use the return value
from taskqueue_enqueue() instead of reading "ta_pending" unlocked and
also ensure the callout is stopped before proceeding.
- Implement drain_workqueue() function.
- Reduce memory consumption when allocating kobject strings in the
LinuxKPI. Compute string length before allocating memory instead of
using fixed size allocations. Make kobject_set_name_vargs() global
instead of inline to save some bytes when compiling.

Sponsored by: Mellanox Technologies

292907 30-Dec-2015 ngie

MFC r270212,r270332:

This helps reduce the diff in pci(4) between head and stable/10 to help pave
the way for bringing in IOV/nv(9) more cleanly

Differential Revision: https://reviews.freebsd.org/D4728
Relnotes: yes
Reviewed by: hselasky (ofed piece), royger (overall change)
Sponsored by: EMC / Isilon Storage Division

r270212 (by royger):

pci: make MSI(-X) enable and disable methods of the PCI bus

Make the functions pci_disable_msi, pci_enable_msi and pci_enable_msix
methods of the newbus PCI bus. This code should not include any
functional change.

Sponsored by: Citrix Systems R&D
Reviewed by: imp, jhb
Differential Revision: https://reviews.freebsd.org/D354

dev/pci/pci.c:
- Convert the mentioned functions to newbus methods.
- Fix the callers of the converted functions.

sys/dev/pci/pci_private.h:
dev/pci/pci_if.m:
- Declare the new methods.

dev/pci/pcivar.h:
- Add helpers to call the newbus methods.

ofed/include/linux/pci.h:
- Add define to prevent the ofed version of pci_enable_msix from
clashing with the FreeBSD native version.

r270332 (by royger):

pci: add a new pci_child_added newbus method.

This is needed so when running under Xen the calls to pci_child_added
can be intercepted and a custom Xen method can be used to register
those devices with Xen. This should not include any functional
change, since the Xen implementation will be added in a following
patch and the native implementation is a noop.

Sponsored by: Citrix Systems R&D
Reviewed by: jhb

dev/pci/pci.c:
dev/pci/pci_if.m:
dev/pci/pci_private.h:
dev/pci/pcivar.h:
- Add the pci_child_added newbus method.

292192 14-Dec-2015 hselasky

MFC r290003:
Add support for binding IRQs to CPUs in the LinuxKPI. The new function
added is for BSD only and does not exist in Linux.

Sponsored by: Mellanox Technologies

292136 13-Dec-2015 ngie

MFC r291753:

Fix scope of bridge_header and bridge_pcix_cap in mthca_reset(..)

They're only used in the __linux__ case

Differential Revision: https://reviews.freebsd.org/D4332
Reported by: cppcheck
Reviewed by: hselasky
Sponsored by: EMC / Isilon Storage Division

292113 11-Dec-2015 hselasky

Enable the mlx4en TSO limits.

This is a direct commit to stable/10.

Sponsored by: Mellanox Technologies

292107 11-Dec-2015 hselasky

MFC r290710, r291694, r291699 and r291793:
- Fix print formatting compile warnings for Sparc64 and PowerPC platforms.
- Updated the mlx4 and mlxen drivers to the latest version, v2.1.6:
- Added support for dumping the SFP EEPROM content to dmesg.
- Fixed handling of network interface capability IOCTLs.
- Fixed race when loading and unloading the mlxen driver by applying
appropriate locking.
- Removed two unused C-files.
- Convert the mlxen driver to use the BUSDMA(9) APIs instead of
vtophys() when loading mbufs for transmission and reception. While at
it all pointer arithmetic and cast qualifier issues were fixed, mostly
related to transmission and reception.
- Fix i386 build WITH_OFED=YES. Remove some redundant KASSERTs.

Sponsored by: Mellanox Technologies
Differential Revision: https://reviews.freebsd.org/D4283
Differential Revision: https://reviews.freebsd.org/D4284

292105 11-Dec-2015 hselasky

MFC r291693:
Add some structures and defines which will be used when decoding small
form factor, SFF, standards compliant ethernet EEPROMs.

Obtained from: Linux
Sponsored by: Mellanox Technologies

292103 11-Dec-2015 hselasky

MFC r291690:
Remove incorrect defines. The proper version of these macros is
defined in linux/etherdevice.h.

Sponsored by: Mellanox Technologies

291185 23-Nov-2015 ngie

MFC r291047:

Don't leak work if __mlx4_register_vlan(..) fails in
mlx4_master_immediate_activate_vlan_qos(..)

Differential Revision: https://reviews.freebsd.org/D4203
Submitted by: Miles Olrich <miles.olrich@isilon.com>
Sponsored by: EMC / Isilon Storage Division

287637 11-Sep-2015 jhb

MFC 287440:
Currently the Linux character device mmap handling only supports mmap
operations that map a single page that has an associated vm_page_t.
This does not permit mapping larger regions (such as a PCI memory
BAR) and it does not permit mapping addresses beyond the top of RAM
(such as a 64-bit BAR located above the top of RAM).

Instead of using a single OBJT_DEVICE object and passing the physaddr via
the offset as a hack, create a new sglist and OBJT_SG object for each
mmap request. The requested memory attribute is applied to the object
thus affecting all pages mapped by the request.

Sponsored by: Chelsio

287229 28-Aug-2015 markj

MFC r286418:
ipv4_is_zeronet() and ipv4_is_loopback() expect an address in network
order, but IN_ZERONET and IN_LOOPBACK expect it in host order.

286841 17-Aug-2015 glebius

Merge r283612:
Add SIOCGI2C ioctl support to the driver. Would work only on ConnectX-3
with fresh firmware. The low level code is based on code provided by
Mellanox.

Thanks to Mellanox and their distributor Must (http://mustcompany.ru)
for providing hardware.

In collaboration with: Andre Melkoumian <andre mellanox.com>
Reviewed by: hselasky
Sponsored by: Netflix
Sponsored by: Nginx, Inc.

285410 11-Jul-2015 hselasky

MFC r285088:
Fix broken implementation of "kvasprintf()" function by adding missing
kmalloc() call. Make function global instead of static inline to fix
compiler warnings about passing variable argument lists to inline
functions.

Sponsored by: Mellanox Technologies
Approved by: re, gjb

284530 17-Jun-2015 np

MFC r277229:

Use parentheses instead of close proximity to ensure layer + 1 is evaluated
before the rest of the expression.

283675 29-May-2015 markj

MFC r282331:
Don't drop the idr lock before verifying that the newly-inserted element
is present in the tree.

MFC r282741:
find_next_bit() and find_next_zero_bit(): if the caller-specified offset
lies within the last block of the bit set and no bits are set beyond the
offset, terminate the search immediately instead of continuing as though
there are further blocks in the set and subsequently returning an incorrect
result.

MFC r282743:
Ensure that msecs_to_jiffies(0) == 0.

283175 21-May-2015 hselasky

MFC r282817:
Apply proper locking when iterating the multicast addresses and add a
missing check for NULL from a non-blocking "kzalloc()" function call.

Sponsored by: Mellanox Technologies

282513 05-May-2015 hselasky

MFC r277396, r278681, r278865, r278924, r279205, r280208,
r280210, r280764 and r280768:

Update the Linux compatibility layer:
- Add more functions.
- Add some missing includes which are needed when the header files
are not included in a particular order.
- The kasprintf() function cannot be inlined due to using a variable
number of arguments. Move it to a C-file.
- Fix problems about 32-bit ticks wraparound and unsigned long
conversion. Jiffies or ticks in FreeBSD have integer type and are
not long.
- Add missing "order_base_2()" macro.
- Fix BUILD_BUG_ON() macro.
- Declare a missing symbol which is needed when compiling without -O2
- Clean up header file inclusions in the linux/completion.h, linux/in.h
and linux/fs.h header files.

Sponsored by: Mellanox Technologies

281955 24-Apr-2015 hiren

MFC r275358 r275483 r276982 - Removing M_FLOWID by hps@

r275358:
Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

r275483:
Remove M_FLOWID from SCTP code.

r276982:
Remove no longer used "M_FLOWID" flag from mbuf.h and update the netisr
manpage.

Note: The FreeBSD version has been bumped.

Reviewed by: hps, tuexen
Sponsored by: Limelight Networks


/freebsd-10-stable/share/man/man9/netisr.9
/freebsd-10-stable/sys/dev/bxe/bxe.c
/freebsd-10-stable/sys/dev/cxgb/cxgb_sge.c
/freebsd-10-stable/sys/dev/cxgbe/t4_main.c
/freebsd-10-stable/sys/dev/cxgbe/t4_sge.c
/freebsd-10-stable/sys/dev/e1000/if_igb.c
/freebsd-10-stable/sys/dev/ixgbe/ixgbe.c
/freebsd-10-stable/sys/dev/ixgbe/ixv.c
/freebsd-10-stable/sys/dev/ixl/ixl_txrx.c
/freebsd-10-stable/sys/dev/mxge/if_mxge.c
/freebsd-10-stable/sys/dev/netmap/netmap_freebsd.c
/freebsd-10-stable/sys/dev/oce/oce_if.c
/freebsd-10-stable/sys/dev/qlxgbe/ql_isr.c
/freebsd-10-stable/sys/dev/qlxgbe/ql_os.c
/freebsd-10-stable/sys/dev/qlxge/qls_isr.c
/freebsd-10-stable/sys/dev/qlxge/qls_os.c
/freebsd-10-stable/sys/dev/sfxge/sfxge_rx.c
/freebsd-10-stable/sys/dev/sfxge/sfxge_tx.c
/freebsd-10-stable/sys/dev/virtio/network/if_vtnet.c
/freebsd-10-stable/sys/dev/vmware/vmxnet3/if_vmx.c
/freebsd-10-stable/sys/dev/vxge/vxge.c
/freebsd-10-stable/sys/net/flowtable.c
/freebsd-10-stable/sys/net/ieee8023ad_lacp.c
/freebsd-10-stable/sys/net/if_lagg.c
/freebsd-10-stable/sys/net/if_lagg.h
/freebsd-10-stable/sys/net/netisr.c
/freebsd-10-stable/sys/netinet/in_pcb.h
/freebsd-10-stable/sys/netinet/ip_output.c
/freebsd-10-stable/sys/netinet/sctp_indata.c
/freebsd-10-stable/sys/netinet/sctp_input.c
/freebsd-10-stable/sys/netinet/sctp_output.c
/freebsd-10-stable/sys/netinet/sctp_pcb.c
/freebsd-10-stable/sys/netinet/sctp_structs.h
/freebsd-10-stable/sys/netinet/sctputil.c
/freebsd-10-stable/sys/netinet/tcp_input.c
/freebsd-10-stable/sys/netinet/tcp_syncache.c
/freebsd-10-stable/sys/netinet6/sctp6_usrreq.c
drivers/net/mlx4/en_rx.c
drivers/net/mlx4/en_tx.c
/freebsd-10-stable/sys/sys/mbuf.h
/freebsd-10-stable/sys/sys/param.h
280540 25-Mar-2015 hselasky

MFC r280211:
Add missing void pointer argument to SYSINIT() functions.

Sponsored by: Mellanox Technologies

280018 15-Mar-2015 hselasky

MFC r279865:
Ensure setting promiscious mode when a network interface is up, is
always non-blocking by not locking a SX type of mutex.

Sponsored by: Mellanox Technologies

279737 07-Mar-2015 hselasky

MFC r279587:
Define PTR_ALIGN() macro which will be needed coming Mellanox driver
releases.

Sponsored by: Mellanox Technologies

279732 07-Mar-2015 hselasky

MFC r278866:
Define standard formatting strings to print GIDs
in a separate header file.

Sponsored by: Mellanox Technologies

279731 07-Mar-2015 hselasky

MFC r279584:
Updates for the Mellanox ethernet driver

> List of fixes:
* use correct format for GID printouts
* double array indexing
* spelling in printouts
* void pointer arithmetic
* allow more receive rings
* correct maximum number of transmit rings
* use "const" instead of "static" for constants
* check for invalid VLAN tags
* check for lack of IRQ resources
> Added more hardware specific defines
> Added more verbose printouts of firmware status codes

Sponsored by: Mellanox Technologies

279014 19-Feb-2015 hselasky

MFC r278856:
The "frag_info" pointer is already pointing to an array index.
Don't index twice.

Sponsored by: Mellanox Technologies

277139 13-Jan-2015 hselasky

MFC r276749:
Fixes and updates for the Linux compatibility layer:
- Remove unsupported "bus" field from "struct pci_dev".
- Fix logic inside "pci_enable_msix()" when the number of allocated
interrupts are less than the number of available interrupts.
- Update header files included from "list.h".
- Ensure that "idr_destroy()" removes all entries before destroying
the IDR root node(s).
- Set the "device->release" function so that we don't leak memory at
device destruction.
- Use FreeBSD's "log()" function for certain debug printouts.
- Put parenthesis around arguments inside the min, max, min_t and max_t macros.
- Make sure we don't leak file descriptors by dropping the extra file
reference counts done by the FreeBSD kernel when calling falloc()
and fget_unlocked().

MFC after: 1 week
Sponsored by: Mellanox Technologies

277137 13-Jan-2015 hselasky

MFC r276879:
Don't mask the IP-address when doing multicast IP over infiniband.

PR: 196631
Sponsored by: Mellanox Technologies

276744 06-Jan-2015 rodrigc

Merge r275599:
Use CURVNET macros inside inet_get_local_port_range() function.
Without this fix, a kernel with VIMAGE + Infiniband will panic on bootup.

Certain necessary #include statements require LIST_HEAD.
Add these includes to ofed/include/linux/list.h, because
LIST_HEAD is specifically overridden in this file.

PR: 191468
Differential Revision: D1279
Reviewed by: hselasky

275724 12-Dec-2014 hselasky

MFC r275636:
Move OFED init a bit earlier so that PXE boot works.

Sponsored by: Mellanox Technologies

274043 03-Nov-2014 hselasky

MFC r271946 and r272595:
Improve transmit sending offload, TSO, algorithm in general. This
change allows all HCAs from Mellanox Technologies to function properly
when TSO is enabled. See r271946 and r272595 for more details about
this commit.

Sponsored by: Mellanox Technologies

273880 31-Oct-2014 hselasky

MFC r273867:
Fix compile warning by removing unused variable.

Sponsored by: Mellanox Technologies

273879 31-Oct-2014 hselasky

MFC r273593:

Update the network interface baudrate integer according to the actual
line rate.

Sponsored by: Mellanox Technologies

273736 27-Oct-2014 hselasky

MFC r263710, r273377, r273378, r273423 and r273455:

- De-vnet hash sizes and hash masks.
- Fix multiple issues related to arguments passed to SYSCTL macros.

Sponsored by: Mellanox Technologies


/freebsd-10-stable/sys/amd64/amd64/fpu.c
/freebsd-10-stable/sys/arm/arm/busdma_machdep-v6.c
/freebsd-10-stable/sys/arm/arm/busdma_machdep.c
/freebsd-10-stable/sys/cam/scsi/scsi_sa.c
/freebsd-10-stable/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c
/freebsd-10-stable/sys/cddl/dev/dtrace/dtrace_sysctl.c
/freebsd-10-stable/sys/compat/ndis/kern_ndis.c
/freebsd-10-stable/sys/dev/acpi_support/acpi_asus.c
/freebsd-10-stable/sys/dev/acpi_support/acpi_asus_wmi.c
/freebsd-10-stable/sys/dev/acpi_support/acpi_hp.c
/freebsd-10-stable/sys/dev/acpi_support/acpi_ibm.c
/freebsd-10-stable/sys/dev/acpi_support/acpi_rapidstart.c
/freebsd-10-stable/sys/dev/acpi_support/acpi_sony.c
/freebsd-10-stable/sys/dev/bxe/bxe.c
/freebsd-10-stable/sys/dev/cxgb/cxgb_sge.c
/freebsd-10-stable/sys/dev/cxgbe/t4_main.c
/freebsd-10-stable/sys/dev/e1000/if_em.c
/freebsd-10-stable/sys/dev/e1000/if_igb.c
/freebsd-10-stable/sys/dev/e1000/if_lem.c
/freebsd-10-stable/sys/dev/hatm/if_hatm.c
/freebsd-10-stable/sys/dev/ixgbe/ixgbe.c
/freebsd-10-stable/sys/dev/ixgbe/ixv.c
/freebsd-10-stable/sys/dev/ixl/if_ixl.c
/freebsd-10-stable/sys/dev/mpr/mpr.c
/freebsd-10-stable/sys/dev/mps/mps.c
/freebsd-10-stable/sys/dev/mrsas/mrsas.c
/freebsd-10-stable/sys/dev/mrsas/mrsas.h
/freebsd-10-stable/sys/dev/mxge/if_mxge.c
/freebsd-10-stable/sys/dev/oce/oce_sysctl.c
/freebsd-10-stable/sys/dev/qlxgb/qla_os.c
/freebsd-10-stable/sys/dev/qlxgbe/ql_os.c
/freebsd-10-stable/sys/dev/rt/if_rt.c
/freebsd-10-stable/sys/dev/sound/pci/hda/hdaa.c
/freebsd-10-stable/sys/dev/vxge/vxge.c
/freebsd-10-stable/sys/dev/xen/netfront/netfront.c
/freebsd-10-stable/sys/fs/devfs/devfs_devs.c
/freebsd-10-stable/sys/fs/fuse/fuse_main.c
/freebsd-10-stable/sys/fs/fuse/fuse_vfsops.c
/freebsd-10-stable/sys/fs/nfsserver/nfs_nfsdkrpc.c
/freebsd-10-stable/sys/geom/geom_kern.c
/freebsd-10-stable/sys/kern/kern_cpuset.c
/freebsd-10-stable/sys/kern/kern_descrip.c
/freebsd-10-stable/sys/kern/kern_mib.c
/freebsd-10-stable/sys/kern/kern_synch.c
/freebsd-10-stable/sys/kern/subr_devstat.c
/freebsd-10-stable/sys/kern/subr_kdb.c
/freebsd-10-stable/sys/kern/subr_uio.c
/freebsd-10-stable/sys/kern/vfs_cache.c
/freebsd-10-stable/sys/mips/mips/busdma_machdep.c
/freebsd-10-stable/sys/net/if_lagg.c
/freebsd-10-stable/sys/net/pfvar.h
/freebsd-10-stable/sys/net80211/ieee80211_ht.c
/freebsd-10-stable/sys/net80211/ieee80211_hwmp.c
/freebsd-10-stable/sys/net80211/ieee80211_mesh.c
/freebsd-10-stable/sys/net80211/ieee80211_superg.c
/freebsd-10-stable/sys/netgraph/bluetooth/common/ng_bluetooth.c
/freebsd-10-stable/sys/netgraph/ng_base.c
/freebsd-10-stable/sys/netgraph/ng_socket.c
/freebsd-10-stable/sys/netinet/cc/cc_chd.c
/freebsd-10-stable/sys/netinet/tcp_reass.c
/freebsd-10-stable/sys/netipsec/ipsec.h
/freebsd-10-stable/sys/netipx/ipx_proto.c
/freebsd-10-stable/sys/netpfil/pf/if_pfsync.c
/freebsd-10-stable/sys/netpfil/pf/pf.c
/freebsd-10-stable/sys/netpfil/pf/pf_ioctl.c
drivers/net/mlx4/mlx4_en.h
/freebsd-10-stable/sys/powerpc/powermac/fcu.c
/freebsd-10-stable/sys/powerpc/powermac/smu.c
/freebsd-10-stable/sys/powerpc/powerpc/busdma_machdep.c
/freebsd-10-stable/sys/powerpc/powerpc/cpu.c
/freebsd-10-stable/sys/sys/sysctl.h
/freebsd-10-stable/sys/vm/memguard.c
/freebsd-10-stable/sys/vm/vm_kern.c
/freebsd-10-stable/sys/x86/x86/busdma_bounce.c
273379 21-Oct-2014 hselasky

MFC r272683:
- Fix compile warning when compiling with GCC.
- Add missed chunk in previous driver code MFC.

Sponsored by: Mellanox Technologies

273246 18-Oct-2014 hselasky

MFC r273135:
Update the OFED Linux compatibility layer and
Mellanox hardware driver(s):

- Properly name an inclusion guard
- Fix compile warnings regarding unsigned enums
- Add two new sysctl nodes
- Remove all empty linux header files
- Make an error printout more verbose
- Use "mod_delayed_work()" instead of
cancelling and starting a timeout.
- Implement more Linux scatterlist
functions.

Sponsored by: Mellanox Technologies

272407 02-Oct-2014 hselasky

MFC r272027:

Hardware driver update from Mellanox Technologies, including:
- improved performance
- better stability
- new features
- bugfixes

Supported HCAs:
- ConnectX-2
- ConnectX-3
- ConnectX-3 Pro

NOTE:
- TSO feature needs r271946, which is not yet merged.

Sponsored by: Mellanox Technologies
Approved by: re, glebius


/freebsd-10-stable/contrib/ofed/libibverbs/examples/asyncwatch.c
/freebsd-10-stable/contrib/ofed/libibverbs/examples/device_list.c
/freebsd-10-stable/contrib/ofed/libibverbs/examples/devinfo.c
/freebsd-10-stable/contrib/ofed/libmlx4/src/mlx4-abi.h
/freebsd-10-stable/sys/conf/files
/freebsd-10-stable/sys/modules/mlx4/Makefile
/freebsd-10-stable/sys/modules/mlxen/Makefile
drivers/infiniband/hw/mlx4/mad.c
drivers/infiniband/hw/mlx4/main.c
drivers/infiniband/hw/mlx4/qp.c
drivers/net/mlx4/alloc.c
drivers/net/mlx4/catas.c
drivers/net/mlx4/cmd.c
drivers/net/mlx4/cq.c
drivers/net/mlx4/en_cq.c
drivers/net/mlx4/en_ethtool.c
drivers/net/mlx4/en_main.c
drivers/net/mlx4/en_netdev.c
drivers/net/mlx4/en_port.c
drivers/net/mlx4/en_port.h
drivers/net/mlx4/en_resources.c
drivers/net/mlx4/en_rx.c
drivers/net/mlx4/en_selftest.c
drivers/net/mlx4/en_tx.c
drivers/net/mlx4/eq.c
drivers/net/mlx4/fw.c
drivers/net/mlx4/fw.h
drivers/net/mlx4/icm.c
drivers/net/mlx4/icm.h
drivers/net/mlx4/intf.c
drivers/net/mlx4/main.c
drivers/net/mlx4/mcg.c
drivers/net/mlx4/mlx4.h
drivers/net/mlx4/mlx4_en.h
drivers/net/mlx4/mlx4_stats.h
drivers/net/mlx4/mr.c
drivers/net/mlx4/pd.c
drivers/net/mlx4/port.c
drivers/net/mlx4/profile.c
drivers/net/mlx4/qp.c
drivers/net/mlx4/reset.c
drivers/net/mlx4/resource_tracker.c
drivers/net/mlx4/sense.c
drivers/net/mlx4/srq.c
drivers/net/mlx4/sys_tune.c
drivers/net/mlx4/utils.c
drivers/net/mlx4/utils.h
include/linux/mlx4/cmd.h
include/linux/mlx4/cq.h
include/linux/mlx4/device.h
include/linux/mlx4/driver.h
include/linux/mlx4/qp.h
include/linux/mlx4/srq.h
271127 04-Sep-2014 hselasky

MFC r270710 and r270821:
- Update the OFED Linux Emulation layer as a preparation for a
hardware driver update from Mellanox Technologies.
- Remove empty files from the OFED Linux Emulation layer.
- Fix compile warnings related to printf() and the "%lld" and "%llx"
format specifiers.
- Add some missing 2-clause BSD copyrights.
- Add "Mellanox Technologies, Ltd." to list of copyright holders.
- Add some new compatibility files.
- Fix order of uninit in the mlx4ib module to avoid crash at unload
using the new module_exit_order() function.

Sponsored by: Mellanox Technologies


/freebsd-10-stable/sys/contrib/rdma/krping/krping.c
/freebsd-10-stable/sys/dev/cxgb/cxgb_osdep.h
/freebsd-10-stable/sys/dev/cxgbe/iw_cxgbe/cm.c
/freebsd-10-stable/sys/dev/cxgbe/iw_cxgbe/qp.c
/freebsd-10-stable/sys/modules/mlx4/Makefile
/freebsd-10-stable/sys/modules/mlx4ib/Makefile
/freebsd-10-stable/sys/modules/mlxen/Makefile
drivers/infiniband/core/addr.c
drivers/infiniband/core/cm.c
drivers/infiniband/core/device.c
drivers/infiniband/core/iwcm.c
drivers/infiniband/core/sa_query.c
drivers/infiniband/core/sysfs.c
drivers/infiniband/core/ucm.c
drivers/infiniband/core/user_mad.c
drivers/infiniband/core/uverbs_cmd.c
drivers/infiniband/core/uverbs_main.c
drivers/infiniband/hw/mlx4/alias_GUID.c
drivers/infiniband/hw/mlx4/cm.c
drivers/infiniband/hw/mlx4/mad.c
drivers/infiniband/hw/mlx4/main.c
drivers/infiniband/hw/mlx4/mlx4_ib.h
drivers/infiniband/hw/mlx4/mr.c
drivers/infiniband/hw/mlx4/qp.c
drivers/infiniband/hw/mlx4/sysfs.c
drivers/infiniband/hw/mthca/mthca_allocator.c
drivers/infiniband/hw/mthca/mthca_main.c
drivers/infiniband/hw/mthca/mthca_provider.c
drivers/infiniband/hw/mthca/mthca_reset.c
drivers/infiniband/ulp/ipoib/ipoib_main.c
drivers/infiniband/ulp/sdp/sdp.h
drivers/net/mlx4/alloc.c
drivers/net/mlx4/cmd.c
drivers/net/mlx4/cq.c
drivers/net/mlx4/en_netdev.c
drivers/net/mlx4/en_rx.c
drivers/net/mlx4/eq.c
drivers/net/mlx4/fw.c
drivers/net/mlx4/main.c
drivers/net/mlx4/mcg.c
drivers/net/mlx4/mr.c
drivers/net/mlx4/pd.c
drivers/net/mlx4/qp.c
drivers/net/mlx4/reset.c
drivers/net/mlx4/resource_tracker.c
drivers/net/mlx4/sense.c
drivers/net/mlx4/srq.c
drivers/net/mlx4/xrcd.c
include/asm/atomic-long.h
include/asm/atomic.h
include/asm/byteorder.h
include/asm/current.h
include/asm/fcntl.h
include/asm/io.h
include/asm/page.h
include/asm/pgtable.h
include/asm/semaphore.h
include/asm/system.h
include/asm/types.h
include/asm/uaccess.h
include/linux/atomic.h
include/linux/bitmap.h
include/linux/bitops.h
include/linux/cache.h
include/linux/cdev.h
include/linux/clocksource.h
include/linux/compat.h
include/linux/compiler.h
include/linux/completion.h
include/linux/ctype.h
include/linux/delay.h
include/linux/device.h
include/linux/dma-attrs.h
include/linux/dma-mapping.h
include/linux/dmapool.h
include/linux/err.h
include/linux/errno.h
include/linux/etherdevice.h
include/linux/ethtool.h
include/linux/file.h
include/linux/fs.h
include/linux/gfp.h
include/linux/hardirq.h
include/linux/idr.h
include/linux/if_arp.h
include/linux/if_ether.h
include/linux/if_vlan.h
include/linux/in.h
include/linux/in6.h
include/linux/inet.h
include/linux/inetdevice.h
include/linux/init.h
include/linux/interrupt.h
include/linux/io-mapping.h
include/linux/io.h
include/linux/ioctl.h
include/linux/jiffies.h
include/linux/kdev_t.h
include/linux/kernel.h
include/linux/kmod.h
include/linux/kobject.h
include/linux/kref.h
include/linux/kthread.h
include/linux/ktime.h
include/linux/linux_compat.c
include/linux/linux_idr.c
include/linux/linux_radix.c
include/linux/list.h
include/linux/lockdep.h
include/linux/log2.h
include/linux/math64.h
include/linux/miscdevice.h
include/linux/mm.h
include/linux/module.h
include/linux/moduleparam.h
include/linux/mount.h
include/linux/mutex.h
include/linux/net.h
include/linux/netdevice.h
include/linux/notifier.h
include/linux/page.h
include/linux/pci.h
include/linux/poll.h
include/linux/radix-tree.h
include/linux/random.h
include/linux/rbtree.h
include/linux/rtnetlink.h
include/linux/rwlock.h
include/linux/rwsem.h
include/linux/scatterlist.h
include/linux/sched.h
include/linux/semaphore.h
include/linux/slab.h
include/linux/socket.h
include/linux/spinlock.h
include/linux/stddef.h
include/linux/string.h
include/linux/sysfs.h
include/linux/timer.h
include/linux/types.h
include/linux/uaccess.h
include/linux/vmalloc.h
include/linux/wait.h
include/linux/workqueue.h
include/net/addrconf.h
include/net/arp.h
include/net/if_inet6.h
include/net/ip.h
include/net/ip6_route.h
include/net/ipv6.h
include/net/neighbour.h
include/net/netevent.h
include/net/tcp.h
include/rdma/ib_umem.h
include/rdma/ib_verbs.h
270166 19-Aug-2014 hselasky

MFC r269859:
Fix for memory leak.

Sponsored by: Mellanox Technologies

269862 12-Aug-2014 hselasky

MFC r268316:
Fix OFED startup order: All SYSINIT()'s and modules should be loaded
prior to starting "/sbin/init" which will run all the "/etc/rc.d/xxx"
scripts. Else there can be a race configuring the interfaces via
"/etc/rc.conf".

Sponsored by: Mellanox Technologies

269861 12-Aug-2014 hselasky

MFC r268315:
Fix compile warning.

Sponsored by: Mellanox Technologies

269860 12-Aug-2014 hselasky

MFC r268314:
Fix some compile warnings.

Sponsored by: Mellanox Technologies

267517 15-Jun-2014 hselasky

MFC r267395:
- Fix out of range shifting bug in bitops.h.
- Make code a bit easier to read by adding parenthesis.

261455 04-Feb-2014 eadler

MFC r258779,r258780,r258787,r258822:

Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this
shifts into the sign bit. Instead use (1U << 31) which gets the
expected result.

Similar to the (1 << 31) case it is not defined to do (2 << 30).

This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.

A similar change was made in OpenBSD.


/freebsd-10-stable/lib/libc/sparc64/fpu/fpu.c
/freebsd-10-stable/lib/libc/sparc64/fpu/fpu_sqrt.c
/freebsd-10-stable/lib/libc/xdr/xdr_rec.c
/freebsd-10-stable/sys/amd64/pci/pci_cfgreg.c
/freebsd-10-stable/sys/amd64/vmm/intel/vmcs.h
/freebsd-10-stable/sys/amd64/vmm/intel/vmx_controls.h
/freebsd-10-stable/sys/amd64/vmm/intel/vtd.c
/freebsd-10-stable/sys/arm/arm/cpufunc_asm_pj4b.S
/freebsd-10-stable/sys/arm/arm/db_trace.c
/freebsd-10-stable/sys/arm/arm/pl190.c
/freebsd-10-stable/sys/arm/at91/if_macbvar.h
/freebsd-10-stable/sys/arm/broadcom/bcm2835/bcm2835_dma.c
/freebsd-10-stable/sys/arm/econa/if_ece.c
/freebsd-10-stable/sys/arm/freescale/imx/imx6_anatopreg.h
/freebsd-10-stable/sys/arm/freescale/imx/imx6_usbphy.c
/freebsd-10-stable/sys/arm/freescale/imx/imx_gptreg.h
/freebsd-10-stable/sys/arm/include/armreg.h
/freebsd-10-stable/sys/arm/lpc/if_lpereg.h
/freebsd-10-stable/sys/arm/lpc/lpcreg.h
/freebsd-10-stable/sys/arm/mv/mv_pci.c
/freebsd-10-stable/sys/arm/samsung/exynos/ehci_exynos5.c
/freebsd-10-stable/sys/arm/xscale/i8134x/i81342reg.h
/freebsd-10-stable/sys/arm/xscale/ixp425/ixp425reg.h
/freebsd-10-stable/sys/boot/arm/at91/libat91/mci_device.h
/freebsd-10-stable/sys/boot/i386/libfirewire/fwohci.h
/freebsd-10-stable/sys/boot/i386/libfirewire/fwohcireg.h
/freebsd-10-stable/sys/dev/aac/aacvar.h
/freebsd-10-stable/sys/dev/acpica/acpi_video.c
/freebsd-10-stable/sys/dev/agp/agp_i810.c
/freebsd-10-stable/sys/dev/ahci/ahci.h
/freebsd-10-stable/sys/dev/bktr/bktr_core.c
/freebsd-10-stable/sys/dev/cesa/cesa.h
/freebsd-10-stable/sys/dev/drm/i915_reg.h
/freebsd-10-stable/sys/dev/drm/mach64_drv.h
/freebsd-10-stable/sys/dev/drm/mga_drv.h
/freebsd-10-stable/sys/dev/drm/r128_drv.h
/freebsd-10-stable/sys/dev/drm/r300_reg.h
/freebsd-10-stable/sys/dev/drm/r600_blit.c
/freebsd-10-stable/sys/dev/drm/radeon_cp.c
/freebsd-10-stable/sys/dev/drm/radeon_drv.h
/freebsd-10-stable/sys/dev/drm/via_irq.c
/freebsd-10-stable/sys/dev/drm2/i915/i915_reg.h
/freebsd-10-stable/sys/dev/drm2/radeon/evergreen_blit_kms.c
/freebsd-10-stable/sys/dev/drm2/radeon/evergreen_cs.c
/freebsd-10-stable/sys/dev/drm2/radeon/evergreend.h
/freebsd-10-stable/sys/dev/drm2/radeon/nid.h
/freebsd-10-stable/sys/dev/drm2/radeon/r200.c
/freebsd-10-stable/sys/dev/drm2/radeon/r300.c
/freebsd-10-stable/sys/dev/drm2/radeon/r300_reg.h
/freebsd-10-stable/sys/dev/drm2/radeon/r500_reg.h
/freebsd-10-stable/sys/dev/drm2/radeon/r600_blit.c
/freebsd-10-stable/sys/dev/drm2/radeon/r600_blit_kms.c
/freebsd-10-stable/sys/dev/drm2/radeon/r600_cs.c
/freebsd-10-stable/sys/dev/drm2/radeon/r600d.h
/freebsd-10-stable/sys/dev/drm2/radeon/radeon_cp.c
/freebsd-10-stable/sys/dev/drm2/radeon/radeon_drv.h
/freebsd-10-stable/sys/dev/drm2/radeon/radeon_reg.h
/freebsd-10-stable/sys/dev/drm2/radeon/rv770d.h
/freebsd-10-stable/sys/dev/drm2/radeon/sid.h
/freebsd-10-stable/sys/dev/drm2/ttm/ttm_bo.c
/freebsd-10-stable/sys/dev/e1000/e1000_82575.h
/freebsd-10-stable/sys/dev/e1000/e1000_ich8lan.c
/freebsd-10-stable/sys/dev/e1000/e1000_regs.h
/freebsd-10-stable/sys/dev/etherswitch/arswitch/arswitchreg.h
/freebsd-10-stable/sys/dev/ffec/if_ffecreg.h
/freebsd-10-stable/sys/dev/firewire/firewire.c
/freebsd-10-stable/sys/dev/firewire/fwohci.c
/freebsd-10-stable/sys/dev/firewire/fwohcireg.h
/freebsd-10-stable/sys/dev/firewire/sbp.c
/freebsd-10-stable/sys/dev/firewire/sbp.h
/freebsd-10-stable/sys/dev/firewire/sbp_targ.c
/freebsd-10-stable/sys/dev/hatm/if_hatmreg.h
/freebsd-10-stable/sys/dev/hwpmc/hwpmc_piv.h
/freebsd-10-stable/sys/dev/iwn/if_iwnreg.h
/freebsd-10-stable/sys/dev/mge/if_mgevar.h
/freebsd-10-stable/sys/dev/mpt/mpt_cam.c
/freebsd-10-stable/sys/dev/msk/if_mskreg.h
/freebsd-10-stable/sys/dev/mvs/mvs.h
/freebsd-10-stable/sys/dev/mxge/mxge_mcp.h
/freebsd-10-stable/sys/dev/qlxge/qls_dump.c
/freebsd-10-stable/sys/dev/ral/rt2560reg.h
/freebsd-10-stable/sys/dev/ral/rt2661reg.h
/freebsd-10-stable/sys/dev/ral/rt2860reg.h
/freebsd-10-stable/sys/dev/sound/pci/hda/hdaa.h
/freebsd-10-stable/sys/dev/usb/controller/ehci.h
/freebsd-10-stable/sys/dev/usb/wlan/if_rumreg.h
/freebsd-10-stable/sys/dev/usb/wlan/if_runreg.h
/freebsd-10-stable/sys/dev/usb/wlan/if_uralreg.h
/freebsd-10-stable/sys/dev/usb/wlan/if_urtwreg.h
/freebsd-10-stable/sys/dev/usb/wlan/if_zydreg.h
/freebsd-10-stable/sys/dev/wpi/if_wpireg.h
/freebsd-10-stable/sys/geom/raid/tr_raid1e.c
/freebsd-10-stable/sys/i386/pci/pci_cfgreg.c
/freebsd-10-stable/sys/mips/atheros/ar71xxreg.h
/freebsd-10-stable/sys/mips/atheros/ar934xreg.h
/freebsd-10-stable/sys/mips/atheros/if_argevar.h
/freebsd-10-stable/sys/mips/malta/gt_pci.c
/freebsd-10-stable/sys/mips/nlm/dev/net/nae.c
/freebsd-10-stable/sys/mips/nlm/xlp_machdep.c
/freebsd-10-stable/sys/mips/rmi/pic.h
drivers/infiniband/hw/mlx4/qp.c
drivers/infiniband/hw/mthca/mthca_mcg.c
drivers/infiniband/hw/mthca/mthca_qp.c
drivers/net/mlx4/mcg.c
/freebsd-10-stable/sys/powerpc/fpu/fpu_emu.c
/freebsd-10-stable/sys/powerpc/fpu/fpu_sqrt.c
/freebsd-10-stable/sys/powerpc/powermac/nvbl.c
/freebsd-10-stable/sys/sys/consio.h
/freebsd-10-stable/sys/x86/iommu/intel_reg.h
/freebsd-10-stable/usr.sbin/bluetooth/bthidd/kbd.c
260495 09-Jan-2014 dim

MFC r260102:

Similar to r260020, only use -fms-extensions with gcc, for all other
modules which require this flag to compile. Use a GCC_MS_EXTENSIONS
variable, defined in kern.pre.mk, which can be used to easily supply the
flag (or not), depending on the compiler type.

MFC r260322:

In addition to r260102, also define GCC_MS_EXTENSIONS in bsd.sys.mk,
since kernel module builds do not use kern.pre.mk.

260321 05-Jan-2014 dim

Revert MFC of r260102 for now, until I can merge the required fix from
head. This should fix building modules which require -fms-extensions to
compile them with gcc.

260268 04-Jan-2014 dim

MFC r260020:

For sys/dev/drm2/radeon, only use -fms-extensions with gcc. This flag
is only to stop gcc complaining about anonymous unions, which clang does
not do. For clang 3.4 however, -fms-extensions enables the Microsoft
__wchar_t type, which clashes with our own types.h.

MFC r260102:

Similar to r260020, only use -fms-extensions with gcc, for all other
modules which require this flag to compile. Use a GCC_MS_EXTENSIONS
variable, defined in kern.pre.mk, which can be used to easily supply the
flag (or not), depending on the compiler type.

259608 19-Dec-2013 alfred

Defer start/stop port to workqueues.

MFC: 259411

258280 17-Nov-2013 alfred

MFC: 258276

Fix creating a vlan over lagg over mlxen crash.

PR: 181931
Submitted by: Shahar Klein (shahark mellanox.com)

Approved by: re

258242 17-Nov-2013 alfred

MFC: 257542

Fix API mismatch exposed by lagg.

When destroying a lagg the driver tries to restore the old mac and
fails due to API mismatch.

Submitted by: Shahar Klein (shahark at mellanox.com)
Approved by: re

257867 08-Nov-2013 alfred

MFC: r257862, r257863, r257864

r257862:

Use explicit long cast to avoid overflow in bitopts.

This was causing problems with the buddy allocator inside of
ofed.

r257863:

Fix for bad performance when mtu is increased.

Update the auto moderation behavior in the mlxen driver to match
the new LINUX OFED code.

r257864:

Do not use a sleep lock when protecting the driver flags.

This was causing a locking issue with lagg.

Approved by: re

256810 20-Oct-2013 alfred

Fix resource free.

The order of releasing resources in mlxen was wrong, which caused
panic on reload of the module.

MFC: 256682

Submitted by: Shahar Klein (shahark at mellanox.com)
Approved by: re

256686 17-Oct-2013 alfred

Fix __free_pages() in the linux shim.

__free_pages() is actaully supposed to take a "struct page *" not
an address.

MFC: 256546

Approved by: re

256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


256269 10-Oct-2013 alfred

Fix for When more than one NIC is present.

The device name was incorrect due to a specific function we ported
from the Linux driver that is not FBSD compatible. This resulted
with a false sysctl registration and some more problematic issues.

The patch basically revokes it all together.

Submitted by: Meny Yossefi (menyy mellanox.com)

Approved by: re


256179 09-Oct-2013 dim

Remove redundant declaration of cmclass in
sys/ofed/drivers/infiniband/core/ucm.c, to silence a gcc warning.

Approved by: re (kib)
X-MFC-With: r255932


256116 07-Oct-2013 dim

Give an unnamed union in sys/ofed/include/rdma/ib_verbs.h a name, to
silence a gcc warning.

Approved by: re (gjb)
MFC after: 3 days


255973 01-Oct-2013 alfred

Fixed kernel crash when running devinfo

When calling to ib_uverbs_cleanup_ucontext, there is a call to
mutex_lock of xrcd_table_mutex, which was not initialized.
Added missing initialization for xrcd_table_mutex.

Submitted by: Orit Moskovich (oritm mellanox.com)

Approved by: re


255972 01-Oct-2013 alfred

Enable ib_dev.mmap function

Removed the ifdef linux from this function.
Added stub function for contiguous pages to avoid compilation
errors.

Submitted by: Orit Moskovich (oritm mellanox.com)
Approved by: re


255970 01-Oct-2013 alfred

Fixed 'Couldn't Create QP' issue when running rc_pingpong, uc_pingpong,
srq_pingpong IBverbs

Removed refrences using 'ifdef __linux__' to qpg functions and
related fields in struct
ib_qp_init_attr.

Submitted by: Orit Moskovich (oritm mellanox.com)

Approved by: re


255969 01-Oct-2013 alfred

Fixed kernel crash when removing IPOIB_CM option from configuration file

Changed module init from module_init() to module_init_order() with
SI_ORDER_MIDDLE flag
Submitted by: Orit Moskovich (oritm mellanox.com)
Approved by: re


255968 01-Oct-2013 alfred

Fix mis-merge of upstream fix.

We would accidentally make the string one byte too short.

Submitted by: Orit Moskovich (oritm mellanox.com)

Approved by: re


255932 29-Sep-2013 alfred

Update OFED to Linux 3.7 and update Mellanox drivers.

Update the OFED Infiniband core to the version supplied in Linux
version 3.7.

The update to OFED is nearly all additional defines and functions
with the exception of the addition of additional parameters to
ib_register_device() and the reg_user_mr callback.

In addition the ibcore (Infiniband core) and ipoib (IP over Infiniband)
have both been made into completely loadable modules to facilitate
testing of the OFED stack in FreeBSD.

Finally the Mellanox Infiniband drivers are now updated to the
latest version shipping with Linux 3.7.

Submitted by: Mellanox FreeBSD driver team:
Oded Shanoon (odeds mellanox.com),
Meny Yossefi (menyy mellanox.com),
Orit Moskovich (oritm mellanox.com)

Approved by: re


255240 05-Sep-2013 pjd

Handle cases where capability rights are not provided.

Reported by: kib


254832 25-Aug-2013 andre

Change m->pkthdr.header to m->pkthdr.PH_loc.ptr after r254804
to transiently store pointers to packet headers.

Sponsored by: The FreeBSD Foundation


254734 23-Aug-2013 np

Fix implementation of sock_getname.

MFC after: 1 week


254576 20-Aug-2013 jhb

Stop an ipoib interface before detaching it.

PR: kern/181225
Submitted by: Shahar Klein
Obtained from: Mellanox
MFC after: 1 week


254523 19-Aug-2013 andre

Add m_clrprotoflags() to clear protocol specific mbuf flags at up and
downwards layer crossings.

Consistently use it within IP, IPv6 and ethernet protocols.

Discussed with: trociny, glebius


254356 15-Aug-2013 glebius

Make sendfile() a method in the struct fileops. Currently only
vnode backed file descriptors have this method implemented.

Reviewed by: kib
Sponsored by: Nginx, Inc.
Sponsored by: Netflix


254122 09-Aug-2013 jeff

- Reserve a special AF for SDP. The one we were incorrectly using before
was taken by another AF.

Sponsored by: EMC / Isilon Storage Division


254121 09-Aug-2013 jeff

- Correctly handle various edge cases in sysfs emulation.

Sponsored by: EMC / Isilon Storage Division


254120 09-Aug-2013 jeff

- Use the correct type in the linux bitops emulation.

Submitted by: Maxim Ignatenko <gelraen.ua@gmail.com>


254065 07-Aug-2013 kib

Split the pagequeues per NUMA domains, and split pageademon process
into threads each processing queue in a single domain. The structure
of the pagedaemons and queues is kept intact, most of the changes come
from the need for code to find an owning page queue for given page,
calculated from the segment containing the page.

The tie between NUMA domain and pagedaemon thread/pagequeue split is
rather arbitrary, the multithreaded daemon could be allowed for the
single-domain machines, or one domain might be split into several page
domains, to further increase concurrency.

Right now, each pagedaemon thread tries to reach the global target,
precalculated at the start of the pass. This is not optimal, since it
could cause excessive page deactivation and freeing. The code should
be changed to re-check the global page deficit state in the loop after
some number of iterations.

The pagedaemons reach the quorum before starting the OOM, since one
thread inability to meet the target is normal for split queues. Only
when all pagedaemons fail to produce enough reusable pages, OOM is
started by single selected thread.

Launder is modified to take into account the segments layout with
regard to the region for which cleaning is performed.

Based on the preliminary patch by jeff, sponsored by EMC / Isilon
Storage Division.

Reviewed by: alc
Tested by: pho
Sponsored by: The FreeBSD Foundation


254025 07-Aug-2013 jeff

Replace kernel virtual address space allocation with vmem. This provides
transparent layering and better fragmentation.

- Normalize functions that allocate memory to use kmem_*
- Those that allocate address space are named kva_*
- Those that operate on maps are named kmap_*
- Implement recursive allocation handling for kmem_arena in vmem.

Reviewed by: alc
Tested by: pho
Sponsored by: EMC / Isilon Storage Division


253785 29-Jul-2013 jhb

Add a missing prototype.

Pointy hat: me


253774 29-Jul-2013 jhb

Various fixes to the mlxen(4) driver:
- Remove an incorrect assertion that can trigger when downing an interface.
- Stop the interface during detach to avoid panics when unloading the
driver.
- A few locking fixes to be more consistent with other FreeBSD drivers:
- Protect if_drv_flags with the driver lock, not atomic ops
- Hold the driver lock when adjusting multicast state.
- Hold the driver lock while adjusting if_capenable.

PR: kern/180791 [1,2]
Submitted by: Shakar Klein @ Mellanox [1,2]
MFC after: 3 days


253653 25-Jul-2013 jhb

Avoid trashing IP fragments:
- Only enable UDP/TCP hardware checksums if CSUM_UDP or CSUM_TCP is set.
- Only enable IP hardware checksums if CSUM_IP is set.

PR: kern/180430
Submitted by: Meny Yossefi <menyy@mellanox.com>
MFC after: 1 week


253604 24-Jul-2013 avg

rename scheduler->swapper and SI_SUB_RUN_SCHEDULER->SI_SUB_LAST

Also directly call swapper() at the end of mi_startup instead of
relying on swapper being the last thing in sysinits order.

Rationale:

- "RUN_SCHEDULER" was misleading, scheduling already takes place at that stage
- "scheduler" was misleading, the function swaps in the swapped out processes
- another SYSINIT(SI_SUB_RUN_SCHEDULER, SI_ORDER_ANY) could never be
invoked depending on its relative order with scheduler; this was not obvious
and the bug actually used to exist

Reviewed by: kib (ealier version)
MFC after: 14 days


253449 18-Jul-2013 jhb

Rework the previous fix for the IB vs Ethernet sysctl handler to be more
generic and apply to all sysfs attributes:
- Use sysctl_handle_string() instead of reimplementing it.
- Remove trailing newline from the current value before passing it to
userland and append a newline to the new string value before passing it
to the attribute's store function.
- Don't leak the temporary buffer if the first error check triggers.
- Revert earlier change to mlx4 port mode handler.

PR: kern/174213
Submitted by: Garrett Cooper
Reviewed by: Shakar Klein @ Mellanox
MFC after: 1 week


253423 17-Jul-2013 jhb

Remove check forbidding requests that would result in one port being set
to Ethernet and the subsequent port being set to IB.

Submitted by: Shakar Klein @ Mellanox
Tested by: Morgan Robertson <morganrobertson@gmail.com>
MFC after: 1 week


253048 08-Jul-2013 jhb

Allow mlx4 devices to switch from Ethernet to Infiniband (and vice versa):
- Fix sysctl wrapper for sysfs attributes to properly handle new string
values similar to sysctl_handle_string() (only copyin the user's
supplied length and nul-terminate the string).
- Don't check for a trailing newline when evaluating the desired operating
mode of a mlx4 device.

PR: kern/179999
Submitted by: Shahar Klein <shahark@mellanox.com>
MFC after: 1 week


251617 11-Jun-2013 jhb

Store a reference to the vnode associated with a file descriptor in the
linux_file structure and use it instead of directly accessing td_fpop
when destroying the linux_file structure. The td_fpop pointer is not
valid when a cdevpriv destructor is run, and the type-specific close
method has already been called, so f_vnode may not be valid (and the
vnode might have been recycled without our own reference).

Tested by: Julian Stecklina <jsteckli@os.inf.tu-dresden.de>
MFC after: 1 week


250460 10-May-2013 eadler

Fxi a bunch of typos.

PR: misc/174625
Submitted by: Jeremy Chadwick <jdc@koitsu.org>


250374 08-May-2013 delphij

According to the documentation, on Linux, cancel_delayed_work() does not
do drain (flush_workqueue() in Linux terms) but instead returns true if
the work was removed before it is run, or false otherwise.

Simulate this by removing the taskqueue_drain() and return the value
derived from taskqueue_cancel()'s return value.

This would solve a witness warning caused by calling taskqueue_drain()
with a non-sleepable lock held, like:

taskqueue_drain with the following non-sleepable locks held:
exclusive rw lle (lle) r = 0 (0xfffffe001450b410) locked @
/usr/src/sys/netinet/in.c:1484
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff848d4f7690
kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff848d4f7740
witness_warn() at witness_warn+0x4a8/frame 0xffffff848d4f7800
taskqueue_drain() at taskqueue_drain+0x3a/frame 0xffffff848d4f7840
set_timeout() at set_timeout+0x4a/frame 0xffffff848d4f7860
netevent_callback() at netevent_callback+0x16/frame 0xffffff848d4f7870
arpintr() at arpintr+0x9b5/frame 0xffffff848d4f7930

This do not affect kernel without OFED compiled in.

Reported by: Garrett Cooper <yaneurabeya gmail com>
(who also tested an earlier version of this patch,
but bugs are mine)
MFC after: 2 weeks


249976 27-Apr-2013 glebius

Add const qualifier to the dst parameter of the ifnet if_output method.


249066 03-Apr-2013 jhb

Check for SS_NBIO in the socket state field rather than socket buffer
flags.

Submitted by: Vijay Singh
MFC after: 1 week


248084 09-Mar-2013 attilio

Switch the vm_object mutex to be a rwlock. This will enable in the
future further optimizations where the vm_object lock will be held
in read mode most of the time the page cache resident pool of pages
are accessed for reading purposes.

The change is mostly mechanical but few notes are reported:
* The KPI changes as follow:
- VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK()
- VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK()
- VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK()
- VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED()
(in order to avoid visibility of implementation details)
- The read-mode operations are added:
VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(),
VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED()
* The vm/vm_pager.h namespace pollution avoidance (forcing requiring
sys/mutex.h in consumers directly to cater its inlining functions
using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h
consumers now must include also sys/rwlock.h.
* zfs requires a quite convoluted fix to include FreeBSD rwlocks into
the compat layer because the name clash between FreeBSD and solaris
versions must be avoided.
At this purpose zfs redefines the vm_object locking functions
directly, isolating the FreeBSD components in specific compat stubs.

The KPI results heavilly broken by this commit. Thirdy part ports must
be updated accordingly (I can think off-hand of VirtualBox, for example).

Sponsored by: EMC / Isilon storage division
Reviewed by: jeff
Reviewed by: pjd (ZFS specific review)
Discussed with: alc
Tested by: pho


247675 02-Mar-2013 mav

Add protective parentheses for macro argument, missed in r247671.


247671 02-Mar-2013 mav

MFcalloutng:
Give OFED Linux wrapper own "expires" field instead of abusing callout's
c_time, which will change its type and units with calloutng commit.


247602 02-Mar-2013 pjd

Merge Capsicum overhaul:

- Capability is no longer separate descriptor type. Now every descriptor
has set of its own capability rights.

- The cap_new(2) system call is left, but it is no longer documented and
should not be used in new code.

- The new syscall cap_rights_limit(2) should be used instead of
cap_new(2), which limits capability rights of the given descriptor
without creating a new one.

- The cap_getrights(2) syscall is renamed to cap_rights_get(2).

- If CAP_IOCTL capability right is present we can further reduce allowed
ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed
ioctls can be retrived with cap_ioctls_get(2) syscall.

- If CAP_FCNTL capability right is present we can further reduce fcntls
that can be used with the new cap_fcntls_limit(2) syscall and retrive
them with cap_fcntls_get(2).

- To support ioctl and fcntl white-listing the filedesc structure was
heavly modified.

- The audit subsystem, kdump and procstat tools were updated to
recognize new syscalls.

- Capability rights were revised and eventhough I tried hard to provide
backward API and ABI compatibility there are some incompatible changes
that are described in detail below:

CAP_CREATE old behaviour:
- Allow for openat(2)+O_CREAT.
- Allow for linkat(2).
- Allow for symlinkat(2).
CAP_CREATE new behaviour:
- Allow for openat(2)+O_CREAT.

Added CAP_LINKAT:
- Allow for linkat(2). ABI: Reuses CAP_RMDIR bit.
- Allow to be target for renameat(2).

Added CAP_SYMLINKAT:
- Allow for symlinkat(2).

Removed CAP_DELETE. Old behaviour:
- Allow for unlinkat(2) when removing non-directory object.
- Allow to be source for renameat(2).

Removed CAP_RMDIR. Old behaviour:
- Allow for unlinkat(2) when removing directory.

Added CAP_RENAMEAT:
- Required for source directory for the renameat(2) syscall.

Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR):
- Allow for unlinkat(2) on any object.
- Required if target of renameat(2) exists and will be removed by this
call.

Removed CAP_MAPEXEC.

CAP_MMAP old behaviour:
- Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and
PROT_WRITE.
CAP_MMAP new behaviour:
- Allow for mmap(2)+PROT_NONE.

Added CAP_MMAP_R:
- Allow for mmap(PROT_READ).
Added CAP_MMAP_W:
- Allow for mmap(PROT_WRITE).
Added CAP_MMAP_X:
- Allow for mmap(PROT_EXEC).
Added CAP_MMAP_RW:
- Allow for mmap(PROT_READ | PROT_WRITE).
Added CAP_MMAP_RX:
- Allow for mmap(PROT_READ | PROT_EXEC).
Added CAP_MMAP_WX:
- Allow for mmap(PROT_WRITE | PROT_EXEC).
Added CAP_MMAP_RWX:
- Allow for mmap(PROT_READ | PROT_WRITE | PROT_EXEC).

Renamed CAP_MKDIR to CAP_MKDIRAT.
Renamed CAP_MKFIFO to CAP_MKFIFOAT.
Renamed CAP_MKNODE to CAP_MKNODEAT.

CAP_READ old behaviour:
- Allow pread(2).
- Disallow read(2), readv(2) (if there is no CAP_SEEK).
CAP_READ new behaviour:
- Allow read(2), readv(2).
- Disallow pread(2) (CAP_SEEK was also required).

CAP_WRITE old behaviour:
- Allow pwrite(2).
- Disallow write(2), writev(2) (if there is no CAP_SEEK).
CAP_WRITE new behaviour:
- Allow write(2), writev(2).
- Disallow pwrite(2) (CAP_SEEK was also required).

Added convinient defines:

#define CAP_PREAD (CAP_SEEK | CAP_READ)
#define CAP_PWRITE (CAP_SEEK | CAP_WRITE)
#define CAP_MMAP_R (CAP_MMAP | CAP_SEEK | CAP_READ)
#define CAP_MMAP_W (CAP_MMAP | CAP_SEEK | CAP_WRITE)
#define CAP_MMAP_X (CAP_MMAP | CAP_SEEK | 0x0000000000000008ULL)
#define CAP_MMAP_RW (CAP_MMAP_R | CAP_MMAP_W)
#define CAP_MMAP_RX (CAP_MMAP_R | CAP_MMAP_X)
#define CAP_MMAP_WX (CAP_MMAP_W | CAP_MMAP_X)
#define CAP_MMAP_RWX (CAP_MMAP_R | CAP_MMAP_W | CAP_MMAP_X)
#define CAP_RECV CAP_READ
#define CAP_SEND CAP_WRITE

#define CAP_SOCK_CLIENT \
(CAP_CONNECT | CAP_GETPEERNAME | CAP_GETSOCKNAME | CAP_GETSOCKOPT | \
CAP_PEELOFF | CAP_RECV | CAP_SEND | CAP_SETSOCKOPT | CAP_SHUTDOWN)
#define CAP_SOCK_SERVER \
(CAP_ACCEPT | CAP_BIND | CAP_GETPEERNAME | CAP_GETSOCKNAME | \
CAP_GETSOCKOPT | CAP_LISTEN | CAP_PEELOFF | CAP_RECV | CAP_SEND | \
CAP_SETSOCKOPT | CAP_SHUTDOWN)

Added defines for backward API compatibility:

#define CAP_MAPEXEC CAP_MMAP_X
#define CAP_DELETE CAP_UNLINKAT
#define CAP_MKDIR CAP_MKDIRAT
#define CAP_RMDIR CAP_UNLINKAT
#define CAP_MKFIFO CAP_MKFIFOAT
#define CAP_MKNOD CAP_MKNODAT
#define CAP_SOCK_ALL (CAP_SOCK_CLIENT | CAP_SOCK_SERVER)

Sponsored by: The FreeBSD Foundation
Reviewed by: Christoph Mallon <christoph.mallon@gmx.de>
Many aspects discussed with: rwatson, benl, jonathan
ABI compatibility discussed with: kib


246581 09-Feb-2013 delphij

Fix LINT build on amd64.


246482 07-Feb-2013 rrs

This fixes a out-of-order problem with several
of the newer drivers. The basic problem was
that the driver was pulling the mbuf off the
drbr ring and then when sending with xmit(), encounting
a full transmit ring. Thus the lower layer
xmit() function would return an error, and the
drivers would then append the data back on to the ring.
For TCP this is a horrible scenario sure to bring
on a fast-retransmit.

The fix is to use drbr_peek() to pull the data pointer
but not remove it from the ring. If it fails then
we either call the new drbr_putback or drbr_advance
method. Advance moves it forward (we do this sometimes
when the xmit() function frees the mbuf). When
we succeed we always call advance. The
putback will always copy the mbuf back to the top
of the ring. Note that the putback *cannot* be used
with a drbr_dequeue() only with drbr_peek(). We most
of the time, in putback, would not need to copy it
back since most likey the mbuf is still the same, but
sometimes xmit() functions will change the mbuf via
a pullup or other call. So the optimial case for
the single consumer is to always copy it back. If
we ever do a multiple_consumer (for lagg?) we
will need a test and atomic in the put back possibly
a seperate putback_mc() in the ring buf.

Reviewed by: jhb@freebsd.org, jlv@freebsd.org


243882 05-Dec-2012 glebius

Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually


242933 12-Nov-2012 dim

Redo r242842, now actually fixing the warnings, as follows:
- In sys/ofed/drivers/infiniband/core/cma.c, an enum struct member is
interpreted as an int, so cast it to an int.
- In sys/ofed/drivers/infiniband/core/ud_header.c, initialize the
packet_length variable in ib_ud_header_init(), to prevent undefined
behaviour.
- In sys/ofed/drivers/infiniband/ulp/sdp/sdp_rx.c, call rdma_notify()
with the correct enum type and value.
- In sys/ofed/include/linux/pci.h, change the PCI_DEVICE and PCI_VDEVICE
macros to use C99 struct initializers, so additional members can be
overridden.

Reviewed by: delphij, Garrett Cooper <yanegomi@gmail.com>
MFC after: 1 week


242841 09-Nov-2012 delphij

Use %s when calling make_dev with a string pointer. This makes
clang happy.

MFC after: 2 weeks


241844 22-Oct-2012 eadler

remove duplicate semicolons where possible.

Approved by: cperciva
MFC after: 1 week


241697 18-Oct-2012 jhb

Take advantage of if_baudrate_pf and calculate an effective baud rate on
all platforms (not just amd64) to compute an equivalent IB rate.


241696 18-Oct-2012 jhb

Use if_initbaudrate().


241037 28-Sep-2012 glebius

The drbr(9) API appeared to be so unclear, that most drivers in
tree used it incorrectly, which lead to inaccurate overrated
if_obytes accounting. The drbr(9) used to update ifnet stats on
drbr_enqueue(), which is not accurate since enqueuing doesn't
imply successful processing by driver. Dequeuing neither mean
that. Most drivers also called drbr_stats_update() which did
accounting again, leading to doubled if_obytes statistics. And
in case of severe transmitting, when a packet could be several
times enqueued and dequeued it could have been accounted several
times.

o Thus, make drbr(9) API thinner. Now drbr(9) merely chooses between
ALTQ queueing or buf_ring(9) queueing.
- It doesn't touch the buf_ring stats any more.
- It doesn't touch ifnet stats anymore.
- drbr_stats_update() no longer exists.

o buf_ring(9) handles its stats itself:
- It handles br_drops itself.
- br_prod_bytes stats are dropped. Rationale: no one ever
reads them but update of a common counter on every packet
negatively affects performance due to excessive cache
invalidation.
- buf_ring_enqueue_bytes() reduced to buf_ring_enqueue(), since
we no longer account bytes.

o Drivers handle their stats theirselves: if_obytes, if_omcasts.

o mlx4(4), igb(4), em(4), vxge(4), oce(4) and ixv(4) no longer
use drbr_stats_update(), and update ifnet stats theirselves.

o bxe(4) was the most correct driver, it didn't call
drbr_stats_update(), thus it was the only driver accurate under
moderate load. Now it also maintains stats itself.

o ixgbe(4) had already taken stats from hardware, so just
- drop software stats updating.
- take multicast packet count from hardware as well.

o mxge(4) just no longer needs NO_SLOW_STATS define.

o cxgb(4), cxgbe(4) need no change, since they obtain stats
from hardware.

Reviewed by: jfv, gnn


240680 18-Sep-2012 gavin

Align the PCI Express #defines with the style used for the PCI-X
#defines. This also has the advantage that it makes the names more
compact, iand also allows us to correct the non-uniform naming of
the PCIM_LINK_* defines, making them all consistent amongst themselves.

This is a mostly mechanical rename:
s/PCIR_EXPRESS_/PCIER_/g
s/PCIM_EXP_/PCIEM_/g
s/PCIM_LINK_/PCIEM_LINK_/g

When this is MFC'd, #defines will be added for the old names to assist
out-of-tree drivers.

Discussed with: jhb
MFC after: 1 week


240082 04-Sep-2012 melifaro

Remove unneeded ipfw headers introduced in r213447 from Infiniband code.

MFC after: 2 weeks


239303 15-Aug-2012 hselasky

Streamline use of cdevpriv and correct some corner cases.

1) It is not useful to call "devfs_clear_cdevpriv()" from
"d_close" callbacks, hence for example read, write, ioctl and
so on might be sleeping at the time of "d_close" being called
and then then freed private data can still be accessed.
Examples: dtrace, linux_compat, ksyms (all fixed by this patch)

2) In sys/dev/drm* there are some cases in which memory will
be freed twice, if open fails, first by code in the open
routine, secondly by the cdevpriv destructor. Move registration
of the cdevpriv to the end of the drm open routines.

3) devfs_clear_cdevpriv() is not called if the "d_open" callback
registered cdevpriv data and the "d_open" callback function
returned an error. Fix this.

Discussed with: phk
MFC after: 2 weeks


239065 05-Aug-2012 kib

After the PHYS_TO_VM_PAGE() function was de-inlined, the main reason
to pull vm_param.h was removed. Other big dependency of vm_page.h on
vm_param.h are PA_LOCK* definitions, which are only needed for
in-kernel code, because modules use KBI-safe functions to lock the
pages.

Stop including vm_param.h into vm_page.h. Include vm_param.h
explicitely for the kernel code which needs it.

Suggested and reviewed by: alc
MFC after: 2 weeks


237563 25-Jun-2012 np

Fix clang warning when compiling iw_cxgb.

Reported by: rene, dim


237263 19-Jun-2012 np

- Updated TOE support in the kernel.

- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
These are available as t3_tom and t4_tom modules that augment cxgb(4)
and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as
usual with or without these extra features.

- iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the
works and will follow soon.

Build-tested with make universe.

30s overview
============
What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE

Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe

Which connections are offloaded? Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe

Reviewed by: bz, gnn
Sponsored by: Chelsio communications.
MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible)


234946 03-May-2012 melifaro

Revert r234834 per luigi@ request.

Cleaner solution (e.g. adding another header) should be done here.

Original log:
Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h.
Remove ipfw/ip_fw_private.h header from non-ipfw code.

Requested by: luigi
Approved by: kib(mentor)


234834 30-Apr-2012 melifaro

Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h.
Remove ipfw/ip_fw_private.h header from non-ipfw code.

Approved by: ae(mentor)
MFC after: 2 weeks


234618 23-Apr-2012 bz

Do not announce IPv6 TSO support yet. The driver seems to make assumptions
based on IPv4 header parsing only.

MFC after: 1 week


234183 12-Apr-2012 jhb

Add OFED and the associated options and drivers to x86 LINT builds:
- Mark 'sdp' as requiring 'inet'.
- Always include "opt_inet.h" and "opt_inet6.h" and modify the IB
driver Makefiles to honor WITH/WITHOUT_INET/INET6/_SUPPORT options
to determine what should be enabled during a module build.
- Fix the mlxen(4) driver and the core IB code to compile without
if INET is disabled (including when both INET and INET6 are disabled).

Reviewed by: bz
MFC after: 2 weeks


234182 12-Apr-2012 jhb

Don't update if_obytes when transmitting packets. That is already done
in IFQ_HANDOFF() when the packet is passed to the start routine, so doing
it here resulted in double counting.

Reported by: Alex Tutubalin lexa lexa ru
MFC after: 1 week


234099 10-Apr-2012 jhb

Properly parse 40G media types from newer Mellanox adapters that are
40G capable. For now, map all 40G links to 40GBase-CR4.

MFC after: 2 weeks


233870 04-Apr-2012 jhb

Fix build on i386.


233547 27-Mar-2012 jhb

Use VM_MEMATTR_UNCACHEABLE instead of VM_MEMATTR_UNCACHED for UC mappings.
VM_MEMATTR_UNCACHED is actually the x86-specific UC- mode (where a WC
MTRR can override the PAT setting).


233198 19-Mar-2012 jhb

Fix build of OFED bits with debugging options enabled.


233040 16-Mar-2012 jhb

Fix build with INET6 disabled.


230135 15-Jan-2012 uqs

Remove spurious 8bit chars, turning files into plain ASCII.


228469 13-Dec-2011 ed

Replace __signed by signed.

The signed keyword is an integral part of the C syntax. There's no need
to use __signed.


228443 12-Dec-2011 mdf

Do not define bool/true/false if the symbols already exist.

MFC after: 2 weeks
Sponsored by: Isilon Systems, LLC


227309 07-Nov-2011 ed

Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.

The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.


227293 07-Nov-2011 ed

Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.

This means that their use is restricted to a single C file.


226436 16-Oct-2011 eadler

- change "is is" to "is" or "it is"
- change "the the" to "the"

Approved by: lstewart
Approved by: sahil (mentor)
MFC after: 3 days


224914 16-Aug-2011 kib

Add the fo_chown and fo_chmod methods to struct fileops and use them
to implement fchown(2) and fchmod(2) support for several file types
that previously lacked it. Add MAC entries for chown/chmod done on
posix shared memory and (old) in-kernel posix semaphores.

Based on the submission by: glebius
Reviewed by: rwatson
Approved by: re (bz)


222813 07-Jun-2011 attilio

etire the cpumask_t type and replace it with cpuset_t usage.

This is intended to fix the bug where cpu mask objects are
capped to 32. MAXCPU, then, can now arbitrarely bumped to whatever
value. Anyway, as long as several structures in the kernel are
statically allocated and sized as MAXCPU, it is suggested to keep it
as low as possible for the time being.

Technical notes on this commit itself:
- More functions to handle with cpuset_t objects are introduced.
The most notable are cpusetobj_ffs() (which calculates a ffs(3)
for a cpuset_t object), cpusetobj_strprint() (which prepares a string
representing a cpuset_t object) and cpusetobj_strscan() (which
creates a valid cpuset_t starting from a string representation).
- pc_cpumask and pc_other_cpus are target to be removed soon.
With the moving from cpumask_t to cpuset_t they are now inefficient
and not really useful. Anyway, for the time being, please note that
access to pcpu datas is protected by sched_pin() in order to avoid
migrating the CPU while reading more than one (possible) word
- Please note that size of cpuset_t objects may differ between kernel
and userland. While this is not directly related to the patch itself,
it is good to understand that concept and possibly use the patch
as a reference on how to deal with cpuset_t objects in userland, when
accessing kernland members.
- KTR_CPUMASK is changed and now is represented through a string, to be
set as the example reported in NOTES.

Please additively note that no MAXCPU is bumped in this patch, but
private testing has been done until to MAXCPU=128 on a real 8x8x2(htt)
machine (amd64).

Please note that the FreeBSD version is not yet bumped because of
the upcoming pcpu changes. However, note that this patch is not
targeted for MFC.

People to thank for the time spent on this patch:
- sbruno, pluknet and Nicholas Esborn (nick AT desert DOT net) tested
several revision of the patches and really helped in improving
stability of this work.
- marius fixed several bugs in the sparc64 implementation and reviewed
patches related to ktr.
- jeff and jhb discussed the basic approach followed.
- kib and marcel made targeted review on some specific part of the
patch.
- marius, art, nwhitehorn and andreast reviewed MD specific part of
the patch.
- marius, andreast, gonzo, nwhitehorn and jceel tested MD specific
implementations of the patch.
- Other people have made contributions on other patches that have been
already committed and have been listed separately.

Companies that should be mentioned for having participated at several
degrees:
- Yahoo! for having offered the machines used for testing on big
count of CPUs.
- The FreeBSD Foundation for having sponsored my devsummit attendance,
which has been instrumental.
- Sandvine for having offered offices and infrastructure during
development.

(I really hope I didn't forget anyone, if it happened I apologize in
advance).


222330 26-May-2011 delphij

In ipoib_cm_handle_rx_wc(): Count incoming packets and
bytes toward incoming counters.

Reviewed by: jeff


221055 26-Apr-2011 jeff

- Catch up to falloc() changes.
- PHOLD() before using a task structure on the stack.
- Fix a LOR between the sleepq lock and thread lock in _intr_drain().


220555 12-Apr-2011 bz

Even though this block is not compiled currently, properly assign
CSUM_TSO to if_hwassist rather than if_capabilities to avoid future
errors.

Reviewed by: jeff


220016 26-Mar-2011 jeff

- Implement wake-on-lan support in mlxen.


219902 23-Mar-2011 jhb

Do a sweep of the tree replacing calls to pci_find_extcap() with calls to
pci_find_cap() instead.


219893 23-Mar-2011 jeff

- Correct the vlan filter programming. The device filter is built in
reverse order.
- Name the cq taskqueues according to whether they handle rx or tx.
- Default LRO to on.


219859 22-Mar-2011 jeff

- Don't use a separate set of rx queues for UDP, hash them into the same
set as TCP.
- Eliminate the fully linear non-scatter/gather rx path, there is no
harm in using arrays of clusters for both TCP and UDP.
- Implement support for enabling/disabling per-vlan priority pause and
queues via sysctl.


219846 21-Mar-2011 kib

Allow the ofed modules to be compiled on i386.

Reviewed by: jeff


219820 21-Mar-2011 jeff

- Merge in OFED 1.5.3 from projects/ofed/head