History log of /freebsd-current/sys/dev/vmware/vmxnet3/if_vmx.c
Revision Date Author Comments
# 3ff0dc1a 10-Jun-2024 Kristof Provost <kp@FreeBSD.org>

vmxnet3: make descriptor count checks more robust

When we update credits there is a potential for a race causing an
overflow of vxcr_next (i.e. incrementing it past vxcr_ndesc). Change the
check to >= rather than == to be more robust against this.

Reviewed by: emaste
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D43712


# 8c127413 31-Aug-2023 Kristof Provost <kp@FreeBSD.org>

vmxnet3: do restart on VLAN changes

At least one user reports issues with vmx interfaces after 725e4008ef,
where we default to not resetting the interface on VLAN changes. This
was on an ESXi 7.0.3 setup.

Reported by: Marcos Mendoza <mmendoza@netgate.com>
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")


# b6b75424 24-Aug-2023 Kevin Bowling <kbowling@FreeBSD.org>

vmxnet3: Don't restart on VLAN changes

In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.

This re-init is unintentional for vmxnet3(4).

MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558


# 51e23514 16-Aug-2023 Marius Strobl <marius@FreeBSD.org>

iflib drivers: Constify PCI ID LUTs

Since d49e83eac3baf16a22b1c5d42e8438b68b17e6f9, iflib(9) is ready
for this change.
While at it, make isc_driver_version strings (static) const where
not apparently un-const on purpose, too.
This reduces the size of the amd64 GENERIC by about 10 KiB.


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 402810d3 20-Oct-2021 Justin Hibbits <jhibbits@FreeBSD.org>

Convert iflib(4) and iflib-based drivers to the DrvAPI

Summary:
Convert iflib(4) and the following drivers:
* axgbe
* em
* ice
* ixl
* vmxnet

Sponsored by: Juniper Networks, Inc.
Reviewed by: kbowling, #iflib
Differential Revision: https://reviews.freebsd.org/D37768


# 43df074d 06-May-2022 John Baldwin <jhb@FreeBSD.org>

vmware: Remove unused devclass arguments to DRIVER_MODULE.


# 94b09888 14-Dec-2021 Mateusz Guzik <mjg@FreeBSD.org>

vmx: plug set-but-not-used var

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 9c612a5d 06-Dec-2021 Andriy Gapon <avg@FreeBSD.org>

vmxnet3: skip zero-length descriptor in the middle of a packet

Passing up such descriptors to iflib is obviously wasteful.
But the main conern is that we may overrun iri_frags array because of
them. That's been observed in practice.

Also, assert that the number of fragments / descriptors / segments is
less than IFLIB_MAX_RX_SEGS.

Reviewed by: gallatin, pkelsey
MFC after: 3 weeks
Sponsored by: Panzura LLC
Differential Revision: https://reviews.freebsd.org/D33189


# 81be6552 18-Dec-2020 Matt Macy <mmacy@FreeBSD.org>

iflib: ensure that tx interrupts enabled and cleanups

Doing a 'dd' over iscsi will reliably cause stalls. Tx
cleaning _should_ reliably happen as data is sent.
However, currently if the transmit queue fills it will
wait until the iflib timer (hz/2) runs.

This change causes the the tx taskq thread to be run
if there are completed descriptors.

While here:

- make timer interrupt delay a sysctl

- simplify txd_db_check handling

- comment on INTR types

Background on the change:

Initially doorbell updates were minimized by only writing to the register
on every fourth packet. If txq_drain would return without writing to the
doorbell it scheduled a callout on the next tick to do the doorbell write
to ensure that the write otherwise happened "soon". At that time a sysctl
was added for users to avoid the potential added latency by simply writing
to the doorbell register on every packet. This worked perfectly well for
e1000 and ixgbe ... and appeared to work well on ixl. However, as it
turned out there was a race to this approach that would lockup the ixl MAC.
It was possible for a lower producer index to be written after a higher one.
On e1000 and ixgbe this was harmless - on ixl it was fatal. My initial
response was to add a lock around doorbell writes - fixing the problem but
adding an unacceptable amount of lock contention.

The next iteration was to use transmit interrupts to drive delayed doorbell
writes. If there were no packets in the queue all doorbell writes would be
immediate as the queue started to fill up we could delay doorbell writes
further and further. At the start of drain if we've cleaned any packets we
know we've moved the state machine along and we write the doorbell (an
obvious missing optimization was to skip that doorbell write if db_pending
is zero). This change required that tx interrupts be scheduled periodically
as opposed to just when the hardware txq was full. However, that just leads
to our next problem.

Initially dedicated msix vectors were used for both tx and rx. However, it
was often possible to use up all available vectors before we set up all the
queues we wanted. By having rx and tx share a vector for a given queue we
could halve the number of vectors used by a given configuration. The problem
here is that with this change only e1000 passed the necessary value to have
the fast interrupt drive tx when appropriate.

Reported by: mav@
Tested by: mav@
Reviewed by: gallatin@
MFC after: 1 month
Sponsored by: iXsystems
Differential Revision: https://reviews.freebsd.org/D27683


# 4eb2ed07 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

vmware: clean up empty lines in .c and .h files


# 35d8a463 01-Sep-2020 Vincenzo Maffione <vmaffione@FreeBSD.org>

iflib: leave only 1 receive descriptor unused

The pidx argument of isc_rxd_flush() indicates which is the last valid
receive descriptor to be used by the NIC. However, current code has
multiple issues:
- Intel drivers write pidx to their RDT register, which means that
NICs will only use the descriptors up to pidx-1 (modulo ring size N),
and won't actually use the one pointed by pidx. This does not break
reception, but it is anyway confusing and suboptimal (the NIC will
actually see only N-2 descriptors as available, rather than N-1).
Other drivers (if_vmx, if_bnxt, if_mgb) adhere to this semantic).
- The semantic used by Intel (RDT is one descriptor past the last
valid one) is used by most (if not all) NICs, and it is also used
on the TX side (also in iflib). Since iflib is not currently
using this semantic for RX, it must decrement fl->ifl_pidx
(modulo N) before calling isc_rxd_flush(), and then the
per-driver callback implementation must increment the index
again (to match the real semantic). This is confusing and suboptimal.
- The iflib refill function is also called at initialization.
However, in case the ring size is smaller than 128 (e.g. if_mgb),
the refill function will actually prepare all the receive
descriptors (N), without leaving one unused, as most of NICs assume
(e.g. to avoid RDT to overrun RDH). I can speculate that the code
looks like this right now because this issue showed up during
testing (e.g. with if_mgb), and it was easy to workaround by
decrementing pidx before isc_rxd_flush().

The goal of this change is to simplify the code (removing a bunch
of instructions from the RX fast path), and to make the semantic of
isc_rxd_flush() consistent across drivers. To achieve this, we:
- change the semantics of the pidx argument to the usual one (that
is the index one past the last valid one), so that both iflib and
drivers avoid the decrement/increment dance.
- fix the initialization code to prepare at most N-1 descriptors.

Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D26191


# 5d1899ee 16-Mar-2020 Patrick Kelsey <pkelsey@FreeBSD.org>

Restore power-of-2 queue count constraint from r290948

When vmx(4) was converted to an iflib driver in r343291, the
power-of-2 queue count constraint was removed as it appeared that
current implementations of the VMXNET3 virtual device no longer
required that constraint. It turns out that some of the
implementations still do, and on such systems, the device will fail to
initialize when configured with a non-power-of-2 RX or TX queue count.

PR: 237321
Reported by: ncrogers@gmail.com
MFC after: 1 week


# 1342c8c6 14-Mar-2020 Patrick Kelsey <pkelsey@FreeBSD.org>

Adjust if_vmx default receive parameters for better out-of-box performance

These adjustments improve performance with jumbo frames and/or LRO
enabled (i.e., when there may be multiple descriptors per packet) by
increasing the default size of the receive queues and by always using
page-sized buffers for the body type receive ring.

This patch also adjust the initialization of the max frame size to
remove cases where certain configuration sequences would result in 2K
receive buffers being used instead of 4K ones when jumbo frames were
enabled.

Reviewed by: gallatin
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23950


# f50375ee 14-Mar-2020 Patrick Kelsey <pkelsey@FreeBSD.org>

Fix if_vmx receive checksum offload bug and harden against the device skipping receive descriptors

This fixes a bug where the checksum offload status of received packets
was being taken from the first descriptor instead of the last, which
affected LRO packets.

The driver has been hardened against the device skipping receive
descriptors, although it is not believed that this can occur given the
way this implementation configures the receive rings.

Additionally, for packets received with the error indicator set, the
driver now forces the length of all fragments in that packet to zero
prior to passing it to iflib. Such packets should wind up being
discarded at some point in the stack anyway, but this removes any
questions by killing them in the driver.

Counters have been added (and exposed via sysctls) for skipped receive
descriptors, zero-length packets received, and packets received with
the error indicator set so that these conditions can be easily
observed in the field.

PR: 243126, 243392, 240628
Reported by: avg, alexandr.oleynikov@gmail.com, Harald Schmalzbauer
Reviewed by: gallatin
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23949


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# 281cab4d 23-Jan-2020 Andriy Gapon <avg@FreeBSD.org>

vmxnet3: add support for RSS kernel option

We observe at least one problem: if a UDP socket is connect(2)-ed, then a
received packet that matches the connection cannot be matched to the
corresponding PCB because of an incorrect flow ID. That was oberved for DNS
requests from the libc resolver. We got this problem because FreeBSD
r343291 enabled code that can set rsstype of received packets to values
other than M_HASHTYPE_OPAQUE_HASH. Earlier that code was under 'ifdef
notyet'.

The essence of this change is to use the system-wide RSS key instead of
some historic hardcoded key when the software RSS is enabled and it is
configured to use Toeplitz algorithm (the default).
In all other cases, the driver reports the opaque hash type for received
packets while still using Toeplitz algorithm with the internal key.

PR: 242890
Reviewed by: pkelsey
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D23147


# f55f37d9 13-Jan-2020 Vincenzo Maffione <vmaffione@FreeBSD.org>

vmx: fix initialization of TSO related descriptor fields

Fix a mistake introduced by r343291, which ported the vmx(4)
driver to iflib.
In case of TSO, the hlen field of the (first) tx descriptor must
be initialized to the cumulative length of Ethernet, IP and TCP
headers. The length of the TCP header was missing.

PR: 236999
Reported by: pkelsey
Reviewed by: avg
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22967


# d6b5965b 21-Oct-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Convert to if_foreach_llmaddr() KPI.


# 41669133 30-Sep-2019 Mark Johnston <markj@FreeBSD.org>

Add IFLIB_SINGLE_IRQ_RX_ONLY.

As of r347221 the iflib legacy interrupt mode setup assumes that drivers
perform both receive and transmit processing from the interrupt handler.
This assumption is invalid in the vmxnet3 driver, so introduce the
IFLIB_SINGLE_IRQ_RX_ONLY flag to make iflib avoid tx processing in the
interrupt handler.

PR: 239118
Reported and tested by: Juraj Lutter <otis@sk.freebsd.org>
Obtained from: marius
Reviewed by: gallatin
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D21831


# 769d56ec 02-Feb-2019 Patrick Kelsey <pkelsey@FreeBSD.org>

Fix interrupt index configuratoin when using MSI interrupts.

When in MSI mode, the device was only being configured with one
interrupt index, but it needs two - one for the actual interrupt and
one to park the tx queue at.

Also clarified comments relating to interrupt index assignment.

Reported by: Yuri Pankov <yuripv@yuripv.net>
MFC after: 1 day


# b97de13a 30-Jan-2019 Marius Strobl <marius@FreeBSD.org>

- Stop iflib(4) from leaking MSI messages on detachment by calling
bus_teardown_intr(9) before pci_release_msi(9).
- Ensure that iflib(4) and associated drivers pass correct RIDs to
bus_release_resource(9) by obtaining the RIDs via rman_get_rid(9)
on the corresponding resources instead of using the RIDs initially
passed to bus_alloc_resource_any(9) as the latter function may
change those RIDs. Solely em(4) for the ioport resource (but not
others) and bnxt(4) were using the correct RIDs by caching the ones
returned by bus_alloc_resource_any(9).
- Change the logic of iflib_msix_init() around to only map the MSI-X
BAR if MSI-X is actually supported, i. e. pci_msix_count(9) returns
> 0. Otherwise the "Unable to map MSIX table " message triggers for
devices that simply don't support MSI-X and the user may think that
something is wrong while in fact everything works as expected.
- Put some (mostly redundant) debug messages emitted by iflib(4)
and em(4) during attachment under bootverbose. The non-verbose
output of em(4) seen during attachment now is close to the one
prior to the conversion to iflib(4).
- Replace various variants of spelling "MSI-X" (several in messages)
with "MSI-X" as used in the PCI specifications.
- Remove some trailing whitespace from messages emitted by iflib(4)
and change them to consistently start with uppercase.
- Remove some obsolete comments about releasing interrupts from
drivers and correct a few others.

Reviewed by: erj, Jacob Keller, shurd
Differential Revision: https://reviews.freebsd.org/D18980


# 8f82136a 21-Jan-2019 Patrick Kelsey <pkelsey@FreeBSD.org>

onvert vmx(4) to being an iflib driver.

Also, expose IFLIB_MAX_RX_SEGS to iflib drivers and add
iflib_dma_alloc_align() to the iflib API.

Performance is generally better with the tunable/sysctl
dev.vmx.<index>.iflib.tx_abdicate=1.

Reviewed by: shurd
MFC after: 1 week
Relnotes: yes
Sponsored by: RG Nets
Differential Revision: https://reviews.freebsd.org/D18761


# d7c5a620 18-May-2018 Matt Macy <mmacy@FreeBSD.org>

ifnet: Replace if_addr_lock rwlock with epoch + mutex

Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree
4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32
4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32
4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32
4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32
4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32
4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32
4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32

After the patch

InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree
5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51
5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51
5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51
5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51
5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52
5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by: gallatin
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15366


# ac2fffa4 21-Jan-2018 Pedro F. Giffuni <pfg@FreeBSD.org>

Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by: wosch
PR: 225197


# 26c1d774 13-Jan-2018 Pedro F. Giffuni <pfg@FreeBSD.org>

dev: make some use of mallocarray(9).

Focus on code where we are doing multiplications within malloc(9). None of
these is likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.


# 1b108b19 27-Nov-2017 Bryan Venteicher <bryanv@FreeBSD.org>

Correctly report the vmxnet3 link down media status

Reported by: lew@perftech.com
MFC after: 1 week


# ced98d78 25-May-2017 Andriy Gapon <avg@FreeBSD.org>

fix vmxnet3 crash when LRO is enabled

The crash can occur when all of the following conditions are true:
- a packet consists of multiple segements (requires LRO enabled)
- there has been a failure to allocate an mbuf for the packet and
the packet has to be dropped
- a host (vmware) still owned at least one segment of the packet,
so the driver had to wait for another interrupt to proceed to
discarding the remaning segment(s)

Reviewed by: rstone
MFC after: 2 weeks
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D10874


# d0cefbdc 18-Jul-2016 Alexander Motin <mav@FreeBSD.org>

Update if_hwassist when interface options are changed.

In particular for me this fixes checksum problem when if_bridge attached
to the interface requests TXCSUM to be disabled, but effectively ignored.

MFC after: 3 days
Sponsored by: iXsystems, Inc.


# 19419093 16-Nov-2015 John Baldwin <jhb@FreeBSD.org>

Only use a power of 2 for the number of receive and transmit queues.
Using other values causes VMXNET3_CMD_ENABLE to fail. The Linux
driver also enforces this restriction.

Reviewed by: bryanv
MFC after: 1 week
Sponsored by: Norse
Differential Revision: https://reviews.freebsd.org/D4139


# c2529042 01-Dec-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

Additional notes:
- The SCTP code changes will be committed as a separate patch.
- Removal of the "M_FLOWID" flag will also be done separately.
- The FreeBSD version has been bumped.

MFC after: 1 month
Sponsored by: Mellanox Technologies


# 0d286aa4 25-Sep-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Provide vmxnet3_get_counter() to return counters that are not collected,
but taken from hardware.


# 9fd573c3 22-Sep-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Improve transmit sending offload, TSO, algorithm in general.

The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

Reviewed by: adrian, rmacklem
Sponsored by: Mellanox Technologies
MFC after: 1 week


# 72f31000 13-Sep-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Revert r271504. A new patch to solve this issue will be made.

Suggested by: adrian @


# eb93b77a 13-Sep-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Improve transmit sending offload, TSO, algorithm in general.

The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# a88f19d8 02-Jul-2014 Bryan Venteicher <bryanv@FreeBSD.org>

Remove some write only variables

MFC after: 3 days


# efd48b30 28-Jun-2014 Bryan Venteicher <bryanv@FreeBSD.org>

Give each interrupt a descriptive name when using MSIX

MFC after: 3 days


# 2a87457c 19-Jun-2014 Bryan Venteicher <bryanv@FreeBSD.org>

Increment the pending packets more aggressively for TSO

Assume the number of description used is reasonable value to
increment this otherwise opaque field by.

While here, reduce a minor difference between the legacy and
multiqueue transmit paths.

MFC after: 1 week


# 1204e374 19-Jun-2014 Bryan Venteicher <bryanv@FreeBSD.org>

Handle multiple calls to rxq_eof for single packet completion

This requires the VMware vmxnet3 device to flip the start of packet
descriptor's generation before the rest of the packet's descriptors
have been loaded into the Rx ring. I've never observed this behavior,
and it seems to make the most sense not to do it this way. But it is
not a lot of work for the driver to handle this situation just in case.

MFC after: 1 week


# d701c199 19-Jun-2014 Bryan Venteicher <bryanv@FreeBSD.org>

Remove an unnecessary kick of the host at the end of transmitting

MFC after: 1 week


# 5098dbb2 18-Jun-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix GCC compile warning: Variable(s) can be used uninitialized.


# 9945063d 14-Jun-2014 Bryan Venteicher <bryanv@FreeBSD.org>

Don't check the dma map address for a static DMA mapping against NULL
to determine if the mapping is valid.

Submitted by: jhb


# e41318fc 08-Jun-2014 Bryan Venteicher <bryanv@FreeBSD.org>

Fix TSO support on VMware Fusion

Apparently for VMware Fusion (and presumably VMware Workstation/Player
since the PR states TSO is broken there too, but I cannot test), the
TCP header pseudo checksum calculated should only include the protocol
(IPPROTO_TCP) value, not also the lengths as the stack does instead.

VMware ESXi seems to ignore whatever value is in the TCP header checksum,
and it is a bit surprising there is a different behavior between the
VMware products. And it is unfortunate that on ESXi we are forced to do
this extra bit of work.

PR: kern/185849
MFC after: 3 days


# 75f4c886 08-Jun-2014 Bryan Venteicher <bryanv@FreeBSD.org>

Remove an unnecessary variable reassignment

And it would be bad if 'm' was different from '*m0' at this
point, since we've already populated the SG list.

MFC after: 3 days


# c7156fe9 06-Jun-2014 Luigi Rizzo <luigi@FreeBSD.org>

make sure if_transmit returns 0 if the mbuf is enqueued.
ixgbe/ixv.c still needs a similar fix but it takes a little
more restructuring of the code.

MFC after: 3 days


# e557c1dd 16-Mar-2014 Bryan Venteicher <bryanv@FreeBSD.org>

Add Tx/Rx multiqueue support to vmx(4)

As a prerequisite for multiple queues, the guest must have MSIX enabled.
Unfortunately, to work around device passthrough bugs, FreeBSD disables
MSIX when running as a VMWare guest due to the hw.pci.honor_msi_blacklist
tunable; this tunable must be disabled for multiple queues.

Also included is various minor changes from the projects/vmxnet branch.

MFC after: 1 month


# b245f96c 12-Mar-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Since 32-bit if_baudrate isn't enough to describe a baud rate of a 10 Gbit
interface, in the r241616 a crutch was provided. It didn't work well, and
finally we decided that it is time to break ABI and simply make if_baudrate
a 64-bit value. Meanwhile, the entire struct if_data was reviewed.

o Remove the if_baudrate_pf crutch.

o Make all fields of struct if_data fixed machine independent size. The
notion of data (packet counters, etc) are by no means MD. And it is a
bug that on amd64 we've got a 64-bit counters, while on i386 32-bit,
which at modern speeds overflow within a second.

This also removes quite a lot of COMPAT_FREEBSD32 code.

o Give 16 bit for the ifi_datalen field. This field was provided to
make future changes to if_data less ABI breaking. Unfortunately the
8 bit size of it had effectively limited sizeof if_data to 256 bytes.

o Give 32 bits to ifi_mtu and ifi_metric.
o Give 64 bits to the rest of fields, since they are counters.

__FreeBSD_version bumped.

Discussed with: emax
Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# c3322cb9 28-Oct-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Include necessary headers that now are available due to pollution
via if_var.h.

Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# 76039bc8 26-Oct-2013 Gleb Smirnoff <glebius@FreeBSD.org>

The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# ce3be286 11-Oct-2013 Bryan Venteicher <bryanv@FreeBSD.org>

Do not provide a hint of the guest's OS version

The calculation can overflow if __FreeBSD_version is big
enough, and it does not appear to be required.

Reported by: grehan
Approved by: re (gjb)


# 3c5dfe89 29-Aug-2013 Bryan Venteicher <bryanv@FreeBSD.org>

Few more minor if_vmx tweaks

- Allow the Rx/Tx queue sizes to be configured by tunables
- Bail out earlier if the Tx queue unlikely has enough free
descriptors to hold the frame
- Cleanup some of the offloading capabilities handling


# 3c965775 26-Aug-2013 Bryan Venteicher <bryanv@FreeBSD.org>

Couple minor if_vmx tweaks

- Use queue size fields from the Tx/Rx queues in various places
instead of (currently the same values) from the softc.
- Fix potential crash in detach if the attached failed to alloc
queue memory.
- Move the VMXNET3_MAX_RX_SEGS define to a better spot.
- Tweak frame size calculation w.r.t. ETHER_ALIGN. This could be
tweaked some more, or removed since it probably doesn't matter
much for x86 (and the x86 class of machines this driver will
be used on).


# e3c97c2c 23-Aug-2013 Bryan Venteicher <bryanv@FreeBSD.org>

Add vmx(4), a VMware VMXNET3 ethernet driver ported from OpenBSD