History log of /freebsd-current/sys/dev/netmap/netmap_freebsd.c
Revision Date Author Comments
# 71625ec9 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c comment pattern

Remove /^/[*/]\s*\$FreeBSD\$.*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# 5f6d3778 05-Apr-2023 Mark Johnston <markj@FreeBSD.org>

netmap: Handle packet batches in generic mode

ifnets are allowed to pass batches of multiple packets to if_input,
linked by the m_nextpkt pointer. iflib_rxeof() sometimes does this, for
example. Netmap's generic mode did not handle this and would only
deliver the first packet in the batch, leaking the rest.

PR: 270636
Reviewed by: vmaffione
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D39426


# ce12afaa 04-Apr-2023 Mark Johnston <markj@FreeBSD.org>

netmap: Fix queue stalls with generic interfaces

In emulated mode, the FreeBSD netmap port attempts to perform zero-copy
transmission. This works as follows: the kernel ring is populated with
mbuf headers to which netmap buffers are attached. When transmitting,
the mbuf refcount is initialized to 2, and when the counter value has
been decremented to 1 netmap infers that the driver has freed the mbuf
and thus transmission is complete.

This scheme does not generalize to the situation where netmap is
attaching to a software interface which may transmit packets among
multiple "queues", as is the case with bridge or lagg interfaces. In
that case, we would be relying on backing hardware drivers to free
transmitted mbufs promptly, but this isn't guaranteed; a driver may
reasonably defer freeing a small number of transmitted buffers
indefinitely. If such a buffer ends up at the tail of a netmap transmit
ring, further transmits can end up blocked indefinitely.

Fix the problem by removing the zero-copy scheme (which is also not
implemented in the Linux port of netmap). Instead, the kernel ring is
populated with regular mbuf clusters into which netmap buffers are
copied by nm_os_generic_xmit_frame(). The refcounting scheme is
preserved, and this lets us avoid allocating a fresh cluster per
transmitted packet in the common case. If the transmit ring is full, a
callout is used to free the "stuck" mbuf, avoiding the queue deadlock
described above.

Furthermore, when recycling mbuf clusters, be sure to fully reinitialize
the mbuf header instead of simply re-setting M_PKTHDR. Some software
interfaces, like if_vlan, may set fields in the header which should be
reset before the mbuf is reused.

Reviewed by: vmaffione
MFC after: 1 month
Sponsored by: Zenarmor
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D38065


# 6c9fe357 14-Mar-2023 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: get rid of save_if_input for emulated adapters

The save_if_input function pointer was meant to save the previous
value of ifp->if_input before replacing it with the emulated
adapter hook.
However, the same pointer value is already stored in the if_input
field of the netmap_adapter struct, to be used for host TX ring processing.

Reuse the netmap_adapter if_input field to simplify the code
and save some space.

MFC after: 14 days


# 626d1e4a 09-Mar-2023 Mark Johnston <markj@FreeBSD.org>

netmap: Remove obsolete compatibility defines

No functional change intended.

Reviewed by: vmaffione
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D39001


# e330262f 12-Jan-2023 Justin Hibbits <jhibbits@FreeBSD.org>

Mechanically convert netmap(4) to IfAPI

Reviewed by: vmaffione, zlei
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D37814


# fa3f6655 08-Feb-2023 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: drop redundant if_mtu assignment

Reported by: zlei
MFC after 3 days


# 110ce09c 13-Jan-2023 Tom Jones <thj@FreeBSD.org>

if_lagg: Allow lagg interfaces to be used with netmap

Reviewed by: zlei
Sponsored by: Zenarmor
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D37436


# 3da494d3 24-Dec-2022 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: drop compatibility FreeBSD code

Netmap users on FreeBSD are not supposed to import code from the
github netmap repository anymore. They should use the code that
is available in the src repo. We can therefore drop the compatibility
code.

MFC after: 7 days


# a9424e0f 10-May-2022 John Baldwin <jhb@FreeBSD.org>

netmap: Remove unused devclass arguments to DRIVER_MODULE.


# 45c67e8f 02-Apr-2021 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: several typo fixes

No functional changes intended.


# a6d768d8 29-Mar-2021 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: add kernel support for the "offsets" feature

This feature enables applications to ask netmap to transmit or
receive packets starting at a user-specified offset from the
beginning of the netmap buffer. This is meant to ease those
packet manipulation operations such as pushing or popping packet
headers, that may be useful to implement software switches,
routers and other packet processors.
To use the feature, drivers (e.g., iflib, vtnet, etc.) must have
explicit support. This change does not add support for any driver,
but introduces the necessary kernel changes. However, offsets support
is already included for VALE ports and pipes.


# ee7ffaa2 20-Mar-2021 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: fix issues in nm_os_extmem_create()

- Call vm_object_reference() before vm_map_lookup_done().
- Use vm_mmap_to_errno() to convert vm_map_* return values to errno.
- Fix memory leak of e->obj.

Reported by: markj
Reviewed by: markj
MFC after: 1 week


# 3cf3b4e6 21-Dec-2019 Jeff Roberson <jeff@FreeBSD.org>

Make page busy state deterministic on free. Pages must be xbusy when
removed from objects including calls to free. Pages must not be xbusy
when freed and not on an object. Strengthen assertions to match these
expectations. In practice very little code had to change busy handling
to meet these rules but we can now make stronger guarantees to busy
holders and avoid conditionally dropping busy in free.

Refine vm_page_remove() and vm_page_replace() semantics now that we have
stronger guarantees about busy state. This removes redundant and
potentially problematic code that has proliferated.

Discussed with: markj
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D22822


# 0012f373 14-Oct-2019 Jeff Roberson <jeff@FreeBSD.org>

(4/6) Protect page valid with the busy lock.

Atomics are used for page busy and valid state when the shared busy is
held. The details of the locking protocol and valid and dirty
synchronization are in the updated vm_page.h comments.

Reviewed by: kib, markj
Tested by: pho
Sponsored by: Netflix, Intel
Differential Revision: https://reviews.freebsd.org/D21594


# fee2a2fa 09-Sep-2019 Mark Johnston <markj@FreeBSD.org>

Change synchonization rules for vm_page reference counting.

There are several mechanisms by which a vm_page reference is held,
preventing the page from being freed back to the page allocator. In
particular, holding the page's object lock is sufficient to prevent the
page from being freed; holding the busy lock or a wiring is sufficent as
well. These references are protected by the page lock, which must
therefore be acquired for many per-page operations. This results in
false sharing since the page locks are external to the vm_page
structures themselves and each lock protects multiple structures.

Transition to using an atomically updated per-page reference counter.
The object's reference is counted using a flag bit in the counter. A
second flag bit is used to atomically block new references via
pmap_extract_and_hold() while removing managed mappings of a page.
Thus, the reference count of a page is guaranteed not to increase if the
page is unbusied, unmapped, and the object's write lock is held. As
a consequence of this, the page lock no longer protects a page's
identity; operations which move pages between objects are now
synchronized solely by the objects' locks.

The vm_page_wire() and vm_page_unwire() KPIs are changed. The former
requires that either the object lock or the busy lock is held. The
latter no longer has a return value and may free the page if it releases
the last reference to that page. vm_page_unwire_noq() behaves the same
as before; the caller is responsible for checking its return value and
freeing or enqueuing the page as appropriate. vm_page_wire_mapped() is
introduced for use in pmap_extract_and_hold(). It fails if the page is
concurrently being unmapped, typically triggering a fallback to the
fault handler. vm_page_wire() no longer requires the page lock and
vm_page_unwire() now internally acquires the page lock when releasing
the last wiring of a page (since the page lock still protects a page's
queue state). In particular, synchronization details are no longer
leaked into the caller.

The change excises the page lock from several frequently executed code
paths. In particular, vm_object_terminate() no longer bounces between
page locks as it releases an object's pages, and direct I/O and
sendfile(SF_NOCACHE) completions no longer require the page lock. In
these latter cases we now get linear scalability in the common scenario
where different threads are operating on different files.

__FreeBSD_version is bumped. The DRM ports have been updated to
accomodate the KPI changes.

Reviewed by: jeff (earlier version)
Tested by: gallatin (earlier version), pho
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20486


# 23ced944 01-Jul-2019 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: fix two panics with emulated adapter

This patch fixes 2 panics. The first one is due to the current VNET not
being set in the emulated adapter transmission path. The second one
is caused by the M_PKTHDR flag not being set when preallocated mbufs
are recycled in the transmit path.

Submitted by: aleksandr.fedorov@itglobal.com
Reviewed by: vmaffione
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20824


# 04e0c883 20-May-2019 Conrad Meyer <cem@FreeBSD.org>

Add two missing eventhandler.h headers

These are obviously missing from the .c files, but don't show up in any
tinderbox configuration (due to latent header pollution of some kind). It
seems some configurations don't have this pollution, and the includes are
obviously missing, so go ahead and add them.

Reported by: Peter Jeremy <peter AT rulingia.com>
X-MFC-With: r347984


# 45100257 18-Feb-2019 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: don't schedule kqueue notify task when kqueue is not used

This change adds a counter (kqueue_users) to keep track of how many
kqueue users are referencing a given struct nm_selinfo.
In this way, nm_os_selwakeup() can schedule the kevent notification
task only when kqueue is actually being used.
This is important to avoid wasting CPU in the common case where
kqueue is not used.

Reviewed by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D19177


# 75f4f3ed 04-Feb-2019 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: refactor logging macros and pipes

Changelist:
- Replace ND, D and RD macros with nm_prdis, nm_prinf, nm_prerr
and nm_prlim, to avoid possible naming conflicts.
- Add netmap_krings_mode_commit() helper function and use that
to reduce code duplication.
- Refactor pipes control code to export some functions that
can be reused by the veth driver (on Linux) and epair(4).
- Add check to reject API requests with version less than 11.
- Small code refactoring for the null adapter.

MFC after: 1 week


# 19c4ec08 30-Jan-2019 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: fix lock order reversal related to kqueue usage

When using poll(), select() or kevent() on netmap file descriptors,
netmap executes the equivalent of NIOCTXSYNC and NIOCRXSYNC commands,
before collecting the events that are ready. In other words, the
poll/kevent callback has side effects. This is done to avoid the
overhead of two system call per iteration (e.g., poll() + ioctl(NIOC*XSYNC)).

When the kqueue subsystem invokes the kqueue(9) f_event callback
(netmap_knrw), it holds the lock of the struct knlist object associated
to the netmap port (the lock is provided at initialization, by calling
knlist_init_mtx).
However, netmap_knrw() may need to wake up another netmap port (or even
the same one), which means that it may need to call knote().
Since knote() needs the lock of the struct knlist object associated to
the to-be-wake-up netmap port, it is possible to have a lock order reversal
problem (AB/BA deadlock).

This change prevents the deadlock by executing the knote() call in a
per-selinfo taskqueue, where it is possible to hold a mutex.

Reviewed by: aleksandr.fedorov_itglobal.com
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D18956


# a56136a1 29-Jan-2019 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: add notifications on kloop stop

On sync-kloop stop, send a wake-up signal to the kloop, so that
waiting for the timeout is not needed.
Also, improve logging in netmap_freebsd.c.

MFC after: 3 days


# 8c9874f5 23-Jan-2019 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: fix knote() argument to match the mutex state

The nm_os_selwakeup function needs to call knote() to wake up kqueue(9)
users. However, this function can be called from different code paths,
with different lock requirements.
This patch fixes the knote() call argument to match the relavant lock state.
Also, comments have been updated to reflect current code.

PR: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219846
Reported by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D18876


# b6e66be2 05-Dec-2018 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: align codebase to the current upstream (760279cfb2730a585)

Changelist:
- Replace netmap passthrough host support with a more general
mechanism to call TXSYNC/RXSYNC from an in-kernel event-loop.
No kernel threads are used to use this feature: the application
is required to spawn a thread (or a process) and issue a
SYNC_KLOOP_START (NIOCCTRL) command in the thread body. The
kernel loop is executed by the ioctl implementation, which returns
to userspace only when a different thread calls SYNC_KLOOP_STOP
or the netmap file descriptor is closed.
- Update the if_ptnet driver to cope with the new data structures,
and prune all the obsolete ptnetmap code.
- Add support for "null" netmap ports, useful to allocate netmap_if,
netmap_ring and netmap buffers to be used by specialized applications
(e.g. hypervisors). TXSYNC/RXSYNC on these ports have no effect.
- Various fixes and code refactoring.

Sponsored by: Sunny Valley Networks
Differential Revision: https://reviews.freebsd.org/D18015


# d55913f5 28-Nov-2018 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: set IFCAP_NETMAP in if_capabilities

Revision r307394 removed (by mistake) the code that sets IFCAP_NETMAP
in if_capabilities on netmap_attach. This patch reverts this change.

Reviewed by: np
Approved by: gnn (mentor)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D17987


# 2a7db7a6 23-Oct-2018 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: align codebase to the current upstream (sha 8374e1a7e6941)

Changelist:
- Move large parts of VALE code to a new file and header netmap_bdg.[ch].
This is useful to reuse the code within upcoming projects.
- Improvements and bug fixes to pipes and monitors.
- Introduce nm_os_onattach(), nm_os_onenter() and nm_os_onexit() to
handle differences between FreeBSD and Linux.
- Introduce some new helper functions to handle more host rings and fake
rings (netmap_all_rings(), netmap_real_rings(), ...)
- Added new sysctl to enable/disable hw checksum in emulated netmap mode.
- nm_inject: add support for NS_MOREFRAG

Approved by: gnn (mentor)
Differential Revision: https://reviews.freebsd.org/D17364


# 53e992cf 14-Aug-2018 David Bright <dab@FreeBSD.org>

Fix several memory leaks.

The libkqueue tests have several places that leak memory by using an
idiom like:

puts(kevent_to_str(kevp));

Rework to save the pointer returned from kevent_to_str() and then
free() it after it has been used.

Reported by: asomers (pointer to Coverity), Coverity
CID: 1296063, 1296064, 1296065, 1296066, 1296067, 1350287, 1394960
Sponsored by: Dell EMC


# 3535fae8 18-May-2018 Matt Macy <mmacy@FreeBSD.org>

netmap: compare e1 with e2, not with itself


# cfa866f6 17-May-2018 Matt Macy <mmacy@FreeBSD.org>

netmap: pull fix for 32-bit support from upstream

Approved by: sbruno


# 1315f9b5 13-Apr-2018 Brooks Davis <brooks@FreeBSD.org>

Fix build on 32-bit systems.


# 2ff91c17 12-Apr-2018 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: align codebase to the current upstream (commit id 3fb001303718146)

Changelist:
- Turn tx_rings and rx_rings arrays into arrays of pointers to kring
structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib,
vtnet and ptnet drivers to cope with the change.
- Generalize the nm_config() callback to accept a struct containing many
parameters.
- Introduce NKR_FAKERING to support buffers sharing (used for netmap
pipes)
- Improved API for external VALE modules.
- Various bug fixes and improvements to the netmap memory allocator,
including support for externally (userspace) allocated memory.
- Refactoring of netmap pipes: now linked rings share the same netmap
buffers, with a separate set of kring pointers (rhead, rcur, rtail).
Buffer swapping does not need to happen anymore.
- Large refactoring of the control API towards an extensible solution;
the goal is to allow the addition of more commands and extension of
existing ones (with new options) without the need of hacks or the
risk of running out of configuration space.
A new NIOCCTRL ioctl has been added to handle all the requests of the
new control API, which cover all the functionalities so far supported.
The netmap API bumps from 11 to 12 with this patch. Full backward
compatibility is provided for the old control command (NIOCREGIF), by
means of a new netmap_legacy module. Many parts of the old netmap.h
header has now been moved to netmap_legacy.h (included by netmap.h).

Approved by: hrs (mentor)


# 4f80b14c 09-Apr-2018 Vincenzo Maffione <vmaffione@FreeBSD.org>

netmap: align codebase to upstream version v11.4

Changelist:
- remove unused nkr_slot_flags
- new nm_intr adapter callback to enable/disable interrupts
- remove unused sysctls and document the other sysctls
- new infrastructure to support NS_MOREFRAG for NIC ports
- support for external memory allocator (for now linux-only),
including linux-specific changes in common headers
- optimizations within netmap pipes datapath
- improvements on VALE control API
- new nm_parse() helper function in netmap_user.h
- various bug fixes and code clean up

Approved by: hrs (mentor)


# 718cf2cc 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/dev: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.


# 4dd44461 20-Jul-2017 Luiz Otavio O Souza <loos@FreeBSD.org>

Restore the changes done in r313982: Replace zero with NULL for pointers.

Spotted by: Harry Schmalzbauer
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC (Netgate)


# c3e9b4db 12-Jun-2017 Luiz Otavio O Souza <loos@FreeBSD.org>

Update the current version of netmap to bring it in sync with the github
version.

This commit contains mostly refactoring, a few fixes and minor added
functionality.

Submitted by: Vincenzo Maffione <v.maffione at gmail.com>
Requested by: many
Sponsored by: Rubicon Communications, LLC (Netgate)


# 4d24901a 19-Feb-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/dev: Replace zero with NULL for pointers.

Makes things easier to read, plus architectures may set NULL to something
different than zero.

Found with: devel/coccinelle
MFC after: 3 weeks


# 67ca1051 01-Jan-2017 Adrian Chadd <adrian@FreeBSD.org>

[netmap] call RLOCK /and/ RUNLOCK.

Reported by: olivier


# 869d8878 30-Dec-2016 Adrian Chadd <adrian@FreeBSD.org>

[netmap] fix locking regressions

* Firmware oriented NICs may need to sleep in their configuration paths.
Use RLOCK instead of WLOCK to allow this to again occur.

This fixes netmap on cxgbe.

* Change the worker lock to a normal mutex rather than a spin lock.
Drivers shouldn't be doing netmap work from the fast interrupt
handlers, so it's not required to be a spinlock.

Submitted by: luigi, Vincenzo Maffione <v.maffione@gmail.com>
Reviewed by: jhb


# 54c7693f 29-Nov-2016 Ed Maste <emaste@FreeBSD.org>

netmap: add cast to fix powerpc64 LINT kernel

Attempt to fix powerpc64 LINT kernel broken by r308000. Netmap's use of
a uint64_t wchan seems odd, but in the interest of minimizing this
change just cast through uintptr_t to silence the compiler warning.

Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D8669


# 844a6f0c 27-Oct-2016 Luigi Rizzo <luigi@FreeBSD.org>

Various fixes for ptnet/ptnetmap (passthrough of netmap ports). In detail:
- use PCI_VENDOR and PCI_DEVICE ids from a publicly allocated range
(thanks to RedHat)
- export memory pool information through PCI registers
- improve mechanism for configuring passthrough on different hypervisors
Code is from Vincenzo Maffione as a follow up to his GSOC work.


# ffaa5deb 21-Oct-2016 Sepherosa Ziehau <sephe@FreeBSD.org>

netmap: Unbreak LINT-VIMAGE building

Sponsored by: Microsoft


# e3f94e51 21-Oct-2016 Sepherosa Ziehau <sephe@FreeBSD.org>

netmap: Unbreak i386 LINT building

Sponsored by: Microsoft


# a2a74091 18-Oct-2016 Luigi Rizzo <luigi@FreeBSD.org>

remove stale and unused code from various files
fix build on 32 bit platforms
simplify logic in netmap_virt.h

The commands (in net/netmap.h) to configure communication with the
hypervisor may be revised soon.
At the moment they are unused so this will not be a change of API.


# 37e3a6d3 16-Oct-2016 Luigi Rizzo <luigi@FreeBSD.org>

Import the current version of netmap, aligned with the one on github.

This commit, long overdue, contains contributions in the last 2 years
from Stefano Garzarella, Giuseppe Lettieri, Vincenzo Maffione, including:
+ fixes on monitor ports
+ the 'ptnet' virtual device driver, and ptnetmap backend, for
high speed virtual passthrough on VMs (bhyve fixes in an upcoming commit)
+ improved emulated netmap mode
+ more robust error handling
+ removal of stale code
+ various fixes to code and documentation (some mixup between RX and TX
parameters, and private and public variables)

We also include an additional tool, nmreplay, which is functionally
equivalent to tcpreplay but operating on netmap ports.


# 847adfb7 19-Jul-2015 Luigi Rizzo <luigi@FreeBSD.org>

add a use count so the netmap module cannot be unloaded while in use.


# 5f94000e 13-Jul-2015 Luigi Rizzo <luigi@FreeBSD.org>

set the refcount for the structure (dropped by mistake in the last commit).


# 8fd44c93 10-Jul-2015 Luigi Rizzo <luigi@FreeBSD.org>

staticize functions only used in netmap.c
(detected by jenkins run with gcc 4.9)

Update documentation on the use of netmap_priv_d,
rename the refcount and use the same structure in
FreeBSD and linux

No functional changes.


# 847bf383 09-Jul-2015 Luigi Rizzo <luigi@FreeBSD.org>

Sync netmap sources with the version in our private tree.
This commit contains large contributions from Giuseppe Lettieri and
Stefano Garzarella, is partly supported by grants from Verisign and Cisco,
and brings in the following:

- fix zerocopy monitor ports and introduce copying monitor ports
(the latter are lower performance but give access to all traffic
in parallel with the application)

- exclusive open mode, useful to implement solutions that recover
from crashes of the main netmap client (suggested by Patrick Kelsey)

- revised memory allocator in preparation for the 'passthrough mode'
(ptnetmap) recently presented at bsdcan. ptnetmap is described in
S. Garzarella, G. Lettieri, L. Rizzo;
Virtual device passthrough for high speed VM networking,
ACM/IEEE ANCS 2015, Oakland (CA) May 2015
http://info.iet.unipi.it/~luigi/research.html

- fix rx CRC handing on ixl

- add module dependencies for netmap when building drivers as modules

- minor simplifications to device-specific routines (*txsync, *rxsync)

- general code cleanup (remove unused variables, introduce macros
to access rings and remove duplicate code,

Applications do not need to be recompiled, unless of course
they want to use the new features (monitors and exclusive open).

Those willing to try this code on stable/10 can just update the
sys/dev/netmap/*, sys/net/netmap* with the version in HEAD
and apply the small patches to individual device drivers.

MFC after: 1 month
Sponsored by: (partly) Verisign, Cisco


# 735c8d95 23-Feb-2015 Luigi Rizzo <luigi@FreeBSD.org>

add MODULE_VERSION, needed to track module dependencies

MFC after: 3 days


# c2529042 01-Dec-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

Additional notes:
- The SCTP code changes will be committed as a separate patch.
- Removal of the "M_FLOWID" flag will also be done separately.
- The FreeBSD version has been bumped.

MFC after: 1 month
Sponsored by: Mellanox Technologies


# 0e73f29a 12-Nov-2014 Luigi Rizzo <luigi@FreeBSD.org>

add support for private knote lock (reduces lock contention),
adapting OS_selrecord accordingly.
Problem and fix suggested by adrian and jmg


# 4e93beff 10-Nov-2014 Luigi Rizzo <luigi@FreeBSD.org>

initialize *color if passed as an argument


# 4bf50f18 16-Aug-2014 Luigi Rizzo <luigi@FreeBSD.org>

Update to the current version of netmap.
Mostly bugfixes or features developed in the past 6 months,
so this is a 10.1 candidate.

Basically no user API changes (some bugfixes in sys/net/netmap_user.h).

In detail:

1. netmap support for virtio-net, including in netmap mode.
Under bhyve and with a netmap backend [2] we reach over 1Mpps
with standard APIs (e.g. libpcap), and 5-8 Mpps in netmap mode.

2. (kernel) add support for multiple memory allocators, so we can
better partition physical and virtual interfaces giving access
to separate users. The most visible effect is one additional
argument to the various kernel functions to compute buffer
addresses. All netmap-supported drivers are affected, but changes
are mechanical and trivial

3. (kernel) simplify the prototype for *txsync() and *rxsync()
driver methods. All netmap drivers affected, changes mostly mechanical.

4. add support for netmap-monitor ports. Think of it as a mirroring
port on a physical switch: a netmap monitor port replicates traffic
present on the main port. Restrictions apply. Drive carefully.

5. if_lem.c: support for various paravirtualization features,
experimental and disabled by default.
Most of these are described in our ANCS'13 paper [1].
Paravirtualized support in netmap mode is new, and beats the
numbers in the paper by a large factor (under qemu-kvm,
we measured gues-host throughput up to 10-12 Mpps).

A lot of refactoring and additional documentation in the files
in sys/dev/netmap, but apart from #2 and #3 above, almost nothing
of this stuff is visible to other kernel parts.

Example programs in tools/tools/netmap have been updated with bugfixes
and to support more of the existing features.

This is meant to go into 10.1 so we plan an MFC before the Aug.22 deadline.

A lot of this code has been contributed by my colleagues at UNIPI,
including Giuseppe Lettieri, Vincenzo Maffione, Stefano Garzarella.

MFC after: 3 days.


# fcc34a23 11-Jul-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Fix style bug: rename the refcount field of m_ext to ext_cnt, to match
other members.

Sponsored by: Nginx, Inc.


# e4166283 06-Jun-2014 Luigi Rizzo <luigi@FreeBSD.org>

better handling of netmap emulation over standard device drivers:
plug a potential mbuf leak, and detect bogus drivers that
return ENOBUFS even when the packet has been queued.

MFC after: 3 days


# 0dc809c0 06-Jun-2014 Luigi Rizzo <luigi@FreeBSD.org>

move netmap_getna() to a freebsd-specific file


# 5a067ae1 19-Feb-2014 Luigi Rizzo <luigi@FreeBSD.org>

compile with NOINET


# f0ea3689 14-Feb-2014 Luigi Rizzo <luigi@FreeBSD.org>

This new version of netmap brings you the following:

- netmap pipes, providing bidirectional blocking I/O while moving
100+ Mpps between processes using shared memory channels
(no mistake: over one hundred million. But mind you, i said
*moving* not *processing*);

- kqueue support (BHyVe needs it);

- improved user library. Just the interface name lets you select a NIC,
host port, VALE switch port, netmap pipe, and individual queues.
The upcoming netmap-enabled libpcap will use this feature.

- optional extra buffers associated to netmap ports, for applications
that need to buffer data yet don't want to make copies.

- segmentation offloading for the VALE switch, useful between VMs.

and a number of bug fixes and performance improvements.

My colleagues Giuseppe Lettieri and Vincenzo Maffione did a substantial
amount of work on these features so we owe them a big thanks.

There are some external repositories that can be of interest:

https://code.google.com/p/netmap
our public repository for netmap/VALE code, including
linux versions and other stuff that does not belong here,
such as python bindings.

https://code.google.com/p/netmap-libpcap
a clone of the libpcap repository with netmap support.
With this any libpcap client has access to most netmap
feature with no recompilation. E.g. tcpdump can filter
packets at 10-15 Mpps.

https://code.google.com/p/netmap-ipfw
a userspace version of ipfw+dummynet which uses netmap
to send/receive packets. Speed is up in the 7-10 Mpps
range per core for simple rulesets.

Both netmap-libpcap and netmap-ipfw will be merged upstream at some
point, but while this happens it is useful to have access to them.

And yes, this code will be merged soon. It is infinitely better
than the version currently in 10 and 9.

MFC after: 3 days


# 17885a7b 05-Jan-2014 Luigi Rizzo <luigi@FreeBSD.org>

It is 2014 and we have a new version of netmap.
Most relevant features:

- netmap emulation on any NIC, even those without native netmap support.

On the ixgbe we have measured about 4Mpps/core/queue in this mode,
which is still a lot more than with sockets/bpf.

- seamless interconnection of VALE switch, NICs and host stack.

If you disable accelerations on your NIC (say em0)

ifconfig em0 -txcsum -txcsum

you can use the VALE switch to connect the NIC and the host stack:

vale-ctl -h valeXX:em0

allowing sharing the NIC with other netmap clients.

- THE USER API HAS SLIGHTLY CHANGED (head/cur/tail pointers
instead of pointers/count as before). This was unavoidable to support,
in the future, multiple threads operating on the same rings.
Netmap clients require very small source code changes to compile again.
On the plus side, the new API should be easier to understand
and the internals are a lot simpler.

The manual page has been updated extensively to reflect the current
features and give some examples.

This is the result of work of several people including Giuseppe Lettieri,
Vincenzo Maffione, Michio Honda and myself, and has been financially
supported by EU projects CHANGE and OPENLAB, from NetApp University
Research Fund, NEC, and of course the Universita` di Pisa.


# f9790aeb 15-Dec-2013 Luigi Rizzo <luigi@FreeBSD.org>

split netmap code according to functions:
- netmap.c base code
- netmap_freebsd.c FreeBSD-specific code
- netmap_generic.c emulate netmap over standard drivers
- netmap_mbq.c simple mbuf tailq
- netmap_mem2.c memory management
- netmap_vale.c VALE switch

simplify devce-specific code