#
73fdbfb9 |
|
26-Mar-2024 |
Tom Jones <thj@FreeBSD.org> |
netmap: Address errors on memory free in netmap_generic netmap_generic keeps a pool of mbufs for handling transfers, these mbufs have an external buffer attached to them. If some cases other parts of the network stack can chain these mbufs, when this happens the normal pool destructor function can end up free'ing the pool mbufs twice: - A first time if a pool mbuf has been chained with another mbuf when its chain is freed - A second time when its entry in the pool is freed Additionally, if other parts of the stack demote a pool mbuf its interface reference will be cleared. In this case we deference a NULL pointer when trying to free the mbuf through the destructor. Store a reference to the adapter in ext_arg1 with the destructor callback so we can find the correct adapter when free'ing a pool mbuf. This change enables using netmap with epair interfaces. Reviewed By: vmaffione MFC after: 1 week Relnotes: yes Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D44371
|
#
99efa2c8 |
|
27-Dec-2023 |
Mark Johnston <markj@FreeBSD.org> |
netmap: Ignore errors in CSB_WRITE() The CSB_WRITE() and _READ() macros respectively write to and read from userspace memory and so can in principle fault. However, we do not check for errors and will proceed blindly if they fail. Add assertions to verify that they do not. This is in preparation for annotating copyin() and related functions with __result_use_check. Reviewed by: vmaffione MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D43200
|
#
2ff63af9 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .h pattern Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/
|
#
4d846d26 |
|
10-May-2023 |
Warner Losh <imp@FreeBSD.org> |
spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
|
#
ce12afaa |
|
04-Apr-2023 |
Mark Johnston <markj@FreeBSD.org> |
netmap: Fix queue stalls with generic interfaces In emulated mode, the FreeBSD netmap port attempts to perform zero-copy transmission. This works as follows: the kernel ring is populated with mbuf headers to which netmap buffers are attached. When transmitting, the mbuf refcount is initialized to 2, and when the counter value has been decremented to 1 netmap infers that the driver has freed the mbuf and thus transmission is complete. This scheme does not generalize to the situation where netmap is attaching to a software interface which may transmit packets among multiple "queues", as is the case with bridge or lagg interfaces. In that case, we would be relying on backing hardware drivers to free transmitted mbufs promptly, but this isn't guaranteed; a driver may reasonably defer freeing a small number of transmitted buffers indefinitely. If such a buffer ends up at the tail of a netmap transmit ring, further transmits can end up blocked indefinitely. Fix the problem by removing the zero-copy scheme (which is also not implemented in the Linux port of netmap). Instead, the kernel ring is populated with regular mbuf clusters into which netmap buffers are copied by nm_os_generic_xmit_frame(). The refcounting scheme is preserved, and this lets us avoid allocating a fresh cluster per transmitted packet in the common case. If the transmit ring is full, a callout is used to free the "stuck" mbuf, avoiding the queue deadlock described above. Furthermore, when recycling mbuf clusters, be sure to fully reinitialize the mbuf header instead of simply re-setting M_PKTHDR. Some software interfaces, like if_vlan, may set fields in the header which should be reset before the mbuf is reused. Reviewed by: vmaffione MFC after: 1 month Sponsored by: Zenarmor Sponsored by: OPNsense Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D38065
|
#
6c9fe357 |
|
14-Mar-2023 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: get rid of save_if_input for emulated adapters The save_if_input function pointer was meant to save the previous value of ifp->if_input before replacing it with the emulated adapter hook. However, the same pointer value is already stored in the if_input field of the netmap_adapter struct, to be used for host TX ring processing. Reuse the netmap_adapter if_input field to simplify the code and save some space. MFC after: 14 days
|
#
22bf2a47 |
|
11-Mar-2023 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: get rid of WNA() macro MFC after: 7 days
|
#
e330262f |
|
12-Jan-2023 |
Justin Hibbits <jhibbits@FreeBSD.org> |
Mechanically convert netmap(4) to IfAPI Reviewed by: vmaffione, zlei Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D37814
|
#
854b2f30 |
|
23-Jan-2023 |
Mark Johnston <markj@FreeBSD.org> |
netmap: Correct a comment Reviewed by: vmaffione MFC after: 1 week Sponsored by: Zenarmor Sponsored by: OPNsense Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D38063
|
#
92e8b4a6 |
|
24-Dec-2022 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
debug_put_get: don't crash on null pointers MFC after: 7 days
|
#
3da494d3 |
|
24-Dec-2022 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: drop compatibility FreeBSD code Netmap users on FreeBSD are not supposed to import code from the github netmap repository anymore. They should use the code that is available in the src repo. We can therefore drop the compatibility code. MFC after: 7 days
|
#
4ad57c7a |
|
03-Dec-2022 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap_update_config: update na->name to cope with reconfigurations MFC after: 1 week
|
#
21cc0918 |
|
16-Aug-2021 |
Elliott Mitchell <ehem+freebsd@m5p.com> |
sys: Nuke double-semicolons A distinct number of double-semicolons have ended up in FreeBSD. Take a pass at getting rid of many of these harmless typos. Reviewed by: emaste, rrs Pull Request: https://github.com/freebsd/freebsd-src/pull/609 Differential Revision: https://reviews.freebsd.org/D31716
|
#
dd6ab49a |
|
06-Mar-2022 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: add a tunable for the maximum number of VALE switches The new dev.netmap.max_bridges sysctl tunable can be set in loader.conf(5) to change the default maximum number of VALE switches that can be created. Current defaults is 8. MFC after: 2 weeks
|
#
98399ab0 |
|
22-Aug-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: import changes from upstream - make sure rings are disabled during resets - introduce netmap_update_hostrings_mode(), with support for multiple host rings - always initialize ni_bufs_head in netmap_if ni_bufs_head was not properly initialized when no external buffers were requestedx and contained the ni_bufs_head from the last request. This was causing spurious buffer frees when alternating between apps that used external buffers and apps that did not use them. - check na validitity under lock on detach - netmap_mem: fix leak on error path - nm_dispatch: fix compilation on Raspberry Pi MFC after: 2 weeks
|
#
13c46411 |
|
17-Apr-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: make sure rings are disabled during resets Explicitly disable ring synchronization before calling callbacks that may result in a hardware reset. Before this patch we relied on capturing the down/up events which, however, may not be issued by all drivers.
|
#
45c67e8f |
|
02-Apr-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: several typo fixes No functional changes intended.
|
#
66671ae5 |
|
02-Apr-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: fix typo bug in netmap_compute_buf_len
|
#
a6d768d8 |
|
29-Mar-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: add kernel support for the "offsets" feature This feature enables applications to ask netmap to transmit or receive packets starting at a user-specified offset from the beginning of the netmap buffer. This is meant to ease those packet manipulation operations such as pushing or popping packet headers, that may be useful to implement software switches, routers and other packet processors. To use the feature, drivers (e.g., iflib, vtnet, etc.) must have explicit support. This change does not add support for any driver, but introduces the necessary kernel changes. However, offsets support is already included for VALE ports and pipes.
|
#
ee0005f1 |
|
24-Jan-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: simplify parameter passing Changes imported from the netmap github.
|
#
66823237 |
|
11-Jun-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: introduce netmap_kring_on() This function returns NULL if the ring identified by queue id and direction is in netmap mode. Otherwise return the corresponding kring. Use this function to replace vtnet_netmap_queue_on(). MFC after: 1 week
|
#
723180da |
|
07-Feb-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: improve netmap(4) and vale(4) man pages Clean up obsolete sysctl descriptions and add missing ones. PR: 243838 Reviewed by: bcr MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23546
|
#
2ec213ab |
|
13-Jan-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: disable passthrough with no hypervisor support The netmap passthrough subsystem requires proper support in the hypervisor. In particular, two PCI device ids (from the Red Hat PCI vendor id 0x1b36) need to be assigned to the two netmap virtual devices. We then disable these devices until the ids have not been assigned, in order to avoid conflicts with other virtual devices emulated by upstream QEMU. PR: 241774 MFC after: 3 days
|
#
253b2ec1 |
|
01-Sep-2019 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: import changes from upstream (SHA 137f537eae513) - Rework option processing. - Use larger integers for memory size values in the memory management code. MFC after: 2 weeks
|
#
45100257 |
|
18-Feb-2019 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: don't schedule kqueue notify task when kqueue is not used This change adds a counter (kqueue_users) to keep track of how many kqueue users are referencing a given struct nm_selinfo. In this way, nm_os_selwakeup() can schedule the kevent notification task only when kqueue is actually being used. This is important to avoid wasting CPU in the common case where kqueue is not used. Reviewed by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com> MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D19177
|
#
75f4f3ed |
|
04-Feb-2019 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: refactor logging macros and pipes Changelist: - Replace ND, D and RD macros with nm_prdis, nm_prinf, nm_prerr and nm_prlim, to avoid possible naming conflicts. - Add netmap_krings_mode_commit() helper function and use that to reduce code duplication. - Refactor pipes control code to export some functions that can be reused by the veth driver (on Linux) and epair(4). - Add check to reject API requests with version less than 11. - Small code refactoring for the null adapter. MFC after: 1 week
|
#
5faab778 |
|
02-Feb-2019 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: upgrade sync-kloop support Add SYNC_KLOOP_MODE option, and add support for direct mode, where application executes the TXSYNC and RXSYNC in the context of the ioeventfd wake up callback. MFC after: 5 days
|
#
19c4ec08 |
|
30-Jan-2019 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: fix lock order reversal related to kqueue usage When using poll(), select() or kevent() on netmap file descriptors, netmap executes the equivalent of NIOCTXSYNC and NIOCRXSYNC commands, before collecting the events that are ready. In other words, the poll/kevent callback has side effects. This is done to avoid the overhead of two system call per iteration (e.g., poll() + ioctl(NIOC*XSYNC)). When the kqueue subsystem invokes the kqueue(9) f_event callback (netmap_knrw), it holds the lock of the struct knlist object associated to the netmap port (the lock is provided at initialization, by calling knlist_init_mtx). However, netmap_knrw() may need to wake up another netmap port (or even the same one), which means that it may need to call knote(). Since knote() needs the lock of the struct knlist object associated to the to-be-wake-up netmap port, it is possible to have a lock order reversal problem (AB/BA deadlock). This change prevents the deadlock by executing the knote() call in a per-selinfo taskqueue, where it is possible to hold a mutex. Reviewed by: aleksandr.fedorov_itglobal.com MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D18956
|
#
f79ba6d7 |
|
23-Jan-2019 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: improvements to the netmap kloop (CSB mode) Changelist: - Add the proper memory barriers in the kloop ring processing functions. - Fix memory barriers usage in the user helpers (nm_sync_kloop_appl_write, nm_sync_kloop_appl_read). - Fix nm_kr_txempty() helper to look at rhead rather than rcur. This is important since the kloop can read a value of rcur which is ahead of the value of rhead (see explanation in nm_sync_kloop_appl_write) - Remove obsolete ptnetmap_guest_write_kring_csb() and ptnet_guest_read_kring_csb(), and update if_ptnet(4) to use those. - Prepare in advance the arguments for netmap_sync_kloop_[tr]x_ring(), to make the kloop faster. - Provide kernel and user implementation for nm_ldld_barrier() and nm_ldst_barrier() MFC after: 2 weeks
|
#
8c9874f5 |
|
23-Jan-2019 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: fix knote() argument to match the mutex state The nm_os_selwakeup function needs to call knote() to wake up kqueue(9) users. However, this function can be called from different code paths, with different lock requirements. This patch fixes the knote() call argument to match the relavant lock state. Also, comments have been updated to reflect current code. PR: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219846 Reported by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com> Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D18876
|
#
77a2baf5 |
|
21-Dec-2018 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: move buf_size validation code to its own function This code validates the netmap buf_size against the interface MTU and maximum descriptor size, to make sure the values are consistent. Moving this functionality to its own function is needed because this function is also called by Linux-specific code. MFC after: 3 days
|
#
b6e66be2 |
|
05-Dec-2018 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: align codebase to the current upstream (760279cfb2730a585) Changelist: - Replace netmap passthrough host support with a more general mechanism to call TXSYNC/RXSYNC from an in-kernel event-loop. No kernel threads are used to use this feature: the application is required to spawn a thread (or a process) and issue a SYNC_KLOOP_START (NIOCCTRL) command in the thread body. The kernel loop is executed by the ioctl implementation, which returns to userspace only when a different thread calls SYNC_KLOOP_STOP or the netmap file descriptor is closed. - Update the if_ptnet driver to cope with the new data structures, and prune all the obsolete ptnetmap code. - Add support for "null" netmap ports, useful to allocate netmap_if, netmap_ring and netmap buffers to be used by specialized applications (e.g. hypervisors). TXSYNC/RXSYNC on these ports have no effect. - Various fixes and code refactoring. Sponsored by: Sunny Valley Networks Differential Revision: https://reviews.freebsd.org/D18015
|
#
2a7db7a6 |
|
23-Oct-2018 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: align codebase to the current upstream (sha 8374e1a7e6941) Changelist: - Move large parts of VALE code to a new file and header netmap_bdg.[ch]. This is useful to reuse the code within upcoming projects. - Improvements and bug fixes to pipes and monitors. - Introduce nm_os_onattach(), nm_os_onenter() and nm_os_onexit() to handle differences between FreeBSD and Linux. - Introduce some new helper functions to handle more host rings and fake rings (netmap_all_rings(), netmap_real_rings(), ...) - Added new sysctl to enable/disable hw checksum in emulated netmap mode. - nm_inject: add support for NS_MOREFRAG Approved by: gnn (mentor) Differential Revision: https://reviews.freebsd.org/D17364
|
#
cfa866f6 |
|
17-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
netmap: pull fix for 32-bit support from upstream Approved by: sbruno
|
#
2ff91c17 |
|
12-Apr-2018 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: align codebase to the current upstream (commit id 3fb001303718146) Changelist: - Turn tx_rings and rx_rings arrays into arrays of pointers to kring structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib, vtnet and ptnet drivers to cope with the change. - Generalize the nm_config() callback to accept a struct containing many parameters. - Introduce NKR_FAKERING to support buffers sharing (used for netmap pipes) - Improved API for external VALE modules. - Various bug fixes and improvements to the netmap memory allocator, including support for externally (userspace) allocated memory. - Refactoring of netmap pipes: now linked rings share the same netmap buffers, with a separate set of kring pointers (rhead, rcur, rtail). Buffer swapping does not need to happen anymore. - Large refactoring of the control API towards an extensible solution; the goal is to allow the addition of more commands and extension of existing ones (with new options) without the need of hacks or the risk of running out of configuration space. A new NIOCCTRL ioctl has been added to handle all the requests of the new control API, which cover all the functionalities so far supported. The netmap API bumps from 11 to 12 with this patch. Full backward compatibility is provided for the old control command (NIOCREGIF), by means of a new netmap_legacy module. Many parts of the old netmap.h header has now been moved to netmap_legacy.h (included by netmap.h). Approved by: hrs (mentor)
|
#
4f80b14c |
|
09-Apr-2018 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: align codebase to upstream version v11.4 Changelist: - remove unused nkr_slot_flags - new nm_intr adapter callback to enable/disable interrupts - remove unused sysctls and document the other sysctls - new infrastructure to support NS_MOREFRAG for NIC ports - support for external memory allocator (for now linux-only), including linux-specific changes in common headers - optimizations within netmap pipes datapath - improvements on VALE control API - new nm_parse() helper function in netmap_user.h - various bug fixes and code clean up Approved by: hrs (mentor)
|
#
46023447 |
|
04-Apr-2018 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: align if_ptnet guest driver to the upstream code (commit 0e15788) The change upgrades the driver to use the split Communication Status Block (CSB) format. In this way the variables written by the guest and read by the host are allocated in a different cacheline than the variables written by the host and read by the guest; this is needed to avoid cache thrashing. Approved by: hrs (mentor)
|
#
718cf2cc |
|
27-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys/dev: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.
|
#
c3e9b4db |
|
12-Jun-2017 |
Luiz Otavio O Souza <loos@FreeBSD.org> |
Update the current version of netmap to bring it in sync with the github version. This commit contains mostly refactoring, a few fixes and minor added functionality. Submitted by: Vincenzo Maffione <v.maffione at gmail.com> Requested by: many Sponsored by: Rubicon Communications, LLC (Netgate)
|
#
844a6f0c |
|
27-Oct-2016 |
Luigi Rizzo <luigi@FreeBSD.org> |
Various fixes for ptnet/ptnetmap (passthrough of netmap ports). In detail: - use PCI_VENDOR and PCI_DEVICE ids from a publicly allocated range (thanks to RedHat) - export memory pool information through PCI registers - improve mechanism for configuring passthrough on different hypervisors Code is from Vincenzo Maffione as a follow up to his GSOC work.
|
#
a2a74091 |
|
18-Oct-2016 |
Luigi Rizzo <luigi@FreeBSD.org> |
remove stale and unused code from various files fix build on 32 bit platforms simplify logic in netmap_virt.h The commands (in net/netmap.h) to configure communication with the hypervisor may be revised soon. At the moment they are unused so this will not be a change of API.
|
#
225d33ff |
|
18-Oct-2016 |
Sean Bruno <sbruno@FreeBSD.org> |
Restore svn r306772 that was overwritten by netmap import at svn r307394 #include <sys/selinfo.h> should be here as all drivers that support netmap need to use this file regardless.
|
#
37e3a6d3 |
|
16-Oct-2016 |
Luigi Rizzo <luigi@FreeBSD.org> |
Import the current version of netmap, aligned with the one on github. This commit, long overdue, contains contributions in the last 2 years from Stefano Garzarella, Giuseppe Lettieri, Vincenzo Maffione, including: + fixes on monitor ports + the 'ptnet' virtual device driver, and ptnetmap backend, for high speed virtual passthrough on VMs (bhyve fixes in an upcoming commit) + improved emulated netmap mode + more robust error handling + removal of stale code + various fixes to code and documentation (some mixup between RX and TX parameters, and private and public variables) We also include an additional tool, nmreplay, which is functionally equivalent to tcpreplay but operating on netmap ports.
|
#
87de0cd1 |
|
06-Oct-2016 |
Sean Bruno <sbruno@FreeBSD.org> |
Move netmap selinfo.h in to sensible location. netmap_kern.h currently requires all drivers including it to include selinfo.h. Submitted by: mmacy@nextbsd.org Reviewed by: gnn MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D5334
|
#
847adfb7 |
|
19-Jul-2015 |
Luigi Rizzo <luigi@FreeBSD.org> |
add a use count so the netmap module cannot be unloaded while in use.
|
#
85fe4e7c |
|
19-Jul-2015 |
Luigi Rizzo <luigi@FreeBSD.org> |
small documentation update
|
#
8fd44c93 |
|
10-Jul-2015 |
Luigi Rizzo <luigi@FreeBSD.org> |
staticize functions only used in netmap.c (detected by jenkins run with gcc 4.9) Update documentation on the use of netmap_priv_d, rename the refcount and use the same structure in FreeBSD and linux No functional changes.
|
#
847bf383 |
|
09-Jul-2015 |
Luigi Rizzo <luigi@FreeBSD.org> |
Sync netmap sources with the version in our private tree. This commit contains large contributions from Giuseppe Lettieri and Stefano Garzarella, is partly supported by grants from Verisign and Cisco, and brings in the following: - fix zerocopy monitor ports and introduce copying monitor ports (the latter are lower performance but give access to all traffic in parallel with the application) - exclusive open mode, useful to implement solutions that recover from crashes of the main netmap client (suggested by Patrick Kelsey) - revised memory allocator in preparation for the 'passthrough mode' (ptnetmap) recently presented at bsdcan. ptnetmap is described in S. Garzarella, G. Lettieri, L. Rizzo; Virtual device passthrough for high speed VM networking, ACM/IEEE ANCS 2015, Oakland (CA) May 2015 http://info.iet.unipi.it/~luigi/research.html - fix rx CRC handing on ixl - add module dependencies for netmap when building drivers as modules - minor simplifications to device-specific routines (*txsync, *rxsync) - general code cleanup (remove unused variables, introduce macros to access rings and remove duplicate code, Applications do not need to be recompiled, unless of course they want to use the new features (monitors and exclusive open). Those willing to try this code on stable/10 can just update the sys/dev/netmap/*, sys/net/netmap* with the version in HEAD and apply the small patches to individual device drivers. MFC after: 1 month Sponsored by: (partly) Verisign, Cisco
|
#
0e73f29a |
|
12-Nov-2014 |
Luigi Rizzo <luigi@FreeBSD.org> |
add support for private knote lock (reduces lock contention), adapting OS_selrecord accordingly. Problem and fix suggested by adrian and jmg
|
#
039dd540 |
|
10-Nov-2014 |
Luigi Rizzo <luigi@FreeBSD.org> |
in the Linux section, properly define the NMG_LOCK type. Also import WITH_GENERIC in preparation to adding fine-grained options to disable specific netmap components.
|
#
6435a0dc |
|
10-Nov-2014 |
Luigi Rizzo <luigi@FreeBSD.org> |
fix a typo
|
#
7f154b71 |
|
25-Sep-2014 |
Luigi Rizzo <luigi@FreeBSD.org> |
adapt the code to different freebsd versions. Not necessary to MFC
|
#
997d2d83 |
|
31-Aug-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Provide pointer from struct ifnet to struct netmap_adapter, instead of abusing spare field.
|
#
9721a22d |
|
20-Aug-2014 |
Navdeep Parhar <np@FreeBSD.org> |
Change netmap's global lock to sx instead of a mutex. Reviewed by: luigi@ MFC after: 1 day
|
#
4bf50f18 |
|
16-Aug-2014 |
Luigi Rizzo <luigi@FreeBSD.org> |
Update to the current version of netmap. Mostly bugfixes or features developed in the past 6 months, so this is a 10.1 candidate. Basically no user API changes (some bugfixes in sys/net/netmap_user.h). In detail: 1. netmap support for virtio-net, including in netmap mode. Under bhyve and with a netmap backend [2] we reach over 1Mpps with standard APIs (e.g. libpcap), and 5-8 Mpps in netmap mode. 2. (kernel) add support for multiple memory allocators, so we can better partition physical and virtual interfaces giving access to separate users. The most visible effect is one additional argument to the various kernel functions to compute buffer addresses. All netmap-supported drivers are affected, but changes are mechanical and trivial 3. (kernel) simplify the prototype for *txsync() and *rxsync() driver methods. All netmap drivers affected, changes mostly mechanical. 4. add support for netmap-monitor ports. Think of it as a mirroring port on a physical switch: a netmap monitor port replicates traffic present on the main port. Restrictions apply. Drive carefully. 5. if_lem.c: support for various paravirtualization features, experimental and disabled by default. Most of these are described in our ANCS'13 paper [1]. Paravirtualized support in netmap mode is new, and beats the numbers in the paper by a large factor (under qemu-kvm, we measured gues-host throughput up to 10-12 Mpps). A lot of refactoring and additional documentation in the files in sys/dev/netmap, but apart from #2 and #3 above, almost nothing of this stuff is visible to other kernel parts. Example programs in tools/tools/netmap have been updated with bugfixes and to support more of the existing features. This is meant to go into 10.1 so we plan an MFC before the Aug.22 deadline. A lot of this code has been contributed by my colleagues at UNIPI, including Giuseppe Lettieri, Vincenzo Maffione, Stefano Garzarella. MFC after: 3 days.
|
#
46aa1303 |
|
09-Jun-2014 |
Luigi Rizzo <luigi@FreeBSD.org> |
sync the code with the one in stable/10 (wrap the if_t compatibilty function into a __FreeBSD_version conditional block)
|
#
89cc2556 |
|
06-Jun-2014 |
Luigi Rizzo <luigi@FreeBSD.org> |
align comments with the ones in our development trunk
|
#
43ed1d3c |
|
05-Jun-2014 |
Luigi Rizzo <luigi@FreeBSD.org> |
whitespace change: remove trailing whitespace
|
#
62d76917 |
|
02-Jun-2014 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Introduce a procedural interface to the ifnet structure. The new interface allows the ifnet structure to be defined as an opaque type in NIC drivers. This then allows the ifnet structure to be changed without a need to change or recompile NIC drivers. Put differently, NIC drivers can be written and compiled once and be used with different network stack implementations, provided of course that those network stack implementations have an API and ABI compatible interface. This commit introduces the 'if_t' type to replace 'struct ifnet *' as the type of a network interface. The 'if_t' type is defined as 'void *' to enable the compiler to perform type conversion to 'struct ifnet *' and vice versa where needed and without warnings. The functions that implement the API are the only functions that need to have an explicit cast. The MII code has been converted to use the driver API to avoid unnecessary code churn. Code churn comes from having to work with both converted and unconverted drivers in correlation with having callback functions that take an interface. By converting the MII code first, the callback functions can be defined so that the compiler will perform the typecasts automatically. As soon as all drivers have been converted, the if_t type can be redefined as needed and the API functions can be fix to not need an explicit cast. The immediate benefactors of this change are: 1. Juniper Networks - The network stack implementation in Junos is entirely different from FreeBSD's one and this change allows Juniper to build "stock" NIC drivers that can be used in combination with both the FreeBSD and Junos stacks. 2. FreeBSD - This change opens the door towards changing ifnet and implementing new features and optimizations in the network stack without it requiring a change in the many NIC drivers FreeBSD has. Submitted by: Anuranjan Shukla <anshukla@juniper.net> Reviewed by: glebius@ Obtained from: Juniper Networks, Inc.
|
#
f0ea3689 |
|
14-Feb-2014 |
Luigi Rizzo <luigi@FreeBSD.org> |
This new version of netmap brings you the following: - netmap pipes, providing bidirectional blocking I/O while moving 100+ Mpps between processes using shared memory channels (no mistake: over one hundred million. But mind you, i said *moving* not *processing*); - kqueue support (BHyVe needs it); - improved user library. Just the interface name lets you select a NIC, host port, VALE switch port, netmap pipe, and individual queues. The upcoming netmap-enabled libpcap will use this feature. - optional extra buffers associated to netmap ports, for applications that need to buffer data yet don't want to make copies. - segmentation offloading for the VALE switch, useful between VMs. and a number of bug fixes and performance improvements. My colleagues Giuseppe Lettieri and Vincenzo Maffione did a substantial amount of work on these features so we owe them a big thanks. There are some external repositories that can be of interest: https://code.google.com/p/netmap our public repository for netmap/VALE code, including linux versions and other stuff that does not belong here, such as python bindings. https://code.google.com/p/netmap-libpcap a clone of the libpcap repository with netmap support. With this any libpcap client has access to most netmap feature with no recompilation. E.g. tcpdump can filter packets at 10-15 Mpps. https://code.google.com/p/netmap-ipfw a userspace version of ipfw+dummynet which uses netmap to send/receive packets. Speed is up in the 7-10 Mpps range per core for simple rulesets. Both netmap-libpcap and netmap-ipfw will be merged upstream at some point, but while this happens it is useful to have access to them. And yes, this code will be merged soon. It is infinitely better than the version currently in 10 and 9. MFC after: 3 days
|
#
fb25194f |
|
07-Jan-2014 |
Luigi Rizzo <luigi@FreeBSD.org> |
fix use after free when releasing a netmap adapter. Submitted by: Giuseppe Lettieri
|
#
17885a7b |
|
05-Jan-2014 |
Luigi Rizzo <luigi@FreeBSD.org> |
It is 2014 and we have a new version of netmap. Most relevant features: - netmap emulation on any NIC, even those without native netmap support. On the ixgbe we have measured about 4Mpps/core/queue in this mode, which is still a lot more than with sockets/bpf. - seamless interconnection of VALE switch, NICs and host stack. If you disable accelerations on your NIC (say em0) ifconfig em0 -txcsum -txcsum you can use the VALE switch to connect the NIC and the host stack: vale-ctl -h valeXX:em0 allowing sharing the NIC with other netmap clients. - THE USER API HAS SLIGHTLY CHANGED (head/cur/tail pointers instead of pointers/count as before). This was unavoidable to support, in the future, multiple threads operating on the same rings. Netmap clients require very small source code changes to compile again. On the plus side, the new API should be easier to understand and the internals are a lot simpler. The manual page has been updated extensively to reflect the current features and give some examples. This is the result of work of several people including Giuseppe Lettieri, Vincenzo Maffione, Michio Honda and myself, and has been financially supported by EU projects CHANGE and OPENLAB, from NetApp University Research Fund, NEC, and of course the Universita` di Pisa.
|
#
2e159ef0 |
|
16-Dec-2013 |
Luigi Rizzo <luigi@FreeBSD.org> |
fix the build using __builtin_prefetch() instead of redefining prefetch()
|
#
f9790aeb |
|
15-Dec-2013 |
Luigi Rizzo <luigi@FreeBSD.org> |
split netmap code according to functions: - netmap.c base code - netmap_freebsd.c FreeBSD-specific code - netmap_generic.c emulate netmap over standard drivers - netmap_mbq.c simple mbuf tailq - netmap_mem2.c memory management - netmap_vale.c VALE switch simplify devce-specific code
|
#
ce3ee1e7 |
|
01-Nov-2013 |
Luigi Rizzo <luigi@FreeBSD.org> |
update to the latest netmap snapshot. This includes the following: - use separate memory regions for VALE ports - locking fixes - some simplifications in the NIC-specific routines - performance improvements for the VALE switch - some new features in the pkt-gen test program - documentation updates There are small API changes that require programs to be recompiled (NETMAP_API has been bumped so you will detect old binaries at runtime). In particular: - struct netmap_slot now is 16 bytes to support an extra pointer, which may save one data copy when using VALE ports or VMs; - the struct netmap_if has two extra fields; MFC after: 3 days
|
#
f18be576 |
|
30-May-2013 |
Luigi Rizzo <luigi@FreeBSD.org> |
Bring in a number of new features, mostly implemented by Michio Honda: - the VALE switch now support up to 254 destinations per switch, unicast or broadcast (multicast goes to all ports). - we can attach hw interfaces and the host stack to a VALE switch, which means we will be able to use it more or less as a native bridge (minor tweaks still necessary). A 'vale-ctl' program is supplied in tools/tools/netmap to attach/detach ports the switch, and list current configuration. - the lookup function in the VALE switch can be reassigned to something else, similar to the pf hooks. This will enable attaching the firewall, or other processing functions (e.g. in-kernel openvswitch) directly on the netmap port. The internal API used by device drivers does not change. Userspace applications should be recompiled because we bump NETMAP_API as we now use some fields in the struct nmreq that were previously ignored -- otherwise, data structures are the same. Manpages will be committed separately.
|
#
849bec0e |
|
30-Apr-2013 |
Luigi Rizzo <luigi@FreeBSD.org> |
Partial cleanup in preparation for upcoming changes: - netmap_rx_irq()/netmap_tx_irq() can now be called by FreeBSD drivers hiding the logic for handling NIC interrupts in netmap mode. This also simplifies the case of NICs attached to VALE switches. Individual drivers will be updated with separate commits. - use the same refcount() API for FreeBSD and linux - plus some comments, typos and formatting fixes Portions contributed by Michio Honda
|
#
d4b42e08 |
|
29-Apr-2013 |
Luigi Rizzo <luigi@FreeBSD.org> |
whitespace changes: remove $Id$ lines, and add blank lines around some #if / #elif /#endif
|
#
2579e2d7 |
|
19-Apr-2013 |
Luigi Rizzo <luigi@FreeBSD.org> |
mostly whitespace changes: - remove vestiges of the old memory allocator - clean up some comments
|
#
ae10d1af |
|
22-Jan-2013 |
Luigi Rizzo <luigi@FreeBSD.org> |
control some debugging messages with dev.netmap.verbose add infrastracture to adapt to changes in number of queues and buffers at runtime
|
#
1dce924d |
|
17-Jan-2013 |
Luigi Rizzo <luigi@FreeBSD.org> |
add some definition and driver changes in preparation for two upcoming features: semi-transparent mode: when a device is opened in this mode, the user program will be able to mark slots that must be forwarded to the "other" side (i.e. from NIC to host stack, or viceversa), and the forwarding will occur automatically at the next netmap syscall. This saves the need to open another file descriptor and do the forwarding manually. direct-forwarding mode: when operating with a VALE port, the user can specify in the slot the actual destination port, overriding the forwarding decision made by a lookup of the destination MAC. This can be useful to implement packet dispatchers. No API changes will be introduced. No new functionality in this patch yet.
|
#
8241616d |
|
18-Oct-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
This is an import of code, mostly from Giuseppe Lettieri, that revises the netmap memory allocator so that the various parameters (number and size of buffers, rings, descriptors) can be modified at runtime through sysctl variables. The changes become effective when no netmap clients are active. The API is mostly unchanged, although the NIOCUNREGIF ioctl now does not bring the interface back to normal mode: and you need to close the file descriptor for that. This change was necessary to track who is using the mapped region, and since it is a simplification of the API there was no incentive in trying to preserve NIOCUNREGIF. We will remove the ioctl from the kernel next time we need a real API change (and version bump). Among other things, buffer allocation when opening devices is now much faster: it used to take O(N^2) time, now it is linear. Submitted by: Giuseppe Lettieri
|
#
24e57ec9 |
|
08-Aug-2012 |
Ed Maste <emaste@FreeBSD.org> |
Clarify comments about number of tx / rx rings
|
#
b3d53016 |
|
02-Aug-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
fix some signed/unsigned warnings in the netmap code. Unfortunately the original drivers still have a lot of sign conversion/comparison warnings.
|
#
d198a63d |
|
30-Jul-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
remove a redundant MALLOC_DECLARE
|
#
826e7ddb |
|
27-Jul-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
remove unused definition, whitespace cleanup
|
#
f196ce38 |
|
26-Jul-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
Add support for VALE bridges to the netmap core, see http://info.iet.unipi.it/~luigi/vale/ VALE lets you dynamically instantiate multiple software bridges that talk the netmap API (and are *extremely* fast), so you can test netmap applications without the need for high end hardware. This is particularly useful as I am completing a netmap-aware version of ipfw, and VALE provides an excellent testing platform. Also, I also have netmap backends for qemu mostly ready for commit to the port, and this too will let you interconnect virtual machines at high speed without fiddling with bridges, tap or other slow solutions. The API for applications is unchanged, so you can use the code in tools/tools/netmap (which i will update soon) on the VALE ports. This commit also syncs the code with the one in my internal repository, so you will see some conditional code for other platforms. The code should run mostly unmodified on stable/9 so people interested in trying it can just copy sys/dev/netmap/ and sys/net/netmap*.h from HEAD VALE is joint work with my colleague Giuseppe Lettieri, and is partly supported by the EU Projects CHANGE and OPENLAB
|
#
d76bf4ff |
|
13-Apr-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
A bit of cleanup in the names of fields of netmap-related structures. Use the name 'ring' instead of 'queue' in all fields. Bump NETMAP_API.
|
#
3c0caf6c |
|
12-Apr-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
Some code restructuring to bring the memory allocator out of netmap.c and make it easier to replace it with a different implementation. On passing, also fix indentation. NOTE: I know that #include "foo.c" is ugly, but the alternative (add another entry to sys/conf/files, add a separate header with structs and prototypes, and expose functions that are meant to be private) looks even worse to me. We need a more modular way to specify dependencies and build options.
|
#
64ae02c3 |
|
27-Feb-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
A bunch of netmap fixes: USERSPACE: 1. add support for devices with different number of rx and tx queues; 2. add better support for zero-copy operation, adding an extra field to the netmap ring to indicate how many buffers we have already processed but not yet released (with help from Eddie Kohler); 3. The two changes above unfortunately require an API change, so while at it add a version field and some spares to the ioctl() argument to help detect mismatches. 4. update the manual page for the two changes above; 5. update sample applications in tools/tools/netmap KERNEL: 1. simplify the internal structures moving the global wait queues to the 'struct netmap_adapter'; 2. simplify the functions that map kring<->nic ring indexes 3. normalize device-specific code, helps mainteinance; 4. start exploring the impact of micro-optimizations (prefetch etc.) in the ixgbe driver. Use 'legacy' descriptors on the tx ring and prefetch slots gives about 20% speedup at 900 MHz. Another 7-10% would come from removing the explict calls to bus_dmamap* in the core (they are effectively NOPs in this case, but it takes expensive load of the per-buffer dma maps to figure out that they are all NULL. Rx performance not investigated. I am postponing the MFC so i can import a few more improvements before merging.
|
#
5644ccec |
|
15-Feb-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
(This commit only touches code within the DEV_NETMAP blocks) Introduce some functions to map NIC ring indexes into netmap ring indexes and vice versa. This way we can implement the bound checks only in one place (and hopefully in a correct way). On passing, make the code and comments more uniform across the various drivers.
|
#
1a26580e |
|
13-Feb-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
- use struct ifnet as explicit type of the argument to the txsync() and rxsync() callbacks, removing some variables made useless by this change; - add generic lock and irq handling routines. These can be useful in case there are no driver locks that we can reuse; - add a few macros to reduce differences with the Linux version.
|
#
5819da83 |
|
08-Feb-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
- change the buffer size from a constant to a TUNABLE variable (hw.netmap.buf_size) so we can experiment with values different from 2048 which may give better cache performance. - rearrange the memory allocation code so it will be easier to replace it with a different implementation. The current code relies on a single large contiguous chunk of memory obtained through contigmalloc. The new implementation (not committed yet) uses multiple smaller chunks which are easier to fit in a fragmented address space.
|
#
2157a17c |
|
26-Jan-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
ixgbe changes: - remove experimental code for disabling CRC - use the correct constant for conversion between interrupt rate and EITR values (the previous values were off by a factor of 2) - make dev.ix.N.queueM.interrupt_rate a RW sysctl variable. Changing individual values affects the queue immediately, and propagates to all interfaces at the next reinit. - add dev.ix.N.queueM.irqs rdonly sysctl, to export the actual interrupt counts Netmap-related changes for ixgbe: - use the "new" format for TX descriptors in netmap mode. - pass interrupt mitigation delays to the user process doing poll() on a netmap file descriptor. On the RX side this means we will not check the ring more than once per interrupt. This gives the process a chance to sleep and process packets in larger batches, thus reducing CPU usage. On the TX side we take this even further: completed transmissions are reclaimed every half ring even if the NIC interrupts more often. This saves even more CPU without any additional tx delays. Generic Netmap-related changes: - align the netmap_kring to cache lines so that there is no false sharing (possibly useful for multiqueue NICs and MSIX interrupts, which are handled by different cores). It's a minor improvement but it does not cost anything. Reviewed by: Jack Vogel Approved by: Jack Vogel
|
#
bcda432e |
|
13-Jan-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
indentation and whitespace fixes
|
#
6dba29a2 |
|
13-Jan-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
Two performance-related fixes: 1. as reported by Alexander Fiveg, the allocator was reporting half of the allocated memory. Fix this by exiting from the loop earlier (not too critical because this code is going away soon). 2. following a discussion on freebsd-current http://lists.freebsd.org/pipermail/freebsd-current/2012-January/031144.html turns out that (re)loading the dmamap was expensive and not optimized. This operation is in the critical path when doing zero-copy forwarding between interfaces. At least on netmap and i386/amd64, the bus_dmamap_load can be completely bypassed if the map is NULL, so we do it. The latter change gives an almost 3x improvement in forwarding performance, from the previous 9.5Mpps at 2.9GHz to the current line rate (14.2Mpps) at 1.733GHz. (this is for 64+4 byte packets, in other configurations the PCIe bus is a bottleneck).
|
#
6e10c8b8 |
|
10-Jan-2012 |
Luigi Rizzo <luigi@FreeBSD.org> |
small code cleanup in preparation for future modifications in the memory allocator used by netmap. No functional change, two small bug fixes: - in if_re.c add a missing bus_dmamap_sync() - in netmap.c comment out a spurious free() in an error handling block
|
#
d0c7b075 |
|
23-Dec-2011 |
Luigi Rizzo <luigi@FreeBSD.org> |
1. don't use if_pspare directly, but through a macro WMA() 2. move a variable declaration at the beginning of a block
|
#
506cc70c |
|
04-Dec-2011 |
Luigi Rizzo <luigi@FreeBSD.org> |
1. Fix the handling of link reset while in netmap more. A link reset now is completely transparent for the netmap client: even if the NIC resets its own ring (e.g. restarting from 0), the client will not see any change in the current rx/tx positions, because the driver will keep track of the offset between the two. 2. make the device-specific code more uniform across different drivers There were some inconsistencies in the implementation of the netmap support routines, now drivers have been aligned to a common code structure. 3. import netmap support for ixgbe . This is implemented as a very small patch for ixgbe.c (233 lines, 11 chunks, mostly comments: in total the patch has only 54 lines of new code) , as most of the code is in an external file sys/dev/netmap/ixgbe_netmap.h , following some initial comments from Jack Vogel about making changes less intrusive. (Note, i have emailed Jack multiple times asking if he had comments on this structure of the code; i got no reply so i assume he is fine with it). Support for other drivers (em, lem, re, igb) will come later. "ixgbe" is now the reference driver for netmap support. Both the external file (sys/dev/netmap/ixgbe_netmap.h) and the device-specific patches (in sys/dev/ixgbe/ixgbe.c) are heavily commented and should serve as a reference for other device drivers. Tested on i386 and amd64 with the pkt-gen program in tools/tools/netmap, the sender does 14.88 Mpps at 1050 Mhz and 14.2 Mpps at 900 MHz on an i7-860 with 4 cores and 82599 card. Haven't tried yet more aggressive optimizations such as adding 'prefetch' instructions in the time-critical parts of the code.
|
#
68b8534b |
|
16-Nov-2011 |
Luigi Rizzo <luigi@FreeBSD.org> |
Bring in support for netmap, a framework for very efficient packet I/O from userspace, capable of line rate at 10G, see http://info.iet.unipi.it/~luigi/netmap/ At this time I am bringing in only the generic code (sys/dev/netmap/ plus two headers under sys/net/), and some sample applications in tools/tools/netmap. There is also a manpage in share/man/man4 [1] In order to make use of the framework you need to build a kernel with "device netmap", and patch individual drivers with the code that you can find in sys/dev/netmap/head.diff The file will go away as the relevant pieces are committed to the various device drivers, which should happen in a few days after talking to the driver maintainers. Netmap support is available at the moment for Intel 10G and 1G cards (ixgbe, em/lem/igb), and for the Realtek 1G card ("re"). I have partial patches for "bge" and am starting to work on "cxgbe". Hopefully changes are trivial enough so interested third parties can submit their patches. Interested people can contact me for advice on how to add netmap support to specific devices. CREDITS: Netmap has been developed by Luigi Rizzo and other collaborators at the Universita` di Pisa, and supported by EU project CHANGE (http://www.change-project.eu/) The code is distributed under a BSD Copyright. [1] In my opinion is a bad idea to have all manpage in one directory. We should place kernel documentation in the same dir that contains the code, which would make it much simpler to keep doc and code in sync, reduce the clutter in share/man/ and incidentally is the policy used for all of userspace code. Makefiles and doc tools can be trivially adjusted to find the manpages in the relevant subdirs.
|